SlideShare une entreprise Scribd logo
1  sur  9
Improving MapReduce:
GridGain and Scala To The Rescue!


      Nikita Ivanov, Founder & CEO   GridGain Systems
      October 2012                    www.gridgain.com




                                           #gridgain
Table Of Contents:




         >        30%                         >       70%
              >    Why Real Time Hadoop/MR?       >   Live Coding
              >    GridGain Overview              >   Real Time & Streaming Word Count
              >    In-Memory Computing
              >    Compute & Data Grids




www.gridgain.com                                                                   Slide 2
Real Time MapReduce/Hadoop:




                               Why?




www.gridgain.com                      Slide 3
In-Memory Computing: Why Now?

                                 “In-memory will have an industry impact
                              comparable to web and cloud. RAM is a new disk,
                                          and disk is a new tape.”


Technology & Cost:                                                       Performance & Scalability Matters:
>   64-bit CPU can address 16 exabytes                                   >   Citi: 100ms == $1M loss
    Entire active data set on the planet is addressable by just 1 CPU.       Forex trading

>   Disk up to 105 times slower than DRAM                                >   Google: 500ms == 20% traffic drop
                                                                             Dropping 20% of revenue
    SSD drives are up to 103 times slower

>   Super effective in-memory parallelization
                                                                         >   SAP sees +206% in profit in Q112
                                                                             For in-memory SAP HANA products
    Enabled by modern multicore CPUs
                                                                         >   Software AG sees 3x revenue in 2012
>   DRAM prices drop 30% every 18 months                                     For in-memory Terracota products
    1TB RAM & 48 cores cluster ~ $40K (< $20K in 3 years)




www.gridgain.com                                                                                                Slide 4
GridGain: In-Memory Data Platform
   >   In-Memory Compute Grid +                 >   Full ACID Transactions
       In-Memory Data Grid                          Fully distributed ACID transactions

   >   Real Time & Streaming MapReduce, CEP     >   Simplicity and Productivity
                                                    Dramatically reduces cost of application development
   >   Three Editions:                              Demonstrably faster time-to-market
        > In-Memory HPC
            HPC market
                                                    Example:
        >   In-Memory Data Grid
                                                    Full source code in Scala of world’s shortest real time MapReduce app
            Transactional Data Caching market
                                                    built with GridGain. Works on one or thousands of computers with no
                                                    code changes required.
        >   In-Memory Hadoop
            Real Time Big Data market

   >   Language support:
       Server: Java, Scala, Groovy,
       Clients: .NET, PHP, REST, C++

   >   Mobile platforms support:
       iOS/ObjectiveC, Android clients




www.gridgain.com                                                                                             Slide 5
GridGain: In-Memory Compute Grid


>   Direct API for split and aggregation
>   Pluggable failover, topology and collision resolution
>   Distributed task session
>   Distributed continuations & recursive split
>   Support for Streaming MapReduce
>   Support for Complex Event Processing (CEP)
>   Node-local cache
>   AOP-based, OOP/FP-based execution modes
>   Direct closure distribution in Java, Scala and Groovy
>   Cron-based scheduling
>   Direct redundant mapping support
>   Zero deployment with P2P class loading
>   Partial asynchronous reduction
>   Direct support for weighted and adaptive mapping
>   State checkpoints for long running tasks
>   Early and late load balancing
>   Affinity routing with data grid




    www.gridgain.com                                        Slide 6
GridGain: In-Memory Data Grid
>   Zero deployment for data
>   Local, full replicable and partitioned cache types
>   Pluggable expiration policies (LRU, LIRS, random, time-based)
>   Read-through and write-through logic with pluggable cache store
>   Synchronous and asynchronous cache operations
>   MVCC-based concurrency
>   Pluggable data overflow storage via new swap space SPI
>   PESSIMISTIC, OPTIMISTIC transactions
>   Standard isolation levels, JTA/JCA integration
>   Master/Master data replication/invalidation
>   Write-behind cache store support
>   Concurrent and transactional data preloading
>   Delayed preloading support
>   Affinity routing with compute grid
>   Partitioned cache with active replicas
>   Structured and unstructured data
>   Datacenter replication
>   JDBC driver for in-memory object data store
>   Off-heap memory support
>   Pluggable indexing via Indexing SPI
>   Tiered storage with on-heap, off-heap, swap space, SQL, and Hadoop
>   Distributed in-memory query capability
>   SQL, H2, Lucene, predicate-based affinity co-located queries



    www.gridgain.com                                                     Slide 7
Live Coding: GridGain + Scala




>       100% Live Coding:
    >       Nothing pre-built
    >       Every line & character
    >       Everything from the start




        www.gridgain.com                 Slide 8
Thank You!



  #gridgain

Contenu connexe

Plus de JAX London

Clojure made-simple - John Stevenson
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John StevensonJAX London
 
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias WessendorfHTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias WessendorfJAX London
 
Play framework 2 : Peter Hilton
Play framework 2 : Peter HiltonPlay framework 2 : Peter Hilton
Play framework 2 : Peter HiltonJAX London
 
Complexity theory and software development : Tim Berglund
Complexity theory and software development : Tim BerglundComplexity theory and software development : Tim Berglund
Complexity theory and software development : Tim BerglundJAX London
 
Why FLOSS is a Java developer's best friend: Dave Gruber
Why FLOSS is a Java developer's best friend: Dave GruberWhy FLOSS is a Java developer's best friend: Dave Gruber
Why FLOSS is a Java developer's best friend: Dave GruberJAX London
 
Akka in Action: Heiko Seeburger
Akka in Action: Heiko SeeburgerAkka in Action: Heiko Seeburger
Akka in Action: Heiko SeeburgerJAX London
 
NoSQL Smackdown 2012 : Tim Berglund
NoSQL Smackdown 2012 : Tim BerglundNoSQL Smackdown 2012 : Tim Berglund
NoSQL Smackdown 2012 : Tim BerglundJAX London
 
Closures, the next "Big Thing" in Java: Russel Winder
Closures, the next "Big Thing" in Java: Russel WinderClosures, the next "Big Thing" in Java: Russel Winder
Closures, the next "Big Thing" in Java: Russel WinderJAX London
 
Java and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk PepperdineJava and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk PepperdineJAX London
 
Mongo DB on the JVM - Brendan McAdams
Mongo DB on the JVM - Brendan McAdamsMongo DB on the JVM - Brendan McAdams
Mongo DB on the JVM - Brendan McAdamsJAX London
 
New opportunities for connected data - Ian Robinson
New opportunities for connected data - Ian RobinsonNew opportunities for connected data - Ian Robinson
New opportunities for connected data - Ian RobinsonJAX London
 
HTML5 Websockets and Java - Arun Gupta
HTML5 Websockets and Java - Arun GuptaHTML5 Websockets and Java - Arun Gupta
HTML5 Websockets and Java - Arun GuptaJAX London
 
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerThe Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerJAX London
 
Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...
Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...
Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...JAX London
 
No Crash Allowed - Patterns for fault tolerance : Uwe Friedrichsen
No Crash Allowed - Patterns for fault tolerance : Uwe FriedrichsenNo Crash Allowed - Patterns for fault tolerance : Uwe Friedrichsen
No Crash Allowed - Patterns for fault tolerance : Uwe FriedrichsenJAX London
 
Size does matter - Patterns for high scalability: Uwe Friedrichsen
Size does matter - Patterns for high scalability: Uwe FriedrichsenSize does matter - Patterns for high scalability: Uwe Friedrichsen
Size does matter - Patterns for high scalability: Uwe FriedrichsenJAX London
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Scala in Action - Heiko Seeburger
Scala in Action - Heiko SeeburgerScala in Action - Heiko Seeburger
Scala in Action - Heiko SeeburgerJAX London
 
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...JAX London
 
Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...
Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...
Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...JAX London
 

Plus de JAX London (20)

Clojure made-simple - John Stevenson
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John Stevenson
 
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias WessendorfHTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
 
Play framework 2 : Peter Hilton
Play framework 2 : Peter HiltonPlay framework 2 : Peter Hilton
Play framework 2 : Peter Hilton
 
Complexity theory and software development : Tim Berglund
Complexity theory and software development : Tim BerglundComplexity theory and software development : Tim Berglund
Complexity theory and software development : Tim Berglund
 
Why FLOSS is a Java developer's best friend: Dave Gruber
Why FLOSS is a Java developer's best friend: Dave GruberWhy FLOSS is a Java developer's best friend: Dave Gruber
Why FLOSS is a Java developer's best friend: Dave Gruber
 
Akka in Action: Heiko Seeburger
Akka in Action: Heiko SeeburgerAkka in Action: Heiko Seeburger
Akka in Action: Heiko Seeburger
 
NoSQL Smackdown 2012 : Tim Berglund
NoSQL Smackdown 2012 : Tim BerglundNoSQL Smackdown 2012 : Tim Berglund
NoSQL Smackdown 2012 : Tim Berglund
 
Closures, the next "Big Thing" in Java: Russel Winder
Closures, the next "Big Thing" in Java: Russel WinderClosures, the next "Big Thing" in Java: Russel Winder
Closures, the next "Big Thing" in Java: Russel Winder
 
Java and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk PepperdineJava and the machine - Martijn Verburg and Kirk Pepperdine
Java and the machine - Martijn Verburg and Kirk Pepperdine
 
Mongo DB on the JVM - Brendan McAdams
Mongo DB on the JVM - Brendan McAdamsMongo DB on the JVM - Brendan McAdams
Mongo DB on the JVM - Brendan McAdams
 
New opportunities for connected data - Ian Robinson
New opportunities for connected data - Ian RobinsonNew opportunities for connected data - Ian Robinson
New opportunities for connected data - Ian Robinson
 
HTML5 Websockets and Java - Arun Gupta
HTML5 Websockets and Java - Arun GuptaHTML5 Websockets and Java - Arun Gupta
HTML5 Websockets and Java - Arun Gupta
 
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerThe Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
 
Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...
Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...
Bluffers guide to elitist jargon - Martijn Verburg, Richard Warburton, James ...
 
No Crash Allowed - Patterns for fault tolerance : Uwe Friedrichsen
No Crash Allowed - Patterns for fault tolerance : Uwe FriedrichsenNo Crash Allowed - Patterns for fault tolerance : Uwe Friedrichsen
No Crash Allowed - Patterns for fault tolerance : Uwe Friedrichsen
 
Size does matter - Patterns for high scalability: Uwe Friedrichsen
Size does matter - Patterns for high scalability: Uwe FriedrichsenSize does matter - Patterns for high scalability: Uwe Friedrichsen
Size does matter - Patterns for high scalability: Uwe Friedrichsen
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Scala in Action - Heiko Seeburger
Scala in Action - Heiko SeeburgerScala in Action - Heiko Seeburger
Scala in Action - Heiko Seeburger
 
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
 
Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...
Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...
Choosing the right Agile innovation practices: Scrum vs Kanban vs Lean Startu...
 

Dernier

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Improving MapReduce: Scala and Gridgain to the rescue! Nikita Ivanov

  • 1. Improving MapReduce: GridGain and Scala To The Rescue! Nikita Ivanov, Founder & CEO GridGain Systems October 2012 www.gridgain.com #gridgain
  • 2. Table Of Contents: > 30% > 70% > Why Real Time Hadoop/MR? > Live Coding > GridGain Overview > Real Time & Streaming Word Count > In-Memory Computing > Compute & Data Grids www.gridgain.com Slide 2
  • 3. Real Time MapReduce/Hadoop: Why? www.gridgain.com Slide 3
  • 4. In-Memory Computing: Why Now? “In-memory will have an industry impact comparable to web and cloud. RAM is a new disk, and disk is a new tape.” Technology & Cost: Performance & Scalability Matters: > 64-bit CPU can address 16 exabytes > Citi: 100ms == $1M loss Entire active data set on the planet is addressable by just 1 CPU. Forex trading > Disk up to 105 times slower than DRAM > Google: 500ms == 20% traffic drop Dropping 20% of revenue SSD drives are up to 103 times slower > Super effective in-memory parallelization > SAP sees +206% in profit in Q112 For in-memory SAP HANA products Enabled by modern multicore CPUs > Software AG sees 3x revenue in 2012 > DRAM prices drop 30% every 18 months For in-memory Terracota products 1TB RAM & 48 cores cluster ~ $40K (< $20K in 3 years) www.gridgain.com Slide 4
  • 5. GridGain: In-Memory Data Platform > In-Memory Compute Grid + > Full ACID Transactions In-Memory Data Grid Fully distributed ACID transactions > Real Time & Streaming MapReduce, CEP > Simplicity and Productivity Dramatically reduces cost of application development > Three Editions: Demonstrably faster time-to-market > In-Memory HPC HPC market Example: > In-Memory Data Grid Full source code in Scala of world’s shortest real time MapReduce app Transactional Data Caching market built with GridGain. Works on one or thousands of computers with no code changes required. > In-Memory Hadoop Real Time Big Data market > Language support: Server: Java, Scala, Groovy, Clients: .NET, PHP, REST, C++ > Mobile platforms support: iOS/ObjectiveC, Android clients www.gridgain.com Slide 5
  • 6. GridGain: In-Memory Compute Grid > Direct API for split and aggregation > Pluggable failover, topology and collision resolution > Distributed task session > Distributed continuations & recursive split > Support for Streaming MapReduce > Support for Complex Event Processing (CEP) > Node-local cache > AOP-based, OOP/FP-based execution modes > Direct closure distribution in Java, Scala and Groovy > Cron-based scheduling > Direct redundant mapping support > Zero deployment with P2P class loading > Partial asynchronous reduction > Direct support for weighted and adaptive mapping > State checkpoints for long running tasks > Early and late load balancing > Affinity routing with data grid www.gridgain.com Slide 6
  • 7. GridGain: In-Memory Data Grid > Zero deployment for data > Local, full replicable and partitioned cache types > Pluggable expiration policies (LRU, LIRS, random, time-based) > Read-through and write-through logic with pluggable cache store > Synchronous and asynchronous cache operations > MVCC-based concurrency > Pluggable data overflow storage via new swap space SPI > PESSIMISTIC, OPTIMISTIC transactions > Standard isolation levels, JTA/JCA integration > Master/Master data replication/invalidation > Write-behind cache store support > Concurrent and transactional data preloading > Delayed preloading support > Affinity routing with compute grid > Partitioned cache with active replicas > Structured and unstructured data > Datacenter replication > JDBC driver for in-memory object data store > Off-heap memory support > Pluggable indexing via Indexing SPI > Tiered storage with on-heap, off-heap, swap space, SQL, and Hadoop > Distributed in-memory query capability > SQL, H2, Lucene, predicate-based affinity co-located queries www.gridgain.com Slide 7
  • 8. Live Coding: GridGain + Scala > 100% Live Coding: > Nothing pre-built > Every line & character > Everything from the start www.gridgain.com Slide 8
  • 9. Thank You! #gridgain