SlideShare a Scribd company logo
1 of 31
Download to read offline
SCALABLE DATABASES
          From Relational Databases
            To Polyglot Persistence



                                     sergio.bossa@gmail.com
Sergio Bossa                       http://twitter.com/sbtourist




                                 Sergio Bossa – sergio.bossa@gmail.com
                                 Javaday IV – Roma – 30 gennaio 2010
About Me
●   Software architect and engineer
    ●   Gioco Digitale (online gambling and casinos)
●   Open Source enthusiast
    ●   Terracotta Messaging (http://forge.terracotta.org)
    ●   Terrastore (http://code.google.com/p/terrastore)
    ●   Actorom (http://code.google.com/p/actorom)
●   (Micro-)Blogger
    ●   http://twitter.com/sbtourist
    ●   http://sbtourist.blogspot.com

                                                     Sergio Bossa – sergio.bossa@gmail.com
                                                       Javaday IV – Roma – 30 gennaio 2010
Five fallacies of data-centric systems


      Data model is static.
   Data volume is predictable.
 Data access load is predictable.
Database topology doesn't change.
      Database never fails.




                              Sergio Bossa – sergio.bossa@gmail.com
                              Javaday IV – Roma – 30 gennaio 2010
Scalable databases in action
●   Scaling your database as a way to solve fallacies above.
    ●   Scale to handle heterogeneous data.
    ●   Scale to handle more data.
    ●   Scale to handle more load.
    ●   Scale to handle topology changes due to:
        ●   Unplanned growth.
        ●   Unpredictable failures.


                                               Sergio Bossa – sergio.bossa@gmail.com
                                               Javaday IV – Roma – 30 gennaio 2010
Scaling Relational Databases




         Sergio Bossa – sergio.bossa@gmail.com
         Javaday IV – Roma – 30 gennaio 2010
Master-Slave replication
●   Master - Slave replication.
    ●   One (and only one) master
        database.
    ●   One or more slaves.
    ●   All writes goes to the master.
        ●   Replicated to slaves.
    ●   Reads are balanced among master
        and slaves.
●   Major issues:
    ●   Single point of failure.
    ●   Single point of bottleneck.
    ●   Static topology.


                                              Sergio Bossa – sergio.bossa@gmail.com
                                               Javaday IV – Roma – 30 gennaio 2010
Master-Master replication
●   Master - Master replication.
    ●   One or more masters.
    ●   Writes and reads can go to any
        master node.
        ●   Writes are replicated among
            masters.
●   Major issues:
    ●   Limited performance and scalability
        (typically due to 2PC).
    ●   Complexity.
    ●   Static topology.




                                                   Sergio Bossa – sergio.bossa@gmail.com
                                                    Javaday IV – Roma – 30 gennaio 2010
Vertical partitioning
●   Vertical partitioning.
    ●   Put tables belonging to different
        functional areas on different
        database nodes.
        ●   Scale your data and load by
            function.
        ●   Move joins to the application
            level.
●   Major issues:
    ●   No more truly relational.
    ●   What if a functional area grows too
        much?




                                              Sergio Bossa – sergio.bossa@gmail.com
                                              Javaday IV – Roma – 30 gennaio 2010
Horizontal partitioning
●   Horizontal partitioning.
    ●   Split tables by key and put
        partitions (shards) on different
        nodes.
        ●   Scale your data and load by key.
        ●   Move joins to the application
            level.
        ●   Needs some kind of routing.
●   Major issues:
    ●   No more truly relational.
    ●   What if your partition grows too
        much?



                                                 Sergio Bossa – sergio.bossa@gmail.com
                                                 Javaday IV – Roma – 30 gennaio 2010
Caching
●   Put a cache in front of your database.
    ●   Distribute.
    ●   Write-through for scaling reads.
    ●   Write-behind for scaling reads and
        writes.
●   Saves you a lot of pain, but ...
    ●   “Only” scales read/write load.




                                             Sergio Bossa – sergio.bossa@gmail.com
                                             Javaday IV – Roma – 30 gennaio 2010
Did we solve our fallacies?
●   We tried, but ...
    ●   Still bound to the relational model.
    ●   Replication only covers a few use cases.
    ●   Partitioning is hard.
    ●   Caching is good, but not definitive.
    ●   ...
●   Can we do any better?


                                               Sergio Bossa – sergio.bossa@gmail.com
                                               Javaday IV – Roma – 30 gennaio 2010
It's Not Only SQL




Sergio Bossa – sergio.bossa@gmail.com
Javaday IV – Roma – 30 gennaio 2010
NOSQL Characteristics
●   Main traits of characterization:
    ●   Data Model.
    ●   Data Processing.
    ●   Consistency Model.
    ●   Scale Out.




                                          Sergio Bossa – sergio.bossa@gmail.com
                                           Javaday IV – Roma – 30 gennaio 2010
Data Model (1)
●   Column-family based.
●   Structure:
    ●   Key-identified rows with a sparse number of columns.
    ●   Columns grouped in families.
    ●   Multiple families for the same key.
●   Highlights:
    ●   Dynamically add and remove columns.
    ●   Efficiently access columns in the same group (column
        family).
                                              Sergio Bossa – sergio.bossa@gmail.com
                                              Javaday IV – Roma – 30 gennaio 2010
Data Model (2)
●   Document based.
●   Structure:
    ●   Key-identified documents.
    ●   Schema-less (but optionally constrained).
        – JSON, XML ...
●   Highlights:
    ●   Dynamically change inner documents structure.
    ●   Efficiently access documents as a unit.

                                             Sergio Bossa – sergio.bossa@gmail.com
                                             Javaday IV – Roma – 30 gennaio 2010
Data Model (3)
●   Graph based.
●   Structure:
    ●   Nodes to represent your data.
    ●   Relations as meaningful links between nodes.
    ●   Properties to enrich both.
●   Highlights:
    ●   Rich data model.
    ●   Efficient, fast, traversal of nodes and relations.

                                                Sergio Bossa – sergio.bossa@gmail.com
                                                Javaday IV – Roma – 30 gennaio 2010
Data Model (4)
●   Key-Value based.
●   Structure:
    ●   Key-identified opaque values.
●   Highlights:
    ●   Great flexibility.
    ●   Fast reads/writes for single entries.




                                                Sergio Bossa – sergio.bossa@gmail.com
                                                Javaday IV – Roma – 30 gennaio 2010
Data Processing
●   Several options:
    ●   Map/Reduce.
    ●   Predicates.
    ●   Range Queries.
    ●   ...
●   One common principle:
    ●   Move processing toward related data.


                                         Sergio Bossa – sergio.bossa@gmail.com
                                         Javaday IV – Roma – 30 gennaio 2010
Consistency Model (1)
●   Strict Consistency.
    ●   All nodes ...
    ●   At every point in time ...
    ●   See a consistent view of the stored data.
        –   Per-key consistency.
        –   Multi-key consistency.




                                             Sergio Bossa – sergio.bossa@gmail.com
                                              Javaday IV – Roma – 30 gennaio 2010
Consistency Model (2)
●   Eventual Consistency.
    ●   Only a subset of all nodes ...
    ●   At a specific point in time ...
    ●   See a consistent view of the stored data.
         –   Other nodes will serve stale data.
         –   Other nodes will eventually get updates later.




                                                Sergio Bossa – sergio.bossa@gmail.com
                                                Javaday IV – Roma – 30 gennaio 2010
Scale Out (1)
●   Master-based.
    ●   Membership managed and
        broadcasted by masters.
    ●   Data consistency guaranteed by
        masters.
    ●   No SPOF with active/passive
        masters.
    ●   No SPOB with active/active
        masters or cluster-cluster
        replication.
    ●   Prone to partitioning failures.




                                          Sergio Bossa – sergio.bossa@gmail.com
                                          Javaday IV – Roma – 30 gennaio 2010
Scale Out (2)
●   Peer-to-peer.
    ●   Membership is maintained through
        multicast or gossip-based protocols.
    ●   Data consistency is maintained
        through quorum protocols.
    ●   Easier to scale.
    ●   Harder to maintain consistency.




                                               Sergio Bossa – sergio.bossa@gmail.com
                                               Javaday IV – Roma – 30 gennaio 2010
NOSQL Use Cases
●   Use cases evolve along the following kinds of data:
    ●   Rich.
    ●   Runtime.
    ●   Hot Spot.
    ●   Massive.
    ●   Computational.
●   Do not use the same product for all cases.
    ●   Pick multiple products for different use cases.

                                               Sergio Bossa – sergio.bossa@gmail.com
                                               Javaday IV – Roma – 30 gennaio 2010
NOSQL Products - Cassandra
●   Cassandra (http://incubator.apache.org/cassandra)
●   Data Model:
    ●   Column-family based.
●   Data Processing:
    ●   Range queries, Predicates.
●   Consistency:
    ●   Eventual consistency.
●   Scalability:
    ●   Peer-to-peer, gossip based.
                                          Sergio Bossa – sergio.bossa@gmail.com
                                          Javaday IV – Roma – 30 gennaio 2010
NOSQL Products - Mongo DB
●   Mongo DB (http://www.mongodb.org)
●   Data Model:
    ●   Document based (JSON).
●   Data Processing:
    ●   Map/Reduce, SQL-like queries.
●   Consistency:
    ●   Per-document strict consistency.
●   Scalability:
    ●   Replication, partitioning (alpha).
                                             Sergio Bossa – sergio.bossa@gmail.com
                                             Javaday IV – Roma – 30 gennaio 2010
NOSQL Products - Neo4j
●   Neo4j (http://neo4j.org)
●   Data Model:
    ●   Graph based.
●   Data Processing:
    ●   Path traversal, Index-based search.
●   Consistency:
    ●   Strict consistency.
●   Scalability:
    ●   Replication.
                                              Sergio Bossa – sergio.bossa@gmail.com
                                              Javaday IV – Roma – 30 gennaio 2010
NOSQL Products - Riak
●   Riak (http://riak.basho.com)
●   Data Model:
    ●   Document based (JSON).
●   Data Processing:
    ●   Map/Reduce.
●   Consistency:
    ●   Eventual consistency.
●   Scalability:
    ●   Peer-to-peer, gossip based.
                                          Sergio Bossa – sergio.bossa@gmail.com
                                          Javaday IV – Roma – 30 gennaio 2010
NOSQL Products - Terrastore
●   Terrastore (http://code.google.com/p/terrastore)
●   Data Model:
    ●   Document based (JSON).
●   Data Processing:
    ●   Range queries, Predicates.
●   Consistency:
    ●   Per-document strict consistency.
●   Scalability:
    ●   Master-based.
                                           Sergio Bossa – sergio.bossa@gmail.com
                                            Javaday IV – Roma – 30 gennaio 2010
NOSQL Products - Voldemort
●   Voldemort (http://project-voldemort.com)
●   Data Model:
    ●   Key-Value.
●   Data Processing:
    ●   None.
●   Consistency:
    ●   Eventual consistency.
●   Scalability:
    ●   Peer-to-peer, gossip based.
                                          Sergio Bossa – sergio.bossa@gmail.com
                                           Javaday IV – Roma – 30 gennaio 2010
NOSQL Products and Use Cases




           Sergio Bossa – sergio.bossa@gmail.com
            Javaday IV – Roma – 30 gennaio 2010
Final words
●   A New World.
    ●   New paradigms.
    ●   New use cases.
    ●   New products.
●   Don't dismiss the old stuff.
    ●   Relational databases still have their place.
●   Embrace change.
    ●   May the NOSQL power be with you.
●   Let the Polyglot Persistence era begin!
                                              Sergio Bossa – sergio.bossa@gmail.com
                                               Javaday IV – Roma – 30 gennaio 2010

More Related Content

More from Sergio Bossa

Terrastore - A document database for developers
Terrastore - A document database for developersTerrastore - A document database for developers
Terrastore - A document database for developers
Sergio Bossa
 
Clustering In The Wild
Clustering In The WildClustering In The Wild
Clustering In The Wild
Sergio Bossa
 

More from Sergio Bossa (6)

Terrastore - A document database for developers
Terrastore - A document database for developersTerrastore - A document database for developers
Terrastore - A document database for developers
 
Actor concurrency for the JVM: a case study
Actor concurrency for the JVM: a case studyActor concurrency for the JVM: a case study
Actor concurrency for the JVM: a case study
 
Scale Your Database And Be Happy
Scale Your Database And Be HappyScale Your Database And Be Happy
Scale Your Database And Be Happy
 
Clustering In The Wild
Clustering In The WildClustering In The Wild
Clustering In The Wild
 
Real Terracotta
Real TerracottaReal Terracotta
Real Terracotta
 
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Scalable Databases - From Relational Databases To Polyglot Persistence

  • 1. SCALABLE DATABASES From Relational Databases To Polyglot Persistence sergio.bossa@gmail.com Sergio Bossa http://twitter.com/sbtourist Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 2. About Me ● Software architect and engineer ● Gioco Digitale (online gambling and casinos) ● Open Source enthusiast ● Terracotta Messaging (http://forge.terracotta.org) ● Terrastore (http://code.google.com/p/terrastore) ● Actorom (http://code.google.com/p/actorom) ● (Micro-)Blogger ● http://twitter.com/sbtourist ● http://sbtourist.blogspot.com Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 3. Five fallacies of data-centric systems Data model is static. Data volume is predictable. Data access load is predictable. Database topology doesn't change. Database never fails. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 4. Scalable databases in action ● Scaling your database as a way to solve fallacies above. ● Scale to handle heterogeneous data. ● Scale to handle more data. ● Scale to handle more load. ● Scale to handle topology changes due to: ● Unplanned growth. ● Unpredictable failures. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 5. Scaling Relational Databases Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 6. Master-Slave replication ● Master - Slave replication. ● One (and only one) master database. ● One or more slaves. ● All writes goes to the master. ● Replicated to slaves. ● Reads are balanced among master and slaves. ● Major issues: ● Single point of failure. ● Single point of bottleneck. ● Static topology. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 7. Master-Master replication ● Master - Master replication. ● One or more masters. ● Writes and reads can go to any master node. ● Writes are replicated among masters. ● Major issues: ● Limited performance and scalability (typically due to 2PC). ● Complexity. ● Static topology. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 8. Vertical partitioning ● Vertical partitioning. ● Put tables belonging to different functional areas on different database nodes. ● Scale your data and load by function. ● Move joins to the application level. ● Major issues: ● No more truly relational. ● What if a functional area grows too much? Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 9. Horizontal partitioning ● Horizontal partitioning. ● Split tables by key and put partitions (shards) on different nodes. ● Scale your data and load by key. ● Move joins to the application level. ● Needs some kind of routing. ● Major issues: ● No more truly relational. ● What if your partition grows too much? Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 10. Caching ● Put a cache in front of your database. ● Distribute. ● Write-through for scaling reads. ● Write-behind for scaling reads and writes. ● Saves you a lot of pain, but ... ● “Only” scales read/write load. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 11. Did we solve our fallacies? ● We tried, but ... ● Still bound to the relational model. ● Replication only covers a few use cases. ● Partitioning is hard. ● Caching is good, but not definitive. ● ... ● Can we do any better? Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 12. It's Not Only SQL Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 13. NOSQL Characteristics ● Main traits of characterization: ● Data Model. ● Data Processing. ● Consistency Model. ● Scale Out. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 14. Data Model (1) ● Column-family based. ● Structure: ● Key-identified rows with a sparse number of columns. ● Columns grouped in families. ● Multiple families for the same key. ● Highlights: ● Dynamically add and remove columns. ● Efficiently access columns in the same group (column family). Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 15. Data Model (2) ● Document based. ● Structure: ● Key-identified documents. ● Schema-less (but optionally constrained). – JSON, XML ... ● Highlights: ● Dynamically change inner documents structure. ● Efficiently access documents as a unit. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 16. Data Model (3) ● Graph based. ● Structure: ● Nodes to represent your data. ● Relations as meaningful links between nodes. ● Properties to enrich both. ● Highlights: ● Rich data model. ● Efficient, fast, traversal of nodes and relations. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 17. Data Model (4) ● Key-Value based. ● Structure: ● Key-identified opaque values. ● Highlights: ● Great flexibility. ● Fast reads/writes for single entries. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 18. Data Processing ● Several options: ● Map/Reduce. ● Predicates. ● Range Queries. ● ... ● One common principle: ● Move processing toward related data. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 19. Consistency Model (1) ● Strict Consistency. ● All nodes ... ● At every point in time ... ● See a consistent view of the stored data. – Per-key consistency. – Multi-key consistency. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 20. Consistency Model (2) ● Eventual Consistency. ● Only a subset of all nodes ... ● At a specific point in time ... ● See a consistent view of the stored data. – Other nodes will serve stale data. – Other nodes will eventually get updates later. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 21. Scale Out (1) ● Master-based. ● Membership managed and broadcasted by masters. ● Data consistency guaranteed by masters. ● No SPOF with active/passive masters. ● No SPOB with active/active masters or cluster-cluster replication. ● Prone to partitioning failures. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 22. Scale Out (2) ● Peer-to-peer. ● Membership is maintained through multicast or gossip-based protocols. ● Data consistency is maintained through quorum protocols. ● Easier to scale. ● Harder to maintain consistency. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 23. NOSQL Use Cases ● Use cases evolve along the following kinds of data: ● Rich. ● Runtime. ● Hot Spot. ● Massive. ● Computational. ● Do not use the same product for all cases. ● Pick multiple products for different use cases. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 24. NOSQL Products - Cassandra ● Cassandra (http://incubator.apache.org/cassandra) ● Data Model: ● Column-family based. ● Data Processing: ● Range queries, Predicates. ● Consistency: ● Eventual consistency. ● Scalability: ● Peer-to-peer, gossip based. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 25. NOSQL Products - Mongo DB ● Mongo DB (http://www.mongodb.org) ● Data Model: ● Document based (JSON). ● Data Processing: ● Map/Reduce, SQL-like queries. ● Consistency: ● Per-document strict consistency. ● Scalability: ● Replication, partitioning (alpha). Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 26. NOSQL Products - Neo4j ● Neo4j (http://neo4j.org) ● Data Model: ● Graph based. ● Data Processing: ● Path traversal, Index-based search. ● Consistency: ● Strict consistency. ● Scalability: ● Replication. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 27. NOSQL Products - Riak ● Riak (http://riak.basho.com) ● Data Model: ● Document based (JSON). ● Data Processing: ● Map/Reduce. ● Consistency: ● Eventual consistency. ● Scalability: ● Peer-to-peer, gossip based. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 28. NOSQL Products - Terrastore ● Terrastore (http://code.google.com/p/terrastore) ● Data Model: ● Document based (JSON). ● Data Processing: ● Range queries, Predicates. ● Consistency: ● Per-document strict consistency. ● Scalability: ● Master-based. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 29. NOSQL Products - Voldemort ● Voldemort (http://project-voldemort.com) ● Data Model: ● Key-Value. ● Data Processing: ● None. ● Consistency: ● Eventual consistency. ● Scalability: ● Peer-to-peer, gossip based. Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 30. NOSQL Products and Use Cases Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010
  • 31. Final words ● A New World. ● New paradigms. ● New use cases. ● New products. ● Don't dismiss the old stuff. ● Relational databases still have their place. ● Embrace change. ● May the NOSQL power be with you. ● Let the Polyglot Persistence era begin! Sergio Bossa – sergio.bossa@gmail.com Javaday IV – Roma – 30 gennaio 2010