SlideShare une entreprise Scribd logo
1  sur  36
Introduction to  NoSQL Databases San Diego NoSQL Meetup – Aug 2010 By Derek Stainer http://nosqldatabases.com
Agenda Introduction Objective Explore NoSQL Databases Conclusion
Introduction UCSD Graduate in Computer Science Java Developer for 10 years Creator of http://nosqldatabases.com Curator of NoSQL information
Objective Deeper dive into each type of NoSQL database Discuss 1-2 NoSQL databases  in each family of databases
NoSQL Taxonomy Key/Value Document Column Graph Others Geospatial File System Object
Key/Value Databases Global collection of Key/Value pairs Inspired by Amazon’s Dynamo and Distributed Hashtables Designed to handle massive load Multiple Types In memory i.e. Memcache On Disk i.e. Redis, SimpleDB Eventually Consistent i.e. Dynamo, Voldemort
Key/Value: Voldemort Created by LinkedIn, now open source Inspired by Amazon’s Dynamo Written in Java Pluggable Storage BerkeleyDB, In Memory, MySQL Pluggable Serialization JSON, Thrift, Protocol Buffers, etc. Cluster Rebalancing
Key/Value: Voldemort Versioning, based on Vector Clocks Reconciliation occurs on reads. Partitioning and Replication based on Dynamo Consistent Hashing Virtual Nodes Gossip
Other Key/Value Stores Other Key/Value Stores Amazon’s Dynamo Riak Redis Memcache SimpleDB
Document Databases Similar to a Key/Value database but with a major difference, value is a document Inspired by Lotus Notes Flexible Schema Any number of fields can be added Documents stored in JSON or BSON formats Examples: CouchDB, MongoDB
Sample Document {      "day": [ 2010, 01, 23 ],      "products": {          "apple": { "price": 10 "quantity": 6 },          "kiwi": { "price": 20 "quantity": 2 }      },      "checkout": 100  }
Document: CouchDB Development began ~ 2005 by Damien Katz former Lotus Notes Developer Couch – Cluster Of Unreliable Commodity Hardware Top level Apache Project Commercially supported by CouchIO Licensed under Apache License Written in Erlang Documents are stored in JSON
Document: CouchDB [cont’d] B-Tree Storage Engine MVCC model, no locking  No joins, primary key or foreign key (UUIDs are auto assigned)  Built bi-directional replication Can even run offline, come back and sync back changes Custom persistent views using MapReduce REST API
Document: MongoDB Development started in 2007 Commercially supported and developed by 10Gen Stores documents using BSON Supports AdHoc queries Can query against embedded objects and arrays Support multiples types of indexing
Document: MongoDB [cont’d] Officially supported drivers available for multiple languages C, C++, Java, Javascript, Perl, PHP, Python and Ruby Community supported drivers include: Scala, Node.js, Haskell, Erlang, Smalltalk Replication uses a master/slave model Scales horizontally via sharding Written C++
Column Family Databases Each key is associated with multiple attributes (i.e. Columns) Hybrid row/column stores Inspired by Google BigTable Examples: HBase, Cassandra
Column: HBase Based on Google’s BigTable Apache Project TLP Cloudera (certifications, EC2 AMI’s, etc.) Layered over HDFS (Hadoop Distributed File System) Input/Output for MapReduce Jobs APIs Thrift, REST
Column: Hbase [cont’d] Automatic partitioning Automatic re-balancing/re-partitioning Fault tolerant HDFS  Multiple Replicas Highly distributed
Column: Hbase [cont’d] Lars George
Column: Cassandra Created at Facebook for Inbox search Facebook -> Google Code -> ASF Commercial Support available from Riptano Features taken from both Dynamo and BigTable Dynamo – Consistent hashing, Partitioning, Replication Big Table – Column Familes, MemTables, SSTables
Column: Cassandra [cont’d] Symmetric nodes No single point of failure Linearly scalable Ease of administration Flexible/Automated Provisioning Flexible Replica Replacement High Availability Eventual Consistency However, consistency is tuneable
Column: Cassandra [cont’d] Partitioning Random Good distribution of data between nodes Range scans not possible Order Preserving Can lead to unbalanced nodes Range scans, Natural Order Custom Extremely fast reads/writes (low latency) Thrift API
Column: Cassandra [cont’d] Column Basic unit of storage Column Family Collection of like records Record level atomicity Indexed Keyspace Top level namespace Usually one per application
Column: Cassandra [cont’d] Eric Evans
Column: Cassandra [cont’d] Column Details Name byte[] Queried against Determines sort order Value byte[] Opaque to Cassandra Timestamp long Conflict resolution (last write wins)
Graph Databases Inspired by Euler Graph Theory, G=(E,V) Focused on modeling the structure of the data Property Graph Data Model Examples: Neo4j, InfiniteGraph
Sample Property Graph[] Todd Hoff
Graph: Neo4j Data Model: Property Graph Nodes – Person, Place, Thing, etc. Relationships – Lives, Likes, Owns, etc. Properties on Both Primary operation is graph traversal between nodes Written in Java Embedded database
Graph: Neo4j [cont’d] Disk-based Graph stored in custom binary format Transactional JTA/JTS, XA, 2PC, MVCC Scales Billions of nodes/relationships/properties per JVM Robust 6+ years in 24/7 production
Graph: Neo4j [cont’d] Multiple language binds Jython, Cpython Jruby (including RESTful API) Clojure Scala (including RESTful API) Uses Social Graph i.e. Facebook Recommendation Engines Financial Audit
Graph: Neo4j [cont’d] Licensed under AGPLv3 Dual Commercial License Available First server is free Second server Inexpensive Commercial support provided by Neo Technologies
Other Graph Databases Other graph databases InfiniteGraph HyperGraphDB sones
Conclusion
Thank You!
References NoSQL Databases - Part 1 – Landscape, Vineet Guptahttp://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSQL for Dummies, Tobias Ivarssonhttp://www.slideshare.net/thobe/nosql-for-dummies NoSQL Databases, Marin Dimitrovhttp://www.slideshare.net/marin_dimitrov/nosql-databases-3584443 CouchDB vs. MongoDB, Gabriele Lanahttp://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 Hbase, Ryan Rawsonhttp://www.slideshare.net/adorepump/hbase-nosql Introduction to Cassandra, Gary Dusbabekhttp://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010 Cassandra Explained, Eric Evanshttp://www.slideshare.net/jericevans/cassandra-explained Towards Robust Distributed Systems, Eric Brewerhttp://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Cassandra - A Decentralized Structured Storage System, Lakshman, Ladishttp://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
References [cont’d] Bigtable: A Distributed Storage System for Structured Data, Google Inc.http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store, Amazon Inc.http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf HBase Architecture 101 – Storage, Lars Georgehttp://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html BASE: An ACID Alternative, Dan Pritchett

Contenu connexe

Tendances

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB FundamentalsMongoDB
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB Habilelabs
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBLee Theobald
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 

Tendances (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
Document Database
Document DatabaseDocument Database
Document Database
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
NoSQL
NoSQLNoSQL
NoSQL
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 

En vedette

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Susanna-Assunta Sansone
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibitionAnna Casey
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Boris Otto
 

En vedette (15)

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibition
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
 

Similaire à Introduction to NoSQL Databases

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionKelum Senanayake
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012Appirio
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBJeff Douglas
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App ArchitectureCorey Butler
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open SourceDanilo Bordini
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
Software development - the java perspective
Software development - the java perspectiveSoftware development - the java perspective
Software development - the java perspectiveAlin Pandichi
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...Ram Murat Sharma
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 Andrey Vykhodtsev
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about cephEmma Haruka Iwao
 
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...DneprCiklumEvents
 
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Andrey Vykhodtsev
 

Similaire à Introduction to NoSQL Databases (20)

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Intro to RavenDB
Intro to RavenDBIntro to RavenDB
Intro to RavenDB
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDB
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App Architecture
 
MongoDB is the MashupDB
MongoDB is the MashupDBMongoDB is the MashupDB
MongoDB is the MashupDB
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open Source
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
Software development - the java perspective
Software development - the java perspectiveSoftware development - the java perspective
Software development - the java perspective
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about ceph
 
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
 
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
 

Dernier

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Introduction to NoSQL Databases

  • 1. Introduction to NoSQL Databases San Diego NoSQL Meetup – Aug 2010 By Derek Stainer http://nosqldatabases.com
  • 2. Agenda Introduction Objective Explore NoSQL Databases Conclusion
  • 3. Introduction UCSD Graduate in Computer Science Java Developer for 10 years Creator of http://nosqldatabases.com Curator of NoSQL information
  • 4. Objective Deeper dive into each type of NoSQL database Discuss 1-2 NoSQL databases in each family of databases
  • 5. NoSQL Taxonomy Key/Value Document Column Graph Others Geospatial File System Object
  • 6. Key/Value Databases Global collection of Key/Value pairs Inspired by Amazon’s Dynamo and Distributed Hashtables Designed to handle massive load Multiple Types In memory i.e. Memcache On Disk i.e. Redis, SimpleDB Eventually Consistent i.e. Dynamo, Voldemort
  • 7. Key/Value: Voldemort Created by LinkedIn, now open source Inspired by Amazon’s Dynamo Written in Java Pluggable Storage BerkeleyDB, In Memory, MySQL Pluggable Serialization JSON, Thrift, Protocol Buffers, etc. Cluster Rebalancing
  • 8. Key/Value: Voldemort Versioning, based on Vector Clocks Reconciliation occurs on reads. Partitioning and Replication based on Dynamo Consistent Hashing Virtual Nodes Gossip
  • 9. Other Key/Value Stores Other Key/Value Stores Amazon’s Dynamo Riak Redis Memcache SimpleDB
  • 10. Document Databases Similar to a Key/Value database but with a major difference, value is a document Inspired by Lotus Notes Flexible Schema Any number of fields can be added Documents stored in JSON or BSON formats Examples: CouchDB, MongoDB
  • 11. Sample Document { "day": [ 2010, 01, 23 ], "products": { "apple": { "price": 10 "quantity": 6 }, "kiwi": { "price": 20 "quantity": 2 } }, "checkout": 100 }
  • 12. Document: CouchDB Development began ~ 2005 by Damien Katz former Lotus Notes Developer Couch – Cluster Of Unreliable Commodity Hardware Top level Apache Project Commercially supported by CouchIO Licensed under Apache License Written in Erlang Documents are stored in JSON
  • 13. Document: CouchDB [cont’d] B-Tree Storage Engine MVCC model, no locking No joins, primary key or foreign key (UUIDs are auto assigned) Built bi-directional replication Can even run offline, come back and sync back changes Custom persistent views using MapReduce REST API
  • 14. Document: MongoDB Development started in 2007 Commercially supported and developed by 10Gen Stores documents using BSON Supports AdHoc queries Can query against embedded objects and arrays Support multiples types of indexing
  • 15. Document: MongoDB [cont’d] Officially supported drivers available for multiple languages C, C++, Java, Javascript, Perl, PHP, Python and Ruby Community supported drivers include: Scala, Node.js, Haskell, Erlang, Smalltalk Replication uses a master/slave model Scales horizontally via sharding Written C++
  • 16. Column Family Databases Each key is associated with multiple attributes (i.e. Columns) Hybrid row/column stores Inspired by Google BigTable Examples: HBase, Cassandra
  • 17. Column: HBase Based on Google’s BigTable Apache Project TLP Cloudera (certifications, EC2 AMI’s, etc.) Layered over HDFS (Hadoop Distributed File System) Input/Output for MapReduce Jobs APIs Thrift, REST
  • 18. Column: Hbase [cont’d] Automatic partitioning Automatic re-balancing/re-partitioning Fault tolerant HDFS Multiple Replicas Highly distributed
  • 20. Column: Cassandra Created at Facebook for Inbox search Facebook -> Google Code -> ASF Commercial Support available from Riptano Features taken from both Dynamo and BigTable Dynamo – Consistent hashing, Partitioning, Replication Big Table – Column Familes, MemTables, SSTables
  • 21. Column: Cassandra [cont’d] Symmetric nodes No single point of failure Linearly scalable Ease of administration Flexible/Automated Provisioning Flexible Replica Replacement High Availability Eventual Consistency However, consistency is tuneable
  • 22. Column: Cassandra [cont’d] Partitioning Random Good distribution of data between nodes Range scans not possible Order Preserving Can lead to unbalanced nodes Range scans, Natural Order Custom Extremely fast reads/writes (low latency) Thrift API
  • 23. Column: Cassandra [cont’d] Column Basic unit of storage Column Family Collection of like records Record level atomicity Indexed Keyspace Top level namespace Usually one per application
  • 25. Column: Cassandra [cont’d] Column Details Name byte[] Queried against Determines sort order Value byte[] Opaque to Cassandra Timestamp long Conflict resolution (last write wins)
  • 26. Graph Databases Inspired by Euler Graph Theory, G=(E,V) Focused on modeling the structure of the data Property Graph Data Model Examples: Neo4j, InfiniteGraph
  • 28. Graph: Neo4j Data Model: Property Graph Nodes – Person, Place, Thing, etc. Relationships – Lives, Likes, Owns, etc. Properties on Both Primary operation is graph traversal between nodes Written in Java Embedded database
  • 29. Graph: Neo4j [cont’d] Disk-based Graph stored in custom binary format Transactional JTA/JTS, XA, 2PC, MVCC Scales Billions of nodes/relationships/properties per JVM Robust 6+ years in 24/7 production
  • 30. Graph: Neo4j [cont’d] Multiple language binds Jython, Cpython Jruby (including RESTful API) Clojure Scala (including RESTful API) Uses Social Graph i.e. Facebook Recommendation Engines Financial Audit
  • 31. Graph: Neo4j [cont’d] Licensed under AGPLv3 Dual Commercial License Available First server is free Second server Inexpensive Commercial support provided by Neo Technologies
  • 32. Other Graph Databases Other graph databases InfiniteGraph HyperGraphDB sones
  • 35. References NoSQL Databases - Part 1 – Landscape, Vineet Guptahttp://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSQL for Dummies, Tobias Ivarssonhttp://www.slideshare.net/thobe/nosql-for-dummies NoSQL Databases, Marin Dimitrovhttp://www.slideshare.net/marin_dimitrov/nosql-databases-3584443 CouchDB vs. MongoDB, Gabriele Lanahttp://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 Hbase, Ryan Rawsonhttp://www.slideshare.net/adorepump/hbase-nosql Introduction to Cassandra, Gary Dusbabekhttp://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010 Cassandra Explained, Eric Evanshttp://www.slideshare.net/jericevans/cassandra-explained Towards Robust Distributed Systems, Eric Brewerhttp://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Cassandra - A Decentralized Structured Storage System, Lakshman, Ladishttp://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
  • 36. References [cont’d] Bigtable: A Distributed Storage System for Structured Data, Google Inc.http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store, Amazon Inc.http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf HBase Architecture 101 – Storage, Lars Georgehttp://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html BASE: An ACID Alternative, Dan Pritchett

Notes de l'éditeur

  1. Surveying the NoSQL Landscape, By Derek Stainer
  2. Indexing types include, single-key, compound, unique, non-unique, and geospatial
  3. Surveying the NoSQL Landscape, By Derek Stainer
  4. Surveying the NoSQL Landscape, By Derek Stainer