SlideShare une entreprise Scribd logo
1  sur  53
An introduction to
NoSQL databases
POOYAN MEHRPARVAR
DEC 2014To get more references visit:
http://bit.ly/nosql_srbiau
1
What is covered in this
presentation:
 A brief history of data bases
 NoSQL why, what and when?
 Aggregate Data Models
 BASE vs ACID
 CAP theorem
 Polyglot persistence : the future of database systems
2
Why did we choose this topic?
 Is NoSQL replacing traditional databases?
 Where should we use NoSQL databases?
 Should we use NoSQL in any kind of projects?
3
A brief history of databases 4
Relational databases
Benefits of Relational databases:
Designed for all purposes
ACID
Strong consistancy, concurrency,
recovery
Mathematical background
Standard Query language (SQL)
Lots of tools to use with i.e: Reporting
services, entity frameworks, ...
Vertical scaling (up scaling)
Object / Object-relational databases
were not practical. Mainly because of
Impedance mismatch
5
Era of Distributed Computing
But...
 Relational databases were not built for
distributed applications.
Because...
 Joins are expensive
 Hard to scale horizontally
 Impedance mismatch occurs
 Expensive (product cost, hardware,
Maintenance)
6
Era of Distributed Computing
But...
 Relational databases were not built for
distributed applications.
Because...
 Joins are expensive
 Hard to scale horizontally
 Impedance mismatch occurs
 Expensive (product cost, hardware,
Maintenance)
And....
It’s weak in:
 Speed (performance)
 High availability
 Partition tolerance
7
Rise of Big data
Three V(s) of Bigdata:
 Volume
 Velocity
 Variety
8
Rise of Big data 9
Rise of Big data
 Wallmart: 1 million transactions per
hour
 Facebook: 40 billion photos
 People are talking about petabytes
today
10
NoSQL why, what and when?
 Google & Amazon bulit their own databases (Big table & Dynamo)
 Facebook invented Cassandra and is using thousands of them
 #NoSQL was a twitter hashtag for a conference in 2009
 The name doesn’t indicate its characteristics
 There is no strict defenition for NoSQL databases
 There are more than 150 NoSQL databases (nosql-database.org)
11
Characteristics of NoSQL databases
 Non relational
 Cluster friendly
 Schema-less
 21 century web
 Open-source
12
Characteristics of NoSQL databases
NoSQL avoids:
 Overhead of ACID transactions
 Complexity of SQL query
 Burden of up-front schema design
 DBA presence
 Transactions (It should be handled at
application layer)
Provides:
 Easy and frequent changes to DB
 Horizontal scaling (scaling out)
 Solution to Impedance mismatch
 Fast development
13
NoSQL is getting more & more popular 14
What is a schema-less datamodel?
In relational Databases:
 You can’t add a record which does not fit
the schema
 You need to add NULLs to unused items in
a row
 We should consider the datatypes. i.e :
you can’t add a stirng to an interger field
 You can’t add multiple items in a field
(You should create another table:
primary-key, foreign key, joins,
normalization, ... !!!)
15
What is a schema-less datamodel?
In NoSQL Databases:
 There is no schema to consider
 There is no unused cell
 There is no datatype (implicit)
 Most of considerations are done in
application layer
 We gather all items in an aggregate
(document)
16
What is Aggregation?
 The term comes from Domain Driven Design
 Shared nothing architecture
 An aggregate is a cluster of domain objects that can be treated as
a single unit
 Aggregates are the basic element of transfer of data storage - you
request to load or save whole aggregates
 Transactions should not cross aggregate boundaries
 This mechanism reduces the join operations to a minimal level
17
What is Aggregation? 18
What is Aggregation? 19
What is Aggregation? 20
Aggregate Data Models
NoSQL databases are classified in four major datamodels:
 Key-value
 Document
 Column family
 Graph
Each DB has its own query language
21
Key-value data model
 The main idea is the use of a hash table
 Access data (values) by strings called keys
 Data has no required format – data may have any format
 Data model: (key, value) pairs
 Basic Operations:
Insert(key,value), Fetch(key),Update(key), Delete(key)
22
Key-value data model
 “Value” is stored as a “blob”
- Without caring or knowing what is inside
- Application is responsible for understanding the
data
 Main observation from Amazon (using Dynamo)
– “There are many services on Amazon’s platform
that only need primary-key access to a data
store.”
E.g. Best seller lists, shopping carts, customer
preferences, session management, sales rank,
product catalog
23
Column family data model
 The column is lowest/smallest instance of
data.
 It is a tuple that contains a name, a value
and a timestamp
24
Column family data model
Some statistics about Facebook Search (using Cassandra)
 MySQL > 50 GB Data
 Writes Average : ~300 ms
 Reads Average : ~350 ms
 Rewritten with Cassandra > 50 GB Data
 Writes Average : 0.12 ms
 Reads Average : 15 ms
25
Graph data model
 Based on Graph Theory.
 Scale vertically, no clustering.
 You can use graph algorithms easily
 Transactions
 ACID
26
Document-based datamodel
 Usually JSON like interchange model.
 Query Model: JavaScript-like or custom.
 Aggregations: Map/Reduce
 Indexes are done via B-Trees.
 unlike simple key-value stores, both keys
and values are fully searchable in
document databases.
27
Document-based datamodel 28
Overview of a Document-based datamodel 29
Overview of a Document-based datamodel 30
Overview of a Document-based datamodel 31
Overview of a Document-based datamodel 32
A sample MongoDB query 33
MySQL:
MongoDB:
There is no join in MongoDB query
Because we are using an aggregate data model
What we need?
 We need a distributed database system having such
features:
 – Fault tolerance
 – High availability
 – Consistency
 – Scalability
34
What we need?
 We need a distributed database system having such
features:
 – Fault tolerance
 – High availability
 – Consistency
 – Scalability
Which is impossible!!!
According to CAP theorem
35
Should we...?
 In some cases getting an answer quickly is
more important than getting a correct
answer
 By giving up ACID properties, one can
achieve higher performance and scalability.
 Any data store can achieve Atomicity,
Isolation and Durability but do you always
need consistency?
 Maybe we should implement Asynchronous
Inserts and updates and should not wait for
confirmation?
36
BASE
Almost the opposite of ACID.
 Basically available: Nodes in the a distributed
environment can go down, but the whole
system shouldn’t be affected.
 Soft State (scalable): The state of the system and
data changes over time.
 Eventual Consistency: Given enough time, data
will be consistent across the distributed system.
37
BASE vs ACID 38
CAP theorem
Consistency: Clients should
read the same data. There
are many levels of
consistency.
o Strict Consistency – RDBMS.
o Tunable Consistency –
Cassandra.
o Eventual Consistency –
Mongodb.
Availability: Data to be
available.
Partial Tolerance: Data to
be partitioned across
network segments due to
network failures.
39
CAP theorem in different SQL/NoSQL
databases
We can not achieve all the three items
In distributed database systems (center) Proven by Nancy Lynch et al. MIT labs.
40
CAP theorem : A simple proof 41
CAP theorem : A simple proof 42
CAP theorem : A simple proof 43
Which data model to choose 44
Polyglot persistence : the future of database
systems
 Future databases are the combination of SQL & NoSQL
 We still need relational databases
45
Overview of a polygot db 46
New approach to database systems:
 Integrated databases has its own
advantages and disadvantages
 With the advent of webservices it
seems now it’s the time to switch
to decentralized data bases
 Single point of failure, Bottlenecks
would be avoided
 Clustering & replication would be
much easier
47
Conclusion:
Before you choose NoSQL as a solution:
Consider these items, ...
 Needs a precise evaluation, Maybe NoSQL is not the right thing
 Needs to read lots of case study papers
 Aggregation is totally a different approach
 NoSQL is still immature
 Needs lots of hours of studing and working to expert in a particular
NoSQL db
 There is no standard query language
 Most of controls have to be implemented at the application layer
 Relational databases are still the strongest in transactional environments
and provide the best solutions in consistancy and concurrency control
48
Conclusion:
Before you choose NoSQL as a solution:
49
Say hello to... 50
NewSQL a brief defenition
 NewSQL group was founded in 2011
Michael Stonebraker’s Definition …
 SQL as the primary interface.
 ACID support for transactions
 Non-locking concurrency control.
 High per-node performance.
 Parallel, shared-nothing architecture – each node is
independent and self-sufficient – do not share memory or storage
51
Technology is still in its infancy...
In 2000 no one even thought database
systems could be a hot topic again!
To get more references visit:
http://bit.ly/nosql_srbiau
52
References:
 NoSQL distilled, Martin Fowler
 Martin Fowler’s presentation at Goto conference
 www.mongodb.org
53

Contenu connexe

Tendances

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.Navdeep Charan
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-PresentationShubham Tomar
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB Habilelabs
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 

Tendances (20)

Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
 
Introduction to mongodb
Introduction to mongodbIntroduction to mongodb
Introduction to mongodb
 
NoSql
NoSqlNoSql
NoSql
 
NoSql
NoSqlNoSql
NoSql
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 

Similaire à NoSQL databases - An introduction

NoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarNoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarJainul Musani
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Presentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMSPresentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMSabdurrobsoyon
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Mohamed Galal
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيMohamed Galal
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesMaynooth University
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillBilly Newport
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentationSalma Gouia
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Ahmed Rashwan
 

Similaire à NoSQL databases - An introduction (20)

NoSQL - 05March2014 Seminar
NoSQL - 05March2014 SeminarNoSQL - 05March2014 Seminar
NoSQL - 05March2014 Seminar
 
NOSQL
NOSQLNOSQL
NOSQL
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
NoSQL Basics and MongDB
NoSQL Basics and  MongDBNoSQL Basics and  MongDB
NoSQL Basics and MongDB
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Presentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMSPresentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMS
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
Chapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choicesChapter1: NoSQL: It’s about making intelligent choices
Chapter1: NoSQL: It’s about making intelligent choices
 
nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison Pill
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 

Dernier

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 

Dernier (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 

NoSQL databases - An introduction

  • 1. An introduction to NoSQL databases POOYAN MEHRPARVAR DEC 2014To get more references visit: http://bit.ly/nosql_srbiau 1
  • 2. What is covered in this presentation:  A brief history of data bases  NoSQL why, what and when?  Aggregate Data Models  BASE vs ACID  CAP theorem  Polyglot persistence : the future of database systems 2
  • 3. Why did we choose this topic?  Is NoSQL replacing traditional databases?  Where should we use NoSQL databases?  Should we use NoSQL in any kind of projects? 3
  • 4. A brief history of databases 4
  • 5. Relational databases Benefits of Relational databases: Designed for all purposes ACID Strong consistancy, concurrency, recovery Mathematical background Standard Query language (SQL) Lots of tools to use with i.e: Reporting services, entity frameworks, ... Vertical scaling (up scaling) Object / Object-relational databases were not practical. Mainly because of Impedance mismatch 5
  • 6. Era of Distributed Computing But...  Relational databases were not built for distributed applications. Because...  Joins are expensive  Hard to scale horizontally  Impedance mismatch occurs  Expensive (product cost, hardware, Maintenance) 6
  • 7. Era of Distributed Computing But...  Relational databases were not built for distributed applications. Because...  Joins are expensive  Hard to scale horizontally  Impedance mismatch occurs  Expensive (product cost, hardware, Maintenance) And.... It’s weak in:  Speed (performance)  High availability  Partition tolerance 7
  • 8. Rise of Big data Three V(s) of Bigdata:  Volume  Velocity  Variety 8
  • 9. Rise of Big data 9
  • 10. Rise of Big data  Wallmart: 1 million transactions per hour  Facebook: 40 billion photos  People are talking about petabytes today 10
  • 11. NoSQL why, what and when?  Google & Amazon bulit their own databases (Big table & Dynamo)  Facebook invented Cassandra and is using thousands of them  #NoSQL was a twitter hashtag for a conference in 2009  The name doesn’t indicate its characteristics  There is no strict defenition for NoSQL databases  There are more than 150 NoSQL databases (nosql-database.org) 11
  • 12. Characteristics of NoSQL databases  Non relational  Cluster friendly  Schema-less  21 century web  Open-source 12
  • 13. Characteristics of NoSQL databases NoSQL avoids:  Overhead of ACID transactions  Complexity of SQL query  Burden of up-front schema design  DBA presence  Transactions (It should be handled at application layer) Provides:  Easy and frequent changes to DB  Horizontal scaling (scaling out)  Solution to Impedance mismatch  Fast development 13
  • 14. NoSQL is getting more & more popular 14
  • 15. What is a schema-less datamodel? In relational Databases:  You can’t add a record which does not fit the schema  You need to add NULLs to unused items in a row  We should consider the datatypes. i.e : you can’t add a stirng to an interger field  You can’t add multiple items in a field (You should create another table: primary-key, foreign key, joins, normalization, ... !!!) 15
  • 16. What is a schema-less datamodel? In NoSQL Databases:  There is no schema to consider  There is no unused cell  There is no datatype (implicit)  Most of considerations are done in application layer  We gather all items in an aggregate (document) 16
  • 17. What is Aggregation?  The term comes from Domain Driven Design  Shared nothing architecture  An aggregate is a cluster of domain objects that can be treated as a single unit  Aggregates are the basic element of transfer of data storage - you request to load or save whole aggregates  Transactions should not cross aggregate boundaries  This mechanism reduces the join operations to a minimal level 17
  • 21. Aggregate Data Models NoSQL databases are classified in four major datamodels:  Key-value  Document  Column family  Graph Each DB has its own query language 21
  • 22. Key-value data model  The main idea is the use of a hash table  Access data (values) by strings called keys  Data has no required format – data may have any format  Data model: (key, value) pairs  Basic Operations: Insert(key,value), Fetch(key),Update(key), Delete(key) 22
  • 23. Key-value data model  “Value” is stored as a “blob” - Without caring or knowing what is inside - Application is responsible for understanding the data  Main observation from Amazon (using Dynamo) – “There are many services on Amazon’s platform that only need primary-key access to a data store.” E.g. Best seller lists, shopping carts, customer preferences, session management, sales rank, product catalog 23
  • 24. Column family data model  The column is lowest/smallest instance of data.  It is a tuple that contains a name, a value and a timestamp 24
  • 25. Column family data model Some statistics about Facebook Search (using Cassandra)  MySQL > 50 GB Data  Writes Average : ~300 ms  Reads Average : ~350 ms  Rewritten with Cassandra > 50 GB Data  Writes Average : 0.12 ms  Reads Average : 15 ms 25
  • 26. Graph data model  Based on Graph Theory.  Scale vertically, no clustering.  You can use graph algorithms easily  Transactions  ACID 26
  • 27. Document-based datamodel  Usually JSON like interchange model.  Query Model: JavaScript-like or custom.  Aggregations: Map/Reduce  Indexes are done via B-Trees.  unlike simple key-value stores, both keys and values are fully searchable in document databases. 27
  • 29. Overview of a Document-based datamodel 29
  • 30. Overview of a Document-based datamodel 30
  • 31. Overview of a Document-based datamodel 31
  • 32. Overview of a Document-based datamodel 32
  • 33. A sample MongoDB query 33 MySQL: MongoDB: There is no join in MongoDB query Because we are using an aggregate data model
  • 34. What we need?  We need a distributed database system having such features:  – Fault tolerance  – High availability  – Consistency  – Scalability 34
  • 35. What we need?  We need a distributed database system having such features:  – Fault tolerance  – High availability  – Consistency  – Scalability Which is impossible!!! According to CAP theorem 35
  • 36. Should we...?  In some cases getting an answer quickly is more important than getting a correct answer  By giving up ACID properties, one can achieve higher performance and scalability.  Any data store can achieve Atomicity, Isolation and Durability but do you always need consistency?  Maybe we should implement Asynchronous Inserts and updates and should not wait for confirmation? 36
  • 37. BASE Almost the opposite of ACID.  Basically available: Nodes in the a distributed environment can go down, but the whole system shouldn’t be affected.  Soft State (scalable): The state of the system and data changes over time.  Eventual Consistency: Given enough time, data will be consistent across the distributed system. 37
  • 39. CAP theorem Consistency: Clients should read the same data. There are many levels of consistency. o Strict Consistency – RDBMS. o Tunable Consistency – Cassandra. o Eventual Consistency – Mongodb. Availability: Data to be available. Partial Tolerance: Data to be partitioned across network segments due to network failures. 39
  • 40. CAP theorem in different SQL/NoSQL databases We can not achieve all the three items In distributed database systems (center) Proven by Nancy Lynch et al. MIT labs. 40
  • 41. CAP theorem : A simple proof 41
  • 42. CAP theorem : A simple proof 42
  • 43. CAP theorem : A simple proof 43
  • 44. Which data model to choose 44
  • 45. Polyglot persistence : the future of database systems  Future databases are the combination of SQL & NoSQL  We still need relational databases 45
  • 46. Overview of a polygot db 46
  • 47. New approach to database systems:  Integrated databases has its own advantages and disadvantages  With the advent of webservices it seems now it’s the time to switch to decentralized data bases  Single point of failure, Bottlenecks would be avoided  Clustering & replication would be much easier 47
  • 48. Conclusion: Before you choose NoSQL as a solution: Consider these items, ...  Needs a precise evaluation, Maybe NoSQL is not the right thing  Needs to read lots of case study papers  Aggregation is totally a different approach  NoSQL is still immature  Needs lots of hours of studing and working to expert in a particular NoSQL db  There is no standard query language  Most of controls have to be implemented at the application layer  Relational databases are still the strongest in transactional environments and provide the best solutions in consistancy and concurrency control 48
  • 49. Conclusion: Before you choose NoSQL as a solution: 49
  • 51. NewSQL a brief defenition  NewSQL group was founded in 2011 Michael Stonebraker’s Definition …  SQL as the primary interface.  ACID support for transactions  Non-locking concurrency control.  High per-node performance.  Parallel, shared-nothing architecture – each node is independent and self-sufficient – do not share memory or storage 51
  • 52. Technology is still in its infancy... In 2000 no one even thought database systems could be a hot topic again! To get more references visit: http://bit.ly/nosql_srbiau 52
  • 53. References:  NoSQL distilled, Martin Fowler  Martin Fowler’s presentation at Goto conference  www.mongodb.org 53

Notes de l'éditeur

  1. Who is familiar with NoSQL? Who has worked with a practical distributed database? First of all you have to forgot about the SQL view. NoSQL is a kind of new approach
  2. A friend chose Mongodb as a solution for their log db. But it faild because they have some difficulties about transactions. (there is no transaction) Maybe it’s due to their lack of knowlage about NoSQL dbs
  3. Partitioning and Memcache in RDMSs Scale up is expensive – we need to scale out
  4. We have acid transactions in graph databases We have atomicity in an aggregate (a document in MongoDB)
  5. Facebook is using +6000 cassandra dbs. Have you seen the same with oracle or db2 (RDMS are suitable to upscaling) That’s how google extend its clusters everyday.
  6. Good for social networks & CS projects
  7. Mongodb has auto sharding, map/reduce module
  8. -> As the data is written, the latest version is on at least one node. The data is then versioned/replicated to other nodes within the system. -> Eventually, the same version is on all nodes.
  9. For a given accepted update and a given node, eventually either the update reaches the node or the node is removed from service
  10. Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it succeeded or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
  11. In nosql we prefer Avaibility. When there is a inconsistancy in shoping, the most important thing is to shop!! (Amazon’s example)
  12. A RDMS will do such a thing. The whole system will be down until it comes to a consistant level
  13. We choose between C & A (it’s not a binary decision) It depends on our domain to decide about the inconsistency window (we should talk to the domain experts)
  14. Distribution methods: replication (master-slave, peer to peer) and sharding. Cassandra uses sharding and peer to peer Master-slave (single point of failure – good in consistency) Peer to peer (consistency is expensive) Replica sets are used for data redundancy, automated failover, read scaling, server maintenance without downtime.
  15. An idea for the thesis: Test nosql (e.g. mongodb) ability to scale out with virtual machines (various Lubuntu machines)
  16. Dental clinic example – xml as a solution – nosql as solution Facebook query example : show me a female < 30 who is intested in y music living in z Most of projects have some custom tables