SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
Why NOSQL? Ok. But, Why So
many?
December 23rd, 2013
Why NOSQL? Ok. But, Why So many?
www.aditi.com
TABLE OF CONTENTS
1. INTRODUCTION ........................................................................................... 3
2. WHY NOSQL?............................................................................................ 3
3. WHY SO MANY? .......................................................................................... 3
4. TYPES OF NOSQL DATABASES ........................................................................ 5
5. KEY ASPECTS .............................................................................................. 5
6. NEWSQL ................................................................................................... 6
7. CONCLUSION .............................................................................................. 6
Why NOSQL? Ok. But, Why So many?
www.aditi.com
1.INTRODUCTION
NOSQL: Not Only SQL, term generally referred to non SQL centric relational data stores
2.WHY NOSQL?
Necessity is the mother of all inventions. A look at what prompted the creation of NOSQL databases.
1. Exorbitant growth of data:
a. Large datasets become onerous when stored in relational databases
b. Query execution time increases creating performance bottlenecks
2. Data model/structure mismatch: Storing hierarchical/graph/relationship data as rows and columns is
highly inefficient, and so is Storing serialized objects
3. Introduction of Distributed Caching infrastructure on top of relational data storage for performance and
its related consistency problems
4. Heavy usage of blob storage beats the purpose
5. Massive Scale out
6. High Availability: always be able to write with a massive write performance, small continuous volatile
reads and write
7. Need for Faster key value access
8. Difficulty in handling volatility in schema and data types some relating to change in business and some
due to data acquisition
9. Complexity in Partitioning/Sharding: Done mostly for manageability, performance or availability
10. Performance in large databases
11. Too Generic, Need for specialist databases
12. Cost based optimization though simplified it for the naïve developers, it is unpredictible more so when
there is high resource queries being executed concurrently.
13. Resource contention, Resource concurrency, blocking queries, index updates, concurrent disk issues such
as log back ups, check pointing,
Is NOSQL the answer to everything stated above? NO, but certainly helps in resolving a few
What NOSQL promises in short is high performance and flexibility with high availability and scalability
3.WHY SO MANY?
What NOSQL databases doesn’t promise is ACID. NOSQL database implementations vary in confirming to various
consistency semantics, most tend to confirm BASE. Let’s look at what they are
ACID
“Atomic: All operations in a transaction succeed or every operation is rolled back.
Consistent: On transaction completion, the database is structurally sound.
Why NOSQL? Ok. But, Why So many?
www.aditi.com
Isolated: Transactions do not contend with one another. Contentious access to state is moderated by the database
so that transactions appear to run sequentially.
Durable: The results of applying a transaction are permanent, even in the presence of failures - Wikipedia”
BASE
“Basic availability: The store appears to work most of the time.
Soft-state: Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the
time.
Eventual consistency: Stores exhibit consistency at some later point (e.g., lazily at read time) – O’Rielly ”
It is important to note that not all NOSQL databases confirm to eventual consistency
Apart from the need for Specialist databases supporting specialised data structures, let’s look at the CAP Theorem
“The CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system
to simultaneously provide all three of the following guarantees
Consistency (all nodes see the same data at the same time)
Availability (a guarantee that every request receives a response about whether it was successful or failed)
Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
According to the theorem, a distributed system cannot satisfy all three of these guarantees at the same time”
- Wikipedia
With drastically different business dynamics, and priorities amongst enterprises, NOSQL databases tend to pick
two of the above mentioned characteristics.
Given the need for flexibility in data structure, there are a multitude of NOSQL databases being introduced, see
figure below
Data Reference: http://nosql-database.org/
Why NOSQL? Ok. But, Why So many?
www.aditi.com
4.TYPES OF NOSQL DATABASES
1. Wide Column Store (Column Families): The data model stores columns of data together, instead of rows
optimized for queries over large datasets
2. Document Store: Pair each key with a complex data structure known as a document. Documents can
contain many different key-value pairs, or key-array pairs, or even nested documents
3. Key Value/Tuple Store: Every single item in the database is stored as an attribute name (or "key"),
together with its value
4. Graph Databases: Graph is a set of nodes and the relationships that connect them. Some graph databases
use native graph, while some serialize the graph data and store in to relational, object or other data store
5. Multi Model Databses: Serve multiple data models
6. Object Databases: Data is persisted in the form of objects
7. Grid and Cloud Database Solutions: Data persisted across multiple servers that work together to manage
information and related operations
8. XML Databases: Data persisted in XML format
9. MultiDimensional Databases: type of database that is optimized for data warehouse and online analytical
processing (OLAP) applications
10. Multi Value databases: Data is persisted as keys and multiple values , they have features that support and
encourage the use of attributes which can take a list of values, rather than all attributes being single-
valued
11. Event Sourcing: Persist application's state by storing the history that determines the current state of the
application
5.KEY ASPECTS
1. NOSQL is not an all in solution, certain scenario mentioned above naturally fits the NOSQL semantics.
NOSQL is certainly not a replcement for relational stores
2. Consider NOSQL for Real time analytics on operational data
3. Consider NOSQL when there are many systems including streaming data
4. NoSQL databases provide a linear approach to database scaling, making scaling easier and intuitive
5. All NOSQL databases are developed to be distributed, scalable databases
6. Data duplication and denormalization are a norm
7. Consider NOSQL for hierarchical, Content Caching, distributed file systems, Social Networking,
recommendation engine and graph like data
8. NOSQL databases can support unstructured and unpredictable data
9. NOSQL databases use a cluster of servers to store data. Data and the operations are usually spread across
clusters
10. Consider NOSQL databases which provide Integrated Caching
11. NOSQL is developed for continous availability
12. Certain NOSQL implementations provide configurable consistency models (strong vs eventual), but this
will have performance implications
13. Only a few NOSQL databases support ACID
14. Only a few NOSQL databases support transactions
Why NOSQL? Ok. But, Why So many?
www.aditi.com
15. Consider NOSQL databases when you have large amounts of data, large enough to not fit in one physical
server
16. Consider NOSQL database when you have a object-relational impedence mismatch
17. NOSQL databases trade off consistency for efficiency
18. Consider NOSQL databases when you need schema flexibility
19. Consider NOSQL database if you are looking for massive write performance
20. Consider NOSQL database if you are looking for fast key value access
21. NOSQL provides horizontal scaling
6.NEWSQL
“NewSQL is a class of modern relational database management systems that seek to provide the same scalable
performance of NoSQL systems for online transaction processing (read-write) workloads while still maintaining the
ACID guarantees of a traditional database system – Wikipedia”
As we have seen above NOSQL databases have been developed to serve different purposes, with one of the main
advantages being scale out. NewSQL is an attempt to provide all the benefits of NOSQL while continuing to support
ACID.
Google Spanner is one of the main contenders with a semi-relational data model, while NuoDB achieves it by split-
ting the transactional (in-memory) and the storage tier accompanied by peer-to-peer coordination.
http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf
http://www.nuodb.com/explore/newsql-cloud-database-how-it-works/
7.CONCLUSION
Be it mergers and acquisitions, or change in business dynamics, or the agility in development large enterprises are
bound to have hybrid solutions. Having multiple RDBMS’s, data warehouses, data marts in one environment is not
unseen or unheard off. It is more than likely for enterprises to add NOSQL/NewSQL databases in to the mix. Be on
the lookout for true shared-nothing distributed architectures!
Prashanth B Panduranga (Shan)
Director-Technology | 725-976-7006 | pandurangap@aditi.com
Connect with us: Blog | Twitter | LinkedIn | Facebook

Contenu connexe

En vedette (9)

My stylemyway
My stylemywayMy stylemyway
My stylemyway
 
Asset anywhere
Asset anywhereAsset anywhere
Asset anywhere
 
Introducing techsharp
Introducing techsharpIntroducing techsharp
Introducing techsharp
 
Mcr trendz
Mcr trendzMcr trendz
Mcr trendz
 
Architecting extremelylarge scale web applications
Architecting extremelylarge scale web applicationsArchitecting extremelylarge scale web applications
Architecting extremelylarge scale web applications
 
Digital transformation
Digital transformationDigital transformation
Digital transformation
 
Safesors
SafesorsSafesors
Safesors
 
Air sync
Air syncAir sync
Air sync
 
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product EngineeringInevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product Engineering
 

Plus de Prashanth Panduranga (11)

WebApplicationArchitectureAzure.pptx
WebApplicationArchitectureAzure.pptxWebApplicationArchitectureAzure.pptx
WebApplicationArchitectureAzure.pptx
 
WebApplicationArchitectureAzure.pdf
WebApplicationArchitectureAzure.pdfWebApplicationArchitectureAzure.pdf
WebApplicationArchitectureAzure.pdf
 
Social review
Social reviewSocial review
Social review
 
Meet mi
Meet miMeet mi
Meet mi
 
Flex matics
Flex maticsFlex matics
Flex matics
 
Doc byyou
Doc byyouDoc byyou
Doc byyou
 
Being there
Being thereBeing there
Being there
 
Agri future
Agri futureAgri future
Agri future
 
Introduction to Enterprise architecture and the steps to perform an Enterpris...
Introduction to Enterprise architecture and the steps to perform an Enterpris...Introduction to Enterprise architecture and the steps to perform an Enterpris...
Introduction to Enterprise architecture and the steps to perform an Enterpris...
 
Why nosql also_why_somany
Why nosql also_why_somanyWhy nosql also_why_somany
Why nosql also_why_somany
 
Mongo learning series
Mongo learning series Mongo learning series
Mongo learning series
 

whynosqlalsowhysomany-140604093355-phpapp01

  • 1. Why NOSQL? Ok. But, Why So many? December 23rd, 2013
  • 2. Why NOSQL? Ok. But, Why So many? www.aditi.com TABLE OF CONTENTS 1. INTRODUCTION ........................................................................................... 3 2. WHY NOSQL?............................................................................................ 3 3. WHY SO MANY? .......................................................................................... 3 4. TYPES OF NOSQL DATABASES ........................................................................ 5 5. KEY ASPECTS .............................................................................................. 5 6. NEWSQL ................................................................................................... 6 7. CONCLUSION .............................................................................................. 6
  • 3. Why NOSQL? Ok. But, Why So many? www.aditi.com 1.INTRODUCTION NOSQL: Not Only SQL, term generally referred to non SQL centric relational data stores 2.WHY NOSQL? Necessity is the mother of all inventions. A look at what prompted the creation of NOSQL databases. 1. Exorbitant growth of data: a. Large datasets become onerous when stored in relational databases b. Query execution time increases creating performance bottlenecks 2. Data model/structure mismatch: Storing hierarchical/graph/relationship data as rows and columns is highly inefficient, and so is Storing serialized objects 3. Introduction of Distributed Caching infrastructure on top of relational data storage for performance and its related consistency problems 4. Heavy usage of blob storage beats the purpose 5. Massive Scale out 6. High Availability: always be able to write with a massive write performance, small continuous volatile reads and write 7. Need for Faster key value access 8. Difficulty in handling volatility in schema and data types some relating to change in business and some due to data acquisition 9. Complexity in Partitioning/Sharding: Done mostly for manageability, performance or availability 10. Performance in large databases 11. Too Generic, Need for specialist databases 12. Cost based optimization though simplified it for the naïve developers, it is unpredictible more so when there is high resource queries being executed concurrently. 13. Resource contention, Resource concurrency, blocking queries, index updates, concurrent disk issues such as log back ups, check pointing, Is NOSQL the answer to everything stated above? NO, but certainly helps in resolving a few What NOSQL promises in short is high performance and flexibility with high availability and scalability 3.WHY SO MANY? What NOSQL databases doesn’t promise is ACID. NOSQL database implementations vary in confirming to various consistency semantics, most tend to confirm BASE. Let’s look at what they are ACID “Atomic: All operations in a transaction succeed or every operation is rolled back. Consistent: On transaction completion, the database is structurally sound.
  • 4. Why NOSQL? Ok. But, Why So many? www.aditi.com Isolated: Transactions do not contend with one another. Contentious access to state is moderated by the database so that transactions appear to run sequentially. Durable: The results of applying a transaction are permanent, even in the presence of failures - Wikipedia” BASE “Basic availability: The store appears to work most of the time. Soft-state: Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time. Eventual consistency: Stores exhibit consistency at some later point (e.g., lazily at read time) – O’Rielly ” It is important to note that not all NOSQL databases confirm to eventual consistency Apart from the need for Specialist databases supporting specialised data structures, let’s look at the CAP Theorem “The CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it was successful or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system) According to the theorem, a distributed system cannot satisfy all three of these guarantees at the same time” - Wikipedia With drastically different business dynamics, and priorities amongst enterprises, NOSQL databases tend to pick two of the above mentioned characteristics. Given the need for flexibility in data structure, there are a multitude of NOSQL databases being introduced, see figure below Data Reference: http://nosql-database.org/
  • 5. Why NOSQL? Ok. But, Why So many? www.aditi.com 4.TYPES OF NOSQL DATABASES 1. Wide Column Store (Column Families): The data model stores columns of data together, instead of rows optimized for queries over large datasets 2. Document Store: Pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents 3. Key Value/Tuple Store: Every single item in the database is stored as an attribute name (or "key"), together with its value 4. Graph Databases: Graph is a set of nodes and the relationships that connect them. Some graph databases use native graph, while some serialize the graph data and store in to relational, object or other data store 5. Multi Model Databses: Serve multiple data models 6. Object Databases: Data is persisted in the form of objects 7. Grid and Cloud Database Solutions: Data persisted across multiple servers that work together to manage information and related operations 8. XML Databases: Data persisted in XML format 9. MultiDimensional Databases: type of database that is optimized for data warehouse and online analytical processing (OLAP) applications 10. Multi Value databases: Data is persisted as keys and multiple values , they have features that support and encourage the use of attributes which can take a list of values, rather than all attributes being single- valued 11. Event Sourcing: Persist application's state by storing the history that determines the current state of the application 5.KEY ASPECTS 1. NOSQL is not an all in solution, certain scenario mentioned above naturally fits the NOSQL semantics. NOSQL is certainly not a replcement for relational stores 2. Consider NOSQL for Real time analytics on operational data 3. Consider NOSQL when there are many systems including streaming data 4. NoSQL databases provide a linear approach to database scaling, making scaling easier and intuitive 5. All NOSQL databases are developed to be distributed, scalable databases 6. Data duplication and denormalization are a norm 7. Consider NOSQL for hierarchical, Content Caching, distributed file systems, Social Networking, recommendation engine and graph like data 8. NOSQL databases can support unstructured and unpredictable data 9. NOSQL databases use a cluster of servers to store data. Data and the operations are usually spread across clusters 10. Consider NOSQL databases which provide Integrated Caching 11. NOSQL is developed for continous availability 12. Certain NOSQL implementations provide configurable consistency models (strong vs eventual), but this will have performance implications 13. Only a few NOSQL databases support ACID 14. Only a few NOSQL databases support transactions
  • 6. Why NOSQL? Ok. But, Why So many? www.aditi.com 15. Consider NOSQL databases when you have large amounts of data, large enough to not fit in one physical server 16. Consider NOSQL database when you have a object-relational impedence mismatch 17. NOSQL databases trade off consistency for efficiency 18. Consider NOSQL databases when you need schema flexibility 19. Consider NOSQL database if you are looking for massive write performance 20. Consider NOSQL database if you are looking for fast key value access 21. NOSQL provides horizontal scaling 6.NEWSQL “NewSQL is a class of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for online transaction processing (read-write) workloads while still maintaining the ACID guarantees of a traditional database system – Wikipedia” As we have seen above NOSQL databases have been developed to serve different purposes, with one of the main advantages being scale out. NewSQL is an attempt to provide all the benefits of NOSQL while continuing to support ACID. Google Spanner is one of the main contenders with a semi-relational data model, while NuoDB achieves it by split- ting the transactional (in-memory) and the storage tier accompanied by peer-to-peer coordination. http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf http://www.nuodb.com/explore/newsql-cloud-database-how-it-works/ 7.CONCLUSION Be it mergers and acquisitions, or change in business dynamics, or the agility in development large enterprises are bound to have hybrid solutions. Having multiple RDBMS’s, data warehouses, data marts in one environment is not unseen or unheard off. It is more than likely for enterprises to add NOSQL/NewSQL databases in to the mix. Be on the lookout for true shared-nothing distributed architectures! Prashanth B Panduranga (Shan) Director-Technology | 725-976-7006 | pandurangap@aditi.com Connect with us: Blog | Twitter | LinkedIn | Facebook