SlideShare une entreprise Scribd logo
1  sur  29
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Dr. Dipali P. Meher
MCS, M.Phil, NET, Ph.D
Modern College of Arts, Science and Commerce,
Ganeshkhind, Pune 16
mailtomeher@gmail.com/dipalimeher@moderncollegegk.org
DATA MODELS
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
NOSQL had ability to run databases on large cluster.
When size of data increases it will become difficult to
scale up with data – always we need to buy bigger ser
ver as data increases.
One solution to this is to run the databases on cluster
of servers.
Running databases on server increases complexity of
databases
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
2 paths for DATA DISTRIBUTION
Replication and Sharding
Replication
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Data Distribution
Sharding
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication
Replication takes the same data
and copies it over multiple nodes.
Replication copies data across multiple servers, so each
bit of data can be found in multiple places
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication : Two Forms
master-slave and peer-to-peer
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication: Master Slave
 Replicate data across multiple nodes
 One node is designated as the master, or primary.
MASTER
 It is the authoritative source for the data
 It is usually responsible for processing any updates to that data.
 other nodes are slaves, or secondaries.
 A replication process synchronizes the slaves with the master.
 It can be appointed manually or automatically.
SLAVE
 A replication process synchronizes the slaves with the master.
 After a failure of the master, a slave can be appointed as new m
aster very quickly.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
MASTER
Can be appointed automatically or manually
Manually: Manual appointing typically means that when
you configure your cluster, you configure one node as the
master.
Automatically: you create a cluster of nodes and they elect
one of themselves to be the master.
automatic appointment means that the cluster can automatically
appoint a new master when a master fails, reducing downtime.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication: Master Slave Replication
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Pros and cons of Master-Slave Replication
 PROS
 More read requests
 Add more slave nodes
 Ensure that all read requests are routed to slaves
 Should the master fail, the slaves can still handle read
requests
 Good for datasets with a read intensive dataset (read
resilience)
 CONS
 The master is a bottleneck
 Limited by its ability to process updates and to
pass those updates on Its failure does eliminate the
ability to handle writes until: the master is restored
or a new master is appointed
 Inconsistency due to slow propagation of changes to
the slaves
 Bad for datasets with heavy write traffic
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Read resilience
More and More Read requests
In order to get read Resilience  user has to ensure that read and write
paths in your application are different.
In case of failure in write path that can be handled separately and read
can continue
Read Path
Write Path
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Master slave replication is good for read resil
ience.
Does not scale well for write resilience.
It also faces bottleneck problem for updates
(write requests).
To solve above issues peer-to-peer replication
is there.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication: peer-to-peer
 All the replicas have equal weight,
 All replicas process write requests
 Loss of any one replica does not prevent
access to data store.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Of any ode will fail then working is continu
ed with other nodes. i.e user can ride over
node failures without losing access to data.
Nodes can be easily added to improve the
performance(complications may increase).
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Complications in peer-to-peer
Biggest complication: consistency
When you can write to two different places
you run the risk that two people will
attempt to update the same record at the
same time—a write-write conflict.
Inconsistencies on read lead to problems
but at least they are relatively transient.
Inconsistent writes are forever.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Peer-to-peer replication
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Peer-to-peer replication
we can ensure that whenever we write data, the
replicas coordinate to ensure we avoid a conflict.
We don’t need all the replicas to agree on the write,
just a majority, so we can still survive losing a
minority of the replica nodes.
we can decide to cope with an inconsistent write.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding
A busy data store is busy because different peo
ple are accessing different parts of the dataset.
In these circumstances we can support
horizontal scalability by putting different
parts of the data onto different
servers —a technique that’s called sharding
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding
Sharding puts different data on different nodes
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding: Ideal Case
We have different users all talking to different
server nodes.
Each user only has to talk to one server, so
gets rapid responses from that server.
The load is balanced out nicely between
servers—for example, if we have ten servers,
each one only has to handle 10% of the load.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
we have to ensure that data that’s accessed together is clump
ed together on the same node and that these clumps are arran
ged on the nodes to provide the best data access.
How to clump the data? using aggregate aggregates are
used to m to combine data that’s commonly accessed
together—so aggregates leap out as an obvious unit of distri
bution.)
To increase performance of aggregates:
1) physical location where aggregates are stored is important
2) Even loading: try to arrange aggregates so they are evenly
distributed across the nodes which all get equal amounts of the
load
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication and sharding are ortho
gonal techniques: You can use
either or both of them
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Many NoSQL databases offer
auto-sharding, where the database
takes on the responsibility of allocatin
g data to shards and ensuring that dat
a access goes to the right shard. This
can make it much easier to use
sharding in an application.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding is particularly valuable for
performance. It can improve both
read and write performance.
A way to horizontally scale writes.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Combining Sharding and Replication
If we use both master-slave replication
and sharding, this means that we have
multiple masters, but each data item
only has a single master Depending on
your configuration, you may choose a
node to be a master for some data and
slaves for others, or you may dedicate no
des for master or slave duties.
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Combining Sharding and Replication
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Combining Sharding and Replication
It is good for column-family databases.
Example : tens or hundreds of nodes in a
cluster with data sharded over them.
A good starting point for peer-to-peer
replication is to have a replication factor of 3,
so each shard is present on three nodes.
Should a node fail, then the shards on that
node will be built on the other nodes
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Peer to peer replication with sharding
Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Thank You

Contenu connexe

Tendances

Tendances (20)

Nosql databases
Nosql databasesNosql databases
Nosql databases
 
NoSql
NoSqlNoSql
NoSql
 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Gfs vs hdfs
Gfs vs hdfsGfs vs hdfs
Gfs vs hdfs
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Introduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & HadoopIntroduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & Hadoop
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
HDFS Architecture
HDFS ArchitectureHDFS Architecture
HDFS Architecture
 

Similaire à Data models in NoSQL

Lecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfLecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdf
manimozhi98
 
Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applications
Jeff Harris
 

Similaire à Data models in NoSQL (20)

Lecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfLecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdf
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
 
Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applications
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
 
nosql [Autosaved].pptx
nosql [Autosaved].pptxnosql [Autosaved].pptx
nosql [Autosaved].pptx
 
No sql3 rmoug
No sql3 rmougNo sql3 rmoug
No sql3 rmoug
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Master master vs master-slave database
Master master vs master-slave databaseMaster master vs master-slave database
Master master vs master-slave database
 
No sql
No sqlNo sql
No sql
 
Nosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understandingNosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understanding
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
MySQL HA Alternatives 2010
MySQL  HA  Alternatives 2010MySQL  HA  Alternatives 2010
MySQL HA Alternatives 2010
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 
How big data moved the needle from monolithic SQL RDBMS to distributed NoSQL
How big data moved the needle from monolithic SQL RDBMS to distributed NoSQLHow big data moved the needle from monolithic SQL RDBMS to distributed NoSQL
How big data moved the needle from monolithic SQL RDBMS to distributed NoSQL
 
Presentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMSPresentation on NoSQL Database related RDBMS
Presentation on NoSQL Database related RDBMS
 
Mysql high availability and scalability
Mysql high availability and scalabilityMysql high availability and scalability
Mysql high availability and scalability
 
Unit 3 MongDB
Unit 3 MongDBUnit 3 MongDB
Unit 3 MongDB
 

Plus de Dr-Dipali Meher

Plus de Dr-Dipali Meher (15)

Database Security Methods, DAC, MAC,View
Database Security Methods, DAC, MAC,ViewDatabase Security Methods, DAC, MAC,View
Database Security Methods, DAC, MAC,View
 
Version Stamps in NOSQL Databases
Version Stamps in NOSQL DatabasesVersion Stamps in NOSQL Databases
Version Stamps in NOSQL Databases
 
DataPreprocessing.pptx
DataPreprocessing.pptxDataPreprocessing.pptx
DataPreprocessing.pptx
 
Literature Review
Literature ReviewLiterature Review
Literature Review
 
Research Problem
Research ProblemResearch Problem
Research Problem
 
Formulation of Research Design
Formulation of Research DesignFormulation of Research Design
Formulation of Research Design
 
Types of Research
Types of ResearchTypes of Research
Types of Research
 
Research Methodology-Intorduction
Research Methodology-IntorductionResearch Methodology-Intorduction
Research Methodology-Intorduction
 
Introduction to Research
Introduction to ResearchIntroduction to Research
Introduction to Research
 
Neo4j session
Neo4j sessionNeo4j session
Neo4j session
 
Schema migrations in no sql
Schema migrations in no sqlSchema migrations in no sql
Schema migrations in no sql
 
Polyglot Persistence
Polyglot Persistence Polyglot Persistence
Polyglot Persistence
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classification
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Function Pointer
Function PointerFunction Pointer
Function Pointer
 

Dernier

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Dernier (20)

Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

Data models in NoSQL

  • 1. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Dr. Dipali P. Meher MCS, M.Phil, NET, Ph.D Modern College of Arts, Science and Commerce, Ganeshkhind, Pune 16 mailtomeher@gmail.com/dipalimeher@moderncollegegk.org DATA MODELS
  • 2. Prepared by Dr. Dipali Meher Source: NoSQL Distilled NOSQL had ability to run databases on large cluster. When size of data increases it will become difficult to scale up with data – always we need to buy bigger ser ver as data increases. One solution to this is to run the databases on cluster of servers. Running databases on server increases complexity of databases
  • 3. Prepared by Dr. Dipali Meher Source: NoSQL Distilled 2 paths for DATA DISTRIBUTION Replication and Sharding Replication
  • 4. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Data Distribution Sharding
  • 5. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Replication Replication takes the same data and copies it over multiple nodes. Replication copies data across multiple servers, so each bit of data can be found in multiple places
  • 6. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Replication : Two Forms master-slave and peer-to-peer
  • 7. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Replication: Master Slave  Replicate data across multiple nodes  One node is designated as the master, or primary. MASTER  It is the authoritative source for the data  It is usually responsible for processing any updates to that data.  other nodes are slaves, or secondaries.  A replication process synchronizes the slaves with the master.  It can be appointed manually or automatically. SLAVE  A replication process synchronizes the slaves with the master.  After a failure of the master, a slave can be appointed as new m aster very quickly.
  • 8. Prepared by Dr. Dipali Meher Source: NoSQL Distilled MASTER Can be appointed automatically or manually Manually: Manual appointing typically means that when you configure your cluster, you configure one node as the master. Automatically: you create a cluster of nodes and they elect one of themselves to be the master. automatic appointment means that the cluster can automatically appoint a new master when a master fails, reducing downtime.
  • 9. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Replication: Master Slave Replication
  • 10. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Pros and cons of Master-Slave Replication  PROS  More read requests  Add more slave nodes  Ensure that all read requests are routed to slaves  Should the master fail, the slaves can still handle read requests  Good for datasets with a read intensive dataset (read resilience)  CONS  The master is a bottleneck  Limited by its ability to process updates and to pass those updates on Its failure does eliminate the ability to handle writes until: the master is restored or a new master is appointed  Inconsistency due to slow propagation of changes to the slaves  Bad for datasets with heavy write traffic
  • 11. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Read resilience More and More Read requests In order to get read Resilience  user has to ensure that read and write paths in your application are different. In case of failure in write path that can be handled separately and read can continue Read Path Write Path
  • 12. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Master slave replication is good for read resil ience. Does not scale well for write resilience. It also faces bottleneck problem for updates (write requests). To solve above issues peer-to-peer replication is there.
  • 13. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Replication: peer-to-peer  All the replicas have equal weight,  All replicas process write requests  Loss of any one replica does not prevent access to data store.
  • 14. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Of any ode will fail then working is continu ed with other nodes. i.e user can ride over node failures without losing access to data. Nodes can be easily added to improve the performance(complications may increase).
  • 15. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Complications in peer-to-peer Biggest complication: consistency When you can write to two different places you run the risk that two people will attempt to update the same record at the same time—a write-write conflict. Inconsistencies on read lead to problems but at least they are relatively transient. Inconsistent writes are forever.
  • 16. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Peer-to-peer replication
  • 17. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Peer-to-peer replication we can ensure that whenever we write data, the replicas coordinate to ensure we avoid a conflict. We don’t need all the replicas to agree on the write, just a majority, so we can still survive losing a minority of the replica nodes. we can decide to cope with an inconsistent write.
  • 18. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Sharding A busy data store is busy because different peo ple are accessing different parts of the dataset. In these circumstances we can support horizontal scalability by putting different parts of the data onto different servers —a technique that’s called sharding
  • 19. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Sharding Sharding puts different data on different nodes
  • 20. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Sharding: Ideal Case We have different users all talking to different server nodes. Each user only has to talk to one server, so gets rapid responses from that server. The load is balanced out nicely between servers—for example, if we have ten servers, each one only has to handle 10% of the load.
  • 21. Prepared by Dr. Dipali Meher Source: NoSQL Distilled we have to ensure that data that’s accessed together is clump ed together on the same node and that these clumps are arran ged on the nodes to provide the best data access. How to clump the data? using aggregate aggregates are used to m to combine data that’s commonly accessed together—so aggregates leap out as an obvious unit of distri bution.) To increase performance of aggregates: 1) physical location where aggregates are stored is important 2) Even loading: try to arrange aggregates so they are evenly distributed across the nodes which all get equal amounts of the load
  • 22. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Replication and sharding are ortho gonal techniques: You can use either or both of them
  • 23. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Many NoSQL databases offer auto-sharding, where the database takes on the responsibility of allocatin g data to shards and ensuring that dat a access goes to the right shard. This can make it much easier to use sharding in an application.
  • 24. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Sharding is particularly valuable for performance. It can improve both read and write performance. A way to horizontally scale writes.
  • 25. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Combining Sharding and Replication If we use both master-slave replication and sharding, this means that we have multiple masters, but each data item only has a single master Depending on your configuration, you may choose a node to be a master for some data and slaves for others, or you may dedicate no des for master or slave duties.
  • 26. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Combining Sharding and Replication
  • 27. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Combining Sharding and Replication It is good for column-family databases. Example : tens or hundreds of nodes in a cluster with data sharded over them. A good starting point for peer-to-peer replication is to have a replication factor of 3, so each shard is present on three nodes. Should a node fail, then the shards on that node will be built on the other nodes
  • 28. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Peer to peer replication with sharding
  • 29. Prepared by Dr. Dipali Meher Source: NoSQL Distilled Thank You

Notes de l'éditeur

  1. Read intensive: more and more read requests Write intensive: more and more write requests read resilience:Should the master fail, the slaves can still handle read requests. Again, this is useful if most of your data access is reads. The failure of the master does eliminate the ability to handle writes until either the master is restored or a new master is appointed. However, having slaves as replicates of the master does speed up recovery after a failure of the master since a slave can be appointed a new master very quickly. meaning of Resilience : the capacity to recover quickly from difficulties(toughness)
  2. A transient database object exists only as long as an application has an open connection to the database. All transient objects disappear when the application shuts down the database. This means that a persistent database does not need to be re-indexed after re-opening. 
  3. data clump" is a name given to any group of variables which are passed around together (in a clump) throughout various parts of the program.  a clump is a grouping.