SlideShare a Scribd company logo
1 of 35
Consistency and Availability
tradeoff in database clusters
Grokking Techtalk 40
About Me
● About Me
● Introduce Segmentation Platform
2
About Me
● Joint Grab for 2 years, currently working as the lead engineer of Segmentation Platform project
● Lead the Research Database group in Grokking Lab
3
About Segmentation Platform (SegP)
● Technology
○ Programming Languages: golang, java, scala
○ Batch processing (spark, scala),
○ Caching (redis),
○ Message queue (SQS, Kafka),
○ Relational database (MySQL),
○ Non-relational (Cassandra, DynamoDB, Elastic Search),
● Team's scope
○ Features development. Coordinate with business owners to develop a platform for
segmentation. Similar to segments.io but for internal users.
○ Batch data processing.
○ Real-time traffic. Build and maintain grpc apis to serve online traffic
4
What we'll discuss in this talk
● CAP theorem
● The cluster architect of Redis, Elastic Search, Cassandra
● How C-A tradeoff reflected in their designs
5
CAP Theorem
6
Consistency
The system is considered consistent if v1 is
returned to client 2 if the read request (2)
happened after the write request (1)
Client 1
Client 2
DB
System
(1) Update v=v1
(2) Get v
V value is
currently v0
7
Availability
When 1 request is sent, one algorithm is being
designed to handle that request which some
steps.
If the system can't go through the algorithm
designed for that request, they're considered
"not available" to that client.
Client 1
Client 2
DB
System
500
DB
System
(3) System return 2xx or 4xxx
(1) Client send request
(2) System
went
through the
algorithm
defined for
this request
(2) System
cannot go
through the
algorithm
defined for
this request
8
Network partition
Network partition happened when some of
the nodes cannot communicate properly to
each other and they believe that the others
was offline.
For example, Node 1 cannot communicate
with Node 2, hence Node 1 thought that
Node 2 is offline. But Node 2 still alive, and
still serve requests.
Node 1 Node 2
Client 1 Client 2
9
CAP Theorem
A distributed database has three very desirable properties:
1. Tolerance towards Network Partition
2. Consistency
3. Availability
The CAP theorem states: You can have at most two of these properties for any shared-data system
Consistency
Availability Partition tolerance
10
Redis Cluster
11
What is Redis
12
- Stands for Remote Dictionary Server
- Is a fast, open-source, in-memory key-value data store for use as a database, cache,
message broker, and queue.
- Delivers sub-millisecond response times enabling millions of requests per second for real-
time applications in Gaming, Ad-Tech, Financial Services, Healthcare, and IoT.
- Popular choice for caching, session management, gaming, leaderboards, real-time analytics,
geospatial, ride-hailing, chat/messaging, media streaming, and pub/sub apps.
Redis cluster - Multi-master
Key is hashed into (1-16384). Depends on
the hash value, the value will be read (and
write into the node assigned that token
accordingly.)
Client
Redis node
1
Redis node
2
key -> value
5 -> "ho chi minh"
6 -> "ha noi"
token 1->8000
token 8001-
>16384
key -> hash
5 -> 18
6 -> 8003
13
6 -> "ha noi"
5 -> "ho chi
minh"
Redis cluster - Master/Replica
Redis uses asynchronous replication, with
asynchronous replica-to-master
acknowledges of the amount of data
processed.
A master can have multiple replicas.
Client write to master, but can read from
replica
Client
1
Redis
master
Redis
replica
Redis
replica
Client
2
Write
command
async updates
Read
command
Ref: https://redis.io/topics/replication
async updates
14
C-A tradeoff
Redis uses asynchronous replication
by default. Which means, by default,
it's AP.
If network partition happened between
master and replica, we'll see
inconsistent data.
Client
1
Redis
master
Redis
replica
Redis
replica
Client
2
Write
command
async updates
async updates
Read
command
return stale
data
15
Elastic Search cluster
16
What is Elasticsearch
17
● Elasticsearch is a distributed, open source search and analytics engine for all types of data,
including textual, numerical, geospatial, structured, and unstructured.
● Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V.
(now known as Elastic).
Roles in Elasticsearch Cluster
Coordinator node
Master nodeData node
Data node
Client 1
Client 2
Manages the overall
operation of a cluster
and keeps track of
the cluster state
Stores and searches
data. Performs all
data-related
operations (indexing,
searching,
aggregating) on local
shards.
Delegates client
requests to the
shards on the data
nodes, collects and
aggregates the
results into one final
result, and sends this
result back to the
client.
18
Steps for primary shards:
● Validate incoming operation and reject it if
structurally invalid
● Execute the operation locally
● Forward the operation to each replica in the current
in-sync copies set.
● Once all replicas have successfully performed the
operation and responded to the primary, the
primary acknowledges the successful completion
of the request to the client
Primary and Replica shards
user1 auth a1 login
from
homepa
ge
Destination node
Coordination node
19
p1
r11
r12
r21
p2
r22
Primary shards to
forward the
operation to replica
If network partition happen, the primary shard cannot
write to replica shard which lead to the primary shard
becomes unavailable.
By default, ElasticSearch is more CP.
C-A tradeoff
Destination node
20
p1
r11
r12
r21
p2
r22
Primary shards to
forward the
operation to replica
Cassandra
21
What is Cassandra
22
Apache Cassandra is an open source, distributed NoSQL database that began internally at
Facebook and was released as an open-source project in July 2008.
Cassandra delivers continuous availability (zero downtime), high performance, and linear
scalability that modern applications require, while also offering operational simplicity and effortless
replication across data centers and geographies.
Cassandra Ring Cluster
-> token: 44
1-20
21-40
41-60
61-80
81-100
101-120
121-140
141-160
Coordinator node
- Each nodes will be assigned a range of token
- Client could connect to any nodes to write, that
node will become the coordinator node
- Partition keys will be hashed into a token.
Coordinator will base on the token to know which
node we can store the data
user1 auth a1 login
from
homepa
ge
Destination node
23
Replication Factor
-> token: 44
1-20
21-40
41-60
61-80
81-100
101-120
121-140
141-160
Coordinator node
Replication node
- Replication Factor (RF) = number of copies we
want to store
- Replication node will be defined by the Replication
Strategy
- Simple strategy = next two nodes will be the
replication node
user1 auth a1 login
from
homepa
ge
24
Data Consistency
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
- Client 1 connect to C1 to read, C1 write data to three nodes, but
failed at node B.
- Client 2 also connect to C2 to read data,
What would happen?
25
Consistent Level (Write)
Level Read Write
One Returns a response from
the closest replica, as
determined by the snitch.
By default, a read repair
runs in the background to
make the other replicas
consistent.
A write must be written to
the commit log and
memtable of at least one
replica node.
Quorum Returns the record after a
quorum of replicas has
responded.
A write must be written to
the commit log and
memtable on a quorum of
replica nodes
All Returns the record after all
replicas have responded.
The read operation will fail if
a replica does not respond.
A write must be written to
the commit log and
memtable on all replica
nodes in the cluster for that
partition.
26
Write with CL=ALL
C1
C2
C
A
B
Client 1
write data with
token 44
v2
v1
v2
Write with CL=ALL
- All replica succeeded -> success
- 1 replica failed -> failed
Result: Failed
27
Write with CL=QUORUM
C1
C2
C
A
B
Client 1
write data with
token 44
v2
v1
v2
Quorum = (RF + 1) / 2 = 2
- Two replicas succeeded -> success
- Less than two success -> failed
Result: Success
28
Consistent Level (Read)
Level Read Write
One Returns a response from
the closest replica, as
determined by the snitch.
By default, a read repair
runs in the background to
make the other replicas
consistent.
A write must be written to
the commit log and
memtable of at least one
replica node.
Quorum Returns the record after a
quorum of replicas has
responded.
A write must be written to
the commit log and
memtable on a quorum of
replica nodes
All Returns the record after all
replicas have responded.
The read operation will fail if
a replica does not respond.
A write must be written to
the commit log and
memtable on all replica
nodes in the cluster for that
partition.
29
Write=QUORUM, Read=One
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
Potentially inconsistent read. If client 2 read
node B, client 2 will receive stale-data.
W (Quorum) + R (1) -> eventual consistent
30
Write=QUORUM, Read=QUORUM
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
Any read combination will always return v2
W (QU) + R (QU) -> consistent
31
Write=All, Read=One
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
Potentially inconsistent read. If client 2 read
node B, client 2 will receive stale-data.
W (All) + R (1) -> consistent
32
Summarize Read and Write CL
WRITE READ Consistent Read Availability Write Availability
All All Consistent Low Low
Quorum All Consistent Low Medium
One All Consistent Low High
All Quoru
m
Consistent Medium Low
Quorum Quoru
m
Consistent Medium Medium
One Quoru
m
Inconsistent Medium High
All One Consistent High Low
Quorum One Inconsistent High Medium
One One Inconsistent High High
33
Summary
Redis Cassandra Elastic Search
Availability > Consistency Tweakable availability and
consistency
Availability < Consistency
34
Q&A
35

More Related Content

What's hot

Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Building Bizweb Microservices with Docker
Building Bizweb Microservices with DockerBuilding Bizweb Microservices with Docker
Building Bizweb Microservices with DockerKhôi Nguyễn Minh
 
Real World Event Sourcing and CQRS
Real World Event Sourcing and CQRSReal World Event Sourcing and CQRS
Real World Event Sourcing and CQRSMatthew Hawkins
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources confluent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
SQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they workSQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they workMarkus Winand
 
How to tune Kafka® for production
How to tune Kafka® for productionHow to tune Kafka® for production
How to tune Kafka® for productionconfluent
 
Dual write strategies for microservices
Dual write strategies for microservicesDual write strategies for microservices
Dual write strategies for microservicesBilgin Ibryam
 
Architecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker servicesArchitecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker servicesLINE Corporation
 
How Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferHow Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferScyllaDB
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking VN
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 

What's hot (20)

Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
kafka
kafkakafka
kafka
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Building Bizweb Microservices with Docker
Building Bizweb Microservices with DockerBuilding Bizweb Microservices with Docker
Building Bizweb Microservices with Docker
 
Sapo Microservices Architecture
Sapo Microservices ArchitectureSapo Microservices Architecture
Sapo Microservices Architecture
 
Real World Event Sourcing and CQRS
Real World Event Sourcing and CQRSReal World Event Sourcing and CQRS
Real World Event Sourcing and CQRS
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
SQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they workSQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they work
 
How to tune Kafka® for production
How to tune Kafka® for productionHow to tune Kafka® for production
How to tune Kafka® for production
 
Dual write strategies for microservices
Dual write strategies for microservicesDual write strategies for microservices
Dual write strategies for microservices
 
Architecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker servicesArchitecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker services
 
How Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and SaferHow Scylla Make Adding and Removing Nodes Faster and Safer
How Scylla Make Adding and Removing Nodes Faster and Safer
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 

Similar to Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster

Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBANikhil Kumar
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
Data correlation using PySpark and HDFS
Data correlation using PySpark and HDFSData correlation using PySpark and HDFS
Data correlation using PySpark and HDFSJohn Conley
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetMongoDB
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...HostedbyConfluent
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 

Similar to Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster (20)

Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
MYSQL
MYSQLMYSQL
MYSQL
 
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
infiniband.pdf
infiniband.pdfinfiniband.pdf
infiniband.pdf
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Data correlation using PySpark and HDFS
Data correlation using PySpark and HDFSData correlation using PySpark and HDFS
Data correlation using PySpark and HDFS
 
Lecture9
Lecture9Lecture9
Lecture9
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica Set
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
PortLand.pptx
PortLand.pptxPortLand.pptx
PortLand.pptx
 

More from Grokking VN

Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking VN
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking VN
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking VN
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoringGrokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking VN
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking VN
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking VN
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking VN
 
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking VN
 
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking VN
 
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B:  Giới thiệu về Viễn thông Di độngGrokking TechTalk #18B:  Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di độngGrokking VN
 

More from Grokking VN (20)

Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platform
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocols
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer Vision
 
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101
 
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
 
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B:  Giới thiệu về Viễn thông Di độngGrokking TechTalk #18B:  Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster

  • 1. Consistency and Availability tradeoff in database clusters Grokking Techtalk 40
  • 2. About Me ● About Me ● Introduce Segmentation Platform 2
  • 3. About Me ● Joint Grab for 2 years, currently working as the lead engineer of Segmentation Platform project ● Lead the Research Database group in Grokking Lab 3
  • 4. About Segmentation Platform (SegP) ● Technology ○ Programming Languages: golang, java, scala ○ Batch processing (spark, scala), ○ Caching (redis), ○ Message queue (SQS, Kafka), ○ Relational database (MySQL), ○ Non-relational (Cassandra, DynamoDB, Elastic Search), ● Team's scope ○ Features development. Coordinate with business owners to develop a platform for segmentation. Similar to segments.io but for internal users. ○ Batch data processing. ○ Real-time traffic. Build and maintain grpc apis to serve online traffic 4
  • 5. What we'll discuss in this talk ● CAP theorem ● The cluster architect of Redis, Elastic Search, Cassandra ● How C-A tradeoff reflected in their designs 5
  • 7. Consistency The system is considered consistent if v1 is returned to client 2 if the read request (2) happened after the write request (1) Client 1 Client 2 DB System (1) Update v=v1 (2) Get v V value is currently v0 7
  • 8. Availability When 1 request is sent, one algorithm is being designed to handle that request which some steps. If the system can't go through the algorithm designed for that request, they're considered "not available" to that client. Client 1 Client 2 DB System 500 DB System (3) System return 2xx or 4xxx (1) Client send request (2) System went through the algorithm defined for this request (2) System cannot go through the algorithm defined for this request 8
  • 9. Network partition Network partition happened when some of the nodes cannot communicate properly to each other and they believe that the others was offline. For example, Node 1 cannot communicate with Node 2, hence Node 1 thought that Node 2 is offline. But Node 2 still alive, and still serve requests. Node 1 Node 2 Client 1 Client 2 9
  • 10. CAP Theorem A distributed database has three very desirable properties: 1. Tolerance towards Network Partition 2. Consistency 3. Availability The CAP theorem states: You can have at most two of these properties for any shared-data system Consistency Availability Partition tolerance 10
  • 12. What is Redis 12 - Stands for Remote Dictionary Server - Is a fast, open-source, in-memory key-value data store for use as a database, cache, message broker, and queue. - Delivers sub-millisecond response times enabling millions of requests per second for real- time applications in Gaming, Ad-Tech, Financial Services, Healthcare, and IoT. - Popular choice for caching, session management, gaming, leaderboards, real-time analytics, geospatial, ride-hailing, chat/messaging, media streaming, and pub/sub apps.
  • 13. Redis cluster - Multi-master Key is hashed into (1-16384). Depends on the hash value, the value will be read (and write into the node assigned that token accordingly.) Client Redis node 1 Redis node 2 key -> value 5 -> "ho chi minh" 6 -> "ha noi" token 1->8000 token 8001- >16384 key -> hash 5 -> 18 6 -> 8003 13 6 -> "ha noi" 5 -> "ho chi minh"
  • 14. Redis cluster - Master/Replica Redis uses asynchronous replication, with asynchronous replica-to-master acknowledges of the amount of data processed. A master can have multiple replicas. Client write to master, but can read from replica Client 1 Redis master Redis replica Redis replica Client 2 Write command async updates Read command Ref: https://redis.io/topics/replication async updates 14
  • 15. C-A tradeoff Redis uses asynchronous replication by default. Which means, by default, it's AP. If network partition happened between master and replica, we'll see inconsistent data. Client 1 Redis master Redis replica Redis replica Client 2 Write command async updates async updates Read command return stale data 15
  • 17. What is Elasticsearch 17 ● Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. ● Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now known as Elastic).
  • 18. Roles in Elasticsearch Cluster Coordinator node Master nodeData node Data node Client 1 Client 2 Manages the overall operation of a cluster and keeps track of the cluster state Stores and searches data. Performs all data-related operations (indexing, searching, aggregating) on local shards. Delegates client requests to the shards on the data nodes, collects and aggregates the results into one final result, and sends this result back to the client. 18
  • 19. Steps for primary shards: ● Validate incoming operation and reject it if structurally invalid ● Execute the operation locally ● Forward the operation to each replica in the current in-sync copies set. ● Once all replicas have successfully performed the operation and responded to the primary, the primary acknowledges the successful completion of the request to the client Primary and Replica shards user1 auth a1 login from homepa ge Destination node Coordination node 19 p1 r11 r12 r21 p2 r22 Primary shards to forward the operation to replica
  • 20. If network partition happen, the primary shard cannot write to replica shard which lead to the primary shard becomes unavailable. By default, ElasticSearch is more CP. C-A tradeoff Destination node 20 p1 r11 r12 r21 p2 r22 Primary shards to forward the operation to replica
  • 22. What is Cassandra 22 Apache Cassandra is an open source, distributed NoSQL database that began internally at Facebook and was released as an open-source project in July 2008. Cassandra delivers continuous availability (zero downtime), high performance, and linear scalability that modern applications require, while also offering operational simplicity and effortless replication across data centers and geographies.
  • 23. Cassandra Ring Cluster -> token: 44 1-20 21-40 41-60 61-80 81-100 101-120 121-140 141-160 Coordinator node - Each nodes will be assigned a range of token - Client could connect to any nodes to write, that node will become the coordinator node - Partition keys will be hashed into a token. Coordinator will base on the token to know which node we can store the data user1 auth a1 login from homepa ge Destination node 23
  • 24. Replication Factor -> token: 44 1-20 21-40 41-60 61-80 81-100 101-120 121-140 141-160 Coordinator node Replication node - Replication Factor (RF) = number of copies we want to store - Replication node will be defined by the Replication Strategy - Simple strategy = next two nodes will be the replication node user1 auth a1 login from homepa ge 24
  • 25. Data Consistency C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 - Client 1 connect to C1 to read, C1 write data to three nodes, but failed at node B. - Client 2 also connect to C2 to read data, What would happen? 25
  • 26. Consistent Level (Write) Level Read Write One Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other replicas consistent. A write must be written to the commit log and memtable of at least one replica node. Quorum Returns the record after a quorum of replicas has responded. A write must be written to the commit log and memtable on a quorum of replica nodes All Returns the record after all replicas have responded. The read operation will fail if a replica does not respond. A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. 26
  • 27. Write with CL=ALL C1 C2 C A B Client 1 write data with token 44 v2 v1 v2 Write with CL=ALL - All replica succeeded -> success - 1 replica failed -> failed Result: Failed 27
  • 28. Write with CL=QUORUM C1 C2 C A B Client 1 write data with token 44 v2 v1 v2 Quorum = (RF + 1) / 2 = 2 - Two replicas succeeded -> success - Less than two success -> failed Result: Success 28
  • 29. Consistent Level (Read) Level Read Write One Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other replicas consistent. A write must be written to the commit log and memtable of at least one replica node. Quorum Returns the record after a quorum of replicas has responded. A write must be written to the commit log and memtable on a quorum of replica nodes All Returns the record after all replicas have responded. The read operation will fail if a replica does not respond. A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. 29
  • 30. Write=QUORUM, Read=One C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 Potentially inconsistent read. If client 2 read node B, client 2 will receive stale-data. W (Quorum) + R (1) -> eventual consistent 30
  • 31. Write=QUORUM, Read=QUORUM C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 Any read combination will always return v2 W (QU) + R (QU) -> consistent 31
  • 32. Write=All, Read=One C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 Potentially inconsistent read. If client 2 read node B, client 2 will receive stale-data. W (All) + R (1) -> consistent 32
  • 33. Summarize Read and Write CL WRITE READ Consistent Read Availability Write Availability All All Consistent Low Low Quorum All Consistent Low Medium One All Consistent Low High All Quoru m Consistent Medium Low Quorum Quoru m Consistent Medium Medium One Quoru m Inconsistent Medium High All One Consistent High Low Quorum One Inconsistent High Medium One One Inconsistent High High 33
  • 34. Summary Redis Cassandra Elastic Search Availability > Consistency Tweakable availability and consistency Availability < Consistency 34