SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Cloud Database Engineering
Making Non-Distributed Databases, Distributed
- Shailesh Birari
- Ioannis Papapanagiotou, PhD
Dynomite Ecosystem
● Dynomite
● Dynomite-manager
● Dyno client
Cloud Database Engg (CDE) Team
● Develop and operate data stores in AWS
- Cassandra, Dynomite, Elastic Search,
RDS, S3
● Ensure availability, scalability, durability and
latency SLAs
● Database expertise, client libraries, tools and
best practices
● Cassandra not a speed demon (reads)
● Needed a data store:
o Scalable & highly available
o High throughput, low latency
o Active-active multi datacenter replication
● Usage of Redis increasing:
o Netflix use case is active-active, highly available
o Does not have bi-directional replication
o Cannot withstand a Monkey attack
Problems & Observations
What is Dynomite?
● A framework that makes non-distributed data
stores, distributed.
o Can be used with many key-value storage engines
like Redis, Memcached, LMDB, etc.
o Focus: performance, cross-datacenter active-active
replication and high availability
o Features: node warmup (cold bootstrapping),
tunable consistency, S3 backups/restores
Dynomite @ Netflix
● Running around 1.5 years in PROD
● ~1000 customer facing nodes
● 1M OPS at peak
● Largest cluster: 6TB
● Quarterly upgrades in PROD
Dynomite Overview
● Layer on top of a non-distributed key value
data store
○ Peer-peer, Shared Nothing
○ Auto Sharding
○ Multi-datacenter
○ Linear scale
○ Replication(Encrypted)
○ Gossiping
● Each rack contains one
copy of data, partitioned
across multiple nodes in
that rack
● Multiple Racks == Higher
Availability (HA)
Topology
Replication
● A client can connect to any node on
the Dynomite cluster when sending
requests.
o If node owns the data,
▪ data are written in local
data-store and
asynchronously replicated.
o If node does not own the data
▪ node acts as a coordinator
and sends the data in the
same rack & replicates to
other nodes in other racks
and DC.
The Dynomite Ecosystem
RESP = Redis
Serialization Protocol
Consistency
● DC_ONE
o Reads and writes are propagated synchronously only to the
node in local rack and asynchronously replicated to other racks
and data centers
● DC_QUORUM
o Reads and writes are propagated synchronously to quorum
number of nodes in the local region and asynchronously to the
rest
● Consistency can be configured dynamically for read or write
operations separately (cluster-wide)
Performance Setup
● Instance Type:
○ Dynomite: r3.2xlarge (1Gbps)
○ Pappy/Dyno: m2.2xls (typical of an app@Netflix)
● Replication factor: 3
○ Deployed Dynomite in 3 zones in us-east-1
○ Every zone had the same number of servers
● Demo app used simple workloads key/value pairs
○ Redis: GET and SET
● Payload
○ Size: 1024 Bytes
○ 80%/20% reads over writes
Performance (Dynomite Speed)
● Throughput scales linearly with number of nodes.
● Dynomite can reach >1Million Client requests with ~24 nodes.
Performance (Latency - average/P50)
● Dynomite’s latency on average is 0.16ms.
● Client side latency is 0.6ms and does not increase as the cluster scales
up/down
Performance (Latency - P99)
● The major contributor to latency at P99 is the network.
● Dynomite affects <10%
Dynomite-manager
● Token management for multi-region deployments
● Support AWS environment
● Automated security group update in multi-region environment
● Monitoring of Dynomite and the underlying storage engine
● Node cold bootstrap (warm up)
● S3 backups and restores
● REST API
Dynomite-manager: warm up
1. Dynomite-manager identifies which node has the same token in the
same DC
2. Sets Redis to “Slave” mode of that node
3. Checks for peer syncing
a. difference between master and slave offset
4. Once master and slave are in sync, Dynomite is set to allow write only
5. Dynomite is set back to normal state
6. Checks for health of the node - Done!
Warm up (node terminated)
Warm up (auto-scale)
Warm up (node with same token)
Warm up (Redis replication)
Warm up (Streaming data)
Warm up (Nodes in sync)
Dynomite: S3 backups/restores
● Why?
o Disaster recovery
o Data corruption
● How?
o Redis dumps data on the instance drive
o Dynomite-manager sends data to S3 buckets
● Data per node are not large so no need for incrementals.
● Use case:
o clusters that use Dynomite as a storage layer
o Not enabled in clusters that have short TTL or use Dynomite as a
cache
Dynomite S3 backups (operation)
1. Perform backup
a. Dynomite-manager performs it on a pre-defined interval
b. Dynomite-manger REST call:
i. curl http://localhost:8080/REST/v1/admin/s3backup
2. Perform a Redis BGREWRITEAOF or BGSAVE.
a. Check the size of the persisted file. If the size is zero, which means that there
was an issue with Redis or no data are there, then we do not perform S3
backups
3. S3 backup key: backup/region/clustername-ASG/token/date
Dynomite S3 restores
1. Perform restore:
a. Dynomite-manager performs once it starts if configuration is enabled
b. Dynomite-manger REST call:
i. curl http://localhost:8080/REST/v1/admin/s3backup
2. Stop Dynomite process:
a. We perform this to notify Discovery that Dynomite is not accessible
b. Stop Redis process
3. Restore the data from a specific date
a. provided in the configuration
4. Start Redis process and check if the data has been loaded.
5. Start Dynomite and check if process is up
Dyno Client - Java API
● Connection Pooling
● Load Balancing
● Effective failover
● Pipelining
● Scatter/Gather
● Metrics, e.g. Netflix Insights
Dyno Load Balancing
● Dyno client employs token
aware load balancing.
● Dyno client is aware of the
cluster topology of Dynomite
within the region,
can write to specific node
using consistent
hashing.
Dyno Failover
● Dyno will route
requests to
different racks in
failure scenarios.
Roadmap
● Multi-threaded support for Dynomite
● Data reconciliation & repair v2
● Dynomite-spark connector
● Investigation for persistent stores
● Async Dyno Client
● Others….
More information
● Netflix OSS:
o https://github.com/Netflix/dynomite
o https://github.com/Netflix/dyno
● Contact:
o email: {sbirari, ipapapanagiotou}@netflix.com
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis Papapanagiotou, Shailesh Birari, Jason Cacciatore, Netflix

Contenu connexe

Tendances

Tendances (20)

Tuning Apache/MySQL/PHP para desenvolvedores
Tuning Apache/MySQL/PHP para desenvolvedoresTuning Apache/MySQL/PHP para desenvolvedores
Tuning Apache/MySQL/PHP para desenvolvedores
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deployment
 
MongoDB Database Replication
MongoDB Database ReplicationMongoDB Database Replication
MongoDB Database Replication
 
Automated master failover
Automated master failoverAutomated master failover
Automated master failover
 
さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)
 
DockerでWordPressサイトを開発してみよう
DockerでWordPressサイトを開発してみようDockerでWordPressサイトを開発してみよう
DockerでWordPressサイトを開発してみよう
 
The JVM is your friend
The JVM is your friendThe JVM is your friend
The JVM is your friend
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High Availability
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Dockerイメージの理解とコンテナのライフサイクル
Dockerイメージの理解とコンテナのライフサイクルDockerイメージの理解とコンテナのライフサイクル
Dockerイメージの理解とコンテナのライフサイクル
 
MySQL5.7 GA の Multi-threaded slave
MySQL5.7 GA の Multi-threaded slaveMySQL5.7 GA の Multi-threaded slave
MySQL5.7 GA の Multi-threaded slave
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
PG-REXで学ぶPacemaker運用の実例
PG-REXで学ぶPacemaker運用の実例PG-REXで学ぶPacemaker運用の実例
PG-REXで学ぶPacemaker運用の実例
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
トランザクションの並行処理制御
トランザクションの並行処理制御トランザクションの並行処理制御
トランザクションの並行処理制御
 
[Cloud OnAir] GCP 上でストリーミングデータ処理基盤を構築してみよう! 2018年9月13日 放送
[Cloud OnAir] GCP 上でストリーミングデータ処理基盤を構築してみよう! 2018年9月13日 放送[Cloud OnAir] GCP 上でストリーミングデータ処理基盤を構築してみよう! 2018年9月13日 放送
[Cloud OnAir] GCP 上でストリーミングデータ処理基盤を構築してみよう! 2018年9月13日 放送
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 

En vedette

Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...
Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...
Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...
Igalia
 

En vedette (20)

What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis Labs
 
Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016
 
Troubleshooting Redis- DaeMyung Kang, Kakao
Troubleshooting Redis- DaeMyung Kang, KakaoTroubleshooting Redis- DaeMyung Kang, Kakao
Troubleshooting Redis- DaeMyung Kang, Kakao
 
Leveraging Probabilistic Data Structures for Real Time Analytics with Redis M...
Leveraging Probabilistic Data Structures for Real Time Analytics with Redis M...Leveraging Probabilistic Data Structures for Real Time Analytics with Redis M...
Leveraging Probabilistic Data Structures for Real Time Analytics with Redis M...
 
Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team
Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops TeamManaging 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team
Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team
 
Highly scalable caching service on cloud - Redis
Highly scalable caching service on cloud - RedisHighly scalable caching service on cloud - Redis
Highly scalable caching service on cloud - Redis
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
WAN Replication: Hazelcast Enterprise Lightning Talk
WAN Replication: Hazelcast Enterprise Lightning TalkWAN Replication: Hazelcast Enterprise Lightning Talk
WAN Replication: Hazelcast Enterprise Lightning Talk
 
Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...
Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...
Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale Apps
 
Getting Started with Redis
Getting Started with RedisGetting Started with Redis
Getting Started with Redis
 
This is redis - feature and usecase
This is redis - feature and usecaseThis is redis - feature and usecase
This is redis - feature and usecase
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
Honest performance testing with NDBench
Honest performance testing with NDBenchHonest performance testing with NDBench
Honest performance testing with NDBench
 
Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)
Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)
Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)
 
Redis Developers Day 2015 - Secondary Indexes and State of Lua
Redis Developers Day 2015 - Secondary Indexes and State of LuaRedis Developers Day 2015 - Secondary Indexes and State of Lua
Redis Developers Day 2015 - Secondary Indexes and State of Lua
 
Getting Started with Redis
Getting Started with RedisGetting Started with Redis
Getting Started with Redis
 
Use Redis in Odd and Unusual Ways
Use Redis in Odd and Unusual WaysUse Redis in Odd and Unusual Ways
Use Redis in Odd and Unusual Ways
 

Similaire à Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis Papapanagiotou, Shailesh Birari, Jason Cacciatore, Netflix

AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 

Similaire à Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis Papapanagiotou, Shailesh Birari, Jason Cacciatore, Netflix (20)

Dynomite - PerconaLive 2017
Dynomite  - PerconaLive 2017Dynomite  - PerconaLive 2017
Dynomite - PerconaLive 2017
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 
Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017
 
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases DistributedRedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
AWS Database Services
AWS Database ServicesAWS Database Services
AWS Database Services
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Security sizing meetup
Security sizing meetupSecurity sizing meetup
Security sizing meetup
 
Effectively deploying hadoop to the cloud
Effectively  deploying hadoop to the cloudEffectively  deploying hadoop to the cloud
Effectively deploying hadoop to the cloud
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
EQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdfEQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdf
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesDruid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
 

Plus de Redis Labs

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
Redis Labs
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Redis Labs
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 

Plus de Redis Labs (20)

Redis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redisRedis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redis
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
 
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
 
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of OracleRedis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
 
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
 
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
 
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
 
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
 
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
 
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
 
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis Papapanagiotou, Shailesh Birari, Jason Cacciatore, Netflix

  • 1. Cloud Database Engineering Making Non-Distributed Databases, Distributed - Shailesh Birari - Ioannis Papapanagiotou, PhD
  • 2. Dynomite Ecosystem ● Dynomite ● Dynomite-manager ● Dyno client
  • 3. Cloud Database Engg (CDE) Team ● Develop and operate data stores in AWS - Cassandra, Dynomite, Elastic Search, RDS, S3 ● Ensure availability, scalability, durability and latency SLAs ● Database expertise, client libraries, tools and best practices
  • 4. ● Cassandra not a speed demon (reads) ● Needed a data store: o Scalable & highly available o High throughput, low latency o Active-active multi datacenter replication ● Usage of Redis increasing: o Netflix use case is active-active, highly available o Does not have bi-directional replication o Cannot withstand a Monkey attack Problems & Observations
  • 5. What is Dynomite? ● A framework that makes non-distributed data stores, distributed. o Can be used with many key-value storage engines like Redis, Memcached, LMDB, etc. o Focus: performance, cross-datacenter active-active replication and high availability o Features: node warmup (cold bootstrapping), tunable consistency, S3 backups/restores
  • 6. Dynomite @ Netflix ● Running around 1.5 years in PROD ● ~1000 customer facing nodes ● 1M OPS at peak ● Largest cluster: 6TB ● Quarterly upgrades in PROD
  • 7. Dynomite Overview ● Layer on top of a non-distributed key value data store ○ Peer-peer, Shared Nothing ○ Auto Sharding ○ Multi-datacenter ○ Linear scale ○ Replication(Encrypted) ○ Gossiping
  • 8. ● Each rack contains one copy of data, partitioned across multiple nodes in that rack ● Multiple Racks == Higher Availability (HA) Topology
  • 9. Replication ● A client can connect to any node on the Dynomite cluster when sending requests. o If node owns the data, ▪ data are written in local data-store and asynchronously replicated. o If node does not own the data ▪ node acts as a coordinator and sends the data in the same rack & replicates to other nodes in other racks and DC.
  • 10. The Dynomite Ecosystem RESP = Redis Serialization Protocol
  • 11. Consistency ● DC_ONE o Reads and writes are propagated synchronously only to the node in local rack and asynchronously replicated to other racks and data centers ● DC_QUORUM o Reads and writes are propagated synchronously to quorum number of nodes in the local region and asynchronously to the rest ● Consistency can be configured dynamically for read or write operations separately (cluster-wide)
  • 12. Performance Setup ● Instance Type: ○ Dynomite: r3.2xlarge (1Gbps) ○ Pappy/Dyno: m2.2xls (typical of an app@Netflix) ● Replication factor: 3 ○ Deployed Dynomite in 3 zones in us-east-1 ○ Every zone had the same number of servers ● Demo app used simple workloads key/value pairs ○ Redis: GET and SET ● Payload ○ Size: 1024 Bytes ○ 80%/20% reads over writes
  • 13. Performance (Dynomite Speed) ● Throughput scales linearly with number of nodes. ● Dynomite can reach >1Million Client requests with ~24 nodes.
  • 14. Performance (Latency - average/P50) ● Dynomite’s latency on average is 0.16ms. ● Client side latency is 0.6ms and does not increase as the cluster scales up/down
  • 15. Performance (Latency - P99) ● The major contributor to latency at P99 is the network. ● Dynomite affects <10%
  • 16. Dynomite-manager ● Token management for multi-region deployments ● Support AWS environment ● Automated security group update in multi-region environment ● Monitoring of Dynomite and the underlying storage engine ● Node cold bootstrap (warm up) ● S3 backups and restores ● REST API
  • 17. Dynomite-manager: warm up 1. Dynomite-manager identifies which node has the same token in the same DC 2. Sets Redis to “Slave” mode of that node 3. Checks for peer syncing a. difference between master and slave offset 4. Once master and slave are in sync, Dynomite is set to allow write only 5. Dynomite is set back to normal state 6. Checks for health of the node - Done!
  • 18. Warm up (node terminated)
  • 20. Warm up (node with same token)
  • 21. Warm up (Redis replication)
  • 23. Warm up (Nodes in sync)
  • 24. Dynomite: S3 backups/restores ● Why? o Disaster recovery o Data corruption ● How? o Redis dumps data on the instance drive o Dynomite-manager sends data to S3 buckets ● Data per node are not large so no need for incrementals. ● Use case: o clusters that use Dynomite as a storage layer o Not enabled in clusters that have short TTL or use Dynomite as a cache
  • 25. Dynomite S3 backups (operation) 1. Perform backup a. Dynomite-manager performs it on a pre-defined interval b. Dynomite-manger REST call: i. curl http://localhost:8080/REST/v1/admin/s3backup 2. Perform a Redis BGREWRITEAOF or BGSAVE. a. Check the size of the persisted file. If the size is zero, which means that there was an issue with Redis or no data are there, then we do not perform S3 backups 3. S3 backup key: backup/region/clustername-ASG/token/date
  • 26. Dynomite S3 restores 1. Perform restore: a. Dynomite-manager performs once it starts if configuration is enabled b. Dynomite-manger REST call: i. curl http://localhost:8080/REST/v1/admin/s3backup 2. Stop Dynomite process: a. We perform this to notify Discovery that Dynomite is not accessible b. Stop Redis process 3. Restore the data from a specific date a. provided in the configuration 4. Start Redis process and check if the data has been loaded. 5. Start Dynomite and check if process is up
  • 27. Dyno Client - Java API ● Connection Pooling ● Load Balancing ● Effective failover ● Pipelining ● Scatter/Gather ● Metrics, e.g. Netflix Insights
  • 28. Dyno Load Balancing ● Dyno client employs token aware load balancing. ● Dyno client is aware of the cluster topology of Dynomite within the region, can write to specific node using consistent hashing.
  • 29. Dyno Failover ● Dyno will route requests to different racks in failure scenarios.
  • 30. Roadmap ● Multi-threaded support for Dynomite ● Data reconciliation & repair v2 ● Dynomite-spark connector ● Investigation for persistent stores ● Async Dyno Client ● Others….
  • 31. More information ● Netflix OSS: o https://github.com/Netflix/dynomite o https://github.com/Netflix/dyno ● Contact: o email: {sbirari, ipapapanagiotou}@netflix.com