SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Running Cassandra
      in the Cloud:
An Introduction to Priam
           Jason Brown


      @jasobrown jasedbrown@gmail.com
        www.linkedin.com/in/jasedbrown
About me
● Senior Software Engineer, Netflix
● Apache Cassandra committer

● E-Commerce Architect, Major League
  Baseball Advanced Media
● Wireless developer (J2ME and BREW)
Netflix Databases
● Oracle in the datacenter
● Migrate to EC2
  ○ SimpleDB at first
  ○ Cassandra
Cassandra meet EC2
● shell script(s)
● python scripts
  ● backup / restore
  ● centralized model
  ● installing 2.7 broke CentOS yum
  ● first time we ran it in prod, my cluster was
    destroyed
Hello, Priam!
Priam, the father of Cassandra
  (http://en.wikipedia.org/wiki/Priam)



Java web app
● Token Assignment
● Backup / Restore
● Multi-region support
● Configuration management
Branches
each priam branch corresponds to a c* version
● priam 1.1 -> c* 1.1
● priam master -> c* 1.2
● ??? -> c* trunk
Token Assignment
● Cassandra needs an assigned token
● Priam tries to
  ○ replace a dead instance
  ○ join as a new node


● External storage for known cluster members
  ○ host name/IP addr/instance id
  ○ token
  ○ region/availability zone
Replacing a dead node
● Get known nodes in region/AZ from storage
  ○ {A, B, C}
● Get live nodes in region/AZ from ASG api
  ○ {A, B}
● Take over a dead node's token
  ○ C


● uses c*'s replace_token
Joining as a new node
● Calculate token
  ○ per-region offset
  ○ determine 'slot' in region/AZ
  ○ derive token
Region hash offset
● Each region needs a different base offset
   ○ avoids token collisions


int hash = "us-east-1".hashCode();
Determining slot
New nodes takes next numbered slot in AZ
- looks for other registered nodes in sdb
Node Slotting Layout
        +--------+--------+--------+
        | zone A | zone B | zone C |
        +--------+--------+--------+
        |    0   |    1 |      2 |
        +--------+--------+--------+
        |    3   |    4 |      5 |
        +--------+--------+--------+
        |    6   |    7 |      8 |
        +--------+--------+--------+
        |    9   |   10 | 11 |
        +--------------------------+
                 (ascii art rocks)
Here's your token
MAXIMUM_TOKEN
 .divide(regionNodeCount)
 .multiply(mySlot)
 .add(regionHashOffset);

example:
100 / 10 (ten nodes in region)
   3 + (in slot three) + 12
  = 42
Seeds
● first node in each AZ, in every region
● except if current node is in the first slot
   ○ seeds cannot auto bootstrap
Multi-region communication
AWS security groups block ingress requests

Intra-region: whitelist by other in-region SG

Inter-region: whitelist by IP address
   ○ must use public IP address!
Whitelisting IP address
● Seed nodes compare
  ○ current region's SG IP address
  ○ entries in SimpleDB database
● Add new nodes's to SG
● Remove dead nodes from SG
++
     us-east-1        ||          eu-west-1
    +-------------+   ||
    | simpleDB    |   ||
    +-------------+   ||
                      ||
               +--+   ||   +--+
               |S |   ||   |S |
               |e |   ||   |e |
               |c |   ||   |c |
+----------+   |G |   ||   |G |      +----------+
| c* 1     |   |r |   ||   |r |      | c* 2     |
+----------+   |p |   ||   |p |      +----------+
               | |    ||   | |
               |1 |   ||   |2 |
               +--+   ||   +--+
                      ||
                      ++
Backup
Two types:
● Snapshot
  ○ invokes nodetool snapshot
  ○ once a day, cron-like
● Incremental
  ○ copy all newly flushed sstables
Backup location
Upload to S3 bucket in same region

Bucket lifecycle rules
● configure TTL for data
Backup path
Bucket: netflix-cassandra-data

Path:
base dir / region / cluster name / token / snapshot time /
 [SNP | SST | META] / keyspace / column family / data file

example:
test_backup/us-east-1/cass_jasobrown/42/1234567/
  SNP/jasobrown/dog/jasobrown-dog-ja-1-Data.db
Restore
● best with same size cluster as source
● best if tokens match with source

Uses (besides the obvious)
● prod to test refresh
● reproduce prod data problems
● incremental restore - WIP
Configuration Management
Control aspects of priam and c*
● yaml
● startup script(s) env values

Netflix needs this as we have ~55 production
clusters, with slightly different configs
So, does Netflix actually use Priam?
55 production clusters, > 750 nodes

Internal extensions
● Hook into internal DNS, properties systems
● Alternative storage to SimpleDB
● BI messaging integration - WIP
● C* JMX monitoring
Monitoring
● Poll C* every 60 seconds
● selected JMX metrics
● publish to internal metrics aggregator
  ○ currently uses Netflix's OSS Servo library (github.
    com/Netflix/servo)
Next directions
Commit log backups
Datastax Enterprise support
● security
● solr
● configuration

c* 1.2 virtual nodes (a/k/a vnodes)
auto scaling
Thank you!



             Q & A time



              @jasobrown

Contenu connexe

Tendances

From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceSpyros Trigazis
 
Cassandra at Glogster
Cassandra at GlogsterCassandra at Glogster
Cassandra at GlogsterRoman Komkov
 
How Scylla Manager Handles Backups
How Scylla Manager Handles BackupsHow Scylla Manager Handles Backups
How Scylla Manager Handles BackupsScyllaDB
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
 
CRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux ContainersCRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux ContainersKirill Kolyshkin
 
opentsdb in a real enviroment
opentsdb in a real enviromentopentsdb in a real enviroment
opentsdb in a real enviromentChen Robert
 
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...DataStax Academy
 
FOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerFOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerAndrey Vagin
 
20140513_jeffyang_demo_openstack
20140513_jeffyang_demo_openstack20140513_jeffyang_demo_openstack
20140513_jeffyang_demo_openstackJeff Yang
 
An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...
An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...
An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...Jeff Yang
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...confluent
 
Arbiter volumes in gluster
Arbiter volumes in glusterArbiter volumes in gluster
Arbiter volumes in glusteritisravi
 
Service Discovery in Prometheus
Service Discovery in PrometheusService Discovery in Prometheus
Service Discovery in PrometheusOliver Moser
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsSematext Group, Inc.
 
Cassandra 2.1 boot camp, Compaction
Cassandra 2.1 boot camp, CompactionCassandra 2.1 boot camp, Compaction
Cassandra 2.1 boot camp, CompactionJoshua McKenzie
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
 
Keynote: Scaling Sensu Go
Keynote: Scaling Sensu GoKeynote: Scaling Sensu Go
Keynote: Scaling Sensu GoSensu Inc.
 
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Gluster.org
 
Ansible with AWS
Ansible with AWSAnsible with AWS
Ansible with AWSAllan Denot
 

Tendances (20)

From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container service
 
Cassandra at Glogster
Cassandra at GlogsterCassandra at Glogster
Cassandra at Glogster
 
How Scylla Manager Handles Backups
How Scylla Manager Handles BackupsHow Scylla Manager Handles Backups
How Scylla Manager Handles Backups
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
CRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux ContainersCRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux Containers
 
opentsdb in a real enviroment
opentsdb in a real enviromentopentsdb in a real enviroment
opentsdb in a real enviroment
 
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
 
FOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerFOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the corner
 
20140513_jeffyang_demo_openstack
20140513_jeffyang_demo_openstack20140513_jeffyang_demo_openstack
20140513_jeffyang_demo_openstack
 
An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...
An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...
An High Available Database for OpenStack Cloud Production by Pacemaker, Coros...
 
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
 
Arbiter volumes in gluster
Arbiter volumes in glusterArbiter volumes in gluster
Arbiter volumes in gluster
 
Service Discovery in Prometheus
Service Discovery in PrometheusService Discovery in Prometheus
Service Discovery in Prometheus
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM apps
 
Cassandra 2.1 boot camp, Compaction
Cassandra 2.1 boot camp, CompactionCassandra 2.1 boot camp, Compaction
Cassandra 2.1 boot camp, Compaction
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?
 
Keynote: Scaling Sensu Go
Keynote: Scaling Sensu GoKeynote: Scaling Sensu Go
Keynote: Scaling Sensu Go
 
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
Up and Running with Glusto & Glusto-Tests in 5 Minutes (or less)
 
Ansible with AWS
Ansible with AWSAnsible with AWS
Ansible with AWS
 

Similaire à An Introduction to Priam

16 MySQL Optimization #burningkeyboards
16 MySQL Optimization #burningkeyboards16 MySQL Optimization #burningkeyboards
16 MySQL Optimization #burningkeyboardsDenis Ristic
 
515_Patroni-training_postgres_high_availability.pdf
515_Patroni-training_postgres_high_availability.pdf515_Patroni-training_postgres_high_availability.pdf
515_Patroni-training_postgres_high_availability.pdfRobertoGiordano16
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Stephen Gordon
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLCommand Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorMasahiko Sawada
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介Masayuki Matsushita
 
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cMauro Pagano
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
Introduction to the IBM AS/400
Introduction to the IBM AS/400Introduction to the IBM AS/400
Introduction to the IBM AS/400tvlooy
 
第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
 
(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private Cloud(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private CloudAmazon Web Services
 
Lightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedLightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedScyllaDB
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018VMware Tanzu
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015Dave Stokes
 

Similaire à An Introduction to Priam (20)

16 MySQL Optimization #burningkeyboards
16 MySQL Optimization #burningkeyboards16 MySQL Optimization #burningkeyboards
16 MySQL Optimization #burningkeyboards
 
515_Patroni-training_postgres_high_availability.pdf
515_Patroni-training_postgres_high_availability.pdf515_Patroni-training_postgres_high_availability.pdf
515_Patroni-training_postgres_high_availability.pdf
 
Digging OpenStack
Digging OpenStackDigging OpenStack
Digging OpenStack
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
 
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12c
 
Complex stories about Sqooping PostgreSQL data
Complex stories about Sqooping PostgreSQL dataComplex stories about Sqooping PostgreSQL data
Complex stories about Sqooping PostgreSQL data
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
Introduction to the IBM AS/400
Introduction to the IBM AS/400Introduction to the IBM AS/400
Introduction to the IBM AS/400
 
第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)第11回 配信講義 計算科学技術特論A(2021)
第11回 配信講義 計算科学技術特論A(2021)
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
8 congestion-ipv6
8 congestion-ipv68 congestion-ipv6
8 congestion-ipv6
 
(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private Cloud(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private Cloud
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Lightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedLightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning Speed
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015
 

Dernier

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Dernier (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

An Introduction to Priam

  • 1. Running Cassandra in the Cloud: An Introduction to Priam Jason Brown @jasobrown jasedbrown@gmail.com www.linkedin.com/in/jasedbrown
  • 2. About me ● Senior Software Engineer, Netflix ● Apache Cassandra committer ● E-Commerce Architect, Major League Baseball Advanced Media ● Wireless developer (J2ME and BREW)
  • 3. Netflix Databases ● Oracle in the datacenter ● Migrate to EC2 ○ SimpleDB at first ○ Cassandra
  • 4. Cassandra meet EC2 ● shell script(s) ● python scripts ● backup / restore ● centralized model ● installing 2.7 broke CentOS yum ● first time we ran it in prod, my cluster was destroyed
  • 5. Hello, Priam! Priam, the father of Cassandra (http://en.wikipedia.org/wiki/Priam) Java web app ● Token Assignment ● Backup / Restore ● Multi-region support ● Configuration management
  • 6. Branches each priam branch corresponds to a c* version ● priam 1.1 -> c* 1.1 ● priam master -> c* 1.2 ● ??? -> c* trunk
  • 7. Token Assignment ● Cassandra needs an assigned token ● Priam tries to ○ replace a dead instance ○ join as a new node ● External storage for known cluster members ○ host name/IP addr/instance id ○ token ○ region/availability zone
  • 8.
  • 9. Replacing a dead node ● Get known nodes in region/AZ from storage ○ {A, B, C} ● Get live nodes in region/AZ from ASG api ○ {A, B} ● Take over a dead node's token ○ C ● uses c*'s replace_token
  • 10. Joining as a new node ● Calculate token ○ per-region offset ○ determine 'slot' in region/AZ ○ derive token
  • 11. Region hash offset ● Each region needs a different base offset ○ avoids token collisions int hash = "us-east-1".hashCode();
  • 12. Determining slot New nodes takes next numbered slot in AZ - looks for other registered nodes in sdb
  • 13. Node Slotting Layout +--------+--------+--------+ | zone A | zone B | zone C | +--------+--------+--------+ | 0 | 1 | 2 | +--------+--------+--------+ | 3 | 4 | 5 | +--------+--------+--------+ | 6 | 7 | 8 | +--------+--------+--------+ | 9 | 10 | 11 | +--------------------------+ (ascii art rocks)
  • 14. Here's your token MAXIMUM_TOKEN .divide(regionNodeCount) .multiply(mySlot) .add(regionHashOffset); example: 100 / 10 (ten nodes in region) 3 + (in slot three) + 12 = 42
  • 15. Seeds ● first node in each AZ, in every region ● except if current node is in the first slot ○ seeds cannot auto bootstrap
  • 16. Multi-region communication AWS security groups block ingress requests Intra-region: whitelist by other in-region SG Inter-region: whitelist by IP address ○ must use public IP address!
  • 17. Whitelisting IP address ● Seed nodes compare ○ current region's SG IP address ○ entries in SimpleDB database ● Add new nodes's to SG ● Remove dead nodes from SG
  • 18. ++ us-east-1 || eu-west-1 +-------------+ || | simpleDB | || +-------------+ || || +--+ || +--+ |S | || |S | |e | || |e | |c | || |c | +----------+ |G | || |G | +----------+ | c* 1 | |r | || |r | | c* 2 | +----------+ |p | || |p | +----------+ | | || | | |1 | || |2 | +--+ || +--+ || ++
  • 19. Backup Two types: ● Snapshot ○ invokes nodetool snapshot ○ once a day, cron-like ● Incremental ○ copy all newly flushed sstables
  • 20. Backup location Upload to S3 bucket in same region Bucket lifecycle rules ● configure TTL for data
  • 21. Backup path Bucket: netflix-cassandra-data Path: base dir / region / cluster name / token / snapshot time / [SNP | SST | META] / keyspace / column family / data file example: test_backup/us-east-1/cass_jasobrown/42/1234567/ SNP/jasobrown/dog/jasobrown-dog-ja-1-Data.db
  • 22. Restore ● best with same size cluster as source ● best if tokens match with source Uses (besides the obvious) ● prod to test refresh ● reproduce prod data problems ● incremental restore - WIP
  • 23. Configuration Management Control aspects of priam and c* ● yaml ● startup script(s) env values Netflix needs this as we have ~55 production clusters, with slightly different configs
  • 24. So, does Netflix actually use Priam? 55 production clusters, > 750 nodes Internal extensions ● Hook into internal DNS, properties systems ● Alternative storage to SimpleDB ● BI messaging integration - WIP ● C* JMX monitoring
  • 25. Monitoring ● Poll C* every 60 seconds ● selected JMX metrics ● publish to internal metrics aggregator ○ currently uses Netflix's OSS Servo library (github. com/Netflix/servo)
  • 26. Next directions Commit log backups Datastax Enterprise support ● security ● solr ● configuration c* 1.2 virtual nodes (a/k/a vnodes) auto scaling
  • 27. Thank you! Q & A time @jasobrown