SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Cassandra
diegopacheco
@diego_pacheco
Diego Pacheco
http://cassandra.apache.org/
Why Apache Cassandra?
❏ Open Source
❏ Written in Java
❏ Scalability & High Availability
❏ Fault Tolerance
❏ Replication across multiple datacenters
❏ Async Masterless Replication
❏ No Single Point of Failure
❏ Based on Amazon Dynamo paper
❏ Created by Facebook, open sourced to apache in 2008
Battle Tested by
http://planetcassandra.org/apache-cassandra-use-cases/
Benchmark: 1 million writes per second
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
CAP: Consistency VS Availability
Cluster
Murmur3Partitioner
❏ Murmur3Partitioner
❏ Default
❏ 3-5x faster than RandomPartitioner
❏ Based on Tokens hash values
❏ Uniform
❏ RandomPartitioner
❏ Uniform
❏ MD5 hash
❏ ByteOrderedPartitioner
❏ Lexically ordered by key bytes
❏ Ordered partition
❏ Not Recommended:
❏ Difficult LB, Hot Spots, Uneven LB multiple tables.
Replication
❏ Concepts:
❏ Virtual Nodes: Data ownership to machines
❏ Partitioner: Partitions data on the cluster
❏ Replication Strategy: Determine Replicas for each row of data
❏ Snitch: Topology, information about replicas and strategy.
❏ Client writes to any node
❏ Node coordinates with replicas
❏ Replication happens in parallel
❏ Replication Factor = How many nodes with same data? I.E. 3.
❏ SimpleStrategy VS NetworkTopologyStrategy
❏ Design: Nodes, Racks, Data Centers great for Cloud Computing!
Replication
Consistency
❏ Tunable Consistency: Reads and Writes:
❏ Consistency VS Availability Trade Offs:
❏ ONE, TWO, THREE
❏ QUORUM(majority = N /2 + 1) - LOCAL_QUORUM(majority local dc)
❏ EACH_QUORUM (majority all dcs)
❏ LOCAL_ONE
❏ ALL
❏ ANY (Just for writes)
❏ Disaster Recovery scenarios:
❏ SERIAL
❏ LOCAL_SERIAL
Reads and Index
❏ Partition key Cache
❏ Off Heap
❏ Configurable
❏ Row Cache
❏ Off Heap
❏ Configurable
❏ Secondary Index
❏ Filter data on table by non-primary key
❏ ALLOW FILTERING - Could be problematic
❏ Cassandra 3.4 - SASI Secondary Index
❏ Better Performance
❏ In memory mapped B+ tree
❏ Can't use with collections
Storage
❏ Log-Structured Merge Tree(don't use B-TREE)
❏ Avoid Read before write
❏ Flavors Latency
❏ Cass Groups Insert/Updates in memory. Periodically SYNC to disk(sequential append).
❏ Immutable Data
❏ Check before write? Use Lightweight Transactions.
❏ Writes: Commit Log -> Memtable -> Flush -> Disk SSTABLE
❏ All writes are versioned
❏ No Delete: Tombstones
❏ Reads: Bloom filter
❏ Off Heap structure for SSTable
Java Driver
❏ Specific to Cassandra
❏ Prepared Statements
❏ Connection Pooling
❏ Reconnection Policies
❏ Load Balancing Policies
❏ Retry Policies
❏ Async Netty
❏ Native Protocol
Cassandra
diegopacheco
@diego_pacheco
Diego Pacheco

Contenu connexe

Tendances

HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudShapeBlue
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
5 Hidden Performance Problems for ASP.NET
5 Hidden Performance Problems for ASP.NET5 Hidden Performance Problems for ASP.NET
5 Hidden Performance Problems for ASP.NETMatt Watson
 
Virtual training Intro to the Tick stack and InfluxEnterprise
Virtual training  Intro to the Tick stack and InfluxEnterpriseVirtual training  Intro to the Tick stack and InfluxEnterprise
Virtual training Intro to the Tick stack and InfluxEnterpriseInfluxData
 
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCMHow to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCMAnant Corporation
 
ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...
ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...
ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...DynamicInfraDays
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceShapeBlue
 
Wido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStackWido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStackShapeBlue
 
K8s in 2hours
K8s in 2hoursK8s in 2hours
K8s in 2hoursDEV Cafe
 
AR/VR The Next Big Thing
AR/VR  The Next Big ThingAR/VR  The Next Big Thing
AR/VR The Next Big ThingDiego Pacheco
 
Ceph with CloudStack
Ceph with CloudStackCeph with CloudStack
Ceph with CloudStackShapeBlue
 
Дмитро Волошин "High[Page]load"
Дмитро Волошин "High[Page]load"Дмитро Волошин "High[Page]load"
Дмитро Волошин "High[Page]load"Fwdays
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-PatternsMatthew Dennis
 
CGSpace technical overview
CGSpace technical overviewCGSpace technical overview
CGSpace technical overviewILRI
 
Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction Kanika Gera
 
Building software defined clouds - Boyan Ivanov
Building software defined clouds - Boyan Ivanov  Building software defined clouds - Boyan Ivanov
Building software defined clouds - Boyan Ivanov ShapeBlue
 

Tendances (20)

HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloud
 
To AWS with Ansible
To AWS with AnsibleTo AWS with Ansible
To AWS with Ansible
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
5 Hidden Performance Problems for ASP.NET
5 Hidden Performance Problems for ASP.NET5 Hidden Performance Problems for ASP.NET
5 Hidden Performance Problems for ASP.NET
 
Virtual training Intro to the Tick stack and InfluxEnterprise
Virtual training  Intro to the Tick stack and InfluxEnterpriseVirtual training  Intro to the Tick stack and InfluxEnterprise
Virtual training Intro to the Tick stack and InfluxEnterprise
 
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCMHow to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
How to Build a Multi-DC Cassandra Cluster in AWS with OpsCenter LCM
 
ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...
ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...
ContainerDays NYC 2016: "From Hello World to Real World: Building a Productio...
 
Cassandra On EC2
Cassandra On EC2Cassandra On EC2
Cassandra On EC2
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experience
 
Wido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStackWido den Hollander - building highly available cloud with Ceph and CloudStack
Wido den Hollander - building highly available cloud with Ceph and CloudStack
 
K8s in 2hours
K8s in 2hoursK8s in 2hours
K8s in 2hours
 
AR/VR The Next Big Thing
AR/VR  The Next Big ThingAR/VR  The Next Big Thing
AR/VR The Next Big Thing
 
Ceph with CloudStack
Ceph with CloudStackCeph with CloudStack
Ceph with CloudStack
 
Дмитро Волошин "High[Page]load"
Дмитро Волошин "High[Page]load"Дмитро Волошин "High[Page]load"
Дмитро Волошин "High[Page]load"
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 
CGSpace technical overview
CGSpace technical overviewCGSpace technical overview
CGSpace technical overview
 
Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction
 
Building software defined clouds - Boyan Ivanov
Building software defined clouds - Boyan Ivanov  Building software defined clouds - Boyan Ivanov
Building software defined clouds - Boyan Ivanov
 

En vedette

Spring framework 2.0 pt_BR
Spring framework 2.0 pt_BRSpring framework 2.0 pt_BR
Spring framework 2.0 pt_BRDiego Pacheco
 
TI na ERA DEVOPS
TI na ERA DEVOPSTI na ERA DEVOPS
TI na ERA DEVOPSilegra
 
Stream Processing with Kafka and Samza
Stream Processing with Kafka and SamzaStream Processing with Kafka and Samza
Stream Processing with Kafka and SamzaDiego Pacheco
 
Spring framework 2.5
Spring framework 2.5Spring framework 2.5
Spring framework 2.5Diego Pacheco
 
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014Amazon Web Services
 
Microservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWSMicroservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWSDiego Pacheco
 
DevOps - Estado da Arte
DevOps - Estado da ArteDevOps - Estado da Arte
DevOps - Estado da Arteilegra
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Diego Pacheco
 
Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3Diego Pacheco
 
Apache Cassandra - part 2
Apache Cassandra - part 2Apache Cassandra - part 2
Apache Cassandra - part 2Diego Pacheco
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and ScalableAmazon Web Services
 

En vedette (15)

Spring framework 2.0 pt_BR
Spring framework 2.0 pt_BRSpring framework 2.0 pt_BR
Spring framework 2.0 pt_BR
 
TI na ERA DEVOPS
TI na ERA DEVOPSTI na ERA DEVOPS
TI na ERA DEVOPS
 
Stream Processing with Kafka and Samza
Stream Processing with Kafka and SamzaStream Processing with Kafka and Samza
Stream Processing with Kafka and Samza
 
Spring framework 2.5
Spring framework 2.5Spring framework 2.5
Spring framework 2.5
 
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
(ARC312) Processing Money in the Cloud | AWS re:Invent 2014
 
Dev opsdaykeynote
Dev opsdaykeynoteDev opsdaykeynote
Dev opsdaykeynote
 
Microservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWSMicroservices reativos usando a stack do Netflix na AWS
Microservices reativos usando a stack do Netflix na AWS
 
Elassandra
ElassandraElassandra
Elassandra
 
DevOps - Estado da Arte
DevOps - Estado da ArteDevOps - Estado da Arte
DevOps - Estado da Arte
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
 
Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3Lean/Agile/DevOps 2016 part 3
Lean/Agile/DevOps 2016 part 3
 
Apache Cassandra - part 2
Apache Cassandra - part 2Apache Cassandra - part 2
Apache Cassandra - part 2
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
(DAT303) Oracle on AWS and Amazon RDS: Secure, Fast, and Scalable
 
Culture
CultureCulture
Culture
 

Similaire à Cassandra

Basic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about CassandraBasic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about CassandraYu-Chang Ho
 
Experiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on awsExperiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on awsDiego Pacheco
 
Building the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopBuilding the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopAll Things Open
 
Edge performance with in memory nosql
Edge performance with in memory nosqlEdge performance with in memory nosql
Edge performance with in memory nosqlLiviu Costea
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache CassandraStu Hood
 
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...Data Con LA
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Corporation
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
Cassandra from tarball to production
Cassandra   from tarball to productionCassandra   from tarball to production
Cassandra from tarball to productionRon Kuris
 
Amazon builder Library notes
Amazon builder Library notesAmazon builder Library notes
Amazon builder Library notesDiego Pacheco
 
Stream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_NengStream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_NengStreamNative
 
Cloud-Native DevOps Engineering
Cloud-Native DevOps EngineeringCloud-Native DevOps Engineering
Cloud-Native DevOps EngineeringDiego Pacheco
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingSergey Bushik
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big DataDataStax Academy
 

Similaire à Cassandra (20)

Basic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about CassandraBasic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about Cassandra
 
Experiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on awsExperiences building a multi region cassandra operations orchestrator on aws
Experiences building a multi region cassandra operations orchestrator on aws
 
Building the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for HadoopBuilding the Right Platform Architecture for Hadoop
Building the Right Platform Architecture for Hadoop
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
ES & Kafka
ES & KafkaES & Kafka
ES & Kafka
 
Cassandra Redis
Cassandra RedisCassandra Redis
Cassandra Redis
 
Edge performance with in memory nosql
Edge performance with in memory nosqlEdge performance with in memory nosql
Edge performance with in memory nosql
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly Available
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Cassandra from tarball to production
Cassandra   from tarball to productionCassandra   from tarball to production
Cassandra from tarball to production
 
NoSQL
NoSQLNoSQL
NoSQL
 
Amazon builder Library notes
Amazon builder Library notesAmazon builder Library notes
Amazon builder Library notes
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Stream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_NengStream or segment : what is the best way to access your events in Pulsar_Neng
Stream or segment : what is the best way to access your events in Pulsar_Neng
 
Cloud-Native DevOps Engineering
Cloud-Native DevOps EngineeringCloud-Native DevOps Engineering
Cloud-Native DevOps Engineering
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for Benchmarking
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 

Plus de Diego Pacheco

Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Diego Pacheco
 
Continuous Discovery Habits Book Review.pdf
Continuous Discovery Habits  Book Review.pdfContinuous Discovery Habits  Book Review.pdf
Continuous Discovery Habits Book Review.pdfDiego Pacheco
 
Thoughts about Shape Up
Thoughts about Shape UpThoughts about Shape Up
Thoughts about Shape UpDiego Pacheco
 
Encryption Deep Dive
Encryption Deep DiveEncryption Deep Dive
Encryption Deep DiveDiego Pacheco
 
Management: Doing the non-obvious! III
Management: Doing the non-obvious! IIIManagement: Doing the non-obvious! III
Management: Doing the non-obvious! IIIDiego Pacheco
 
Design is not Subjective
Design is not SubjectiveDesign is not Subjective
Design is not SubjectiveDiego Pacheco
 
Architecture & Engineering : Doing the non-obvious!
Architecture & Engineering :  Doing the non-obvious!Architecture & Engineering :  Doing the non-obvious!
Architecture & Engineering : Doing the non-obvious!Diego Pacheco
 
Management doing the non-obvious II
Management doing the non-obvious II Management doing the non-obvious II
Management doing the non-obvious II Diego Pacheco
 
Testing in production
Testing in productionTesting in production
Testing in productionDiego Pacheco
 
Nine lies about work
Nine lies about workNine lies about work
Nine lies about workDiego Pacheco
 
Management: doing the nonobvious!
Management: doing the nonobvious!Management: doing the nonobvious!
Management: doing the nonobvious!Diego Pacheco
 
Dealing with dependencies
Dealing  with dependenciesDealing  with dependencies
Dealing with dependenciesDiego Pacheco
 
Dealing with dependencies in tests
Dealing  with dependencies in testsDealing  with dependencies in tests
Dealing with dependencies in testsDiego Pacheco
 

Plus de Diego Pacheco (20)

Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!
 
Continuous Discovery Habits Book Review.pdf
Continuous Discovery Habits  Book Review.pdfContinuous Discovery Habits  Book Review.pdf
Continuous Discovery Habits Book Review.pdf
 
Thoughts about Shape Up
Thoughts about Shape UpThoughts about Shape Up
Thoughts about Shape Up
 
Holacracy
HolacracyHolacracy
Holacracy
 
AWS IAM
AWS IAMAWS IAM
AWS IAM
 
CDKs
CDKsCDKs
CDKs
 
Encryption Deep Dive
Encryption Deep DiveEncryption Deep Dive
Encryption Deep Dive
 
Sec 101
Sec 101Sec 101
Sec 101
 
Reflections on SCM
Reflections on SCMReflections on SCM
Reflections on SCM
 
Management: Doing the non-obvious! III
Management: Doing the non-obvious! IIIManagement: Doing the non-obvious! III
Management: Doing the non-obvious! III
 
Design is not Subjective
Design is not SubjectiveDesign is not Subjective
Design is not Subjective
 
Architecture & Engineering : Doing the non-obvious!
Architecture & Engineering :  Doing the non-obvious!Architecture & Engineering :  Doing the non-obvious!
Architecture & Engineering : Doing the non-obvious!
 
Management doing the non-obvious II
Management doing the non-obvious II Management doing the non-obvious II
Management doing the non-obvious II
 
Testing in production
Testing in productionTesting in production
Testing in production
 
Nine lies about work
Nine lies about workNine lies about work
Nine lies about work
 
Management: doing the nonobvious!
Management: doing the nonobvious!Management: doing the nonobvious!
Management: doing the nonobvious!
 
AI and the Future
AI and the FutureAI and the Future
AI and the Future
 
Dealing with dependencies
Dealing  with dependenciesDealing  with dependencies
Dealing with dependencies
 
Dealing with dependencies in tests
Dealing  with dependencies in testsDealing  with dependencies in tests
Dealing with dependencies in tests
 
Kanban 2020
Kanban 2020Kanban 2020
Kanban 2020
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Cassandra

  • 3. Why Apache Cassandra? ❏ Open Source ❏ Written in Java ❏ Scalability & High Availability ❏ Fault Tolerance ❏ Replication across multiple datacenters ❏ Async Masterless Replication ❏ No Single Point of Failure ❏ Based on Amazon Dynamo paper ❏ Created by Facebook, open sourced to apache in 2008
  • 5. Benchmark: 1 million writes per second http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
  • 6. CAP: Consistency VS Availability
  • 8. Murmur3Partitioner ❏ Murmur3Partitioner ❏ Default ❏ 3-5x faster than RandomPartitioner ❏ Based on Tokens hash values ❏ Uniform ❏ RandomPartitioner ❏ Uniform ❏ MD5 hash ❏ ByteOrderedPartitioner ❏ Lexically ordered by key bytes ❏ Ordered partition ❏ Not Recommended: ❏ Difficult LB, Hot Spots, Uneven LB multiple tables.
  • 9. Replication ❏ Concepts: ❏ Virtual Nodes: Data ownership to machines ❏ Partitioner: Partitions data on the cluster ❏ Replication Strategy: Determine Replicas for each row of data ❏ Snitch: Topology, information about replicas and strategy. ❏ Client writes to any node ❏ Node coordinates with replicas ❏ Replication happens in parallel ❏ Replication Factor = How many nodes with same data? I.E. 3. ❏ SimpleStrategy VS NetworkTopologyStrategy ❏ Design: Nodes, Racks, Data Centers great for Cloud Computing!
  • 11. Consistency ❏ Tunable Consistency: Reads and Writes: ❏ Consistency VS Availability Trade Offs: ❏ ONE, TWO, THREE ❏ QUORUM(majority = N /2 + 1) - LOCAL_QUORUM(majority local dc) ❏ EACH_QUORUM (majority all dcs) ❏ LOCAL_ONE ❏ ALL ❏ ANY (Just for writes) ❏ Disaster Recovery scenarios: ❏ SERIAL ❏ LOCAL_SERIAL
  • 12. Reads and Index ❏ Partition key Cache ❏ Off Heap ❏ Configurable ❏ Row Cache ❏ Off Heap ❏ Configurable ❏ Secondary Index ❏ Filter data on table by non-primary key ❏ ALLOW FILTERING - Could be problematic ❏ Cassandra 3.4 - SASI Secondary Index ❏ Better Performance ❏ In memory mapped B+ tree ❏ Can't use with collections
  • 13. Storage ❏ Log-Structured Merge Tree(don't use B-TREE) ❏ Avoid Read before write ❏ Flavors Latency ❏ Cass Groups Insert/Updates in memory. Periodically SYNC to disk(sequential append). ❏ Immutable Data ❏ Check before write? Use Lightweight Transactions. ❏ Writes: Commit Log -> Memtable -> Flush -> Disk SSTABLE ❏ All writes are versioned ❏ No Delete: Tombstones ❏ Reads: Bloom filter ❏ Off Heap structure for SSTable
  • 14. Java Driver ❏ Specific to Cassandra ❏ Prepared Statements ❏ Connection Pooling ❏ Reconnection Policies ❏ Load Balancing Policies ❏ Retry Policies ❏ Async Netty ❏ Native Protocol