SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Cassandra 
Pretty Cool
History 
Google Big Table 
Amazon Dynamo
Today
Why Should You Care 
● Horizontal Scaling (basically auto sharding) 
● Multiple Nodes - Highly Available 
● Really Fast Writes 
● Not too shabby at reads either - SLICES!! 
● Bright Future
The Cluster 
● replication factor (rf) 
● read consistency (r) 
● write consistency (w) 
● clustering - shard on 
partition key
The One Ring
Storage - Vnodes
Data Model 
● Wide rows 
● Slices Queries 
● Denormalization 
● Index tables
Data Model - Simple Key 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
ROW KEY 
PRIMARY KEY(user_id));
Data Model - Simple Inserts 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘111’, 
‘party’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my 
place’); 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘999’, ‘wat 
‘, ‘horse@b.com‘, ‘giraffe@b.com‘, ‘is going 
on?’);
Data Model Simple Inserts Result 
Select * from email_app.emails; 
111 
subject to_add cc body 
party cat@ hippo@ at my place 
subject to_add cc body 
wat horse@ giraffe@ is going on 999
Mental Model - Nested Hash 
Row Keys 111 
999 
to cc body 
Column 
Values 
subject subject to cc body
Data Model - Simple Insert - Again 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b. 
com‘, ‘at my place’); 
111 subject to_add cc body 
party cat@ hippo@ at my place 
subject to_add cc body 
wat horse@ giraffe@ Is going on? 999 IDEMPOTENT
Data Model - Composite Key 1 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
PRIMARY KEY(user_id, subject)); 
ROW KEY CLUSTERING KEY
Data Model - Composite Insert 1 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘111’, 
‘party‘, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my 
place’); 
Same as Before. 
Right???
Data Model Composite Insert Result 
Select * from emails WHERE user_id = 111; 
Subject 
111 party|to_ad party|cc party|body 
cat@ hippo@ At my place
Mental Model - Nested Hash 
111 
to_add cc body 
Row Key 
Column 
Values 
party 
Clustering 
Column 
user_id 
subject
Data Model - Composite Insert 2 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘111’, ’ 
swim’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘in the 
pool’);
Composite Insert 2 Result 
Select * from emails WHERE user_id = ‘111’; 
Subject 
111 party|to_add party|cc party|body 
cat@ hippo@ at my place 
swim|to_add swim|cc swim|body 
cat@ hippo@b in the pool 
Sorted by clustering column - “subject”
Mental Model - Nested Sorted Hash 
111 
party 
to cc body 
Row Key 
Clustering 
Column 
Column 
Values 
swim 
to cc body 
user_id 
subject
Why sorted? 
SLICE QUERIES!! 
SELECT * FROM emails WHERE user_id = '111' 
AND (subject) >= ('s') AND (subject) < (‘t’); 
111 party|to_add party|cc party|body 
cat@ giraffe@ At my place 
swim|to_add swim|cc swim|body 
cat@ hippo@b in the pool
DM - Compound Composite Key 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
PRIMARY KEY((user_id, subject), to_add)); 
ROW KEY CLUSTERING KEY
Composite / Compound Inserts 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘wat‘, ‘horse@b.com‘, ‘giraffe@b. 
com‘, ‘is going on?’); 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. 
com‘, ‘at my place’);
Composite Insert 2 Result 
SELECT * FROM emails WHERE user_id = ‘111’; 
SELECT * FROM emails WHERE user_id = ‘111’ 
AND subject = ‘party’; 
111:party 
cat@|cc cat@|body 
hippo@ At my place 
to_add
Data Model - Composite Insert 1 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. 
com‘, ‘all the time’); 
SELECT * FROM emails WHERE user_id = ‘111’ AND 
subject = ‘party’; 
111:party 
cat@|cc cat@...|body 
giraffe@ At my place 
dog@|cc dog@|body 
hippo@b all the time 
Sorting / slice on - “to_add” 
to_add
DM - Compound Composite Key 2 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
ROW KEY CLUSTERING KEYS 
PRIMARY KEY((user_id, subject), to_add, cc));
Composite / Clustered Inserts 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. 
com‘, ‘all the time); 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. 
com‘, ‘At my place’); 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘mouse@b. 
com‘, ‘At my place’);
DM - Composite / Clustered Inserts 
SELECT * FROM emails WHERE user_id = ‘111’ AND 
subject = ‘party’; 
111|party 
cat@|hippo@|body cat@|mouse@|body 
at my place at my place 
dog@|hippo@|body 
all the time 
Slice on (to_add) OR (to_add, cc)
Mental Model - Nested Sorted Hash 
111|party 
cat dog 
hippo mouse hippo 
body body body 
Row Key 
Clustering 
Columns 
Column 
Values 
user_id + 
subject 
to_add 
cc
Part 2 / 8 of this 7 hour talk 
● Denormalization 
● Index Column Families 
● Cassandra Internals (memtables, SSTables, 
compaction, repair)
Part 8 / 8: The Future 
● Continually improving 
● More and more adoption 
● Awesome projects 
● http://www.datastax. 
com/documentation/cassandra/2. 
0/pdf/cassandra20.pdf 
● http://planetcassandra.org/

Contenu connexe

En vedette

Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupPySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupFrens Jan Rumph
 
NoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBNoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBsadegh salehi
 
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012Amazon Web Services
 
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012Amazon Web Services
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Best Presentation About Infosys
Best Presentation About InfosysBest Presentation About Infosys
Best Presentation About InfosysDurgadatta Dash
 

En vedette (10)

Cassandra
CassandraCassandra
Cassandra
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupPySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
 
NoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBNoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DB
 
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
 
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Best Presentation About Infosys
Best Presentation About InfosysBest Presentation About Infosys
Best Presentation About Infosys
 

Plus de DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Plus de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Dernier

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Dernier (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

  • 2. History Google Big Table Amazon Dynamo
  • 4. Why Should You Care ● Horizontal Scaling (basically auto sharding) ● Multiple Nodes - Highly Available ● Really Fast Writes ● Not too shabby at reads either - SLICES!! ● Bright Future
  • 5. The Cluster ● replication factor (rf) ● read consistency (r) ● write consistency (w) ● clustering - shard on partition key
  • 8. Data Model ● Wide rows ● Slices Queries ● Denormalization ● Index tables
  • 9. Data Model - Simple Key CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, ROW KEY PRIMARY KEY(user_id));
  • 10. Data Model - Simple Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my place’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘999’, ‘wat ‘, ‘horse@b.com‘, ‘giraffe@b.com‘, ‘is going on?’);
  • 11. Data Model Simple Inserts Result Select * from email_app.emails; 111 subject to_add cc body party cat@ hippo@ at my place subject to_add cc body wat horse@ giraffe@ is going on 999
  • 12. Mental Model - Nested Hash Row Keys 111 999 to cc body Column Values subject subject to cc body
  • 13. Data Model - Simple Insert - Again INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b. com‘, ‘at my place’); 111 subject to_add cc body party cat@ hippo@ at my place subject to_add cc body wat horse@ giraffe@ Is going on? 999 IDEMPOTENT
  • 14. Data Model - Composite Key 1 CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id, subject)); ROW KEY CLUSTERING KEY
  • 15. Data Model - Composite Insert 1 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my place’); Same as Before. Right???
  • 16. Data Model Composite Insert Result Select * from emails WHERE user_id = 111; Subject 111 party|to_ad party|cc party|body cat@ hippo@ At my place
  • 17. Mental Model - Nested Hash 111 to_add cc body Row Key Column Values party Clustering Column user_id subject
  • 18. Data Model - Composite Insert 2 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ’ swim’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘in the pool’);
  • 19. Composite Insert 2 Result Select * from emails WHERE user_id = ‘111’; Subject 111 party|to_add party|cc party|body cat@ hippo@ at my place swim|to_add swim|cc swim|body cat@ hippo@b in the pool Sorted by clustering column - “subject”
  • 20. Mental Model - Nested Sorted Hash 111 party to cc body Row Key Clustering Column Column Values swim to cc body user_id subject
  • 21. Why sorted? SLICE QUERIES!! SELECT * FROM emails WHERE user_id = '111' AND (subject) >= ('s') AND (subject) < (‘t’); 111 party|to_add party|cc party|body cat@ giraffe@ At my place swim|to_add swim|cc swim|body cat@ hippo@b in the pool
  • 22. DM - Compound Composite Key CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add)); ROW KEY CLUSTERING KEY
  • 23. Composite / Compound Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘wat‘, ‘horse@b.com‘, ‘giraffe@b. com‘, ‘is going on?’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. com‘, ‘at my place’);
  • 24. Composite Insert 2 Result SELECT * FROM emails WHERE user_id = ‘111’; SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111:party cat@|cc cat@|body hippo@ At my place to_add
  • 25. Data Model - Composite Insert 1 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. com‘, ‘all the time’); SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111:party cat@|cc cat@...|body giraffe@ At my place dog@|cc dog@|body hippo@b all the time Sorting / slice on - “to_add” to_add
  • 26. DM - Compound Composite Key 2 CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, ROW KEY CLUSTERING KEYS PRIMARY KEY((user_id, subject), to_add, cc));
  • 27. Composite / Clustered Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. com‘, ‘all the time); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. com‘, ‘At my place’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘mouse@b. com‘, ‘At my place’);
  • 28. DM - Composite / Clustered Inserts SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111|party cat@|hippo@|body cat@|mouse@|body at my place at my place dog@|hippo@|body all the time Slice on (to_add) OR (to_add, cc)
  • 29. Mental Model - Nested Sorted Hash 111|party cat dog hippo mouse hippo body body body Row Key Clustering Columns Column Values user_id + subject to_add cc
  • 30. Part 2 / 8 of this 7 hour talk ● Denormalization ● Index Column Families ● Cassandra Internals (memtables, SSTables, compaction, repair)
  • 31. Part 8 / 8: The Future ● Continually improving ● More and more adoption ● Awesome projects ● http://www.datastax. com/documentation/cassandra/2. 0/pdf/cassandra20.pdf ● http://planetcassandra.org/