SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Introduction to Cassandra 
Jon Haddad, Technical Evangelist, DataStax 
Luke Tillman, Language Evangelist, DataStax 
@rustyrazorblade, @LukeTillman 
©2013 DataStax Confidential. Do not distribute without consent. 
1
High Level Architecture 
• Ring based replication 
• Only 1 type of server (cassandra) 
• All nodes hold data and can answer 
queries 
• No SPOF 
• Build for HA & Scalability 
• Multi-DC 
• Data is found by key (CQL) 
• Runs on JVM
Hash Ring 
• Key is hashed to a position on the 
ring 
• Data is replicated to RF=N servers 
• Using a snitch (build in feature) you 
can ensure replicas are located on 
different racks / AZ
CAP Tradeoffs 
! 
• Cassandra chooses Availability & 
Partition Tolerance over Consistency 
• Replication factor (RF=3) 
• Queries have tunable consistency level 
• ALL, QUORUM, ONE
The Write Path 
• Writes are written to any node in the cluster 
(coordinator) 
• Writes are written to commit log, then to 
meltable 
• Memtable flushed to disk periodically 
(sstable) 
• New memtable is created in memory 
• Deletes are actually a special write case, 
called a “tombstone”
What is an SSTable? 
• Immutable data file 
• Deletes are written as tombstones 
• Every write includes a timestamp of when it 
was written 
• Partition is spread across multiple SSTables 
• Same column can be in multiple SSTables 
• Merged through compaction, only latest 
timestamp is kept 
sstable 
sstable sstable sstable
The Read Path 
• Any server may be queried, it acts as the 
coordinator 
• Contacts nodes with the requested key 
• On each node, data is pulled from 
SSTables and merged 
• Consistency< ALL performs read repair 
in background (read_repair_chance)
Data Structures 
• Like an RDBMS, Cassandra uses a Table to 
store data 
• But there’s where the similarities end 
• Partitions within tables 
• Rows within partitions (or a single row) 
• CQL to create tables & query data 
• Partition keys determine where a partition 
is found 
• Clustering keys determine ordering of rows 
within a partition 
Keyspace 
Table 
Partition 
Row
Example: Single Row Partition 
• Simple User system 
• Identified by name (pk) 
• 1 Row per partition 
• This is familiar territory 
name age job 
jon 33 evangelist 
luke 33 evangelist 
old pete 108 retired 
s. seagal 62 actor 
JCVD 53 actor 
cqlsh:demo> select * from user WHERE name = 'JCVD' 
cqlsh:demo> create table user 
(name text primary key, 
age int, 
job text);
Example: Multiple Rows 
• Comments on photos 
• Comments are always selected by 
the photo_id 
• There are only 4 rows in 2 partitions 
• In the real world, use UUIDs instead 
of int for PK 
photo_id comment_id user comment 
5 1 jon hi 
5 2 luke oh hey 
5 3 JCVD AHHHHH!!! 
6 4 jon great pic 
select * from comment where photo_id=5 
create table comment 
( photo_id int, 
comment_id int, 
user text, 
comment text, 
primary key (photo_id, comment_id));
Partition with Clustering 
• Multiple rows are transposed into a single partition 
• Partitions vary in size 
• Old terminology - "wide row" 
photo_id comment_id name comment comment_id user comment comment_id user comment 
5 1 jon hi 2 luke oh hey 3 JCVD AHHHHH!!! 
6 4 jon great pic
CQL Data Types 
Basic Types Collections 
text uuid counter map 
int timeuuid list 
decimal set 
blob 
Read the CQL documentation for the full list of types
Pro Data Modeling 
• How do I query my data if I can only 
query by key? 
• Denormalize! 
• Create multiple views into your data 
(multiple tables) 
• Cassandra is built for fast writes 
• Use fast writes to do as few reads as 
possible
Let’s Download Cassandra! 
• http://planetcassandra.org/ 
• click downloads 
• use drop down to pick OS
©2013 DataStax Confidential. Do not distribute without consent. 15

Contenu connexe

Tendances

Tendances (20)

Денис Резник "Моя база данных не справляется с нагрузкой. Что делать?"
Денис Резник "Моя база данных не справляется с нагрузкой. Что делать?"Денис Резник "Моя база данных не справляется с нагрузкой. Что делать?"
Денис Резник "Моя база данных не справляется с нагрузкой. Что делать?"
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Real time fulltext search with sphinx
Real time fulltext search with sphinxReal time fulltext search with sphinx
Real time fulltext search with sphinx
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
Fluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, ScalableFluentd - Flexible, Stable, Scalable
Fluentd - Flexible, Stable, Scalable
 
Intro to Apache Solr
Intro to Apache SolrIntro to Apache Solr
Intro to Apache Solr
 
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
 
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use cases
 
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff DavisDeep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
 
A Survey of Elasticsearch Usage
A Survey of Elasticsearch UsageA Survey of Elasticsearch Usage
A Survey of Elasticsearch Usage
 
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, AolMail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014
 
LuceneRDD for (Geospatial) Search and Entity Linkage
LuceneRDD for (Geospatial) Search and Entity LinkageLuceneRDD for (Geospatial) Search and Entity Linkage
LuceneRDD for (Geospatial) Search and Entity Linkage
 
Finite State Queries In Lucene
Finite State Queries In LuceneFinite State Queries In Lucene
Finite State Queries In Lucene
 

En vedette

En vedette (19)

Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Cassandra Day Denver 2014: Building Java Applications with Apache Cassandra
Cassandra Day Denver 2014: Building Java Applications with Apache CassandraCassandra Day Denver 2014: Building Java Applications with Apache Cassandra
Cassandra Day Denver 2014: Building Java Applications with Apache Cassandra
 
Cassandra Day Denver 2014: Transitioning to Cassandra for an Already Giant Pr...
Cassandra Day Denver 2014: Transitioning to Cassandra for an Already Giant Pr...Cassandra Day Denver 2014: Transitioning to Cassandra for an Already Giant Pr...
Cassandra Day Denver 2014: Transitioning to Cassandra for an Already Giant Pr...
 
Cassandra Day Denver 2014: Setting up a DataStax Enterprise Instance on Micro...
Cassandra Day Denver 2014: Setting up a DataStax Enterprise Instance on Micro...Cassandra Day Denver 2014: Setting up a DataStax Enterprise Instance on Micro...
Cassandra Day Denver 2014: Setting up a DataStax Enterprise Instance on Micro...
 
Cassandra Day Denver 2014: Python & Cassandra Best Friends
Cassandra Day Denver 2014: Python & Cassandra Best FriendsCassandra Day Denver 2014: Python & Cassandra Best Friends
Cassandra Day Denver 2014: Python & Cassandra Best Friends
 
Cassandra Day Denver 2014: Cassandra Anti-Pattern Jeopardy
Cassandra Day Denver 2014: Cassandra Anti-Pattern JeopardyCassandra Day Denver 2014: Cassandra Anti-Pattern Jeopardy
Cassandra Day Denver 2014: Cassandra Anti-Pattern Jeopardy
 
Cassandra Day Denver 2014: So, You Want to Use Cassandra?
Cassandra Day Denver 2014: So, You Want to Use Cassandra?Cassandra Day Denver 2014: So, You Want to Use Cassandra?
Cassandra Day Denver 2014: So, You Want to Use Cassandra?
 
Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014
 
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
 
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
 
Cassandra Day Denver 2014: A Cassandra Data Model for Serving up Cat Videos
Cassandra Day Denver 2014: A Cassandra Data Model for Serving up Cat VideosCassandra Day Denver 2014: A Cassandra Data Model for Serving up Cat Videos
Cassandra Day Denver 2014: A Cassandra Data Model for Serving up Cat Videos
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 

Similaire à Cassandra Day Denver 2014: Introduction to Apache Cassandra

Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandra
Jon Haddad
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 

Similaire à Cassandra Day Denver 2014: Introduction to Apache Cassandra (20)

Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandra
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
 
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
An Introduction to Cassandra - Oracle User Group
An Introduction to Cassandra - Oracle User GroupAn Introduction to Cassandra - Oracle User Group
An Introduction to Cassandra - Oracle User Group
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 

Plus de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Plus de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Cassandra Day Denver 2014: Introduction to Apache Cassandra

  • 1. Introduction to Cassandra Jon Haddad, Technical Evangelist, DataStax Luke Tillman, Language Evangelist, DataStax @rustyrazorblade, @LukeTillman ©2013 DataStax Confidential. Do not distribute without consent. 1
  • 2. High Level Architecture • Ring based replication • Only 1 type of server (cassandra) • All nodes hold data and can answer queries • No SPOF • Build for HA & Scalability • Multi-DC • Data is found by key (CQL) • Runs on JVM
  • 3. Hash Ring • Key is hashed to a position on the ring • Data is replicated to RF=N servers • Using a snitch (build in feature) you can ensure replicas are located on different racks / AZ
  • 4. CAP Tradeoffs ! • Cassandra chooses Availability & Partition Tolerance over Consistency • Replication factor (RF=3) • Queries have tunable consistency level • ALL, QUORUM, ONE
  • 5. The Write Path • Writes are written to any node in the cluster (coordinator) • Writes are written to commit log, then to meltable • Memtable flushed to disk periodically (sstable) • New memtable is created in memory • Deletes are actually a special write case, called a “tombstone”
  • 6. What is an SSTable? • Immutable data file • Deletes are written as tombstones • Every write includes a timestamp of when it was written • Partition is spread across multiple SSTables • Same column can be in multiple SSTables • Merged through compaction, only latest timestamp is kept sstable sstable sstable sstable
  • 7. The Read Path • Any server may be queried, it acts as the coordinator • Contacts nodes with the requested key • On each node, data is pulled from SSTables and merged • Consistency< ALL performs read repair in background (read_repair_chance)
  • 8. Data Structures • Like an RDBMS, Cassandra uses a Table to store data • But there’s where the similarities end • Partitions within tables • Rows within partitions (or a single row) • CQL to create tables & query data • Partition keys determine where a partition is found • Clustering keys determine ordering of rows within a partition Keyspace Table Partition Row
  • 9. Example: Single Row Partition • Simple User system • Identified by name (pk) • 1 Row per partition • This is familiar territory name age job jon 33 evangelist luke 33 evangelist old pete 108 retired s. seagal 62 actor JCVD 53 actor cqlsh:demo> select * from user WHERE name = 'JCVD' cqlsh:demo> create table user (name text primary key, age int, job text);
  • 10. Example: Multiple Rows • Comments on photos • Comments are always selected by the photo_id • There are only 4 rows in 2 partitions • In the real world, use UUIDs instead of int for PK photo_id comment_id user comment 5 1 jon hi 5 2 luke oh hey 5 3 JCVD AHHHHH!!! 6 4 jon great pic select * from comment where photo_id=5 create table comment ( photo_id int, comment_id int, user text, comment text, primary key (photo_id, comment_id));
  • 11. Partition with Clustering • Multiple rows are transposed into a single partition • Partitions vary in size • Old terminology - "wide row" photo_id comment_id name comment comment_id user comment comment_id user comment 5 1 jon hi 2 luke oh hey 3 JCVD AHHHHH!!! 6 4 jon great pic
  • 12. CQL Data Types Basic Types Collections text uuid counter map int timeuuid list decimal set blob Read the CQL documentation for the full list of types
  • 13. Pro Data Modeling • How do I query my data if I can only query by key? • Denormalize! • Create multiple views into your data (multiple tables) • Cassandra is built for fast writes • Use fast writes to do as few reads as possible
  • 14. Let’s Download Cassandra! • http://planetcassandra.org/ • click downloads • use drop down to pick OS
  • 15. ©2013 DataStax Confidential. Do not distribute without consent. 15