SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
On Cassandra's evolution
Berlin Buzzwords(June4th 2013)
Sylvain Lebresne
Apache Cassandra
#bbuzz
Fully Distributed Database
Massively Scalable
High performance
Highly reliable/available
·
·
·
·
3/22
Cassandra: the past
#bbuzz
Cassandra 0.7 (Jan 2011):
Cassandra 0.8 (Jun 2011):
·
Dynamic schema creation
Expiring columns (TTL)
Secondary indexes
-
-
-
·
Counters
First version of CQL
Automatic memtable tuning
-
-
-
Cassandra 1.0 (Oct 2011):
Cassandra 1.1 (Apr 2012):
·
Compression
Leveled compaction
-
-
·
Row level isolation
Concurrent schema changes
Support for mixed SDD+HDD
nodes
Self-tuning key/row caches
-
-
-
-
4/22
Cassandra: the present
Cassandra 1.2 (Jan 2013):
#bbuzz
Virtual nodes
CQL3
Native protocol
Tracing
...
·
·
·
·
·
5/22
Data distribution without virtual nodes
#bbuzz 6/22
Virtual nodes
#bbuzz 7/22
Repairing without virtual nodes
#bbuzz 8/22
Virtual nodes: repairing
#bbuzz 9/22
Virtual nodes
#bbuzz
Not really "virtual nodes", more "multiple tokens per nodes" (but we still call them
vnodes).
Faster rebuilds.
Allows heterogeneous nodes.
Simpler load balancing when adding nodes.
·
·
·
·
10/22
The Cassandra Query language
#bbuzz
Initial version introduced in Cassandra 0.8. Version 3 (described here) is a major, more
ambitious, revision.
Goal: provide a much simpler, more abstracted user interface than the legacy thrift one.
Kind of a "denormalized SQL".
Strictly real-time oriented:
·
·
·
·
No joins
No sub-queries
No aggregation/GROUP BY
Limited ORDER BY
-
-
-
-
11/22
Storing songs
id title artist album tags track
a3e64f8f... La Grange ZZTop Tres Hombres { blues, 1973 } 3
8a172618... Moving in Stereo Fu Manchu We Must Obey { covers, 2003 } 9
2b09185b... Outside Woman Blues Back Door Slame Roll Away 3
#bbuzz
CREATETABLEsongs(
iduuidPRIMARYKEY,
titletext,
artisttext,
albumtext,
trackint,
tagsset<text>
);
--Atomicandisolated
INSERTINTOsongs(id,title,artist,album,tags,track)
VALUES(a3e64f8f...,'LaGrange','ZTop''TresHombres',{'blues','1973'},3);
UPDATEsongsSETartist='ZZTop'WHEREid=a3e64f8f...;
CQL
12/22
Playlists
user_id playlist_name title artist album song_id
pcmanus My list La Grange ZZTop Tres Hombres a3e64f8f...
pcmanus My list Moving in Stereo Fu Manchu We Must Obey 8a172618...
pcmanus Other list La Grange ZZTop Tres Hombres a3e64f8f...
pcmanus Other list Outside Woman Blues Back Door Slame Roll Away 2b09185b...
Partition key: determines the physical node holding the row
Clustering key: induces physical ordering of the rows on disk
#bbuzz
CREATETABLEplaylists(
user_idtext,
playlist_nametext,
titletext,
artisttext,
albumtext,
song_iduuid,
PRIMARYKEY((user_id,playlist_name), title,album,artist)
);
CQL
13/22
Querying a Playlist
#bbuzz
--Songsin'Mylist'withatitlestartingby'b'or'c'
SELECT*FROMplaylists
WHEREuser_id='pcmanus'
ANDplaylist_name='Mylist'
ANDtitle>='b'ANDtitle<'d';
--50lastsongsin'Mylist'
SELECT*FROMplaylists
WHEREuser_id='pcmanus'
ANDplaylist_name='Mylist'
ORDERBYtitleDESC
LIMIT50;
CQL
14/22
Native protocol
Binary transport protocol for CQL3 (replace Thrift transport):
See the Datastax Java Driver for a mature driver using this new protocol
(https://github.com/datastax/java-driver).
#bbuzz
Asynchronous (less connections)
Server notifications for new nodes, schema changes, etc..
Optimized for CQL3
·
·
·
15/22
Request tracing
#bbuzz
cqlsh:foo>TRACINGON;
cqlsh:foo>INSERTINTObar(i,j)VALUES(6,2);
CQL
activity |timestamp |source |elapsed
-------------------------------------+--------------+-----------+---------
Determiningreplicasformutation |00:02:37,015|127.0.0.1| 540
Sendingmessageto/127.0.0.2 |00:02:37,015|127.0.0.1| 779
Messagereceivedfrom/127.0.0.1 |00:02:37,016|127.0.0.2| 63
Applyingmutation |00:02:37,016|127.0.0.2| 220
AcquiringswitchLock |00:02:37,016|127.0.0.2| 250
Appendingtocommitlog |00:02:37,016|127.0.0.2| 277
Addingtomemtable |00:02:37,016|127.0.0.2| 378
Enqueuingresponseto/127.0.0.1 |00:02:37,016|127.0.0.2| 710
Sendingmessageto/127.0.0.1 |00:02:37,016|127.0.0.2| 888
Messagereceivedfrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2334
Processingresponsefrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2550
16/22
Tracing an anti-pattern
id created_at value
my_queue 1399121331 0x9b0450d30de9
my_queue 1439051021 0xfc7aee5f6a66
my_queue 1440134565 0x668fdb3a2196
my_queue 1445219887 0xdaf420a01c09
my_queue 1479138491 0x3241ad893ff0
#bbuzz
CREATETABLEqueues(
idtext,
created_attimestamp,
valueblob,
PRIMARYKEY(id,created_at)
);
CQL
17/22
Tracing an anti-pattern
#bbuzz
cqlsh:foo>TRACINGON;
cqlsh:foo>SELECTFROMqueuesWHEREid='myqueue'ORDERBYcreated_atLIMIT1;
CQL
activity |timestamp |source |elapsed
------------------------------------------+--------------+-----------+---------
execute_cql3_query |19:31:05,650|127.0.0.1| 0
Sendingmessageto/127.0.0.3 |19:31:05,651|127.0.0.1| 541
Messagereceivedfrom/127.0.0.1 |19:31:05,651|127.0.0.3| 39
Executingsingle-partitionquery |19:31:05,652|127.0.0.3| 943
Acquiringsstablereferences |19:31:05,652|127.0.0.3| 973
Mergingmemtablecontents |19:31:05,652|127.0.0.3| 1020
Mergingdatafrommemtablesandsstables |19:31:05,652|127.0.0.3| 1081
Read1livecellsand100000tombstoned |19:31:05,686|127.0.0.3| 35072
Enqueuingresponseto/127.0.0.1 |19:31:05,687|127.0.0.3| 35220
Sendingmessageto/127.0.0.1 |19:31:05,687|127.0.0.3| 35314
Messagereceivedfrom/127.0.0.3 |19:31:05,687|127.0.0.1| 36908
Processingresponsefrom/127.0.0.3 |19:31:05,688|127.0.0.1| 37650
Requestcomplete |19:31:05,688|127.0.0.1| 38047
18/22
But also ...
#bbuzz
Concurrent schema creation
Improved JBOD support
Off-heap bloom filters and compression metadata
Faster (murmur3 based) partitioner
...
·
·
·
·
·
19/22
What's next?
Cassandra 2.0 is scheduled for July:
#bbuzz
Improvements to CQL3 and the native protocol (automatic query paging)
Compare-and-swap
Triggers (experimental)
Eager retries
Performance improvements (single-pass compaction, more efficient tombstone removal,
...)
...
·
·
UPDATEusersSETlogin='pcmanus',name='SylvainLebresne'IFNOTEXISTS;
UPDATEusersSETemail='sylvain@datastax.com'IFemail='slebresne@datastax.com';
CQL
·
·
·
·
20/22
<Thank You!>
Questions?
www http://cassandra.apache.org/
twitter @pcmanus
github github.com/pcmanus
On cassandra's evolution @ Berlin buzzwords

Contenu connexe

Tendances

By Vikrant Desai
By Vikrant DesaiBy Vikrant Desai
By Vikrant Desaidvikranttt
 
Creación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virshCreación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virshJonathan Franchesco Torres Baca
 
Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2While42
 
Simple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute nodeSimple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute nodeAsmaa Ibrahim
 
Linux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archiveLinux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archiveKenny (netman)
 
Building packages through emulation by Sean Bruno
Building packages through emulation by Sean BrunoBuilding packages through emulation by Sean Bruno
Building packages through emulation by Sean Brunoeurobsdcon
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsLouis liu
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBXiao Yan Li
 
Fixed in drizzle
Fixed in drizzleFixed in drizzle
Fixed in drizzleHenrik Ingo
 
Proxy server ubuntu 12.04
Proxy server ubuntu 12.04Proxy server ubuntu 12.04
Proxy server ubuntu 12.04Tio Aldiansyah
 
Linux Introduction
Linux IntroductionLinux Introduction
Linux IntroductionDuy Do Phan
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awrLouis liu
 

Tendances (20)

By Vikrant Desai
By Vikrant DesaiBy Vikrant Desai
By Vikrant Desai
 
NetApp mailbox disk
NetApp mailbox diskNetApp mailbox disk
NetApp mailbox disk
 
My First BCC
My First BCCMy First BCC
My First BCC
 
Creación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virshCreación de máquinas virtuales basada en kernel usando qemu y virsh
Creación de máquinas virtuales basada en kernel usando qemu y virsh
 
Gitkata fish shell
Gitkata fish shellGitkata fish shell
Gitkata fish shell
 
Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2
 
Backups
BackupsBackups
Backups
 
Simple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute nodeSimple Sheet for resource calculation of openstack compute node
Simple Sheet for resource calculation of openstack compute node
 
Cli1 Bibalex
Cli1 BibalexCli1 Bibalex
Cli1 Bibalex
 
Linux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archiveLinux fundamental - Chap 04 archive
Linux fundamental - Chap 04 archive
 
Building packages through emulation by Sean Bruno
Building packages through emulation by Sean BrunoBuilding packages through emulation by Sean Bruno
Building packages through emulation by Sean Bruno
 
Linux Containers (LXC)
Linux Containers (LXC)Linux Containers (LXC)
Linux Containers (LXC)
 
My sql fabric ha and sharding solutions
My sql fabric ha and sharding solutionsMy sql fabric ha and sharding solutions
My sql fabric ha and sharding solutions
 
Some analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDBSome analysis of BlueStore and RocksDB
Some analysis of BlueStore and RocksDB
 
ubunturef
ubunturefubunturef
ubunturef
 
Fixed in drizzle
Fixed in drizzleFixed in drizzle
Fixed in drizzle
 
Proxy server ubuntu 12.04
Proxy server ubuntu 12.04Proxy server ubuntu 12.04
Proxy server ubuntu 12.04
 
Linux Introduction
Linux IntroductionLinux Introduction
Linux Introduction
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
 
Unixtoolbox
UnixtoolboxUnixtoolbox
Unixtoolbox
 

Plus de pcmanus

Webinar Big Data Paris
Webinar Big Data ParisWebinar Big Data Paris
Webinar Big Data Parispcmanus
 
Cassandra EU - State of CQL
Cassandra EU - State of CQLCassandra EU - State of CQL
Cassandra EU - State of CQLpcmanus
 
Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012pcmanus
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conferencepcmanus
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Futurepcmanus
 
Cassandra
CassandraCassandra
Cassandrapcmanus
 

Plus de pcmanus (6)

Webinar Big Data Paris
Webinar Big Data ParisWebinar Big Data Paris
Webinar Big Data Paris
 
Cassandra EU - State of CQL
Cassandra EU - State of CQLCassandra EU - State of CQL
Cassandra EU - State of CQL
 
Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conference
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Future
 
Cassandra
CassandraCassandra
Cassandra
 

Dernier

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 

Dernier (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

On cassandra's evolution @ Berlin buzzwords

  • 1.
  • 2. On Cassandra's evolution Berlin Buzzwords(June4th 2013) Sylvain Lebresne
  • 3. Apache Cassandra #bbuzz Fully Distributed Database Massively Scalable High performance Highly reliable/available · · · · 3/22
  • 4. Cassandra: the past #bbuzz Cassandra 0.7 (Jan 2011): Cassandra 0.8 (Jun 2011): · Dynamic schema creation Expiring columns (TTL) Secondary indexes - - - · Counters First version of CQL Automatic memtable tuning - - - Cassandra 1.0 (Oct 2011): Cassandra 1.1 (Apr 2012): · Compression Leveled compaction - - · Row level isolation Concurrent schema changes Support for mixed SDD+HDD nodes Self-tuning key/row caches - - - - 4/22
  • 5. Cassandra: the present Cassandra 1.2 (Jan 2013): #bbuzz Virtual nodes CQL3 Native protocol Tracing ... · · · · · 5/22
  • 6. Data distribution without virtual nodes #bbuzz 6/22
  • 8. Repairing without virtual nodes #bbuzz 8/22
  • 10. Virtual nodes #bbuzz Not really "virtual nodes", more "multiple tokens per nodes" (but we still call them vnodes). Faster rebuilds. Allows heterogeneous nodes. Simpler load balancing when adding nodes. · · · · 10/22
  • 11. The Cassandra Query language #bbuzz Initial version introduced in Cassandra 0.8. Version 3 (described here) is a major, more ambitious, revision. Goal: provide a much simpler, more abstracted user interface than the legacy thrift one. Kind of a "denormalized SQL". Strictly real-time oriented: · · · · No joins No sub-queries No aggregation/GROUP BY Limited ORDER BY - - - - 11/22
  • 12. Storing songs id title artist album tags track a3e64f8f... La Grange ZZTop Tres Hombres { blues, 1973 } 3 8a172618... Moving in Stereo Fu Manchu We Must Obey { covers, 2003 } 9 2b09185b... Outside Woman Blues Back Door Slame Roll Away 3 #bbuzz CREATETABLEsongs( iduuidPRIMARYKEY, titletext, artisttext, albumtext, trackint, tagsset<text> ); --Atomicandisolated INSERTINTOsongs(id,title,artist,album,tags,track) VALUES(a3e64f8f...,'LaGrange','ZTop''TresHombres',{'blues','1973'},3); UPDATEsongsSETartist='ZZTop'WHEREid=a3e64f8f...; CQL 12/22
  • 13. Playlists user_id playlist_name title artist album song_id pcmanus My list La Grange ZZTop Tres Hombres a3e64f8f... pcmanus My list Moving in Stereo Fu Manchu We Must Obey 8a172618... pcmanus Other list La Grange ZZTop Tres Hombres a3e64f8f... pcmanus Other list Outside Woman Blues Back Door Slame Roll Away 2b09185b... Partition key: determines the physical node holding the row Clustering key: induces physical ordering of the rows on disk #bbuzz CREATETABLEplaylists( user_idtext, playlist_nametext, titletext, artisttext, albumtext, song_iduuid, PRIMARYKEY((user_id,playlist_name), title,album,artist) ); CQL 13/22
  • 15. Native protocol Binary transport protocol for CQL3 (replace Thrift transport): See the Datastax Java Driver for a mature driver using this new protocol (https://github.com/datastax/java-driver). #bbuzz Asynchronous (less connections) Server notifications for new nodes, schema changes, etc.. Optimized for CQL3 · · · 15/22
  • 16. Request tracing #bbuzz cqlsh:foo>TRACINGON; cqlsh:foo>INSERTINTObar(i,j)VALUES(6,2); CQL activity |timestamp |source |elapsed -------------------------------------+--------------+-----------+--------- Determiningreplicasformutation |00:02:37,015|127.0.0.1| 540 Sendingmessageto/127.0.0.2 |00:02:37,015|127.0.0.1| 779 Messagereceivedfrom/127.0.0.1 |00:02:37,016|127.0.0.2| 63 Applyingmutation |00:02:37,016|127.0.0.2| 220 AcquiringswitchLock |00:02:37,016|127.0.0.2| 250 Appendingtocommitlog |00:02:37,016|127.0.0.2| 277 Addingtomemtable |00:02:37,016|127.0.0.2| 378 Enqueuingresponseto/127.0.0.1 |00:02:37,016|127.0.0.2| 710 Sendingmessageto/127.0.0.1 |00:02:37,016|127.0.0.2| 888 Messagereceivedfrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2334 Processingresponsefrom/127.0.0.2 |00:02:37,017|127.0.0.1| 2550 16/22
  • 17. Tracing an anti-pattern id created_at value my_queue 1399121331 0x9b0450d30de9 my_queue 1439051021 0xfc7aee5f6a66 my_queue 1440134565 0x668fdb3a2196 my_queue 1445219887 0xdaf420a01c09 my_queue 1479138491 0x3241ad893ff0 #bbuzz CREATETABLEqueues( idtext, created_attimestamp, valueblob, PRIMARYKEY(id,created_at) ); CQL 17/22
  • 18. Tracing an anti-pattern #bbuzz cqlsh:foo>TRACINGON; cqlsh:foo>SELECTFROMqueuesWHEREid='myqueue'ORDERBYcreated_atLIMIT1; CQL activity |timestamp |source |elapsed ------------------------------------------+--------------+-----------+--------- execute_cql3_query |19:31:05,650|127.0.0.1| 0 Sendingmessageto/127.0.0.3 |19:31:05,651|127.0.0.1| 541 Messagereceivedfrom/127.0.0.1 |19:31:05,651|127.0.0.3| 39 Executingsingle-partitionquery |19:31:05,652|127.0.0.3| 943 Acquiringsstablereferences |19:31:05,652|127.0.0.3| 973 Mergingmemtablecontents |19:31:05,652|127.0.0.3| 1020 Mergingdatafrommemtablesandsstables |19:31:05,652|127.0.0.3| 1081 Read1livecellsand100000tombstoned |19:31:05,686|127.0.0.3| 35072 Enqueuingresponseto/127.0.0.1 |19:31:05,687|127.0.0.3| 35220 Sendingmessageto/127.0.0.1 |19:31:05,687|127.0.0.3| 35314 Messagereceivedfrom/127.0.0.3 |19:31:05,687|127.0.0.1| 36908 Processingresponsefrom/127.0.0.3 |19:31:05,688|127.0.0.1| 37650 Requestcomplete |19:31:05,688|127.0.0.1| 38047 18/22
  • 19. But also ... #bbuzz Concurrent schema creation Improved JBOD support Off-heap bloom filters and compression metadata Faster (murmur3 based) partitioner ... · · · · · 19/22
  • 20. What's next? Cassandra 2.0 is scheduled for July: #bbuzz Improvements to CQL3 and the native protocol (automatic query paging) Compare-and-swap Triggers (experimental) Eager retries Performance improvements (single-pass compaction, more efficient tombstone removal, ...) ... · · UPDATEusersSETlogin='pcmanus',name='SylvainLebresne'IFNOTEXISTS; UPDATEusersSETemail='sylvain@datastax.com'IFemail='slebresne@datastax.com'; CQL · · · · 20/22