This document summarizes how Cassandra Query Language (CQL) works under the hood compared to previous Cassandra APIs. It explains that while CQL provides a SQL-like interface, the underlying data model and storage remain the same. CQL addresses issues with prior APIs like Thrift by introducing a common query language, supporting cursors to avoid loading entire result sets into memory, and standardizing schema definitions and features across clients. The document also describes how CQL queries map to the underlying storage layout using concepts like partition keys, clustering columns, and composite keys to organize data across partitions and determine retrieval order.
3. Target Audience
• Veterans who don’t use or don’t understand CQL
• Newcomers who only know CQL, but don’t know what’s happening
underneath
4. Definitions
• Thrift: legacy RPC protocol + code generation tool
• batch_mutate, get_range_slices, multiget_slice, etc.
• Deprecated in favor of native protocol
• Native protocol: replacement for Thrift
• Only works with CQL
5. Definitions
• Storage rows: keys + columns as stored on disk
• CQL rows: abstraction layer on top of storage rows
• Not usually a direct mapping to storage rows
• Make use of predefined schema
• Data still sparse – no space used for null columns
6. In the Old Days …
• Lots of clients with lots of APIs
• No common language for describing schemas or queries
• Steep learning curve
7. In the Old Days …
• No cursors, so entire result set must fit in memory on client and server
• Hard to add new features
• Lag time between release of new features and client library adoption
11. Reality
• Don’t panic
• Thrift problems solved
• You didn’t lose anything
• Underlying storage is
unchanged
• Don’t get lazy
• CQL is not SQL
• You need to know what
you’re doing
• No, you can’t just index
everything
12. Simple CQL
CREATE TABLE Books (!
title varchar,!
author varchar,!
year int,!
PRIMARY KEY (title)!
SELECT * FROM Books;!
);!
!
INSERT INTO Books (title, author, year) !
VALUES ('Patriot Games', 'Tom Clancy', 1987);!
INSERT INTO Books (title, author, year) !
VALUES ('Without Remorse', 'Tom Clancy', 1993);!
!
title | author | year!
-----------------+------------+------!
Without Remorse | Tom Clancy | 1993!
Patriot Games | Tom Clancy | 1987!
13. Storage Rows
[default@unknown] create keyspace Library;!
[default@unknown] use Library;!
[default@Library] create column family Books!
...! with key_validation_class=UTF8Type!
...! and comparator=UTF8Type!
...! and default_validation_class=UTF8Type;!
[default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';!
[default@Library] set Books['Patriot Games']['year'] = '1987';!
[default@Library] list Books;!
!
RowKey: Patriot Games!
=> (name=author, value=Tom Clancy, timestamp=1393102991499000)!
=> (name=year, value=1987, timestamp=1393103015955000)!
14. Storage Rows
[default@unknown] create keyspace Library;!
[default@unknown] use Library;!
[default@Library] create column family Books!
...! with key_validation_class=UTF8Type!
...! and comparator=UTF8Type!
...! and default_validation_class=UTF8Type;!
[default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';!
[default@Library] set Books['Patriot Games']['year'] = '1987';!
[default@Library] list Books;!
!
RowKey: Patriot Games!
=> (name=author, value=Tom Clancy, timestamp=1393102991499000)!
=> (name=year, value=1987, timestamp=1393103015955000)!
Random hash (no ordering)
15. Storage Rows
[default@unknown] create keyspace Library;!
[default@unknown] use Library;!
[default@Library] create column family Books!
...! with key_validation_class=UTF8Type!
...! and comparator=UTF8Type!
...! and default_validation_class=UTF8Type;!
[default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';!
[default@Library] set Books['Patriot Games']['year'] = '1987';!
[default@Library] list Books;!
!
Ordered by name
RowKey: Patriot Games!
=> (name=author, value=Tom Clancy, timestamp=1393102991499000)!
=> (name=year, value=1987, timestamp=1393103015955000)!
16. Compound Key
CREATE TABLE authors (!
!name text,!
!year int,!
!title text,!
!isbn text,!
!publisher text,!
!PRIMARY KEY (name, year, title)!
);!
name | year | title | isbn | publisher!
------------+------+-----------------+---------------+-----------!
Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!
Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!
17. Compound Key
CREATE TABLE authors (!
!name text,!
!year int,!
!title text,!
!isbn text,!
!publisher text,!
!PRIMARY KEY (name, year, title)!
);!
Partition key (row key)
name | year | title | isbn | publisher!
------------+------+-----------------+---------------+-----------!
Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!
Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!
18. Compound Key
CREATE TABLE authors (!
!name text,!
!year int,!
!title text,!
!isbn text,!
!publisher text,!
!PRIMARY KEY (name, year, title)!
);!
Clustering columns
name | year | title | isbn | publisher!
------------+------+-----------------+---------------+-----------!
Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!
Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!
28. Why This Matters
• Queries must respect underlying storage, else they will either be slow or
impossible
• You have to know your partition key at query time
• If you want fast multi-record queries, select a range, in storage order
• With clustering columns, order matters
29. Example
CREATE TABLE authors (!
!name text,!
!year int,!
!title text,!
!isbn text,!
!publisher text,!
!PRIMARY KEY (name, year, title)!
) WITH CLUSTERING ORDER BY (year DESC);!
30. Example
name | year | title | isbn | publisher!
------------+------+-----------------------------+---------------+-----------!
Tom Clancy | 1996 | Executive Orders | 0-399-13825-0 | Putnam!
Tom Clancy | 1994 | Debt of Honor | 0-399-13826-1 | Putnam!
Tom Clancy | 1993 | Without Remorse | 0-399-13927-0 | Putnam!
Tom Clancy | 1991 | The Sum of All Fears | 0-399-12341-6 | Putnam!
Tom Clancy | 1989 | Clear and Present Danger | 0-399-13341-1 | Putnam!
Tom Clancy | 1988 | The Cardinal of the Kremlin | 0-399-13241-4 | Putnam!
Tom Clancy | 1987 | Patriot Games | 0-399-13231-4 | Putnam!
Tom Clancy | 1986 | Red Storm Rising | 0-399-13230-2 | Putnam!
Tom Clancy | 1984 | The Hunt for Red October | 0-399-13251-1 | Putnam!
!!
31. Query by Key
SELECT * FROM authors
WHERE name = ‘Tom Clancy’
(CL = QUORUM)
RF = 3
Tom
Clancy
Tom
Clancy
Tom
Clancy
32. Query by Key
Find partition key
RowKey: Tom Clancy!
=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!
...!
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!
33. Query by Key
Scan all columns in order
RowKey: Tom Clancy!
=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!
...!
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!
34. Range Query
SELECT * FROM authors
WHERE name = ‘Tom Clancy’
AND year >= 1990
(CL = QUORUM)
Tom
Clancy
Tom
Clancy
Tom
Clancy
35. Range Query
Find partition key
RowKey: Tom Clancy!
=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!
...!
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!
36. Range Query
Scan until < 1990
RowKey: Tom Clancy!
=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!
...!
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!
37. Querying Tail of Range
SELECT * FROM authors
WHERE name = ‘Tom Clancy’
AND year <= 1990
(CL = QUORUM)
Tom
Clancy
Tom
Clancy
Tom
Clancy
38. Querying Tail of Range
Find partition key
RowKey: Tom Clancy!
=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!
...!
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!
39. Querying Tail of Range
Scan all,
then filter <= 1990
RowKey: Tom Clancy!
=> (name=1996:Executive Orders:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1996:Executive Orders:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1994:Debt of Honor:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1994:Debt of Honor:ISBN, value=0-399-13826-1, timestamp=1393104109214000)!
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)!
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)!
=> (name=1991:The Sum of All Fears:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1991:The Sum of All Fears:ISBN, value=0-399-13241-6, timestamp=1393104011458000)!
...!
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)!
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)!
40. Multiple Keys
DK
DK
MG
MG
TC
MG
TC
TC
DK
SELECT * FROM authors
WHERE name IN
(‘Tom Clancy’,
‘Dean Koontz’,
‘Malcolm Gladwell’)
(CL = QUORUM)
41. Lessons Learned
• Sequential == fast
• Query by key / clustering column (range) == Sequential
• Multi-key query often == lots of nodes
• Write in the intended read order
42. Collections
• Sets: unordered, unique
• Lists: ordered, allow duplicates
• Maps: key/value pairs – can be a good substitute for dynamic columns
• Max 64k items, 64k per item
• Always returns entire collection
44. Sets
RowKey: Tom Clancy!
=> (name=books:50617472696f742047616d6573, value=, ...)!
=> (name=books:576974686f75742052656d6f727365, value=, ...)!
Set name
Set item
Empty value
46. Lists
RowKey: Tom Clancy!
=> (name=books:d36de8b0305011e4a0dddbbeade718be, value=576974686f75742052656d6f727365, ...)!
=> (name=books:d36de8b1305011e4a0dddbbeade718be, value=50617472696f742047616d6573, ...)!
List name
Ordering ID
List item
50. Indices
• Allow query by value in certain cases
• Partitioned based on row key of indexed table
• Are updated atomically along with the data being inserted
• Must be low cardinality, or it won’t scale well (to large cluster sizes)
• But not too low, or it’s sort of pointless
52. Index Distribution
Node 1
Authors
“Tom Clancy” : “Putnam”!
“Mark Twain” : “Putnam”!
!
Index
“Putnam” : “Tom Clancy”!
“Putnam” : “Mark Twain”!
!
Node 2
Authors
“Mark Twain” : “Putnam”!
“Dan Brown” : “Putnam”!
!
Index
“Putnam” : “Mark Twain”!
“Putnam” : “Dan Brown”!
!
Node 3
Authors
“Dan Brown” : “Putnam”
“Tom Clancy” : “Putnam”!
!
Index
“Putnam” : “Dan Brown”!
“Putnam” : “Tom Clancy”!
!
… but node distribution based on original table key
53. Querying by Value
pub idx
pub idx
pub idx pub idx
pub idx
pub idx
SELECT * FROM authors
WHERE publisher = ‘Putnam’
(CL = QUORUM)
54. Deletes
DELETE FROM authors WHERE name = 'Tom Clancy';!
!
INSERT INTO authors (title, name)
VALUES ('Patriot Games', 'Tom Clancy')
USING TTL 86400;!
!
INSERT INTO authors (title, name, year)
VALUES ('Patriot Games', 'Tom Clancy', null);!
!
UPDATE authors SET publisher = null
WHERE name = 'Tom Clancy';!
!
55. Deletes
• Log-structured storage, so writes are immutable
• Deletes create tombstones, one for each deleted column
• Cassandra must read the tombstones to make sure it doesn’t revive
deleted data
• Lots of deletes is an anti-pattern
56. Missing Columns
INSERT INTO authors (title, name, year)!
VALUES ('Without Remorse', 'Tom Clancy', 1993);!
!
name | year | title | isbn | publisher!
------------+------+-----------------+------+-----------!
Tom Clancy | 1993 | Without Remorse | null | null!
!
RowKey: Tom Clancy!
=> (name=1993:Without Remorse:, value=, timestamp=1409936754170000)!
57. Missing Columns
activity | timestamp | source | source_elapsed!
---------------------------------------------------------------------------+--------------+-----------+----------------!
execute_cql3_query | 11:51:55,975 | 127.0.0.1 | 0!
Parsing select * from authors where name = 'Tom Clancy' LIMIT 10000; | 11:51:55,975 | 127.0.0.1 | 47!
Preparing statement | 11:51:55,975 | 127.0.0.1 | 105!
Executing single-partition query on authors | 11:51:55,975 | 127.0.0.1 | 307!
Acquiring sstable references | 11:51:55,975 | 127.0.0.1 | 315!
Merging memtable tombstones | 11:51:55,975 | 127.0.0.1 | 328!
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 11:51:55,975 | 127.0.0.1 | 374!
Merging data from memtables and 0 sstables | 11:51:55,975 | 127.0.0.1 | 383!
Read 1 live and 0 tombstoned cells | 11:51:55,975 | 127.0.0.1 | 420!
Request complete | 11:51:55,975 | 127.0.0.1 | 585!
Only 1 read required
58. Null Columns
INSERT INTO authors (title, name, year, isbn, publisher)
VALUES ('Without Remorse', 'Tom Clancy', 1993, null, null);!
!
name | year | title | isbn | publisher!
------------+------+-----------------+------+-----------!
Tom Clancy | 1993 | Without Remorse | null | null!
60. Why Queries Fail
SELECT name FROM authors WHERE title = 'Patriot Games';!
Bad Request: PRIMARY KEY part title cannot be
restricted (preceding part year is either not
restricted or by a non-EQ relation)!
61. Why Queries Fail
• Failure to provide the full partition key
• Querying by value without an index
• Misunderstanding clustering columns
62. Missing Key Parts
CREATE TABLE authors (!
!name text,!
!year int,!
!title text,!
!isbn text,!
!publisher text,!
!PRIMARY KEY ((name, year), title)!
);!
!
!
SELECT name FROM authors WHERE title
63. Missing Key Parts
[default@Library] list authors;!
!
Partition keys (row key)
RowKey: Tom Clancy:1993!
=> (name=Without Remorse:isbn, value=0-399-13241-4, timestamp=1409344246457000)!
=> (name=Without Remorse:publisher, value=5075746e616d, timestamp=1409344246457000)!
-------------------!
RowKey: Tom Clancy:1987!
=> (name=Patriot Games:isbn, value=0-399-13825-0, timestamp=1409344245715000)!
=> (name=Patriot Games:publisher, value=5075746e616d, timestamp=1409344245715000)!
64. Missing Key Parts
SELECT * FROM authors WHERE name = 'Tom Clancy’;!
Bad Request: Partition key part year must be restricted
since preceding part is!
65. Missing Key Parts
SELECT * FROM authors WHERE year = 1987;!
Bad Request: partition key part year cannot be
restricted (preceding part name is either not
restricted or by a non-EQ relation)!
66. Missing Key Parts
SELECT * FROM authors WHERE year >= 1987;!
Bad Request: partition key part year cannot be
restricted (preceding part name is either not
restricted or by a non-EQ relation)!
67. Missing Key Parts
SELECT * FROM authors !
WHERE name = 'Tom Clancy' and year >= 1987;!
!
Bad Request: Only EQ and IN relation are supported on
the partition key (unless you use the token() function)!
68. Querying by Value
SELECT * FROM authors WHERE isbn = '0-399-13241-4';!
Bad Request: No indexed columns present in by-columns
clause with Equal operator!
71. Querying Clustering Columns
SELECT * FROM authors !
WHERE name = 'Tom Clancy’!
AND year = 1993;!
name | year | title | isbn | publisher!
------------+------+-----------------+---------------+-----------!
Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!
73. Querying Clustering Columns
SELECT * FROM authors !
WHERE name = 'Tom Clancy’!
AND year <= 1993;!
name | year | title | isbn | publisher!
------------+------+-----------------+---------------+-----------!
Tom Clancy | 1987 | Patriot Games | 0-399-13241-4 | Putnam!
Tom Clancy | 1993 | Without Remorse | 0-399-13825-0 | Putnam!
75. Querying Clustering Columns
SELECT * FROM authors !
WHERE name = 'Tom Clancy’!
AND title = 'Patriot Games';!
!
Bad Request: PRIMARY KEY part title cannot be restricted
(preceding part year is either not restricted or by a
non-EQ relation)!
!!