CQL3 in depth

CQL3 in depth
Cassandra Conference in Tokyo, 11/29/2012

Yuki Morishita
Software Engineer@DataStax / Apache Cassandra Committer

©2012 DataStax
1

Agenda!
• Why CQL3?
• CQL3 walkthrough
• Deﬁning Schema
• Querying / Mutating Data
• New features
• Related topics
• Native transport

©2012 DataStax
2

Why CQL3?
©2012 DataStax
3

Cassandra Storage
create column family profiles
with key_validation_class = UTF8Type
and comparator = UTF8Type
and column_metadata = [
{column_name: first_name, validation_class: UTF8Type},
{column_name: last_name, validation_class: UTF8Type},
{column_name: year, validation_class: IntegerType}
];

row key columns values are validated by validation_class

nobu first_name Nobunaga
columns are sorted
last_name Oda
in comparator order
year 1582

©2012 DataStax
4

Thrift API
• Low level: get, get_slice, mutate...
• Directly exposes internal storage
structure
• Hard to change the signature of API

©2012 DataStax
5

Inserting data with Thrift
Column col = new Column(ByteBuffer.wrap("name".getBytes()));
col.setValue(ByteBuffer.wrap("value".getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);

Mutation mutation = new Mutation();
mutation.setColumn_or_supercolumn(cosc);

List<Mutation> mutations = new ArrayList<Mutation>();
mutations.add(mutation);

Map<String, List<Mutation>> cf = new HashMap<String, List<Mutation>>();
cf.put("Standard1", mutations);

Map<ByteBuffer, Map<String, List<Mutation>>> records
= new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
records.put(ByteBuffer.wrap("key".getBytes()), cf);

client.batch_mutate(records, consistencyLevel);

©2012 DataStax
6

... with Cassandra Query Language

INSERT INTO “Standard1” (key, name)
VALUES (“key”, “value”);

• Introduced in 0.8(CQL), updated in
1.0(CQL2)
• Syntax similar to SQL
• More extensible than Thrift API
©2012 DataStax
7

CQL2 Problems
• Almost 1 to 1 mapping to Thrift API, so
not compose with the row-oriented parts
of SQL
• No support for CompositeType

©2012 DataStax
8

CQL3
• Maps storage to a more natural rows-
and-columns representation using
CompositeType
• Wide rows are “transposed” and unpacked
into named columns
• beta in 1.1, default in 1.2
• New features
• Collection support

©2012 DataStax
9

CQL3 walkthrough
©2012 DataStax
10

Defining Keyspace
• Syntax is changed from CQL2

CREATE KEYSPACE my_keyspace WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 2
};

©2012 DataStax
11

Defining Static Column Family
• “Strict” schema definition (and it’s good
thing)
• You cannot add column arbitrary
• You need ALTER TABLE ... ADD column
first
• Columns are defined and sorted using
CompositeType comparator

©2012 DataStax
12

Defining Static Column Family

CREATE TABLE profiles (
user_id text PRIMARY KEY, user_id | first_name | last_name | year
first_name text, ---------+------------+-----------+------
last_name text,
year int nobu | Nobunaga | Oda | 1582
)

CompositeType(UTF8Type)
user_id values are validated by type deﬁnition

nobu :

first_name: Nobunaga
columns are sorted
last_name: Oda
in comparator order
year: 1582

©2012 DataStax
13

Defining Dynamic Column Family
• Then, how can we add columns
dynamically to our time series data like
we did before?
• Use compound key

©2012 DataStax
14

Compound key
CREATE TABLE comments (
article_id uuid,
posted_at timestamp,
author text,
content text,
PRIMARY KEY (article_id, posted_at)
)

CompositeType(DateType, UTF8Type)

article_id values are validated by type deﬁnition

550e8400-.. 1350499616:

1350499616:author yukim
columns are sorted
1350499616:content blah, blah, blah in comparator order,
ﬁrst by date, and then
1368499616: column name
1368499616:author yukim

1368499616:content well, well, well
...
©2012 DataStax
15

Changes worth noting
• Identiﬁers (keyspace/table/columns
names) are always case insensitive by
default
• Use double quote(“) to force case
• Compaction setting is now map type
CREATE TABLE test (
...
) WITH COMPACTION = {
'class': 'SizeTieredCompactionStrategy',
'min_threshold': 2,
'max_threshold': 4
};
©2012 DataStax
17

Changes worth noting
• system.schema_*
• All schema information are stored in system
Keyspace
• schema_keyspaces, schema_columnfamilies,
schema_columns
• system tables themselves are CQL3 schema
• CQL3 schema are not visible through
cassandra-cli’s ‘describe’ command.
• use cqlsh’s ‘describe columnfamily’
©2012 DataStax
18

More on CQL3 schema
• Thrift to CQL3 migration
• http://www.datastax.com/dev/blog/thrift-to-cql3

• For better understanding
• http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
• http://www.datastax.com/dev/blog/cql3-evolutions
• http://www.datastax.com/dev/blog/cql3-for-cassandra-experts

©2012 DataStax
19

Mutating Data

INSERT INTO example (id, name) VALUES (...)

UPDATE example SET f = ‘foo’ WHERE ...

DELETE FROM example WHERE ...

• No more USING CONSISTENCY
• Consistency level setting is moved to protocol
level
©2012 DataStax
20

Batch Mutate
BEGIN BATCH
INSERT INTO aaa (id, col) VALUES (...)
UPDATE bbb SET col1 = ‘val1’ WHERE ...
...
APPLY BATCH;

• Batches are atomic by default from 1.2
• does not mean mutations are isolated
(mutation within a row is isolated from 1.1)
• some performance penalty because of batch
log process
©2012 DataStax
21

Batch Mutate
• Use non atomic batch if you need
performance, not atomicity
BEGIN UNLOGGED BATCH
...
APPLY BATCH;

• More on dev blog
• http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2

©2012 DataStax
22

Querying Data

SELECT article_id, posted_at, author
FROM comments
WHERE
article_id >= ‘...’
ORDER BY posted_at DESC
LIMIT 100;

©2012 DataStax
23

Querying Data
• TTL/WRITETIME
• You can query TTL or write time of the column.

cqlsh:ks> SELECT WRITETIME(author) FROM comments;

writetime(author)
-------------------
1354146105288000

©2012 DataStax
24

Collection support
• Collection
• Set
• Unordered, no duplicates
• List
• Ordered, allow duplicates
• Map
• Keys and associated values

©2012 DataStax
25

Collection support

CREATE TABLE example (
id uuid PRIMARY KEY,
tags set<text>,
points list<int>,
attributes map<text, text>
);

• Collections are typed, but cannot be
nested(no list<list<text>>)
• No secondary index on collections
©2012 DataStax
26

Collection support

INSERT INTO example (id, tags, points, attributes)
VALUES (
‘62c36092-82a1-3a00-93d1-46196ee77204’,
{‘foo’, ‘bar’, ‘baz’}, // set
[100, 20, 93], // list
{‘abc’: ‘def’} // map
);

©2012 DataStax
27

Collection support
• Set
UPDATE example SET tags = tags + {‘qux’} WHERE ...
UPDATE example SET tags = tags - {‘foo’} WHERE ...

• List
UPDATE example SET points = points + [20, 30] WHERE ...
UPDATE example SET points = points - [100] WHERE ...

• Map
UPDATE example SET attributes[‘ghi’] = ‘jkl’ WHERE ...
DELETE attributes[‘abc’] FROM example WHERE ...

©2012 DataStax
28

Collection support

SELECT tags, points, attributes FROM example;

tags | points | attributes
-----------------+---------------+--------------
{baz, foo, bar} | [100, 20, 93] | {abc: def}

• You cannot retrieve item in collection
individually

©2012 DataStax
29

Collection support
• Each element in collection is internally
stored as one Cassandra column
• More on dev blog
• http://www.datastax.com/dev/blog/cql3_collections

©2012 DataStax
30

CQL3 in depth

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (8)

Similaire à CQL3 in depth

Similaire à CQL3 in depth (20)

Plus de Yuki Morishita

Plus de Yuki Morishita (13)

CQL3 in depth