SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
CQL3 in depth
Cassandra Conference in Tokyo, 11/29/2012


Yuki Morishita
Software Engineer@DataStax / Apache Cassandra Committer


©2012 DataStax
                                                          1
Agenda!
   • Why CQL3?
   • CQL3 walkthrough
            • Defining Schema
            • Querying / Mutating Data
            • New features
   • Related topics
            • Native transport


©2012 DataStax
                                         2
Why CQL3?
©2012 DataStax
                 3
Cassandra Storage
                 create column family profiles
                 with key_validation_class = UTF8Type
                 and comparator = UTF8Type
                 and column_metadata = [
                    {column_name: first_name, validation_class: UTF8Type},
                    {column_name: last_name, validation_class: UTF8Type},
                    {column_name: year, validation_class: IntegerType}
                 ];




                      row key          columns        values are validated by validation_class


                                nobu     first_name   Nobunaga
                                                                                 columns are sorted
                                         last_name    Oda
                                                                                 in comparator order
                                         year         1582




©2012 DataStax
                                                                                                       4
Thrift API
   • Low level: get, get_slice, mutate...
   • Directly exposes internal storage
     structure
   • Hard to change the signature of API




©2012 DataStax
                                            5
Inserting data with Thrift
   Column col = new Column(ByteBuffer.wrap("name".getBytes()));
   col.setValue(ByteBuffer.wrap("value".getBytes()));
   col.setTimestamp(System.currentTimeMillis());

   ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
   cosc.setColumn(col);

   Mutation mutation = new Mutation();
   mutation.setColumn_or_supercolumn(cosc);

   List<Mutation> mutations = new ArrayList<Mutation>();
   mutations.add(mutation);

   Map<String, List<Mutation>> cf = new HashMap<String, List<Mutation>>();
   cf.put("Standard1", mutations);

   Map<ByteBuffer, Map<String, List<Mutation>>> records
                        = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
   records.put(ByteBuffer.wrap("key".getBytes()), cf);

   client.batch_mutate(records, consistencyLevel);




©2012 DataStax
                                                                                    6
... with Cassandra Query Language


                 INSERT INTO “Standard1” (key, name)
                 VALUES (“key”, “value”);




    • Introduced in 0.8(CQL), updated in
      1.0(CQL2)
    • Syntax similar to SQL
    • More extensible than Thrift API
©2012 DataStax
                                                       7
CQL2 Problems
   • Almost 1 to 1 mapping to Thrift API, so
     not compose with the row-oriented parts
     of SQL
   • No support for CompositeType




©2012 DataStax
                                               8
CQL3
   • Maps storage to a more natural rows-
     and-columns representation using
     CompositeType
            • Wide rows are “transposed” and unpacked
              into named columns
   • beta in 1.1, default in 1.2
   • New features
            • Collection support

©2012 DataStax
                                                        9
CQL3 walkthrough
©2012 DataStax
                   10
Defining Keyspace
   • Syntax is changed from CQL2

     CREATE KEYSPACE my_keyspace WITH replication = {
         'class': 'SimpleStrategy',
         'replication_factor': 2
     };




©2012 DataStax
                                                        11
Defining Static Column Family
   • “Strict” schema definition (and it’s good
     thing)
            • You cannot add column arbitrary
            • You need ALTER TABLE ... ADD column
              first
   • Columns are defined and sorted using
     CompositeType comparator


©2012 DataStax
                                                    12
Defining Static Column Family

   CREATE TABLE profiles (
     user_id text PRIMARY KEY,              user_id | first_name | last_name | year
     first_name text,                      ---------+------------+-----------+------
     last_name text,
     year int                                  nobu |   Nobunaga |       Oda | 1582
   )




                            CompositeType(UTF8Type)
                 user_id                        values are validated by type definition


                     nobu         :

                                  first_name:    Nobunaga
                                                                           columns are sorted
                                  last_name:     Oda
                                                                           in comparator order
                                  year:          1582

©2012 DataStax
                                                                                                 13
Defining Dynamic Column Family
   • Then, how can we add columns
     dynamically to our time series data like
     we did before?
            • Use compound key




©2012 DataStax
                                                14
Compound key
                        CREATE TABLE comments (
                            article_id uuid,
                            posted_at timestamp,
                            author text,
                            content text,
                            PRIMARY KEY (article_id, posted_at)
                        )

           CompositeType(DateType, UTF8Type)


    article_id                     values are validated by type definition

   550e8400-..       1350499616:

                     1350499616:author              yukim
                                                                            columns are sorted
                     1350499616:content             blah, blah, blah        in comparator order,
                                                                            first by date, and then
                     1368499616:                                            column name
                     1368499616:author              yukim

                     1368499616:content             well, well, well
                                              ...
©2012 DataStax
                                                                                             15
Compound key

cqlsh:ks> SELECT * FROM comments;

 article_id   | posted_at                | author | content
--------------+--------------------------+--------+------------------
 550e8400-... | 1970-01-17 00:08:19+0900 | yukim | blah, blah, blah
 550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well

cqlsh:ks> SELECT * FROM comments WHERE posted_at >= '1970-01-17 05:08:19+0900';


 article_id   | posted_at                | author | content
--------------+--------------------------+--------+------------------
 550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well




©2012 DataStax
                                                                                  16
Changes worth noting
   • Identifiers (keyspace/table/columns
     names) are always case insensitive by
     default
            • Use double quote(“) to force case
   • Compaction setting is now map type
            CREATE TABLE test (
                 ...
            ) WITH COMPACTION = {
                 'class': 'SizeTieredCompactionStrategy',
                 'min_threshold': 2,
                 'max_threshold': 4
            };
©2012 DataStax
                                                            17
Changes worth noting
   • system.schema_*
            • All schema information are stored in system
              Keyspace
                 • schema_keyspaces, schema_columnfamilies,
                   schema_columns
            • system tables themselves are CQL3 schema
   • CQL3 schema are not visible through
     cassandra-cli’s ‘describe’ command.
            • use cqlsh’s ‘describe columnfamily’
©2012 DataStax
                                                              18
More on CQL3 schema
   • Thrift to CQL3 migration
            • http://www.datastax.com/dev/blog/thrift-to-cql3

   • For better understanding
            • http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
            • http://www.datastax.com/dev/blog/cql3-evolutions
            • http://www.datastax.com/dev/blog/cql3-for-cassandra-experts




©2012 DataStax
                                                                            19
Mutating Data


                 INSERT INTO example (id, name) VALUES (...)

                 UPDATE example SET f = ‘foo’ WHERE ...

                 DELETE FROM example WHERE ...




   • No more USING CONSISTENCY
            • Consistency level setting is moved to protocol
              level
©2012 DataStax
                                                               20
Batch Mutate
                 BEGIN BATCH
                     INSERT INTO aaa (id, col) VALUES (...)
                     UPDATE bbb SET col1 = ‘val1’ WHERE ...
                     ...
                 APPLY BATCH;


   • Batches are atomic by default from 1.2
            • does not mean mutations are isolated
              (mutation within a row is isolated from 1.1)
            • some performance penalty because of batch
              log process
©2012 DataStax
                                                              21
Batch Mutate
   • Use non atomic batch if you need
     performance, not atomicity
                 BEGIN UNLOGGED BATCH
                     ...
                 APPLY BATCH;



   • More on dev blog
            • http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2




©2012 DataStax
                                                                                 22
Querying Data

                 SELECT article_id, posted_at, author
                 FROM comments
                 WHERE
                   article_id >= ‘...’
                 ORDER BY posted_at DESC
                 LIMIT 100;




©2012 DataStax
                                                        23
Querying Data
   • TTL/WRITETIME
            • You can query TTL or write time of the column.

                   cqlsh:ks> SELECT WRITETIME(author) FROM comments;

                    writetime(author)
                   -------------------
                     1354146105288000




©2012 DataStax
                                                                       24
Collection support
   • Collection
            • Set
                 • Unordered, no duplicates
            • List
                 • Ordered, allow duplicates
            • Map
                 • Keys and associated values




©2012 DataStax
                                                25
Collection support

                 CREATE TABLE example (
                    id uuid PRIMARY KEY,
                    tags set<text>,
                    points list<int>,
                    attributes map<text, text>
                 );




   • Collections are typed, but cannot be
     nested(no list<list<text>>)
   • No secondary index on collections
©2012 DataStax
                                                 26
Collection support

           INSERT INTO example (id, tags, points, attributes)
           VALUES (
               ‘62c36092-82a1-3a00-93d1-46196ee77204’,
               {‘foo’, ‘bar’, ‘baz’}, // set
               [100, 20, 93],          // list
               {‘abc’: ‘def’}          // map
           );




©2012 DataStax
                                                                27
Collection support
   • Set
    UPDATE example SET tags = tags + {‘qux’} WHERE ...
    UPDATE example SET tags = tags - {‘foo’} WHERE ...


   • List
    UPDATE example SET points = points + [20, 30] WHERE ...
    UPDATE example SET points = points - [100] WHERE ...


   • Map
    UPDATE example SET attributes[‘ghi’] = ‘jkl’ WHERE ...
    DELETE attributes[‘abc’] FROM example WHERE ...




©2012 DataStax
                                                              28
Collection support

           SELECT tags, points, attributes FROM example;

            tags            | points        | attributes
           -----------------+---------------+--------------
            {baz, foo, bar} | [100, 20, 93] | {abc: def}




   • You cannot retrieve item in collection
     individually

©2012 DataStax
                                                              29
Collection support
   • Each element in collection is internally
     stored as one Cassandra column
   • More on dev blog
            • http://www.datastax.com/dev/blog/cql3_collections




©2012 DataStax
                                                                  30
Related topics
©2012 DataStax
                 31
Native Transport
   • CQL3 still goes through Thrift’s
     execute_cql3_query API
   • Native Transport support introduces
     Cassandra’s original binary protocol
            • Async IO, server event push, ...
            • http://www.datastax.com/dev/blog/binary-protocol

   • Try DataStax Java native driver with C*
     1.2 beta today!
            • https://github.com/datastax/java-driver

©2012 DataStax
                                                                 32
Question ?

                 Or contact me later if you have one
                         yuki@datastax.com
                         yukim (IRC, twitter)                   Now
                                                               Hiring
                                                       talented engineers from all
                                                             over the world!




©2012 DataStax
                                                                                33

Contenu connexe

Tendances

PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frPGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frjlb666
 
PostgreSQL Materialized Views with Active Record
PostgreSQL Materialized Views with Active RecordPostgreSQL Materialized Views with Active Record
PostgreSQL Materialized Views with Active RecordDavid Roberts
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedis Labs
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Databricks
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDatabricks
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Databricks
 
Database Anti Patterns
Database Anti PatternsDatabase Anti Patterns
Database Anti PatternsRobert Treat
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
 
GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™Databricks
 
Let's Learn to Talk to GC Logs in Java 9
Let's Learn to Talk to GC Logs in Java 9Let's Learn to Talk to GC Logs in Java 9
Let's Learn to Talk to GC Logs in Java 9Poonam Bajaj Parhar
 
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018Amazon Web Services
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...Edureka!
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkDatabricks
 
Hive join optimizations
Hive join optimizationsHive join optimizations
Hive join optimizationsSzehon Ho
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Roopa Tangirala
 

Tendances (20)

PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frPGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
 
PostgreSQL Materialized Views with Active Record
PostgreSQL Materialized Views with Active RecordPostgreSQL Materialized Views with Active Record
PostgreSQL Materialized Views with Active Record
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...
 
Deep Dive - DynamoDB
Deep Dive - DynamoDBDeep Dive - DynamoDB
Deep Dive - DynamoDB
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things Right
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
Database Anti Patterns
Database Anti PatternsDatabase Anti Patterns
Database Anti Patterns
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™
 
Let's Learn to Talk to GC Logs in Java 9
Let's Learn to Talk to GC Logs in Java 9Let's Learn to Talk to GC Logs in Java 9
Let's Learn to Talk to GC Logs in Java 9
 
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Catalyst optimizer
Catalyst optimizerCatalyst optimizer
Catalyst optimizer
 
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
PySpark Programming | PySpark Concepts with Hands-On | PySpark Training | Edu...
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
Hive join optimizations
Hive join optimizationsHive join optimizations
Hive join optimizations
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
 

En vedette

Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Johnny Miller
 
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column FamiliesData Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Familiesgdusbabek
 
How to find Zero day vulnerabilities
How to find Zero day vulnerabilitiesHow to find Zero day vulnerabilities
How to find Zero day vulnerabilitiesMohammed A. Imran
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandraaaronmorton
 
NoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBNoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBsadegh salehi
 
Advanced excel 2010 & 2013 updated Terrabiz
Advanced excel 2010 & 2013 updated TerrabizAdvanced excel 2010 & 2013 updated Terrabiz
Advanced excel 2010 & 2013 updated TerrabizAhmed Yasir Khan
 
Sql queries with answers
Sql queries with answersSql queries with answers
Sql queries with answersvijaybusu
 

En vedette (8)

Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1
 
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column FamiliesData Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
 
How to find Zero day vulnerabilities
How to find Zero day vulnerabilitiesHow to find Zero day vulnerabilities
How to find Zero day vulnerabilities
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
NoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBNoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DB
 
CQL Under the Hood
CQL Under the HoodCQL Under the Hood
CQL Under the Hood
 
Advanced excel 2010 & 2013 updated Terrabiz
Advanced excel 2010 & 2013 updated TerrabizAdvanced excel 2010 & 2013 updated Terrabiz
Advanced excel 2010 & 2013 updated Terrabiz
 
Sql queries with answers
Sql queries with answersSql queries with answers
Sql queries with answers
 

Similaire à CQL3 in depth

Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...Cloudera, Inc.
 
Kafka meetup - kafka connect
Kafka meetup -  kafka connectKafka meetup -  kafka connect
Kafka meetup - kafka connectYi Zhang
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cqlzznate
 
Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013Prosanta Ghosh
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.pptMARasheed3
 
Tk2323 lecture 7 sql
Tk2323 lecture 7   sql Tk2323 lecture 7   sql
Tk2323 lecture 7 sql MengChun Lam
 
Dbms oracle
Dbms oracle Dbms oracle
Dbms oracle Abrar ali
 
Oracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesOracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesKellyn Pot'Vin-Gorman
 
Creating and Managing Tables -Oracle Data base
Creating and Managing Tables -Oracle Data base Creating and Managing Tables -Oracle Data base
Creating and Managing Tables -Oracle Data base Salman Memon
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
Introduction to SQL, SQL*Plus
Introduction to SQL, SQL*PlusIntroduction to SQL, SQL*Plus
Introduction to SQL, SQL*PlusChhom Karath
 
08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadatarehaniltifat
 
Apache MetaModel - unified access to all your data points
Apache MetaModel - unified access to all your data pointsApache MetaModel - unified access to all your data points
Apache MetaModel - unified access to all your data pointsKasper Sørensen
 

Similaire à CQL3 in depth (20)

Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
Friction-free ETL: Automating data transformation with Impala | Strata + Hado...
 
Kafka meetup - kafka connect
Kafka meetup -  kafka connectKafka meetup -  kafka connect
Kafka meetup - kafka connect
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
 
PL/SQL 3 DML
PL/SQL 3 DMLPL/SQL 3 DML
PL/SQL 3 DML
 
Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.ppt
 
DBMS Chapter-3.ppsx
DBMS Chapter-3.ppsxDBMS Chapter-3.ppsx
DBMS Chapter-3.ppsx
 
Tk2323 lecture 7 sql
Tk2323 lecture 7   sql Tk2323 lecture 7   sql
Tk2323 lecture 7 sql
 
plsql Les09
 plsql Les09 plsql Les09
plsql Les09
 
Dbms oracle
Dbms oracle Dbms oracle
Dbms oracle
 
Oracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesOracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the Indices
 
Creating and Managing Tables -Oracle Data base
Creating and Managing Tables -Oracle Data base Creating and Managing Tables -Oracle Data base
Creating and Managing Tables -Oracle Data base
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Les09
Les09Les09
Les09
 
Sql
SqlSql
Sql
 
Les09.ppt
Les09.pptLes09.ppt
Les09.ppt
 
Introduction to SQL, SQL*Plus
Introduction to SQL, SQL*PlusIntroduction to SQL, SQL*Plus
Introduction to SQL, SQL*Plus
 
08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata
 
Apache MetaModel - unified access to all your data points
Apache MetaModel - unified access to all your data pointsApache MetaModel - unified access to all your data points
Apache MetaModel - unified access to all your data points
 

Plus de Yuki Morishita

DataStax EnterpriseでApache Tinkerpop入門
DataStax EnterpriseでApache Tinkerpop入門DataStax EnterpriseでApache Tinkerpop入門
DataStax EnterpriseでApache Tinkerpop入門Yuki Morishita
 
Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界Yuki Morishita
 
DataStax Enterpriseによる大規模グラフ解析
DataStax Enterpriseによる大規模グラフ解析DataStax Enterpriseによる大規模グラフ解析
DataStax Enterpriseによる大規模グラフ解析Yuki Morishita
 
サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方
サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方
サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方Yuki Morishita
 
サンプルで学ぶCassandraアプリケーションの作り方
サンプルで学ぶCassandraアプリケーションの作り方サンプルで学ぶCassandraアプリケーションの作り方
サンプルで学ぶCassandraアプリケーションの作り方Yuki Morishita
 
RDB開発者のためのApache Cassandra データモデリング入門
RDB開発者のためのApache Cassandra データモデリング入門RDB開発者のためのApache Cassandra データモデリング入門
RDB開発者のためのApache Cassandra データモデリング入門Yuki Morishita
 
分散グラフデータベース DataStax Enterprise Graph
分散グラフデータベース DataStax Enterprise Graph分散グラフデータベース DataStax Enterprise Graph
分散グラフデータベース DataStax Enterprise GraphYuki Morishita
 
事例で学ぶApache Cassandra
事例で学ぶApache Cassandra事例で学ぶApache Cassandra
事例で学ぶApache CassandraYuki Morishita
 
Apache Cassandra最新情報まとめ
Apache Cassandra最新情報まとめApache Cassandra最新情報まとめ
Apache Cassandra最新情報まとめYuki Morishita
 
Datastax Enterpriseをはじめよう
Datastax EnterpriseをはじめようDatastax Enterpriseをはじめよう
Datastax EnterpriseをはじめようYuki Morishita
 
How you can contribute to Apache Cassandra
How you can contribute to Apache CassandraHow you can contribute to Apache Cassandra
How you can contribute to Apache CassandraYuki Morishita
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Yuki Morishita
 

Plus de Yuki Morishita (13)

Apache cassandra v4.0
Apache cassandra v4.0Apache cassandra v4.0
Apache cassandra v4.0
 
DataStax EnterpriseでApache Tinkerpop入門
DataStax EnterpriseでApache Tinkerpop入門DataStax EnterpriseでApache Tinkerpop入門
DataStax EnterpriseでApache Tinkerpop入門
 
Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界Apache tinkerpopとグラフデータベースの世界
Apache tinkerpopとグラフデータベースの世界
 
DataStax Enterpriseによる大規模グラフ解析
DataStax Enterpriseによる大規模グラフ解析DataStax Enterpriseによる大規模グラフ解析
DataStax Enterpriseによる大規模グラフ解析
 
サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方
サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方
サンプルアプリケーションで学ぶApache Cassandraを使ったJavaアプリケーションの作り方
 
サンプルで学ぶCassandraアプリケーションの作り方
サンプルで学ぶCassandraアプリケーションの作り方サンプルで学ぶCassandraアプリケーションの作り方
サンプルで学ぶCassandraアプリケーションの作り方
 
RDB開発者のためのApache Cassandra データモデリング入門
RDB開発者のためのApache Cassandra データモデリング入門RDB開発者のためのApache Cassandra データモデリング入門
RDB開発者のためのApache Cassandra データモデリング入門
 
分散グラフデータベース DataStax Enterprise Graph
分散グラフデータベース DataStax Enterprise Graph分散グラフデータベース DataStax Enterprise Graph
分散グラフデータベース DataStax Enterprise Graph
 
事例で学ぶApache Cassandra
事例で学ぶApache Cassandra事例で学ぶApache Cassandra
事例で学ぶApache Cassandra
 
Apache Cassandra最新情報まとめ
Apache Cassandra最新情報まとめApache Cassandra最新情報まとめ
Apache Cassandra最新情報まとめ
 
Datastax Enterpriseをはじめよう
Datastax EnterpriseをはじめようDatastax Enterpriseをはじめよう
Datastax Enterpriseをはじめよう
 
How you can contribute to Apache Cassandra
How you can contribute to Apache CassandraHow you can contribute to Apache Cassandra
How you can contribute to Apache Cassandra
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
 

CQL3 in depth

  • 1. CQL3 in depth Cassandra Conference in Tokyo, 11/29/2012 Yuki Morishita Software Engineer@DataStax / Apache Cassandra Committer ©2012 DataStax 1
  • 2. Agenda! • Why CQL3? • CQL3 walkthrough • Defining Schema • Querying / Mutating Data • New features • Related topics • Native transport ©2012 DataStax 2
  • 4. Cassandra Storage create column family profiles with key_validation_class = UTF8Type and comparator = UTF8Type and column_metadata = [ {column_name: first_name, validation_class: UTF8Type}, {column_name: last_name, validation_class: UTF8Type}, {column_name: year, validation_class: IntegerType} ]; row key columns values are validated by validation_class nobu first_name Nobunaga columns are sorted last_name Oda in comparator order year 1582 ©2012 DataStax 4
  • 5. Thrift API • Low level: get, get_slice, mutate... • Directly exposes internal storage structure • Hard to change the signature of API ©2012 DataStax 5
  • 6. Inserting data with Thrift Column col = new Column(ByteBuffer.wrap("name".getBytes())); col.setValue(ByteBuffer.wrap("value".getBytes())); col.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cosc = new ColumnOrSuperColumn(); cosc.setColumn(col); Mutation mutation = new Mutation(); mutation.setColumn_or_supercolumn(cosc); List<Mutation> mutations = new ArrayList<Mutation>(); mutations.add(mutation); Map<String, List<Mutation>> cf = new HashMap<String, List<Mutation>>(); cf.put("Standard1", mutations); Map<ByteBuffer, Map<String, List<Mutation>>> records = new HashMap<ByteBuffer, Map<String, List<Mutation>>>(); records.put(ByteBuffer.wrap("key".getBytes()), cf); client.batch_mutate(records, consistencyLevel); ©2012 DataStax 6
  • 7. ... with Cassandra Query Language INSERT INTO “Standard1” (key, name) VALUES (“key”, “value”); • Introduced in 0.8(CQL), updated in 1.0(CQL2) • Syntax similar to SQL • More extensible than Thrift API ©2012 DataStax 7
  • 8. CQL2 Problems • Almost 1 to 1 mapping to Thrift API, so not compose with the row-oriented parts of SQL • No support for CompositeType ©2012 DataStax 8
  • 9. CQL3 • Maps storage to a more natural rows- and-columns representation using CompositeType • Wide rows are “transposed” and unpacked into named columns • beta in 1.1, default in 1.2 • New features • Collection support ©2012 DataStax 9
  • 11. Defining Keyspace • Syntax is changed from CQL2 CREATE KEYSPACE my_keyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 2 }; ©2012 DataStax 11
  • 12. Defining Static Column Family • “Strict” schema definition (and it’s good thing) • You cannot add column arbitrary • You need ALTER TABLE ... ADD column first • Columns are defined and sorted using CompositeType comparator ©2012 DataStax 12
  • 13. Defining Static Column Family CREATE TABLE profiles ( user_id text PRIMARY KEY, user_id | first_name | last_name | year first_name text, ---------+------------+-----------+------ last_name text, year int nobu | Nobunaga | Oda | 1582 ) CompositeType(UTF8Type) user_id values are validated by type definition nobu : first_name: Nobunaga columns are sorted last_name: Oda in comparator order year: 1582 ©2012 DataStax 13
  • 14. Defining Dynamic Column Family • Then, how can we add columns dynamically to our time series data like we did before? • Use compound key ©2012 DataStax 14
  • 15. Compound key CREATE TABLE comments ( article_id uuid, posted_at timestamp, author text, content text, PRIMARY KEY (article_id, posted_at) ) CompositeType(DateType, UTF8Type) article_id values are validated by type definition 550e8400-.. 1350499616: 1350499616:author yukim columns are sorted 1350499616:content blah, blah, blah in comparator order, first by date, and then 1368499616: column name 1368499616:author yukim 1368499616:content well, well, well ... ©2012 DataStax 15
  • 16. Compound key cqlsh:ks> SELECT * FROM comments; article_id | posted_at | author | content --------------+--------------------------+--------+------------------ 550e8400-... | 1970-01-17 00:08:19+0900 | yukim | blah, blah, blah 550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well cqlsh:ks> SELECT * FROM comments WHERE posted_at >= '1970-01-17 05:08:19+0900'; article_id | posted_at | author | content --------------+--------------------------+--------+------------------ 550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well ©2012 DataStax 16
  • 17. Changes worth noting • Identifiers (keyspace/table/columns names) are always case insensitive by default • Use double quote(“) to force case • Compaction setting is now map type CREATE TABLE test ( ... ) WITH COMPACTION = { 'class': 'SizeTieredCompactionStrategy', 'min_threshold': 2, 'max_threshold': 4 }; ©2012 DataStax 17
  • 18. Changes worth noting • system.schema_* • All schema information are stored in system Keyspace • schema_keyspaces, schema_columnfamilies, schema_columns • system tables themselves are CQL3 schema • CQL3 schema are not visible through cassandra-cli’s ‘describe’ command. • use cqlsh’s ‘describe columnfamily’ ©2012 DataStax 18
  • 19. More on CQL3 schema • Thrift to CQL3 migration • http://www.datastax.com/dev/blog/thrift-to-cql3 • For better understanding • http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 • http://www.datastax.com/dev/blog/cql3-evolutions • http://www.datastax.com/dev/blog/cql3-for-cassandra-experts ©2012 DataStax 19
  • 20. Mutating Data INSERT INTO example (id, name) VALUES (...) UPDATE example SET f = ‘foo’ WHERE ... DELETE FROM example WHERE ... • No more USING CONSISTENCY • Consistency level setting is moved to protocol level ©2012 DataStax 20
  • 21. Batch Mutate BEGIN BATCH INSERT INTO aaa (id, col) VALUES (...) UPDATE bbb SET col1 = ‘val1’ WHERE ... ... APPLY BATCH; • Batches are atomic by default from 1.2 • does not mean mutations are isolated (mutation within a row is isolated from 1.1) • some performance penalty because of batch log process ©2012 DataStax 21
  • 22. Batch Mutate • Use non atomic batch if you need performance, not atomicity BEGIN UNLOGGED BATCH ... APPLY BATCH; • More on dev blog • http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2 ©2012 DataStax 22
  • 23. Querying Data SELECT article_id, posted_at, author FROM comments WHERE article_id >= ‘...’ ORDER BY posted_at DESC LIMIT 100; ©2012 DataStax 23
  • 24. Querying Data • TTL/WRITETIME • You can query TTL or write time of the column. cqlsh:ks> SELECT WRITETIME(author) FROM comments; writetime(author) ------------------- 1354146105288000 ©2012 DataStax 24
  • 25. Collection support • Collection • Set • Unordered, no duplicates • List • Ordered, allow duplicates • Map • Keys and associated values ©2012 DataStax 25
  • 26. Collection support CREATE TABLE example ( id uuid PRIMARY KEY, tags set<text>, points list<int>, attributes map<text, text> ); • Collections are typed, but cannot be nested(no list<list<text>>) • No secondary index on collections ©2012 DataStax 26
  • 27. Collection support INSERT INTO example (id, tags, points, attributes) VALUES ( ‘62c36092-82a1-3a00-93d1-46196ee77204’, {‘foo’, ‘bar’, ‘baz’}, // set [100, 20, 93], // list {‘abc’: ‘def’} // map ); ©2012 DataStax 27
  • 28. Collection support • Set UPDATE example SET tags = tags + {‘qux’} WHERE ... UPDATE example SET tags = tags - {‘foo’} WHERE ... • List UPDATE example SET points = points + [20, 30] WHERE ... UPDATE example SET points = points - [100] WHERE ... • Map UPDATE example SET attributes[‘ghi’] = ‘jkl’ WHERE ... DELETE attributes[‘abc’] FROM example WHERE ... ©2012 DataStax 28
  • 29. Collection support SELECT tags, points, attributes FROM example; tags | points | attributes -----------------+---------------+-------------- {baz, foo, bar} | [100, 20, 93] | {abc: def} • You cannot retrieve item in collection individually ©2012 DataStax 29
  • 30. Collection support • Each element in collection is internally stored as one Cassandra column • More on dev blog • http://www.datastax.com/dev/blog/cql3_collections ©2012 DataStax 30
  • 32. Native Transport • CQL3 still goes through Thrift’s execute_cql3_query API • Native Transport support introduces Cassandra’s original binary protocol • Async IO, server event push, ... • http://www.datastax.com/dev/blog/binary-protocol • Try DataStax Java native driver with C* 1.2 beta today! • https://github.com/datastax/java-driver ©2012 DataStax 32
  • 33. Question ? Or contact me later if you have one yuki@datastax.com yukim (IRC, twitter) Now Hiring talented engineers from all over the world! ©2012 DataStax 33