SlideShare une entreprise Scribd logo
1  sur  62
Télécharger pour lire hors ligne
Introduction to Cassandra
Nick Bailey
@nickmbailey

Monday, October 28, 13
Who am I?
©2012 DataStax
Monday, October 28, 13

2
What’s DataStax?
©2012 DataStax
Monday, October 28, 13

3
On to the good stuff!
©2012 DataStax
Monday, October 28, 13

4
Why Cassandra?
Cluster Architecture
Node Architecture
5

Data Modeling
Wrap up
©2012 DataStax
Monday, October 28, 13
Why Cassandra?
©2012 DataStax
Monday, October 28, 13

6
Time for buzz
words!

©2012 DataStax
Monday, October 28, 13

Big Data!
NoSQL!

7
Big Data
• Gartner: “...high-volume, high-velocity and
high-variety...”

• 2 sides of ‘big data’
•
•

©2012 DataStax
Monday, October 28, 13

Analytics
Real-time

8
NoSQL
• A terrible label
• Covers a wide range of DBs
•
•
•
•
•

©2012 DataStax
Monday, October 28, 13

Cassandra
Redis
MongoDB
HBase
...

9
Started by Facebook

©2012 DataStax
Monday, October 28, 13

10
Dynamo (Amazon)
+
Big Table (Google)

©2012 DataStax
Monday, October 28, 13

11
©2012 DataStax
Monday, October 28, 13

12
Cassandra is great for...
• Massive, linear scaling

(e.g. CERN hadron collider, Barracuda Networks)

• Extremely heavy writes

(e.g. BlueMountain Capital – financial tick data)

• High availability

(e.g. eBay, Eventbrite, Netflix, SoundCloud,
HeathCare Anytime, Comcast, GoDaddy, Sony
Entertainment Network)

©2012 DataStax
Monday, October 28, 13

13
©2012 DataStax
Monday, October 28, 13

14
©2012 DataStax
Monday, October 28, 13

15
http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html
©2012 DataStax
Monday, October 28, 13

16
9
One size does not fit all
Polyglot persistence

©2012 DataStax
Monday, October 28, 13

17
More Resources
• PlanetCassandra.org
• Blog
• 5 minute interviews

©2012 DataStax
Monday, October 28, 13

18
Cluster Architecture
©2012 DataStax
Monday, October 28, 13

19
Data Distribution
0

75

25

50
Hash_Function(Partition Key) >> Token
©2012 DataStax
Monday, October 28, 13
Replication

©2012 DataStax
Monday, October 28, 13
Failure Modes

©2012 DataStax
Monday, October 28, 13
Consistency Level
• Multiple options
•
•
•
•
•

ONE
QUORUM
ALL
LOCAL_QUORUM
...

• Can be specified per request

©2012 DataStax
Monday, October 28, 13

23
Quorum

©2012 DataStax
Monday, October 28, 13
Quorum

©2012 DataStax
Monday, October 28, 13
Consistency
Write
CL: ONE

©2012 DataStax
Monday, October 28, 13
Consistency
Read
CL: One

©2012 DataStax
Monday, October 28, 13
Failure Types
• UnavailableException
•

Didn’t even try

•

Possible success or failure

• TimedOutException

©2012 DataStax
Monday, October 28, 13

28
Multi DC

©2012 DataStax
Monday, October 28, 13
Gossip
• Manages cluster state
•
•

Nodes up/down
Nodes joining/leaving

• Decentralized

©2012 DataStax
Monday, October 28, 13

30
Snitch
• Responsible for determining cluster topology
• Tracks node responsiveness
• Simple, PropertyFile, Ec2Snitch, etc...

©2012 DataStax
Monday, October 28, 13

31
Node Architecture
©2012 DataStax
Monday, October 28, 13

32
Write Path
Write

Memtable

Memory
Disk

commit log

©2012 DataStax
Monday, October 28, 13

SSTable

33
Read Path
Read

Memtable

Memory
Disk

SSTable

©2012 DataStax
Monday, October 28, 13

SSTable

34
Data Modeling
©2012 DataStax
Monday, October 28, 13

35
CQL
Cassandra Query Language

©2012 DataStax
Monday, October 28, 13

36
Terminology
• Keyspace
• Table (Column Family)
• Row
• Column
• Partition Key
• Clustering Key (Optional)

©2012 DataStax
Monday, October 28, 13

37
For Example:
CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' :
'SimpleStrategy', 'replication_factor' : 1 };
CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' :
'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2};
CREATE TABLE events (
package_id text,
status_timestamp timestamp,
location text,
notes text,
PRIMARY KEY (package_id, status_timestamp)
);

©2012 DataStax
Monday, October 28, 13

38
Constructs

©2012 DataStax
Monday, October 28, 13

39
Basic Data Types
• blob
• int
• text
• long
• uuid
• etc

©2012 DataStax
Monday, October 28, 13

40
More Data Modeling Constructs
• Collections
•

map, set, list

• Time to live (TTL)
• Counters
• Secondary Indexes

©2012 DataStax
Monday, October 28, 13

41
Approaching Data Modeling
• Model your queries, not your data
•

Optimize your data model for reads

• Don’t be afraid to denormalize
• You will get it wrong, iterate

©2012 DataStax
Monday, October 28, 13

42
An Example:
User Logins

©2012 DataStax
Monday, October 28, 13

43
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

©2012 DataStax
Monday, October 28, 13

44
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Partition Key

©2012 DataStax
Monday, October 28, 13

45
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Clustering Key

©2012 DataStax
Monday, October 28, 13

Partition Key

46
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Clustering Key

©2012 DataStax
Monday, October 28, 13

Partition Key
Additional Columns

47
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Clustering Key

Partition Key
Additional Columns

CREATE COLUMN FAMILY logins (
	 user text,
time timestamp,
location text,
PRIMARY KEY (user, time));

©2012 DataStax
Monday, October 28, 13

48
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;
CREATE COLUMN FAMILY logins (
	 user text,
time timestamp,
location text,
PRIMARY KEY (user, time));
Partition key

Primary key

User

Time

Location

nickmbailey

2013-07-19 09:22:18

Austin, Texas

nickmbailey

2013-07-19 14:49:27

Blacksburg, Virginia

jsmith

2013-07-20 07:59:34

Atlanta, Georgia

©2012 DataStax
Monday, October 28, 13

49
Time-series data
• By far, the most common data model
• Event logs
• Metrics
• Sensor Data
• Etc

©2012 DataStax
Monday, October 28, 13

50
Another Query
When was the last time nickmbailey logged in from San
Francisco, California?
SELECT time FROM logins WHERE user = ‘nickmbailey’ and
location=‘San Francisco, California’;
User

Time

Location

nickmbailey

2013-07-19 09:22:18

Austin, Texas

nickmbailey

2013-07-19 14:49:27

Blacksburg, Virginia

nickmbailey

2013-07-19 14:49:27

Austin, Texas

nickmbailey

2013-05-19 14:49:27

Austin, Texas

nickmbailey

2013-04-19 14:49:27

San Francisco, California

...

...

...

jsmith

2013-07-20 07:59:34

Atlanta, Georgia

©2012 DataStax
Monday, October 28, 13

51
Another Query
When was the last time nickmbailey logged in from Austin,
Texas?
SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’
and location=‘San Francisco, California’;
CREATE COLUMN FAMILY logins_by_location (
user text,
time timestamp,
location text,
PRIMARY KEY (user, location));

©2012 DataStax
Monday, October 28, 13

52
Another Query
When was the last time nickmbailey logged in from Austin,
Texas?
SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’
and location=‘San Francisco, California’;
CREATE COLUMN FAMILY logins_by_location (
user text,
time timestamp,
location text,
PRIMARY KEY (user, location));
User

Location

Time

nickmbailey

Austin, Texas

2013-07-19 09:22:18

nickmbailey

Blacksburg, Virginia

2013-07-19 14:49:27

nickmbailey

San Francisco, California

2013-07-19 14:49:27

©2012 DataStax
Monday, October 28, 13

53
Denormalize
• Create materialized views of the same data to
support different queries

• Storage space is cheap, Cassandra is fast

©2012 DataStax
Monday, October 28, 13

54
Debugging your data model
cqlsh> tracing on;
Now tracing requests.
cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example');
Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9
activity
| timestamp
| source
| source_elapsed
-------------------------------------+--------------+-----------+---------------execute_cql3_query | 00:02:37,015 | 127.0.0.1 |
0
Parsing statement | 00:02:37,015 | 127.0.0.1 |
81
Preparing statement | 00:02:37,015 | 127.0.0.1 |
273
Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 |
540
Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 |
779
Messsage received from /127.0.0.1
Applying mutation
Acquiring switchLock
Appending to commitlog
Adding to memtable
Enqueuing response to /127.0.0.1
Sending message to /127.0.0.1

©2012 DataStax
Monday, October 28, 13

|
|
|
|
|
|
|

00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016

|
|
|
|
|
|
|

127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2

|
|
|
|
|
|
|

63
220
250
277
378
710
888
55
A note on Transactions
• In general, you want to construct your data
model around them

• The latest version of Cassandra has ‘Compare
and swap’

•
•
•

©2012 DataStax
Monday, October 28, 13

An implementation of Paxos
...IF NOT EXISTS;
...IF column1 = ‘value’;

56
Try it out
©2012 DataStax
Monday, October 28, 13

57
CCM
• CCM - Cassandra Cluster Manager
•

https://github.com/pcmanus/ccm

•
•
•

ccm create test -v 2.0.1
ccm populate -n 3
ccm start

• Warning: not lightweight
• Example:

©2012 DataStax
Monday, October 28, 13

58
Clients
• Cqlsh
•

Bundled with Cassandra

•
•
•
•

java: https://github.com/datastax/java-driver
python: https://github.com/datastax/python-driver
.net: https://github.com/datastax/csharp-driver
and more: http://www.datastax.com/download/
clientdrivers

• Drivers

©2012 DataStax
Monday, October 28, 13

59
Get Help
• IRC: #cassandra on freenode
• Mailing Lists
• Stack Overflow
• DataStax Docs
•

©2012 DataStax
Monday, October 28, 13

http://www.datastax.com/docs

60
Questions?
©2012 DataStax
Monday, October 28, 13

61
Monday, October 28, 13

Contenu connexe

Tendances

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for HumansCitus Data
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuningngupt28
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introductionleanderlee2
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...DataStax
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMongoDB
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDatabricks
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An IntroductionSmita Prasad
 
PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and BenchmarksJignesh Shah
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 

Tendances (20)

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for Humans
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuning
 
Postgresql
PostgresqlPostgresql
Postgresql
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introduction
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
 
PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and Benchmarks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 

En vedette

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraAran Deltac
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in CassandraEd Anuff
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsDave Gardner
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraDataStax Academy
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Modelebenhewitt
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyBenjamin Black
 

En vedette (20)

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 

Similaire à Introduction to Cassandra Basics

Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modelingnickmbailey
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data modelPatrick McFadin
 
An Introduction to Cassandra on Linux
An Introduction to Cassandra on LinuxAn Introduction to Cassandra on Linux
An Introduction to Cassandra on Linuxnickmbailey
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!Dave Stokes
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101DataStax Academy
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101DataStax Academy
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101DataStax Academy
 
springdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdfspringdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdfssuser0562f1
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP ConferenceDave Stokes
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruTim Callaghan
 
CFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for HadoopCFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for HadoopDataStax Academy
 
CFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for HadoopCFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for Hadoopnickmbailey
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ishDave Stokes
 
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerCassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerDataStax
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax
 

Similaire à Introduction to Cassandra Basics (20)

Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modeling
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
An Introduction to Cassandra on Linux
An Introduction to Cassandra on LinuxAn Introduction to Cassandra on Linux
An Introduction to Cassandra on Linux
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
 
springdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdfspringdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdf
 
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
 
CFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for HadoopCFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for Hadoop
 
CFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for HadoopCFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for Hadoop
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
Bonjour, iCloud
Bonjour, iCloudBonjour, iCloud
Bonjour, iCloud
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
 
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerCassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
 

Dernier

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Dernier (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Introduction to Cassandra Basics

  • 1. Introduction to Cassandra Nick Bailey @nickmbailey Monday, October 28, 13
  • 2. Who am I? ©2012 DataStax Monday, October 28, 13 2
  • 4. On to the good stuff! ©2012 DataStax Monday, October 28, 13 4
  • 5. Why Cassandra? Cluster Architecture Node Architecture 5 Data Modeling Wrap up ©2012 DataStax Monday, October 28, 13
  • 7. Time for buzz words! ©2012 DataStax Monday, October 28, 13 Big Data! NoSQL! 7
  • 8. Big Data • Gartner: “...high-volume, high-velocity and high-variety...” • 2 sides of ‘big data’ • • ©2012 DataStax Monday, October 28, 13 Analytics Real-time 8
  • 9. NoSQL • A terrible label • Covers a wide range of DBs • • • • • ©2012 DataStax Monday, October 28, 13 Cassandra Redis MongoDB HBase ... 9
  • 10. Started by Facebook ©2012 DataStax Monday, October 28, 13 10
  • 11. Dynamo (Amazon) + Big Table (Google) ©2012 DataStax Monday, October 28, 13 11
  • 13. Cassandra is great for... • Massive, linear scaling (e.g. CERN hadron collider, Barracuda Networks) • Extremely heavy writes (e.g. BlueMountain Capital – financial tick data) • High availability (e.g. eBay, Eventbrite, Netflix, SoundCloud, HeathCare Anytime, Comcast, GoDaddy, Sony Entertainment Network) ©2012 DataStax Monday, October 28, 13 13
  • 17. One size does not fit all Polyglot persistence ©2012 DataStax Monday, October 28, 13 17
  • 18. More Resources • PlanetCassandra.org • Blog • 5 minute interviews ©2012 DataStax Monday, October 28, 13 18
  • 20. Data Distribution 0 75 25 50 Hash_Function(Partition Key) >> Token ©2012 DataStax Monday, October 28, 13
  • 23. Consistency Level • Multiple options • • • • • ONE QUORUM ALL LOCAL_QUORUM ... • Can be specified per request ©2012 DataStax Monday, October 28, 13 23
  • 28. Failure Types • UnavailableException • Didn’t even try • Possible success or failure • TimedOutException ©2012 DataStax Monday, October 28, 13 28
  • 30. Gossip • Manages cluster state • • Nodes up/down Nodes joining/leaving • Decentralized ©2012 DataStax Monday, October 28, 13 30
  • 31. Snitch • Responsible for determining cluster topology • Tracks node responsiveness • Simple, PropertyFile, Ec2Snitch, etc... ©2012 DataStax Monday, October 28, 13 31
  • 33. Write Path Write Memtable Memory Disk commit log ©2012 DataStax Monday, October 28, 13 SSTable 33
  • 36. CQL Cassandra Query Language ©2012 DataStax Monday, October 28, 13 36
  • 37. Terminology • Keyspace • Table (Column Family) • Row • Column • Partition Key • Clustering Key (Optional) ©2012 DataStax Monday, October 28, 13 37
  • 38. For Example: CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2}; CREATE TABLE events ( package_id text, status_timestamp timestamp, location text, notes text, PRIMARY KEY (package_id, status_timestamp) ); ©2012 DataStax Monday, October 28, 13 38
  • 40. Basic Data Types • blob • int • text • long • uuid • etc ©2012 DataStax Monday, October 28, 13 40
  • 41. More Data Modeling Constructs • Collections • map, set, list • Time to live (TTL) • Counters • Secondary Indexes ©2012 DataStax Monday, October 28, 13 41
  • 42. Approaching Data Modeling • Model your queries, not your data • Optimize your data model for reads • Don’t be afraid to denormalize • You will get it wrong, iterate ©2012 DataStax Monday, October 28, 13 42
  • 43. An Example: User Logins ©2012 DataStax Monday, October 28, 13 43
  • 44. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; ©2012 DataStax Monday, October 28, 13 44
  • 45. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Partition Key ©2012 DataStax Monday, October 28, 13 45
  • 46. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Clustering Key ©2012 DataStax Monday, October 28, 13 Partition Key 46
  • 47. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Clustering Key ©2012 DataStax Monday, October 28, 13 Partition Key Additional Columns 47
  • 48. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Clustering Key Partition Key Additional Columns CREATE COLUMN FAMILY logins ( user text, time timestamp, location text, PRIMARY KEY (user, time)); ©2012 DataStax Monday, October 28, 13 48
  • 49. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; CREATE COLUMN FAMILY logins ( user text, time timestamp, location text, PRIMARY KEY (user, time)); Partition key Primary key User Time Location nickmbailey 2013-07-19 09:22:18 Austin, Texas nickmbailey 2013-07-19 14:49:27 Blacksburg, Virginia jsmith 2013-07-20 07:59:34 Atlanta, Georgia ©2012 DataStax Monday, October 28, 13 49
  • 50. Time-series data • By far, the most common data model • Event logs • Metrics • Sensor Data • Etc ©2012 DataStax Monday, October 28, 13 50
  • 51. Another Query When was the last time nickmbailey logged in from San Francisco, California? SELECT time FROM logins WHERE user = ‘nickmbailey’ and location=‘San Francisco, California’; User Time Location nickmbailey 2013-07-19 09:22:18 Austin, Texas nickmbailey 2013-07-19 14:49:27 Blacksburg, Virginia nickmbailey 2013-07-19 14:49:27 Austin, Texas nickmbailey 2013-05-19 14:49:27 Austin, Texas nickmbailey 2013-04-19 14:49:27 San Francisco, California ... ... ... jsmith 2013-07-20 07:59:34 Atlanta, Georgia ©2012 DataStax Monday, October 28, 13 51
  • 52. Another Query When was the last time nickmbailey logged in from Austin, Texas? SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’ and location=‘San Francisco, California’; CREATE COLUMN FAMILY logins_by_location ( user text, time timestamp, location text, PRIMARY KEY (user, location)); ©2012 DataStax Monday, October 28, 13 52
  • 53. Another Query When was the last time nickmbailey logged in from Austin, Texas? SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’ and location=‘San Francisco, California’; CREATE COLUMN FAMILY logins_by_location ( user text, time timestamp, location text, PRIMARY KEY (user, location)); User Location Time nickmbailey Austin, Texas 2013-07-19 09:22:18 nickmbailey Blacksburg, Virginia 2013-07-19 14:49:27 nickmbailey San Francisco, California 2013-07-19 14:49:27 ©2012 DataStax Monday, October 28, 13 53
  • 54. Denormalize • Create materialized views of the same data to support different queries • Storage space is cheap, Cassandra is fast ©2012 DataStax Monday, October 28, 13 54
  • 55. Debugging your data model cqlsh> tracing on; Now tracing requests. cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example'); Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 activity | timestamp | source | source_elapsed -------------------------------------+--------------+-----------+---------------execute_cql3_query | 00:02:37,015 | 127.0.0.1 | 0 Parsing statement | 00:02:37,015 | 127.0.0.1 | 81 Preparing statement | 00:02:37,015 | 127.0.0.1 | 273 Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 Messsage received from /127.0.0.1 Applying mutation Acquiring switchLock Appending to commitlog Adding to memtable Enqueuing response to /127.0.0.1 Sending message to /127.0.0.1 ©2012 DataStax Monday, October 28, 13 | | | | | | | 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 | | | | | | | 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 | | | | | | | 63 220 250 277 378 710 888 55
  • 56. A note on Transactions • In general, you want to construct your data model around them • The latest version of Cassandra has ‘Compare and swap’ • • • ©2012 DataStax Monday, October 28, 13 An implementation of Paxos ...IF NOT EXISTS; ...IF column1 = ‘value’; 56
  • 57. Try it out ©2012 DataStax Monday, October 28, 13 57
  • 58. CCM • CCM - Cassandra Cluster Manager • https://github.com/pcmanus/ccm • • • ccm create test -v 2.0.1 ccm populate -n 3 ccm start • Warning: not lightweight • Example: ©2012 DataStax Monday, October 28, 13 58
  • 59. Clients • Cqlsh • Bundled with Cassandra • • • • java: https://github.com/datastax/java-driver python: https://github.com/datastax/python-driver .net: https://github.com/datastax/csharp-driver and more: http://www.datastax.com/download/ clientdrivers • Drivers ©2012 DataStax Monday, October 28, 13 59
  • 60. Get Help • IRC: #cassandra on freenode • Mailing Lists • Stack Overflow • DataStax Docs • ©2012 DataStax Monday, October 28, 13 http://www.datastax.com/docs 60