SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
©2013 DataStax Confidential. Do not distribute without consent.
@PatrickMcFadin
Patrick McFadin

Chief Evangelist
Data Modeling Time Series
1
Internet Of Things
• 15B devices by 2015
• 40B devices by 2020!
Why Cassandra for Time Series
Scales
Resilient
Good data model
Efficient Storage Model
What about that?
Example 1: Weather Station
• Weather station collects data
• Cassandra stores in sequence
• Application reads in sequence
Use case
• Store data per weather station
• Store time series in order: first to last
• Get all data for one weather station
• Get data for a single date and time
• Get data for a range of dates and times
Needed Queries
Data Model to support queries
Data Model
• Weather Station Id and Time
are unique
• Store as many as needed
CREATE TABLE temperature (
weatherstation_id text,
event_time timestamp,
temperature text,
PRIMARY KEY (weatherstation_id,event_time)
);
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:01:00','72F');
!
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:02:00','73F');
!
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:03:00','73F');
!
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:04:00','74F');
Storage Model - Logical View
2013-04-03 07:01:00
72F
2013-04-03 07:02:00
73F
2013-04-03 07:03:00
73F
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';
1234ABCD
1234ABCD
1234ABCD
weatherstation_id event_time temperature
2013-04-03 07:04:00
74F
1234ABCD
Storage Model - Disk Layout
2013-04-03 07:01:00
72F
2013-04-03 07:02:00
73F
2013-04-03 07:03:00
73F
1234ABCD
2013-04-03 07:04:00
74F
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';
Merged, Sorted and Stored Sequentially
2013-04-03 07:05:00!
!
74F
2013-04-03 07:06:00!
!
75F
Query patterns
• Range queries
• “Slice” operation on disk
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD'
AND event_time >= '2013-04-03 07:01:00'
AND event_time <= '2013-04-03 07:04:00';
2013-04-03 07:01:00
72F
2013-04-03 07:02:00
73F
2013-04-03 07:03:00
73F
1234ABCD
2013-04-03 07:04:00
74F
2013-04-03 07:05:00!
!
74F
2013-04-03 07:06:00!
!
75F
Single seek on disk
Query patterns
• Range queries
• “Slice” operation on disk
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD'
AND event_time >= '2013-04-03 07:01:00'
AND event_time <= '2013-04-03 07:04:00';
2013-04-03 07:01:00
72F
2013-04-03 07:02:00
73F
2013-04-03 07:03:00
73F
1234ABCD
2013-04-03 07:04:00
74F
weatherstation_id event_time temperature
1234ABCD
1234ABCD
1234ABCD
Programmers like this
Sorted by event_time
Additional help on the storage engine
SSTable seeks
• Each read minimum
1 seek
• Cache and bloom
filter help minimize
Total seek time = Disk Latency * number of seeks
The key to speed
Use the first part of the primary key to get the node
(data localization)
Minimize seeks for SStables
(Key Cache, Bloom Filter)
Find the data fast in the SSTable
(Indexes)
Min/Max Value Hint
• New since 2.0
• Range index on primary key values per SSTable
• Minimizes seeks on range data
CASSANDRA-5514 if you are interested in details
SELECT temperature
FROM event_time,temperature
WHERE weatherstation_id='1234ABCD'
AND event_time > '2013-04-03 07:01:00'
AND event_time < '2013-04-03 07:04:00';
Row Key: 1234ABCD
Min event_time: 2013-04-01 00:00:00
Max event_time: 2013-04-04 23:59:59
Row Key: 1234ABCD
Min event_time: 2013-04-05 00:00:00
Max event_time: 2013-04-09 23:59:59
Row Key: 1234ABCD
Min event_time: 2013-03-27 00:00:00
Max event_time: 2013-03-31 23:59:59
?
This one
Ingestion models
• Apache Kafka
• Apache Flume
• Storm
• Custom Applications
Apache Kafka
Your totally!
killer!
application
Kafka + Storm
• Kafka provides reliable queuing
• Storm processes (rollups, counts)
• Cassandra stores at the same speed
• Storm lookup on Cassandra
Apache Kafka
Apache Storm
Queue Process Store
Flume
• Source accepts data
• Channel buffers data
• Sink processes and stores
• Popular for log processing
Sink
Channel
Source
Application
Load
Balancer
Syslog
Dealing with data at speed
• 1 million writes per second?
• 1 insert every microsecond
• Collisions?
• Primary Key determines node
placement
• Random partitioning
• Special data type - TimeUUID
Your totally!
killer!
application weatherstation_id='1234ABCD'
weatherstation_id='5678EFGH'
How does data replicate?
Primary key determines placement*
Partitioning
jim age: 36 car: camaro gender: M
carol age: 37 car: subaru gender: F
johnny age:12 gender: M
suzy age:10 gender: F
jim
carol
johnny
suzy
PK
5e02739678...
a9a0198010...
f4eb27cea7...
78b421309e...
MD5 Hash
MD5* hash
operation yields
a 128-bit
number for keys
of any size.
Key Hashing
Node A
Node D Node C
Node B
The Token Ring
jim 5e02739678...
carol a9a0198010...
johnny f4eb27cea7...
suzy 78b421309e...
Start End
A 0xc000000000..1 0x0000000000..0
B 0x0000000000..1 0x4000000000..0
C 0x4000000000..1 0x8000000000..0
D 0x8000000000..1 0xc000000000..0
jim 5e02739678...
carol a9a0198010...
johnny f4eb27cea7...
suzy 78b421309e...
Start End
A 0xc000000000..1 0x0000000000..0
B 0x0000000000..1 0x4000000000..0
C 0x4000000000..1 0x8000000000..0
D 0x8000000000..1 0xc000000000..0
jim 5e02739678...
carol a9a0198010...
johnny f4eb27cea7...
suzy 78b421309e...
Start End
A 0xc000000000..1 0x0000000000..0
B 0x0000000000..1 0x4000000000..0
C 0x4000000000..1 0x8000000000..0
D 0x8000000000..1 0xc000000000..0
jim 5e02739678...
carol a9a0198010...
johnny f4eb27cea7...
suzy 78b421309e...
Start End
A 0xc000000000..1 0x0000000000..0
B 0x0000000000..1 0x4000000000..0
C 0x4000000000..1 0x8000000000..0
D 0x8000000000..1 0xc000000000..0
jim 5e02739678...
carol a9a0198010...
johnny f4eb27cea7...
suzy 78b421309e...
Start End
A 0xc000000000..1 0x0000000000..0
B 0x0000000000..1 0x4000000000..0
C 0x4000000000..1 0x8000000000..0
D 0x8000000000..1 0xc000000000..0
Node A
Node D Node C
Node B
carol a9a0198010...
Replication
Node A
Node D Node C
Node B
carol a9a0198010...
Replication
Node A
Node D Node C
Node B
carol a9a0198010...
Replication
Replication factor = 3
Consistency is a
different topic for
later
TimeUUID
• Also known as a Version 1 UUID
• Sortable
• Reversible
Timestamp to Microsecond + UUID = TimeUUID
04d580b0-9412-11e3-baa8-0800200c9a66 Wednesday, February 12, 2014 6:18:06 PM GMT
http://www.famkruithof.net/uuid/uuidgen
=
Example 2: Financial Transactions
• Trading of stocks
• When did they happen?
• Massive speeds and volumes
“Sirca, a non-profit university consortium based in Sydney, is the world’s biggest broker of
financial data, ingesting into its database 2million pieces of information a second from every
major trading exchange.”*
* http://www.theage.com.au/it-pro/business-it/help-poverty-theres-an-app-for-that-20140120-hv948.html
Use case
• Store data per symbol and date
• Store time series in reverse order: last to first
• Make sure every transaction is unique
• Get all trades for symbol and day
• Get trade for a single date and time
• Get last 10 trades for symbol and date
Needed Queries
Data Model to support queries
Data Model
• date is int of days since epoch
• timeuuid keeps it unique
• Reverse the times for later
queries
CREATE TABLE stock_ticks (
symbol text,
date int,
trade timeuuid,
trade_details text,
PRIMARY KEY ((symbol, date), trade)
) WITH CLUSTERING ORDER BY (trade DESC);
INSERT INTO stock_ticks(symbol, date, trade, trade_details)
VALUES (‘NFLX’,340,04d580b0-1431-1e33-baf8-0833200c98a6,'BUY:2000');
!
INSERT INTO stock_ticks(symbol, date, trade, trade_details)
VALUES (‘NFLX’,340,05d580b0-6472-1ef3-a3a8-0430200c9a66,'BUY:300');
!
INSERT INTO stock_ticks(symbol, date, trade, trade_details)
VALUES (‘NFLX’,340,02d580b0-9412-d223-55a8-0976200c9a25,'SELL:450');
!
INSERT INTO stock_ticks(symbol, date, trade, trade_details)
VALUES (‘NFLX’,340,08d580b0-4482-11e3-5fd3-3421200c9a65,'SELL:3000');
Storage Model - Logical View
08d580b0-4482-11e3-5fd3-
3421200c9a65
SELL:3000
02d580b0-9412-
d223-55a8-0976200c9a25
SELL:450
05d580b0-6472-1ef3-
a3a8-0430200c9a66
BUY:300
SELECT trade,trade_details
FROM stock_ticks
WHERE symbol =‘NFLX’ AND date=‘340’;
NFLX:340
NFLX:340
NFLX:340
symbol:date trade trade_details
04d580b0-1431-1e33-
baf8-0833200c98a6
BUY:2000
NFLX:340
04d580b0-1431-1e33-
baf8-0833200c98a6
05d580b0-6472-1ef3-
a3a8-0430200c9a66
02d580b0-9412-d223-55a8
BUY:2000BUY:300
08d580b0-4482-11e3-5fd3-
3421200c9a65
SELL:3000 SELL:450
Storage Model - Disk Layout
NFLX:340
Order is from last trade to first
SELECT trade,trade_details
FROM stock_ticks
WHERE symbol =‘NFLX’ AND date=‘340’;
04d580b0-1431-1e33-
baf8-0833200c98a6
05d580b0-6472-1ef3-
a3a8-0430200c9a66
02d580b0-9412-
d223-55a8-0976200c9a25
Query patterns
• Limit queries
• Get last X trades
From here
SELECT trade,trade_details
FROM stock_ticks
WHERE symbol =‘NFLX’ AND date=‘340’
LIMIT 3;
BUY:2000BUY:300
08d580b0-4482-11e3-5fd3-
3421200c9a65
SELL:3000 SELL:450
NFLX:340
to here
Query patterns
Reverse sorted by trade
Last 3 trades
08d580b0-4482-11e3-5fd3-
3421200c9a65
SELL:3000
02d580b0-9412-
d223-55a8-0976200c9a25
SELL:450
05d580b0-6472-1ef3-
a3a8-0430200c9a66
BUY:300
NFLX:340
NFLX:340
NFLX:340
symbol:date trade trade_details
• Limit queries
• Get last X trades
SELECT trade,trade_details
FROM stock_ticks
WHERE symbol =‘NFLX’ AND date=‘340’
LIMIT 3;
Way more examples
• 5 minute interviews
• Use cases
• Free training!
!
www.planetcassandra.org
Thank You!
Follow me for more updates all the time: @PatrickMcFadin

Contenu connexe

Tendances

Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced previewPatrick McFadin
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Markus Klems
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015StampedeCon
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerDataStax
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...StampedeCon
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 

Tendances (20)

Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015Cassandra 3.0 - JSON at scale - StampedeCon 2015
Cassandra 3.0 - JSON at scale - StampedeCon 2015
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basics
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 

Similaire à Cassandra Time Series Data Modeling for IoT and Financial Use Cases

Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...it-people
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2DataStax Academy
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
Extra performance out of thin air
Extra performance out of thin airExtra performance out of thin air
Extra performance out of thin airKonstantine Krutiy
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101DataStax Academy
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101DataStax Academy
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101DataStax Academy
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databasejavier ramirez
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkChristopher Batey
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireDataStax
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraDataStax Academy
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
 
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...Databricks
 
Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015Speedment, Inc.
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC OsloDavid Pilato
 
Managing your black friday logs - Code Europe
Managing your black friday logs - Code EuropeManaging your black friday logs - Code Europe
Managing your black friday logs - Code EuropeDavid Pilato
 

Similaire à Cassandra Time Series Data Modeling for IoT and Financial Use Cases (20)

Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
 
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Extra performance out of thin air
Extra performance out of thin airExtra performance out of thin air
Extra performance out of thin air
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
 
MySQL vs. MonetDB
MySQL vs. MonetDBMySQL vs. MonetDB
MySQL vs. MonetDB
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on Fire
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for Cassandra
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
WSO2 Complex Event Processor
WSO2 Complex Event ProcessorWSO2 Complex Event Processor
WSO2 Complex Event Processor
 
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
Analyzing 2TB of Raw Trace Data from a Manufacturing Process: A First Use Cas...
 
Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015Speedment & Sencha at Oracle Open World 2015
Speedment & Sencha at Oracle Open World 2015
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
 
Managing your black friday logs - Code Europe
Managing your black friday logs - Code EuropeManaging your black friday logs - Code Europe
Managing your black friday logs - Code Europe
 

Plus de DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph DatabasesDataStax Academy
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkDataStax Academy
 

Plus de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 

Cassandra Time Series Data Modeling for IoT and Financial Use Cases

  • 1. ©2013 DataStax Confidential. Do not distribute without consent. @PatrickMcFadin Patrick McFadin
 Chief Evangelist Data Modeling Time Series 1
  • 2. Internet Of Things • 15B devices by 2015 • 40B devices by 2020!
  • 3. Why Cassandra for Time Series Scales Resilient Good data model Efficient Storage Model What about that?
  • 4. Example 1: Weather Station • Weather station collects data • Cassandra stores in sequence • Application reads in sequence
  • 5. Use case • Store data per weather station • Store time series in order: first to last • Get all data for one weather station • Get data for a single date and time • Get data for a range of dates and times Needed Queries Data Model to support queries
  • 6. Data Model • Weather Station Id and Time are unique • Store as many as needed CREATE TABLE temperature ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) ); INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:01:00','72F'); ! INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:02:00','73F'); ! INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:03:00','73F'); ! INSERT INTO temperature(weatherstation_id,event_time,temperature) VALUES ('1234ABCD','2013-04-03 07:04:00','74F');
  • 7. Storage Model - Logical View 2013-04-03 07:01:00 72F 2013-04-03 07:02:00 73F 2013-04-03 07:03:00 73F SELECT weatherstation_id,event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD'; 1234ABCD 1234ABCD 1234ABCD weatherstation_id event_time temperature 2013-04-03 07:04:00 74F 1234ABCD
  • 8. Storage Model - Disk Layout 2013-04-03 07:01:00 72F 2013-04-03 07:02:00 73F 2013-04-03 07:03:00 73F 1234ABCD 2013-04-03 07:04:00 74F SELECT weatherstation_id,event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD'; Merged, Sorted and Stored Sequentially 2013-04-03 07:05:00! ! 74F 2013-04-03 07:06:00! ! 75F
  • 9. Query patterns • Range queries • “Slice” operation on disk SELECT weatherstation_id,event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD' AND event_time >= '2013-04-03 07:01:00' AND event_time <= '2013-04-03 07:04:00'; 2013-04-03 07:01:00 72F 2013-04-03 07:02:00 73F 2013-04-03 07:03:00 73F 1234ABCD 2013-04-03 07:04:00 74F 2013-04-03 07:05:00! ! 74F 2013-04-03 07:06:00! ! 75F Single seek on disk
  • 10. Query patterns • Range queries • “Slice” operation on disk SELECT weatherstation_id,event_time,temperature FROM temperature WHERE weatherstation_id='1234ABCD' AND event_time >= '2013-04-03 07:01:00' AND event_time <= '2013-04-03 07:04:00'; 2013-04-03 07:01:00 72F 2013-04-03 07:02:00 73F 2013-04-03 07:03:00 73F 1234ABCD 2013-04-03 07:04:00 74F weatherstation_id event_time temperature 1234ABCD 1234ABCD 1234ABCD Programmers like this Sorted by event_time
  • 11. Additional help on the storage engine
  • 12. SSTable seeks • Each read minimum 1 seek • Cache and bloom filter help minimize Total seek time = Disk Latency * number of seeks
  • 13. The key to speed Use the first part of the primary key to get the node (data localization) Minimize seeks for SStables (Key Cache, Bloom Filter) Find the data fast in the SSTable (Indexes)
  • 14. Min/Max Value Hint • New since 2.0 • Range index on primary key values per SSTable • Minimizes seeks on range data CASSANDRA-5514 if you are interested in details SELECT temperature FROM event_time,temperature WHERE weatherstation_id='1234ABCD' AND event_time > '2013-04-03 07:01:00' AND event_time < '2013-04-03 07:04:00'; Row Key: 1234ABCD Min event_time: 2013-04-01 00:00:00 Max event_time: 2013-04-04 23:59:59 Row Key: 1234ABCD Min event_time: 2013-04-05 00:00:00 Max event_time: 2013-04-09 23:59:59 Row Key: 1234ABCD Min event_time: 2013-03-27 00:00:00 Max event_time: 2013-03-31 23:59:59 ? This one
  • 15. Ingestion models • Apache Kafka • Apache Flume • Storm • Custom Applications Apache Kafka Your totally! killer! application
  • 16. Kafka + Storm • Kafka provides reliable queuing • Storm processes (rollups, counts) • Cassandra stores at the same speed • Storm lookup on Cassandra Apache Kafka Apache Storm Queue Process Store
  • 17. Flume • Source accepts data • Channel buffers data • Sink processes and stores • Popular for log processing Sink Channel Source Application Load Balancer Syslog
  • 18. Dealing with data at speed • 1 million writes per second? • 1 insert every microsecond • Collisions? • Primary Key determines node placement • Random partitioning • Special data type - TimeUUID Your totally! killer! application weatherstation_id='1234ABCD' weatherstation_id='5678EFGH'
  • 19. How does data replicate?
  • 20. Primary key determines placement* Partitioning jim age: 36 car: camaro gender: M carol age: 37 car: subaru gender: F johnny age:12 gender: M suzy age:10 gender: F
  • 22. Node A Node D Node C Node B The Token Ring
  • 23. jim 5e02739678... carol a9a0198010... johnny f4eb27cea7... suzy 78b421309e... Start End A 0xc000000000..1 0x0000000000..0 B 0x0000000000..1 0x4000000000..0 C 0x4000000000..1 0x8000000000..0 D 0x8000000000..1 0xc000000000..0
  • 24. jim 5e02739678... carol a9a0198010... johnny f4eb27cea7... suzy 78b421309e... Start End A 0xc000000000..1 0x0000000000..0 B 0x0000000000..1 0x4000000000..0 C 0x4000000000..1 0x8000000000..0 D 0x8000000000..1 0xc000000000..0
  • 25. jim 5e02739678... carol a9a0198010... johnny f4eb27cea7... suzy 78b421309e... Start End A 0xc000000000..1 0x0000000000..0 B 0x0000000000..1 0x4000000000..0 C 0x4000000000..1 0x8000000000..0 D 0x8000000000..1 0xc000000000..0
  • 26. jim 5e02739678... carol a9a0198010... johnny f4eb27cea7... suzy 78b421309e... Start End A 0xc000000000..1 0x0000000000..0 B 0x0000000000..1 0x4000000000..0 C 0x4000000000..1 0x8000000000..0 D 0x8000000000..1 0xc000000000..0
  • 27. jim 5e02739678... carol a9a0198010... johnny f4eb27cea7... suzy 78b421309e... Start End A 0xc000000000..1 0x0000000000..0 B 0x0000000000..1 0x4000000000..0 C 0x4000000000..1 0x8000000000..0 D 0x8000000000..1 0xc000000000..0
  • 28. Node A Node D Node C Node B carol a9a0198010... Replication
  • 29. Node A Node D Node C Node B carol a9a0198010... Replication
  • 30. Node A Node D Node C Node B carol a9a0198010... Replication Replication factor = 3 Consistency is a different topic for later
  • 31. TimeUUID • Also known as a Version 1 UUID • Sortable • Reversible Timestamp to Microsecond + UUID = TimeUUID 04d580b0-9412-11e3-baa8-0800200c9a66 Wednesday, February 12, 2014 6:18:06 PM GMT http://www.famkruithof.net/uuid/uuidgen =
  • 32. Example 2: Financial Transactions • Trading of stocks • When did they happen? • Massive speeds and volumes “Sirca, a non-profit university consortium based in Sydney, is the world’s biggest broker of financial data, ingesting into its database 2million pieces of information a second from every major trading exchange.”* * http://www.theage.com.au/it-pro/business-it/help-poverty-theres-an-app-for-that-20140120-hv948.html
  • 33. Use case • Store data per symbol and date • Store time series in reverse order: last to first • Make sure every transaction is unique • Get all trades for symbol and day • Get trade for a single date and time • Get last 10 trades for symbol and date Needed Queries Data Model to support queries
  • 34. Data Model • date is int of days since epoch • timeuuid keeps it unique • Reverse the times for later queries CREATE TABLE stock_ticks ( symbol text, date int, trade timeuuid, trade_details text, PRIMARY KEY ((symbol, date), trade) ) WITH CLUSTERING ORDER BY (trade DESC); INSERT INTO stock_ticks(symbol, date, trade, trade_details) VALUES (‘NFLX’,340,04d580b0-1431-1e33-baf8-0833200c98a6,'BUY:2000'); ! INSERT INTO stock_ticks(symbol, date, trade, trade_details) VALUES (‘NFLX’,340,05d580b0-6472-1ef3-a3a8-0430200c9a66,'BUY:300'); ! INSERT INTO stock_ticks(symbol, date, trade, trade_details) VALUES (‘NFLX’,340,02d580b0-9412-d223-55a8-0976200c9a25,'SELL:450'); ! INSERT INTO stock_ticks(symbol, date, trade, trade_details) VALUES (‘NFLX’,340,08d580b0-4482-11e3-5fd3-3421200c9a65,'SELL:3000');
  • 35. Storage Model - Logical View 08d580b0-4482-11e3-5fd3- 3421200c9a65 SELL:3000 02d580b0-9412- d223-55a8-0976200c9a25 SELL:450 05d580b0-6472-1ef3- a3a8-0430200c9a66 BUY:300 SELECT trade,trade_details FROM stock_ticks WHERE symbol =‘NFLX’ AND date=‘340’; NFLX:340 NFLX:340 NFLX:340 symbol:date trade trade_details 04d580b0-1431-1e33- baf8-0833200c98a6 BUY:2000 NFLX:340
  • 36. 04d580b0-1431-1e33- baf8-0833200c98a6 05d580b0-6472-1ef3- a3a8-0430200c9a66 02d580b0-9412-d223-55a8 BUY:2000BUY:300 08d580b0-4482-11e3-5fd3- 3421200c9a65 SELL:3000 SELL:450 Storage Model - Disk Layout NFLX:340 Order is from last trade to first SELECT trade,trade_details FROM stock_ticks WHERE symbol =‘NFLX’ AND date=‘340’;
  • 37. 04d580b0-1431-1e33- baf8-0833200c98a6 05d580b0-6472-1ef3- a3a8-0430200c9a66 02d580b0-9412- d223-55a8-0976200c9a25 Query patterns • Limit queries • Get last X trades From here SELECT trade,trade_details FROM stock_ticks WHERE symbol =‘NFLX’ AND date=‘340’ LIMIT 3; BUY:2000BUY:300 08d580b0-4482-11e3-5fd3- 3421200c9a65 SELL:3000 SELL:450 NFLX:340 to here
  • 38. Query patterns Reverse sorted by trade Last 3 trades 08d580b0-4482-11e3-5fd3- 3421200c9a65 SELL:3000 02d580b0-9412- d223-55a8-0976200c9a25 SELL:450 05d580b0-6472-1ef3- a3a8-0430200c9a66 BUY:300 NFLX:340 NFLX:340 NFLX:340 symbol:date trade trade_details • Limit queries • Get last X trades SELECT trade,trade_details FROM stock_ticks WHERE symbol =‘NFLX’ AND date=‘340’ LIMIT 3;
  • 39. Way more examples • 5 minute interviews • Use cases • Free training! ! www.planetcassandra.org
  • 40. Thank You! Follow me for more updates all the time: @PatrickMcFadin