SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
Modeling the IoT with TitanDB and Cassandra
Intro
• Ted Wilmes
• Data warehouse engineer at WellAware - wellaware.us
• Building a SaaS oil and gas production monitoring and
analytics platform
• Collect production O&G data from the field via cellular,
satellite, and other means and deliver to our customers via
mobile and browser clients
© 2015. All Rights Reserved. 2
© 2015. All Rights Reserved.
1 The property graph model and TitanDB
2 Modeling IoT
3 Time series and performance
Property graph model
© 2015. All Rights Reserved. 4
person
name: Ted
person
name: George
knows
metOn: June 1,2012
Querying with Gremlin
© 2015. All Rights Reserved. 5
person
name: Ted
person
name: George
knows
metOn: June 1,2012
g.V().hasLabel(“person”).has(“name”, “Ted”).out(“knows”).values(“name”)
> George
TitanDB
• Graph database that supports pluggable storage layers
• Designed from the ground up to provide OLTP
performance against large graphs with a particular
focus on supporting high degree vertices (vertices with
many edges)
• Implements Apache TinkerPop 3 APIs
• Cassandra acts as solid foundation providing high
availability, performance, and ease of operation
© 2015. All Rights Reserved. 6
Our Internet of Things
© 2015. All Rights Reserved. 7
Things
People
Organizations
Places
Time
A hypothetical use case: IoT…
© 2015. All Rights Reserved. 8
in SPACE
© 2015. All Rights Reserved. 9
Spaceship
Mars
Base
Space
Station
Rocket
Satellite
© 2015. All Rights Reserved. 10
Many dimensions
© 2015. All Rights Reserved. 11
Rocket
Starfleet
Acme
Rockets
Delta
Booster
operates
builds
isModel
Major Tom
pilots Joyce
maintains
Many times, a “thing” is a system of systems
© 2015. All Rights Reserved. 12
http://stardust.jpl.nasa.gov/mission/delta2.html
Rocket
1st Stage 2nd Stage
3rd Stage
Interstage
Fuel Tank
Oxidizer
Guidance
Electronics
CPU Memory
© 2015. All Rights Reserved. 13
Guidance
Electronics
CPU
Memory JVM
Heap
Usage
Thread
Count
Continuing to zoom in
Heap
Usage
© 2015. All Rights Reserved. 14
JVM
Thread
Count
Alarm
Alarm
Condition
Joyce
triggers
notifies
monitors
© 2015. All Rights Reserved. 15
Major Tom
Alarm
Alarm
Condition
Joyce
triggers
notifies
reports to
Starfleet
employs
employs
IoT modeling in summary
• Things can be interconnecting systems of other
things
• High fidelity model of ‘reality’ supports wide
variety of use cases vs. a disconnected set of
entities
• IoT app is really only one part about things,
don’t forget to include everything else! (social,
organizational, etc.)
© 2015. All Rights Reserved. 16
Time series & Performance
© 2015. All Rights Reserved. 17
© 2015. All Rights Reserved. 18
Guidance
Electronics
CPU
Memory JVM
Heap
Usage
Thread
Count
© 2015. All Rights Reserved. 19
Time series in Titan
Heap
Usage
JVM
?
© 2015. All Rights Reserved. 20
Our basic time series requirements
• Support a large volume of low latency
writes
• Low latency retrieval on primarily the most
recent data
© 2015. All Rights Reserved. 21
A selection of factors affecting Titan performance
• Titan deployment topology and configuration
• All your usual Cassandra tuning tips and tricks
• Titan JVM tuning
• selection of appropriate garbage collector
• GC parameters
• like Cassandra, worthwhile to adjust NewSize
• Data modeling
• Indexing
• Global graph indices (native Titan vs. external)
• Vertex centric indices
• Titan different caches - transaction cache & the database-level cache
© 2015. All Rights Reserved. 22
A selection of factors affecting Titan performance
• Titan deployment topology and configuration
• All your usual Cassandra tuning tips and tricks
• Titan JVM tuning
• selection of appropriate garbage collector
• GC parameters
• like Cassandra, worthwhile to adjust NewSize
• Data modeling
• Indexing
• Global graph indices
• Vertex centric indices
• Titan different caches - transaction cache & the database-level cache
Deployment options
© 2015. All Rights Reserved. 23
mars-north-1
Local
Embedded
Remote
© 2015. All Rights Reserved. 24
© 2015. All Rights Reserved. 25
© 2015. All Rights Reserved. 26
But first, time series with CQL
* Brady Gentile - https://academy.datastax.com/demos/getting-started-time-series-data-modeling
© 2015. All Rights Reserved. 27
But first, time series with CQL
* Brady Gentile - https://academy.datastax.com/demos/getting-started-time-series-data-modeling
CQL
© 2015. All Rights Reserved. 28
First approach
Heap
Usage
Chunk
chunkStart: 1442880000000

chunkEnd: 1442966400000
Chunk
chunkStart: 1442966400000

chunkEnd: 1442966400000
Observation Observation
tstamp: 1442880000001 tstamp: 1442880000002
• Intuitive and easy to query
• You can imagine adding further
levels to the hierarchy following
a year->month->day format
• Individual observations can be
associated with other pieces of
data
• Observations can be filtered by
timestamp with edge filter but
you still have to retrieve a large
number of disparate vertices
© 2015. All Rights Reserved. 29
A further refinement
Heap
Usage
Chunk
chunkStart: 1442880000000

chunkEnd: 1442966400000
Chunk
chunkStart: 1442966400000

chunkEnd: 1442966400000
• How do we reduce the number
of vertices (think Cassandra
partitions) that we need to
retrieve?
© 2015. All Rights Reserved. 30
timestamp value
1. Move all properties to the edge
2. Make the edge “undirected”
or, a combo of the two approaches
1. Copy the properties to the edge
2. Keep the discrete observation
vertex
Chunk
tstamp
value
Heap
Usage
© 2015. All Rights Reserved. 31
Chunk vertex with its observations
Vertex ID chunkStart chunkEnd obs. @ t2 obs. @ t1 obs. @ t0
Observations in time descending order
© 2015. All Rights Reserved. 32
Sample Gremlin queries
• observations > 1442162072000
• chunk.outE().has(“tstamp”, gt(1442162072000))
• observations between 1442162072000 and 1442162073000
• chunk.outE().has(“tstamp”, between(1442162072000, 1442162073000))
• Most recent observation before now
• chunk.outE().has(“tstamp”, lte(System.currentTimeMillis()).
order().by(“tstamp”, decr).limit(1)
• You can wrap this in your own time series specific API
• new SeriesQuery(series1).interval(startTstamp, endTstamp).decr().limit(1)
© 2015. All Rights Reserved. 33
Pros and cons vs. separate CQL or other tsdb
• Pros
• Allows for a single unified view of your IoT data, maintaining
direct connectivity between sensor data & the other entities
• Gremlin works well for processing streams of time series
data
• Cons
• Storage format is not as compact
• Extra overhead of managing ‘chunks’ versus CQL primary
key taking care of that for us (eg. chunk cache)
© 2015. All Rights Reserved. 34
Heap
Usage
Chunk
label: hasChunk
chunkStart: 1442880000000

chunkEnd: 1442966400000
Chunk
label: hasChunk
chunkStart: 1442966400000

chunkEnd: 1442966400000
A simple query - retrieve all the heap usage chunks
gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’)
==> 1442880000000
==> 1442966400000
© 2015. All Rights Reserved. 35
Getting a vertex by id
gremlin> g.V(4)
==>v[4]
Yes
Does this vertex exist?
Vertex is now loaded in Titan
transaction cache
© 2015. All Rights Reserved. 36
Aside - a tool of the trade
Profiler with socket tracing
© 2015. All Rights Reserved. 37
© 2015. All Rights Reserved. 38
© 2015. All Rights Reserved. 39
Retrieving properties
gremlin> g.V(4).valueMap()
==>[sensorType:[heap usage], units:[bytes]]
Two properties
Retrieve properties
Vertex properties are now loaded
in the Titan transaction cache
© 2015. All Rights Reserved. 40
2 Round trips
Two properties
Retrieve properties
Yes
Does this vertex exist?
• Not a big deal for single
vertex lookup with
property retrieval but
can add up
• Exacerbated by
magnitude of latency
between Titan and
Cassandra
© 2015. All Rights Reserved. 41
Querying for adjacent vertices
gremlin>
g.V(4).out(‘hasChunk’).values(‘chunkStart’)
==> 1442880000000
==> 1442966400000
Does this vertex exist?
Get 1st chunk properties
Get edges
Get 2nd chunk properties
© 2015. All Rights Reserved. 42
Batch requests
Does this vertex exist?
Get 1st chunk properties
Get 2nd chunk properties
Get edges
Get 2nd chunk properties
• query.batch = true
• “Whether traversal queries should
be batched when executed against
the storage backend. This can lead
to significant performance
improvement if there is a non-trivial
latency to the backend.” - http://
s3.thinkaurelius.com/docs/titan/0.9.0-M2/titan-config-ref.html
gremlin>
g.V(4).out(‘hasChunk’).values(‘chunkStart’)
==> 1442880000000
==> 1442966400000
© 2015. All Rights Reserved. 43
Remove initial exists query
• storage.batch-loading = true
• WARNING - this disables
vertex ‘exists’ checks
gremlin>
g.V(4).out(‘hasChunk’).values(‘chunkStart’)
==> 1442880000000
==> 1442966400000
Does this vertex exist?
Get 1st chunk properties
Get 2nd chunk properties
Get edges
Get 2nd chunk properties
© 2015. All Rights Reserved. 44
Optimizing your write
gremlin> chunk.addEdge(“hasObservation”, chunk, “tstamp”, 1442162072000, “value”, 500.123)
Does this vertex exist?
Write new edge
© 2015. All Rights Reserved. 45
Optimizing your writes
gremlin> chunk.addEdge(“hasObservation”, chunk, “tstamp”, 1442162072000, “value”, 500.123)
Does this vertex exist?
Write new edge
• Remove the read from your write
path - storage.batch-loading =
true
• batch your commits, measure
latency and throughput on your
system to find a good commit size
© 2015. All Rights Reserved. 46
storage.batch-loading=false
© 2015. All Rights Reserved. 47
storage.batch-loading=true
© 2015. All Rights Reserved. 48
Quick and dirty write performance numbers
wps
0
22,500
45,000
67,500
90,000
• 9 m3.2xlarge nodes w/ C* 2.2, RF = 3, writing @ quorum, default C* settings
• 1 m3.2xlarge “client” w/ Titan 1.0-SNAPSHOT, 10 write threads writing 100
million points in total across 100,000 series
© 2015. All Rights Reserved. 49
In summary
• Understanding of underlying data storage format can help with performance
tuning
• Writes
• remove reads from the write path where possible
• test different batch commit sizes
• when writing vertices you may need to adjust ids.block-size and
ids.renew-percentage
• Reads
• batch communication between Titan and Cassandra with
query.batch=true
• make use of global and vertex centric indices when possible
What questions do you have and thanks!
Thanks to the Apache TinkerPop,TitanDB team, my awesome
coworkers, and the folks at DataStax for putting on an excellent
summit!
Ted Wilmes
Data Warehouse Engineer
@trwilmes
tedwilmes@wellaware.us
© 2015. All Rights Reserved. 50
Thank you

Contenu connexe

Tendances

DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormDataStax
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendscodecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendsDataStax Academy
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesSigmoid
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...DataStax
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
DataStax: Rigorous Cassandra Data Modeling for the Relational Data Architect
DataStax: Rigorous Cassandra Data Modeling for the Relational Data ArchitectDataStax: Rigorous Cassandra Data Modeling for the Relational Data Architect
DataStax: Rigorous Cassandra Data Modeling for the Relational Data ArchitectDataStax Academy
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexesDaniel Lemire
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraRobbie Strickland
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Jelena Zanko
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...DataStax
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016DataStax
 
Real time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesosReal time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesosRahul Kumar
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon RedshiftJeff Patti
 
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...DataStax
 
From stream to recommendation using apache beam with cloud pubsub and cloud d...
From stream to recommendation using apache beam with cloud pubsub and cloud d...From stream to recommendation using apache beam with cloud pubsub and cloud d...
From stream to recommendation using apache beam with cloud pubsub and cloud d...Neville Li
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 

Tendances (20)

DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
 
Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backendscodecentric AG: Using Cassandra and Clojure for Data Crunching backends
codecentric AG: Using Cassandra and Clojure for Data Crunching backends
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time Series
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
DataStax: Rigorous Cassandra Data Modeling for the Relational Data Architect
DataStax: Rigorous Cassandra Data Modeling for the Relational Data ArchitectDataStax: Rigorous Cassandra Data Modeling for the Relational Data Architect
DataStax: Rigorous Cassandra Data Modeling for the Relational Data Architect
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexes
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
Real time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesosReal time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesos
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon Redshift
 
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
 
From stream to recommendation using apache beam with cloud pubsub and cloud d...
From stream to recommendation using apache beam with cloud pubsub and cloud d...From stream to recommendation using apache beam with cloud pubsub and cloud d...
From stream to recommendation using apache beam with cloud pubsub and cloud d...
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 

En vedette

Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...
TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...
TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...Miguel Gea
 
Cassandra(no sql)によるシステム提案と開発
Cassandra(no sql)によるシステム提案と開発Cassandra(no sql)によるシステム提案と開発
Cassandra(no sql)によるシステム提案と開発kishimotosc
 
Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発
Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発
Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発kishimotosc
 
Apache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-DataApache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-DataGuido Schmutz
 
MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...
MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...
MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...MongoDB
 
GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...
GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...
GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...Neo4j
 
Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~
Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~
Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~kishimotosc
 
Using spark for timeseries graph analytics
Using spark for timeseries graph analyticsUsing spark for timeseries graph analytics
Using spark for timeseries graph analyticsSigmoid
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks DataWorks Summit/Hadoop Summit
 
An indoor location aware system for an io t-based smart museum
An indoor location aware system for an io t-based smart museumAn indoor location aware system for an io t-based smart museum
An indoor location aware system for an io t-based smart museumieeepondy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeAmazon Web Services
 
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...Amazon Web Services
 

En vedette (16)

Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...
TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...
TOWARDS SMART & INCLUSIVE SOCIETY: BUILDING 3D IMMERSIVE MUSEUM BY CHILDREN W...
 
Cassandra(no sql)によるシステム提案と開発
Cassandra(no sql)によるシステム提案と開発Cassandra(no sql)によるシステム提案と開発
Cassandra(no sql)によるシステム提案と開発
 
Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発
Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発
Cassandraのトランザクションサポート化 & web2pyによるcms用プラグイン開発
 
Apache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-DataApache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-Data
 
MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...
MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...
MongoDB IoT City Tour LONDON: Why your Dad's database won't work for IoT. Joe...
 
GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...
GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...
GraphConnect Europe 2016 - IoT - where do Graphs fit with Business Requiremen...
 
Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~
Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~
Devsumi2013【15-e-5】NoSQLの野心的な使い方 ~Apache Cassandra編~
 
Using spark for timeseries graph analytics
Using spark for timeseries graph analyticsUsing spark for timeseries graph analytics
Using spark for timeseries graph analytics
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
 
IOT Trend and Solution Development in Taiwan
IOT Trend and Solution Development in TaiwanIOT Trend and Solution Development in Taiwan
IOT Trend and Solution Development in Taiwan
 
An indoor location aware system for an io t-based smart museum
An indoor location aware system for an io t-based smart museumAn indoor location aware system for an io t-based smart museum
An indoor location aware system for an io t-based smart museum
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-time
 
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...
Using AWS to Build a Graph-Based Product Recommendation System (BDT303) | AWS...
 
Cassandra and IoT
Cassandra and IoTCassandra and IoT
Cassandra and IoT
 

Similaire à Modeling the IoT with TitanDB and Cassandra

Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightDataStax Academy
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaDataStax Academy
 
CA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Technologies
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...DataStax
 
MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...
MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...
MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...MongoDB
 
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...Databricks
 
Turnkey Riak KV Cluster
Turnkey Riak KV ClusterTurnkey Riak KV Cluster
Turnkey Riak KV ClusterJoe Olson
 
The Flink - Apache Bigtop integration
The Flink - Apache Bigtop integrationThe Flink - Apache Bigtop integration
The Flink - Apache Bigtop integrationMárton Balassi
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerMonica Beckwith
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Adrian Cockcroft
 
Observer, a "real life" time series application
Observer, a "real life" time series applicationObserver, a "real life" time series application
Observer, a "real life" time series applicationKévin LOVATO
 
Chicago DevOps Meetup Nov2019
Chicago DevOps Meetup Nov2019Chicago DevOps Meetup Nov2019
Chicago DevOps Meetup Nov2019Mike Villiger
 
Viavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxViavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxmani723
 
Real-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesReal-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesDatabricks
 
Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospectc-bslim
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceLN Renganarayana
 
Durability Simulator Design for OpenStack Swift
Durability Simulator Design for OpenStack SwiftDurability Simulator Design for OpenStack Swift
Durability Simulator Design for OpenStack SwiftKota Tsuyuzaki
 
Intel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalIntel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalDeepak Mane
 
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...Motoki Kakinuma
 

Similaire à Modeling the IoT with TitanDB and Cassandra (20)

Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Tsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in ChinaTsinghua University: Two Exemplary Applications in China
Tsinghua University: Two Exemplary Applications in China
 
CA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and Better
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...
MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...
MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR Fro...
 
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
 
Turnkey Riak KV Cluster
Turnkey Riak KV ClusterTurnkey Riak KV Cluster
Turnkey Riak KV Cluster
 
The Flink - Apache Bigtop integration
The Flink - Apache Bigtop integrationThe Flink - Apache Bigtop integration
The Flink - Apache Bigtop integration
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
 
Observer, a "real life" time series application
Observer, a "real life" time series applicationObserver, a "real life" time series application
Observer, a "real life" time series application
 
Chicago DevOps Meetup Nov2019
Chicago DevOps Meetup Nov2019Chicago DevOps Meetup Nov2019
Chicago DevOps Meetup Nov2019
 
Viavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxViavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptx
 
Real-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesReal-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on Kubernetes
 
Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospect
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
 
Durability Simulator Design for OpenStack Swift
Durability Simulator Design for OpenStack SwiftDurability Simulator Design for OpenStack Swift
Durability Simulator Design for OpenStack Swift
 
Intel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalIntel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-final
 
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
Kirin User Story: Migrating Mission Critical Applications to OpenStack Privat...
 

Dernier

Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Copilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotCopilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotEdgard Alejos
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainAbdul Ahad
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 

Dernier (20)

Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Copilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotCopilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform Copilot
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software Domain
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 

Modeling the IoT with TitanDB and Cassandra

  • 1. Modeling the IoT with TitanDB and Cassandra
  • 2. Intro • Ted Wilmes • Data warehouse engineer at WellAware - wellaware.us • Building a SaaS oil and gas production monitoring and analytics platform • Collect production O&G data from the field via cellular, satellite, and other means and deliver to our customers via mobile and browser clients © 2015. All Rights Reserved. 2
  • 3. © 2015. All Rights Reserved. 1 The property graph model and TitanDB 2 Modeling IoT 3 Time series and performance
  • 4. Property graph model © 2015. All Rights Reserved. 4 person name: Ted person name: George knows metOn: June 1,2012
  • 5. Querying with Gremlin © 2015. All Rights Reserved. 5 person name: Ted person name: George knows metOn: June 1,2012 g.V().hasLabel(“person”).has(“name”, “Ted”).out(“knows”).values(“name”) > George
  • 6. TitanDB • Graph database that supports pluggable storage layers • Designed from the ground up to provide OLTP performance against large graphs with a particular focus on supporting high degree vertices (vertices with many edges) • Implements Apache TinkerPop 3 APIs • Cassandra acts as solid foundation providing high availability, performance, and ease of operation © 2015. All Rights Reserved. 6
  • 7. Our Internet of Things © 2015. All Rights Reserved. 7 Things People Organizations Places Time
  • 8. A hypothetical use case: IoT… © 2015. All Rights Reserved. 8 in SPACE
  • 9. © 2015. All Rights Reserved. 9 Spaceship Mars Base Space Station Rocket Satellite
  • 10. © 2015. All Rights Reserved. 10
  • 11. Many dimensions © 2015. All Rights Reserved. 11 Rocket Starfleet Acme Rockets Delta Booster operates builds isModel Major Tom pilots Joyce maintains
  • 12. Many times, a “thing” is a system of systems © 2015. All Rights Reserved. 12 http://stardust.jpl.nasa.gov/mission/delta2.html Rocket 1st Stage 2nd Stage 3rd Stage Interstage Fuel Tank Oxidizer Guidance Electronics CPU Memory
  • 13. © 2015. All Rights Reserved. 13 Guidance Electronics CPU Memory JVM Heap Usage Thread Count Continuing to zoom in
  • 14. Heap Usage © 2015. All Rights Reserved. 14 JVM Thread Count Alarm Alarm Condition Joyce triggers notifies monitors
  • 15. © 2015. All Rights Reserved. 15 Major Tom Alarm Alarm Condition Joyce triggers notifies reports to Starfleet employs employs
  • 16. IoT modeling in summary • Things can be interconnecting systems of other things • High fidelity model of ‘reality’ supports wide variety of use cases vs. a disconnected set of entities • IoT app is really only one part about things, don’t forget to include everything else! (social, organizational, etc.) © 2015. All Rights Reserved. 16
  • 17. Time series & Performance © 2015. All Rights Reserved. 17
  • 18. © 2015. All Rights Reserved. 18 Guidance Electronics CPU Memory JVM Heap Usage Thread Count
  • 19. © 2015. All Rights Reserved. 19 Time series in Titan Heap Usage JVM ?
  • 20. © 2015. All Rights Reserved. 20 Our basic time series requirements • Support a large volume of low latency writes • Low latency retrieval on primarily the most recent data
  • 21. © 2015. All Rights Reserved. 21 A selection of factors affecting Titan performance • Titan deployment topology and configuration • All your usual Cassandra tuning tips and tricks • Titan JVM tuning • selection of appropriate garbage collector • GC parameters • like Cassandra, worthwhile to adjust NewSize • Data modeling • Indexing • Global graph indices (native Titan vs. external) • Vertex centric indices • Titan different caches - transaction cache & the database-level cache
  • 22. © 2015. All Rights Reserved. 22 A selection of factors affecting Titan performance • Titan deployment topology and configuration • All your usual Cassandra tuning tips and tricks • Titan JVM tuning • selection of appropriate garbage collector • GC parameters • like Cassandra, worthwhile to adjust NewSize • Data modeling • Indexing • Global graph indices • Vertex centric indices • Titan different caches - transaction cache & the database-level cache
  • 23. Deployment options © 2015. All Rights Reserved. 23 mars-north-1 Local Embedded Remote
  • 24. © 2015. All Rights Reserved. 24
  • 25. © 2015. All Rights Reserved. 25
  • 26. © 2015. All Rights Reserved. 26 But first, time series with CQL * Brady Gentile - https://academy.datastax.com/demos/getting-started-time-series-data-modeling
  • 27. © 2015. All Rights Reserved. 27 But first, time series with CQL * Brady Gentile - https://academy.datastax.com/demos/getting-started-time-series-data-modeling CQL
  • 28. © 2015. All Rights Reserved. 28 First approach Heap Usage Chunk chunkStart: 1442880000000
 chunkEnd: 1442966400000 Chunk chunkStart: 1442966400000
 chunkEnd: 1442966400000 Observation Observation tstamp: 1442880000001 tstamp: 1442880000002 • Intuitive and easy to query • You can imagine adding further levels to the hierarchy following a year->month->day format • Individual observations can be associated with other pieces of data • Observations can be filtered by timestamp with edge filter but you still have to retrieve a large number of disparate vertices
  • 29. © 2015. All Rights Reserved. 29 A further refinement Heap Usage Chunk chunkStart: 1442880000000
 chunkEnd: 1442966400000 Chunk chunkStart: 1442966400000
 chunkEnd: 1442966400000 • How do we reduce the number of vertices (think Cassandra partitions) that we need to retrieve?
  • 30. © 2015. All Rights Reserved. 30 timestamp value 1. Move all properties to the edge 2. Make the edge “undirected” or, a combo of the two approaches 1. Copy the properties to the edge 2. Keep the discrete observation vertex Chunk tstamp value Heap Usage
  • 31. © 2015. All Rights Reserved. 31 Chunk vertex with its observations Vertex ID chunkStart chunkEnd obs. @ t2 obs. @ t1 obs. @ t0 Observations in time descending order
  • 32. © 2015. All Rights Reserved. 32 Sample Gremlin queries • observations > 1442162072000 • chunk.outE().has(“tstamp”, gt(1442162072000)) • observations between 1442162072000 and 1442162073000 • chunk.outE().has(“tstamp”, between(1442162072000, 1442162073000)) • Most recent observation before now • chunk.outE().has(“tstamp”, lte(System.currentTimeMillis()). order().by(“tstamp”, decr).limit(1) • You can wrap this in your own time series specific API • new SeriesQuery(series1).interval(startTstamp, endTstamp).decr().limit(1)
  • 33. © 2015. All Rights Reserved. 33 Pros and cons vs. separate CQL or other tsdb • Pros • Allows for a single unified view of your IoT data, maintaining direct connectivity between sensor data & the other entities • Gremlin works well for processing streams of time series data • Cons • Storage format is not as compact • Extra overhead of managing ‘chunks’ versus CQL primary key taking care of that for us (eg. chunk cache)
  • 34. © 2015. All Rights Reserved. 34 Heap Usage Chunk label: hasChunk chunkStart: 1442880000000
 chunkEnd: 1442966400000 Chunk label: hasChunk chunkStart: 1442966400000
 chunkEnd: 1442966400000 A simple query - retrieve all the heap usage chunks gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000
  • 35. © 2015. All Rights Reserved. 35 Getting a vertex by id gremlin> g.V(4) ==>v[4] Yes Does this vertex exist? Vertex is now loaded in Titan transaction cache
  • 36. © 2015. All Rights Reserved. 36 Aside - a tool of the trade Profiler with socket tracing
  • 37. © 2015. All Rights Reserved. 37
  • 38. © 2015. All Rights Reserved. 38
  • 39. © 2015. All Rights Reserved. 39 Retrieving properties gremlin> g.V(4).valueMap() ==>[sensorType:[heap usage], units:[bytes]] Two properties Retrieve properties Vertex properties are now loaded in the Titan transaction cache
  • 40. © 2015. All Rights Reserved. 40 2 Round trips Two properties Retrieve properties Yes Does this vertex exist? • Not a big deal for single vertex lookup with property retrieval but can add up • Exacerbated by magnitude of latency between Titan and Cassandra
  • 41. © 2015. All Rights Reserved. 41 Querying for adjacent vertices gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000 Does this vertex exist? Get 1st chunk properties Get edges Get 2nd chunk properties
  • 42. © 2015. All Rights Reserved. 42 Batch requests Does this vertex exist? Get 1st chunk properties Get 2nd chunk properties Get edges Get 2nd chunk properties • query.batch = true • “Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend.” - http:// s3.thinkaurelius.com/docs/titan/0.9.0-M2/titan-config-ref.html gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000
  • 43. © 2015. All Rights Reserved. 43 Remove initial exists query • storage.batch-loading = true • WARNING - this disables vertex ‘exists’ checks gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000 Does this vertex exist? Get 1st chunk properties Get 2nd chunk properties Get edges Get 2nd chunk properties
  • 44. © 2015. All Rights Reserved. 44 Optimizing your write gremlin> chunk.addEdge(“hasObservation”, chunk, “tstamp”, 1442162072000, “value”, 500.123) Does this vertex exist? Write new edge
  • 45. © 2015. All Rights Reserved. 45 Optimizing your writes gremlin> chunk.addEdge(“hasObservation”, chunk, “tstamp”, 1442162072000, “value”, 500.123) Does this vertex exist? Write new edge • Remove the read from your write path - storage.batch-loading = true • batch your commits, measure latency and throughput on your system to find a good commit size
  • 46. © 2015. All Rights Reserved. 46 storage.batch-loading=false
  • 47. © 2015. All Rights Reserved. 47 storage.batch-loading=true
  • 48. © 2015. All Rights Reserved. 48 Quick and dirty write performance numbers wps 0 22,500 45,000 67,500 90,000 • 9 m3.2xlarge nodes w/ C* 2.2, RF = 3, writing @ quorum, default C* settings • 1 m3.2xlarge “client” w/ Titan 1.0-SNAPSHOT, 10 write threads writing 100 million points in total across 100,000 series
  • 49. © 2015. All Rights Reserved. 49 In summary • Understanding of underlying data storage format can help with performance tuning • Writes • remove reads from the write path where possible • test different batch commit sizes • when writing vertices you may need to adjust ids.block-size and ids.renew-percentage • Reads • batch communication between Titan and Cassandra with query.batch=true • make use of global and vertex centric indices when possible
  • 50. What questions do you have and thanks! Thanks to the Apache TinkerPop,TitanDB team, my awesome coworkers, and the folks at DataStax for putting on an excellent summit! Ted Wilmes Data Warehouse Engineer @trwilmes tedwilmes@wellaware.us © 2015. All Rights Reserved. 50