SlideShare une entreprise Scribd logo
1  sur  38
#MongoDB 
The Weather of the Century: 
Design and High Performance 
André Spiegel 
Consulting Engineer, MongoDB
What was the weather 
when you were born?
Data Format: Raw and in MongoDB 
0303725053947282013060322517+40779-073969FM-15+0048KNYC 
V0309999C00005030485MN0080475N5+02115+02005100975 
ADDAA101000095AU100001015AW1105GA1025+016765999GA2045+024385999 
GA3075+030485999GD11991+0167659GD22991+0243859GD33991+0304859... 
{ 
"st" : "u725053", 
"ts" : ISODate("2013-06-03T22:51:00Z"), 
"airTemperature" : { 
"value" : 21.1, 
"quality" : "5" 
}, 
"atmosphericPressure" : { 
"value" : 1009.7, 
"quality" : "5" 
} 
}
Data Format: Raw and in MongoDB 
0303725053947282013060322517+40779-073969FM-15+0048KNYC 
V0309999C00005030485MN0080475N5+02115+02005100975 
ADDAA101000095AU100001015AW1105GA1025+016765999GA2045+024385999 
GA3075+030485999GD11991+0167659GD22991+0243859GD33991+0304859... 
{ 
"st" : "u725053", 
"ts" : ISODate("2013-06-03T22:51:00Z"), 
"airTemperature" : { 
"value" : 21.1, 
"quality" : "5" 
}, 
"atmosphericPressure" : { 
"value" : 1009.7, 
"quality" : "5" 
} 
} 
Station Identifier 
(»NYC Central Park«)
How Big Is It? 
• 2.5 billion data points 
• 4 Terabyte (1.6k per document) 
• “moderately big”
How to do this with MongoDB?
First Deployment 
• A single server with a really big disk 
Application mongod 
i2.8xlarge 
251 GB RAM 
6 TB SSD 
c3.8xlarge
Second Deployment 
• A really big cluster where everything is in RAM 
Application / mongos 
... 
100 x r3.2xlarge 
@ 
61 GB RAM 
100 GB disk 
mongod 
c3.8xlarge
Second Deployment 
• A really big cluster where everything is in RAM 
Application / mongos 
... 
100 x r3.2xlarge 
@ 
61 GB RAM 
100 GB disk 
mongod
Now... how much would you pay? 
..
Now... how much would you pay? 
$60,000 / yr 
..
Now... how much would you pay? 
$60,000 / yr 
.. 
$700,000 / yr
Use Cases 
• Bulk loading 
– getting all data into the system 
• Latency and throughput for queries 
– point in space-time 
– one station, one year 
– the whole world, once upon a time 
• Aggregation and Exploration 
– warmest and coldest day ever, etc.
Bulk Loading: Principles 
• On the application side: 
– batch size 
– number of client threads 
– use unordered bulk writes 
• On the server side: 
– Journaling off ( temporarily! ) 
– Index later 
– In cluster: pre-split, no balancing
Bulk Loading: Single Server 
batch 
size 
threads 
through 
put
Bulk Loading: Single Server 
batch 
size 
threads 
through 
put 
8 threads, 
batch size 100 
→ 85,000 doc/s
Bulk Loading: Single Server 
• Settings: 8 threads 
batch size 100 
• Total loading time: 10 h 20 min 
• Documents per second: 70,000 
• Index build time: 7 h 40 min (ts_1_st_1)
Bulk Loading: Cluster
Bulk Loading: Cluster 
144 threads, 
batch size 200 
→ 220,000 doc/s
Bulk Loading: Cluster 
• Shard Key: Station ID, hashed 
• Settings: 10 mongos @ 144 
threads 
batch size 200 
• Total loading time: 3 h 10 min 
• Documents per second: 228,000 
• Index build time: 5 min (ts_1_st_1)
Queries: Point in Space-Time 
db.data.find({"st" : "u747940", 
"ts" : ISODate("1969-07-16T12:00:00Z")})
Queries: Point in Space-Time 
db.data.find({"st" : "u747940", 
1.6 
1.4 
1.2 
1 
0.8 
0.6 
0.4 
0.2 
0 
"ts" : ISODate("1969-07-16T12:00:00Z")}) 
single server cluster 
ms 
avg 
95th 
99th 
max. 
throughput: 
40,000/s 610,000/s 
(10 mongos)
Queries: One Station, One Year 
db.data.find({"st" : "u103840", 
"ts" : {"$gte": ISODate("1989-01-01"), 
"$lt" : ISODate("1990-01-01")}})
Queries: One Station, One Year 
db.data.find({"st" : "u103840", 
5000 
4000 
3000 
2000 
1000 
0 
"ts" : {"$gte": ISODate("1989-01-01"), 
"$lt" : ISODate("1990-01-01")}}) 
single server cluster 
ms 
avg 
95th 
99th 
max. 
throughput: 20/s 430/s 
(10 mongos) 
targeted query
Queries: The Whole World, Once 
Upon... 
db.data.find({"ts" : ISODate("2000-01-01T00:00:00Z")})
Queries: The Whole World, Once 
Upon... 
db.data.find({"ts" : ISODate("2000-01-01T00:00:00Z")}) 
10000 
8000 
6000 
4000 
2000 
0 
single server cluster 
ms 
avg 
95th 
99th 
max. 
throughput: 8/s 
310/s 
(10 mongos) 
scatter/gather query
Analytics and Exploration 
• Analytics means ad-hoc queries for which 
we do not have an index 
– Find all tornados 
– Maximum reported temperature 
• We cannot just index everything 
– memory 
– write performance
Analytics: Find all Tornados 
db.data.find ({ 
"presentWeatherObservation.condition" : "99" 
})
Analytics: Find all Tornados 
db.data.find ({ 
"presentWeatherObservation.condition" : "99" 
}) 
1 h 28 min 
Single Server
Analytics: Find all Tornados 
db.data.find ({ 
"presentWeatherObservation.condition" : "99" 
}) 
47 s 
Cluster 
1 h 28 min 
Single Server
Analytics: Maximum Temperature 
db.data.aggregate ([ 
{ "$match" : { "airTemperature.quality" : 
{ "$in" : [ "1", "5" ] } } }, 
{ "$group" : { "_id" : null, 
"maxTemp" : { "$max" : 
"$airTemperature.value" } } } 
])
Analytics: Maximum Temperature 
db.data.aggregate ([ 
{ "$match" : { "airTemperature.quality" : 
{ "$in" : [ "1", "5" ] } } }, 
{ "$group" : { "_id" : null, 
"maxTemp" : { "$max" : 
"$airTemperature.value" } } } 
]) 
61.8 °C = 143 °F
Analytics: Maximum Temperature 
db.data.aggregate ([ 
{ "$match" : { "airTemperature.quality" : 
{ "$in" : [ "1", "5" ] } } }, 
{ "$group" : { "_id" : null, 
"maxTemp" : { "$max" : 
"$airTemperature.value" } } } 
]) 
61.8 °C = 143 °F 
4 h 45 min 
Single Server
Analytics: Maximum Temperature 
db.data.aggregate ([ 
{ "$match" : { "airTemperature.quality" : 
{ "$in" : [ "1", "5" ] } } }, 
{ "$group" : { "_id" : null, 
"maxTemp" : { "$max" : 
"$airTemperature.value" } } } 
]) 
61.8 °C = 143 °F 
2 min 
Cluster 
4 h 45 min 
Single Server
Summary: Single Server 
Pro 
• Cost-effective 
• Very good latency for single queries 
Con 
• Some operations are prohibitive: 
– Indexing 
– Table Scans
Summary: Cluster 
Con 
• High cost 
Pro 
• High throughput 
• Very good latency for single queries 
• Scatter-gather yields significant speed-up 
• Analytics are possible 
..
Thank you.

Contenu connexe

Tendances

The Ring programming language version 1.6 book - Part 69 of 189
The Ring programming language version 1.6 book - Part 69 of 189The Ring programming language version 1.6 book - Part 69 of 189
The Ring programming language version 1.6 book - Part 69 of 189Mahmoud Samir Fayed
 
The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210Mahmoud Samir Fayed
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB
 
The Ring programming language version 1.5.1 book - Part 63 of 180
The Ring programming language version 1.5.1 book - Part 63 of 180The Ring programming language version 1.5.1 book - Part 63 of 180
The Ring programming language version 1.5.1 book - Part 63 of 180Mahmoud Samir Fayed
 
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureCloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureAnkur Dave
 
AJUG April 2011 Raw hadoop example
AJUG April 2011 Raw hadoop exampleAJUG April 2011 Raw hadoop example
AJUG April 2011 Raw hadoop exampleChristopher Curtin
 
Q-learning and Deep Q Network (Reinforcement Learning)
Q-learning and Deep Q Network (Reinforcement Learning)Q-learning and Deep Q Network (Reinforcement Learning)
Q-learning and Deep Q Network (Reinforcement Learning)Thom Lane
 
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod PolyakovYulia Shcherbachova
 
JVM performance options. How it works
JVM performance options. How it worksJVM performance options. How it works
JVM performance options. How it worksDmitriy Dumanskiy
 
Python Development (MongoSF)
Python Development (MongoSF)Python Development (MongoSF)
Python Development (MongoSF)Mike Dirolf
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
Data structure programs in c++
Data structure programs in c++Data structure programs in c++
Data structure programs in c++mmirfan
 
The Ring programming language version 1.6 book - Part 70 of 189
The Ring programming language version 1.6 book - Part 70 of 189The Ring programming language version 1.6 book - Part 70 of 189
The Ring programming language version 1.6 book - Part 70 of 189Mahmoud Samir Fayed
 
The Ring programming language version 1.7 book - Part 67 of 196
The Ring programming language version 1.7 book - Part 67 of 196The Ring programming language version 1.7 book - Part 67 of 196
The Ring programming language version 1.7 book - Part 67 of 196Mahmoud Samir Fayed
 
The Ring programming language version 1.9 book - Part 73 of 210
The Ring programming language version 1.9 book - Part 73 of 210The Ring programming language version 1.9 book - Part 73 of 210
The Ring programming language version 1.9 book - Part 73 of 210Mahmoud Samir Fayed
 
The Ring programming language version 1.3 book - Part 44 of 88
The Ring programming language version 1.3 book - Part 44 of 88The Ring programming language version 1.3 book - Part 44 of 88
The Ring programming language version 1.3 book - Part 44 of 88Mahmoud Samir Fayed
 
A Century Of Weather Data - Midwest.io
A Century Of Weather Data - Midwest.ioA Century Of Weather Data - Midwest.io
A Century Of Weather Data - Midwest.ioRandall Hunt
 
The Ring programming language version 1.5.3 book - Part 71 of 184
The Ring programming language version 1.5.3 book - Part 71 of 184The Ring programming language version 1.5.3 book - Part 71 of 184
The Ring programming language version 1.5.3 book - Part 71 of 184Mahmoud Samir Fayed
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineJason Terpko
 

Tendances (20)

The Ring programming language version 1.6 book - Part 69 of 189
The Ring programming language version 1.6 book - Part 69 of 189The Ring programming language version 1.6 book - Part 69 of 189
The Ring programming language version 1.6 book - Part 69 of 189
 
The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
 
The Ring programming language version 1.5.1 book - Part 63 of 180
The Ring programming language version 1.5.1 book - Part 63 of 180The Ring programming language version 1.5.1 book - Part 63 of 180
The Ring programming language version 1.5.1 book - Part 63 of 180
 
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureCloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
 
AJUG April 2011 Raw hadoop example
AJUG April 2011 Raw hadoop exampleAJUG April 2011 Raw hadoop example
AJUG April 2011 Raw hadoop example
 
Q-learning and Deep Q Network (Reinforcement Learning)
Q-learning and Deep Q Network (Reinforcement Learning)Q-learning and Deep Q Network (Reinforcement Learning)
Q-learning and Deep Q Network (Reinforcement Learning)
 
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
 
JVM performance options. How it works
JVM performance options. How it worksJVM performance options. How it works
JVM performance options. How it works
 
Python Development (MongoSF)
Python Development (MongoSF)Python Development (MongoSF)
Python Development (MongoSF)
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
Data structure programs in c++
Data structure programs in c++Data structure programs in c++
Data structure programs in c++
 
The Ring programming language version 1.6 book - Part 70 of 189
The Ring programming language version 1.6 book - Part 70 of 189The Ring programming language version 1.6 book - Part 70 of 189
The Ring programming language version 1.6 book - Part 70 of 189
 
The Ring programming language version 1.7 book - Part 67 of 196
The Ring programming language version 1.7 book - Part 67 of 196The Ring programming language version 1.7 book - Part 67 of 196
The Ring programming language version 1.7 book - Part 67 of 196
 
The Ring programming language version 1.9 book - Part 73 of 210
The Ring programming language version 1.9 book - Part 73 of 210The Ring programming language version 1.9 book - Part 73 of 210
The Ring programming language version 1.9 book - Part 73 of 210
 
The Ring programming language version 1.3 book - Part 44 of 88
The Ring programming language version 1.3 book - Part 44 of 88The Ring programming language version 1.3 book - Part 44 of 88
The Ring programming language version 1.3 book - Part 44 of 88
 
Tricks
TricksTricks
Tricks
 
A Century Of Weather Data - Midwest.io
A Century Of Weather Data - Midwest.ioA Century Of Weather Data - Midwest.io
A Century Of Weather Data - Midwest.io
 
The Ring programming language version 1.5.3 book - Part 71 of 184
The Ring programming language version 1.5.3 book - Part 71 of 184The Ring programming language version 1.5.3 book - Part 71 of 184
The Ring programming language version 1.5.3 book - Part 71 of 184
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 

En vedette

Big Dating at eHarmony
Big Dating at eHarmonyBig Dating at eHarmony
Big Dating at eHarmonyMongoDB
 
Case Interview Magna
Case Interview MagnaCase Interview Magna
Case Interview Magnajayathirth
 
Write Yourself a Storage Engine in 40 Minutes
Write Yourself a Storage Engine in 40 MinutesWrite Yourself a Storage Engine in 40 Minutes
Write Yourself a Storage Engine in 40 MinutesMongoDB
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBMongoDB
 
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChangerZero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChangerMongoDB
 
MongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB
 
Lightning Talk: Get Even More Value from MongoDB Applications
Lightning Talk: Get Even More Value from MongoDB ApplicationsLightning Talk: Get Even More Value from MongoDB Applications
Lightning Talk: Get Even More Value from MongoDB ApplicationsMongoDB
 

En vedette (7)

Big Dating at eHarmony
Big Dating at eHarmonyBig Dating at eHarmony
Big Dating at eHarmony
 
Case Interview Magna
Case Interview MagnaCase Interview Magna
Case Interview Magna
 
Write Yourself a Storage Engine in 40 Minutes
Write Yourself a Storage Engine in 40 MinutesWrite Yourself a Storage Engine in 40 Minutes
Write Yourself a Storage Engine in 40 Minutes
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDB
 
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChangerZero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
Zero to 1 Billion+ Records: A True Story of Learning & Scaling GameChanger
 
MongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, Analytics
 
Lightning Talk: Get Even More Value from MongoDB Applications
Lightning Talk: Get Even More Value from MongoDB ApplicationsLightning Talk: Get Even More Value from MongoDB Applications
Lightning Talk: Get Even More Value from MongoDB Applications
 

Similaire à The Weather of the Century: MongoDB for Storing and Analyzing Historical Weather Data

MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB
 
Building and Scaling the Internet of Things with MongoDB at Vivint
Building and Scaling the Internet of Things with MongoDB at Vivint Building and Scaling the Internet of Things with MongoDB at Vivint
Building and Scaling the Internet of Things with MongoDB at Vivint MongoDB
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB
 
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...InfluxData
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningMongoDB
 
WiredTiger MongoDB Integration
WiredTiger MongoDB Integration WiredTiger MongoDB Integration
WiredTiger MongoDB Integration MongoDB
 
Mongo db world 2014 billrun
Mongo db world 2014   billrunMongo db world 2014   billrun
Mongo db world 2014 billrunMongoDB
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series DataMongoDB
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a monthDmitriy Dumanskiy
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for CassandraEdward Capriolo
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"DataStax Academy
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseAll Things Open
 
MongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETTomas Jansson
 
MongoDB World 2014 - BillRun, Billing on top of MongoDB
MongoDB World 2014 - BillRun, Billing on top of MongoDBMongoDB World 2014 - BillRun, Billing on top of MongoDB
MongoDB World 2014 - BillRun, Billing on top of MongoDBOfer Cohen
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningPuneet Behl
 

Similaire à The Weather of the Century: MongoDB for Storing and Analyzing Historical Weather Data (20)

MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
 
Building and Scaling the Internet of Things with MongoDB at Vivint
Building and Scaling the Internet of Things with MongoDB at Vivint Building and Scaling the Internet of Things with MongoDB at Vivint
Building and Scaling the Internet of Things with MongoDB at Vivint
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
 
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
WiredTiger MongoDB Integration
WiredTiger MongoDB Integration WiredTiger MongoDB Integration
WiredTiger MongoDB Integration
 
Mongo db world 2014 billrun
Mongo db world 2014   billrunMongo db world 2014   billrun
Mongo db world 2014 billrun
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a month
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
MongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB and the Internet of Things
MongoDB and the Internet of Things
 
dfl
dfldfl
dfl
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NET
 
MongoDB World 2014 - BillRun, Billing on top of MongoDB
MongoDB World 2014 - BillRun, Billing on top of MongoDBMongoDB World 2014 - BillRun, Billing on top of MongoDB
MongoDB World 2014 - BillRun, Billing on top of MongoDB
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Sensmon couchdb
Sensmon couchdbSensmon couchdb
Sensmon couchdb
 

Plus de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Dernier (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

The Weather of the Century: MongoDB for Storing and Analyzing Historical Weather Data

  • 1. #MongoDB The Weather of the Century: Design and High Performance André Spiegel Consulting Engineer, MongoDB
  • 2. What was the weather when you were born?
  • 3.
  • 4. Data Format: Raw and in MongoDB 0303725053947282013060322517+40779-073969FM-15+0048KNYC V0309999C00005030485MN0080475N5+02115+02005100975 ADDAA101000095AU100001015AW1105GA1025+016765999GA2045+024385999 GA3075+030485999GD11991+0167659GD22991+0243859GD33991+0304859... { "st" : "u725053", "ts" : ISODate("2013-06-03T22:51:00Z"), "airTemperature" : { "value" : 21.1, "quality" : "5" }, "atmosphericPressure" : { "value" : 1009.7, "quality" : "5" } }
  • 5. Data Format: Raw and in MongoDB 0303725053947282013060322517+40779-073969FM-15+0048KNYC V0309999C00005030485MN0080475N5+02115+02005100975 ADDAA101000095AU100001015AW1105GA1025+016765999GA2045+024385999 GA3075+030485999GD11991+0167659GD22991+0243859GD33991+0304859... { "st" : "u725053", "ts" : ISODate("2013-06-03T22:51:00Z"), "airTemperature" : { "value" : 21.1, "quality" : "5" }, "atmosphericPressure" : { "value" : 1009.7, "quality" : "5" } } Station Identifier (»NYC Central Park«)
  • 6. How Big Is It? • 2.5 billion data points • 4 Terabyte (1.6k per document) • “moderately big”
  • 7. How to do this with MongoDB?
  • 8. First Deployment • A single server with a really big disk Application mongod i2.8xlarge 251 GB RAM 6 TB SSD c3.8xlarge
  • 9. Second Deployment • A really big cluster where everything is in RAM Application / mongos ... 100 x r3.2xlarge @ 61 GB RAM 100 GB disk mongod c3.8xlarge
  • 10. Second Deployment • A really big cluster where everything is in RAM Application / mongos ... 100 x r3.2xlarge @ 61 GB RAM 100 GB disk mongod
  • 11. Now... how much would you pay? ..
  • 12. Now... how much would you pay? $60,000 / yr ..
  • 13. Now... how much would you pay? $60,000 / yr .. $700,000 / yr
  • 14. Use Cases • Bulk loading – getting all data into the system • Latency and throughput for queries – point in space-time – one station, one year – the whole world, once upon a time • Aggregation and Exploration – warmest and coldest day ever, etc.
  • 15. Bulk Loading: Principles • On the application side: – batch size – number of client threads – use unordered bulk writes • On the server side: – Journaling off ( temporarily! ) – Index later – In cluster: pre-split, no balancing
  • 16. Bulk Loading: Single Server batch size threads through put
  • 17. Bulk Loading: Single Server batch size threads through put 8 threads, batch size 100 → 85,000 doc/s
  • 18. Bulk Loading: Single Server • Settings: 8 threads batch size 100 • Total loading time: 10 h 20 min • Documents per second: 70,000 • Index build time: 7 h 40 min (ts_1_st_1)
  • 20. Bulk Loading: Cluster 144 threads, batch size 200 → 220,000 doc/s
  • 21. Bulk Loading: Cluster • Shard Key: Station ID, hashed • Settings: 10 mongos @ 144 threads batch size 200 • Total loading time: 3 h 10 min • Documents per second: 228,000 • Index build time: 5 min (ts_1_st_1)
  • 22. Queries: Point in Space-Time db.data.find({"st" : "u747940", "ts" : ISODate("1969-07-16T12:00:00Z")})
  • 23. Queries: Point in Space-Time db.data.find({"st" : "u747940", 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 "ts" : ISODate("1969-07-16T12:00:00Z")}) single server cluster ms avg 95th 99th max. throughput: 40,000/s 610,000/s (10 mongos)
  • 24. Queries: One Station, One Year db.data.find({"st" : "u103840", "ts" : {"$gte": ISODate("1989-01-01"), "$lt" : ISODate("1990-01-01")}})
  • 25. Queries: One Station, One Year db.data.find({"st" : "u103840", 5000 4000 3000 2000 1000 0 "ts" : {"$gte": ISODate("1989-01-01"), "$lt" : ISODate("1990-01-01")}}) single server cluster ms avg 95th 99th max. throughput: 20/s 430/s (10 mongos) targeted query
  • 26. Queries: The Whole World, Once Upon... db.data.find({"ts" : ISODate("2000-01-01T00:00:00Z")})
  • 27. Queries: The Whole World, Once Upon... db.data.find({"ts" : ISODate("2000-01-01T00:00:00Z")}) 10000 8000 6000 4000 2000 0 single server cluster ms avg 95th 99th max. throughput: 8/s 310/s (10 mongos) scatter/gather query
  • 28. Analytics and Exploration • Analytics means ad-hoc queries for which we do not have an index – Find all tornados – Maximum reported temperature • We cannot just index everything – memory – write performance
  • 29. Analytics: Find all Tornados db.data.find ({ "presentWeatherObservation.condition" : "99" })
  • 30. Analytics: Find all Tornados db.data.find ({ "presentWeatherObservation.condition" : "99" }) 1 h 28 min Single Server
  • 31. Analytics: Find all Tornados db.data.find ({ "presentWeatherObservation.condition" : "99" }) 47 s Cluster 1 h 28 min Single Server
  • 32. Analytics: Maximum Temperature db.data.aggregate ([ { "$match" : { "airTemperature.quality" : { "$in" : [ "1", "5" ] } } }, { "$group" : { "_id" : null, "maxTemp" : { "$max" : "$airTemperature.value" } } } ])
  • 33. Analytics: Maximum Temperature db.data.aggregate ([ { "$match" : { "airTemperature.quality" : { "$in" : [ "1", "5" ] } } }, { "$group" : { "_id" : null, "maxTemp" : { "$max" : "$airTemperature.value" } } } ]) 61.8 °C = 143 °F
  • 34. Analytics: Maximum Temperature db.data.aggregate ([ { "$match" : { "airTemperature.quality" : { "$in" : [ "1", "5" ] } } }, { "$group" : { "_id" : null, "maxTemp" : { "$max" : "$airTemperature.value" } } } ]) 61.8 °C = 143 °F 4 h 45 min Single Server
  • 35. Analytics: Maximum Temperature db.data.aggregate ([ { "$match" : { "airTemperature.quality" : { "$in" : [ "1", "5" ] } } }, { "$group" : { "_id" : null, "maxTemp" : { "$max" : "$airTemperature.value" } } } ]) 61.8 °C = 143 °F 2 min Cluster 4 h 45 min Single Server
  • 36. Summary: Single Server Pro • Cost-effective • Very good latency for single queries Con • Some operations are prohibitive: – Indexing – Table Scans
  • 37. Summary: Cluster Con • High cost Pro • High throughput • Very good latency for single queries • Scatter-gather yields significant speed-up • Analytics are possible ..