SlideShare une entreprise Scribd logo
1  sur  29
#MDBW17
Hitting hardware bottlenecks
PERFORMANCE TIPPING POINTS
Akira Kurogane
Senior TSE, MongoDB (Sydney)
WHY YOU'RE HERE TODAY
Curious about DB performance
Responsible for DB performance
You've experienced a bad
"Everything in production is slow!"
day at work before, and you'd
like to never have one again.
#MDBW17
DB PERFORMANCE IS A
COMPLEX EQUATION
• "What if the queries per second rate increased by 50%
compared to now?"
• "What if the queries and aggregations get larger on average?"
• "What if the read to write ratio changes?"
• "What if I downsize the server or use cheaper disk storage to
reduce cost?"
δx = . . . . .?
#MDBW17
DIFFERENT TECHNOLOGIES,
DIFFERENT PHYSICAL LIMITS
• CPU
• SRAM
• DRAM
• Flash memory
• (Magnetic) Disk heads
Powers of ten apart in data delivery
speed.
#MDBW17
TWO DIFFERENT PERFORMANCE-LIMITING
MECHANISMS
1. For any channel
The channel's throughput capacity is saturated -> bottleneck.
Mainly influenced by: The rate of db ops / sec * cost of avg op
2. Storage I/O channels
Small, fast storage layer is full -> the next, slower level is used
L1 Cache -> L2 -> L3 -> RAM -> Disk
Mainly influenced by: How much data is 'active'
#MDBW17
"ACTIVE DATA SET"
Active data set size is not derived simply from total data size, or server
specs.
My definition: The portion of your data where 99%* of reads
are expected to be completed within a fixed latency.
* or 99.9%, or 99.99%, etc., according to your preference
At PerformanceShopper.com we found 99.9% of reads are either on recently-inserted
documents, or from certain small collections.
If we get in-memory latencies for that 99.9% then disk latency for the other 0.1% is fine.
Our total data size may be ~1TB but the "active" documents are < 100GB.
Example:
DEMONSTRATION
S
#MDBW17
DEMO #1: CPU BOTTLENECK
db.collection.aggregate([
{ $match: { city: X } }, //Indexed lookup
{ $limit: 20000 }
])
db.collection.aggregate([
{ $match: { city: X } }, //Indexed lookup
{ $limit: 20000 },
{ $sort: { first_name: 1 } } //Test diff. Unindexed field
])
#MDBW17
Normal query/sec rate
Unwelcome surprise 100% CPU
bottleneck
A SUDDEN CHANGE IN PERFORMANCE
#MDBW17
DEMO #2: NETWORK BOTTLENECK
A simple query that can be switched between tiny and huge result
sizes.
db.collection.find(
/* find */ { _id: X, nested_array: Y },
//Nested array is large in every document
/*project*/ { _id: true, "nested_array.$": true }
//Oops: Let's 'accidentally' forget the ".$".
)
Avg result is ~ 0.4 kB when nested_array.$ is used; ~1.8 MB when it is not
#MDBW17
1 MB/s
143 MB/s
#MDBW17
Large drop in
ops/sec
(Test-created)
huge change in
network usage
Disproportionately
small change to
server-side
latency
DEMO 2B: NETWORK BOTTLENECK (MIXED)
< - - - - >
#MDBW17
DEMO #3: ACTIVE DATA SET GROWS BEYOND
WIREDTIGER CACHE
Collection "foo": 160 GB. Average document size is 1kb,
when uncompressed in the WiredTiger cache.
RAM on the server is 15 GB: WiredTiger cache set to
10GB.
Test query:
db.foo.find({_id: <val>})
for random _id within limited range.
#MDBW17
WHAT YOU SEE IN MONGODB
OPCOUNTERS
#MDBW17
SLOW COMMANDS IN THE LOGS
2017-05-28T23:03:29.710+0000 I COMMAND [conn19764] command ptp_as_demo.foo command: find
{ find: "foo", filter: { _id: 24401243 }, projection: { long_string: false } }
planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1
numYields:0 nreturned:1 reslen:459
locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } },
Collection: { acquireCount: { r: 1 } } }
protocol:op_query 1ms
2017-05-28T23:16:39.359+0000 I COMMAND [conn19993] command ptp_as_demo.foo command: find
{ find: "foo", filter: { _id: 26384539 }, projection: { long_string: false } }
planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1
numYields:1 nreturned:1 reslen:459
locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } },
Collection: { acquireCount: { r: 2 } } }
protocol:op_query 12ms
2017-05-28T23:03:29.710+0000 I COMMAND [conn19764] command ptp_as_demo.foo command: find
{ find: "foo", filter: { _id: 24401243 }, projection: { long_string: false } }
planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1
numYields:0 nreturned:1 reslen:459
locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } },
Collection: { acquireCount: { r: 1 } } }
protocol:op_query 1ms
2017-05-28T23:16:39.359+0000 I COMMAND [conn19993] command ptp_as_demo.foo command: find
{ find: "foo", filter: { _id: 26384539 }, projection: { long_string: false } }
planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1
numYields:1 nreturned:1 reslen:459
locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } },
Collection: { acquireCount: { r: 2 } } }
protocol:op_query 12ms
#MDBW17
THE STORY IN WIREDTIGER CACHE
ACTIVITY
N.b. to make decompression
run slower than typical in this
test I artificially constrained
CPU cores to just 2.
2 cores
8 cores
100 MB/s
200 MB/s
300 MB/s
This test’s active data set size was kept within
that size.
So far this test is rigged to be pure RAM.
Disk was avoided.
• Default WiredTiger cache size 60% of RAM.
• Leaves 40% for OS and filesystem cache.
‒ Let's say 35% for filesystem page cache
Even with mildly compressible document data
more than twice the cache size of document
data will be in RAM.
It just needs to be decompressed on the fly.
RAM
#MDBW17
DEMO #4: ACTIVE DATA SET GROWS INTO
DISK RANGES
Continue the same test, gradually increasing the range
of data being queried to be ~10x than the WiredTiger
cache size.
Rough calculation: By the end > 70% of queries will
need to wait for disk.
#MDBW17
WHAT YOU SEE IN MONGODB OP
COUNTERS
Linear increase in active data set size
Welcome to disk-land!
#MDBW17
THE STORY IN WIREDTIGER CACHE ACTIVITY
RAM
RAM
↓
Decompress
↓
RAM
Disk
WT Cache Activity
(before)
Where now?
#MDBW17
AGGREGATE SUMS OF
DIFFERENT LATENCIES
Classic latency comparison numbers
--------------------------
Main memory reference 0.1 μs
Compress 1K bytes with snappy 3 μs
Read 4K randomly from SSD 150 μs
Read 1 MB sequentially from memory 250 μs
Read 1 MB seq'ly from 300MB/s SSD 3,300 μs
Disk seek 10,000 μs
Read 1 MB sequentially from disk 20,000 μs
105
102
#MDBW17
RAM vs Disk % Latency What the users say
100 / 0 25 ms (normal)
99 / 1 42 ms "Everything's really slow"
90 / 10 200 ms "Everything's broken'
50 / 50 1000 ms "What do you mean ETA ..."
0 / 100 2000 ms "... is next week?!"
Magnetic disk
THEORETICAL 100MB READ LATENCIES
#MDBW17
RAM vs SSD % Latency What the users say
100 / 0 25 ms (normal)
99 / 1 28 ms (normal)
90 / 10 50 ms "Everything's really slow"
50 / 50 175 ms "Everything's broken"
0 / 100 330 ms "ETA within today?"
THEORETICAL 100MB READ LATENCIES
Low-end SSD, 300MB/s
#MDBW17
WRITE LOADS
The previous demonstrations focused on read-only cases alone.
Writes are more I/O bound that reads.
Every write is involved in disk access at two points.
• First all writes go to journal. (Commits ~10 times per second)
• Asynchronously WiredTiger cache block marked 'dirty' ->
compressed -> fdatasync'ed to disk (once per min)
Key point: focus even more on disk util% and WiredTiger Cache
Activity than we did in the previous demonstrations.
SUMMARY
#MDBW17
CPU, NETWORK BOTTLENECKS
• It's unlikely you're suffering from these
• But on the other hand it's not hard to check them
• Check it, forget it, move onto storage I/O
#MDBW17
STORAGE
• On a logarithmic scale the differences between disk latency and
RAM latency doesn't look so bad ....
... but here in the real, linear-time universe it is.
• Increased read MB/s into WiredTiger cache is not a problem if
it's being read from filesystem page cache, but:
• That metric growing from near-zero to 100's of MB/s gives you a
warning that the active data set is getting closer to ‘disk-land’.
#MDBW17
QUESTIONS? ....

Contenu connexe

Tendances

Using Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterUsing Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterMongoDB
 
It's a Dangerous World
It's a Dangerous World It's a Dangerous World
It's a Dangerous World MongoDB
 
Introducing Stitch
Introducing Stitch Introducing Stitch
Introducing Stitch MongoDB
 
Building the Real-Time Performance Panel
Building the Real-Time Performance PanelBuilding the Real-Time Performance Panel
Building the Real-Time Performance PanelMongoDB
 
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationWebinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationMongoDB
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
Managing Cloud Security Design and Implementation in a Ransomware World
Managing Cloud Security Design and Implementation in a Ransomware World Managing Cloud Security Design and Implementation in a Ransomware World
Managing Cloud Security Design and Implementation in a Ransomware World MongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...MongoDB
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineMongoDB
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters MongoDB
 
Containerizing MongoDB with kubernetes
Containerizing MongoDB with kubernetesContainerizing MongoDB with kubernetes
Containerizing MongoDB with kubernetesBrian McNamara
 
Getting Started with MongoDB Using the Microsoft Stack
Getting Started with MongoDB Using the Microsoft Stack Getting Started with MongoDB Using the Microsoft Stack
Getting Started with MongoDB Using the Microsoft Stack MongoDB
 
RedisConf18 - Redis on Flash
RedisConf18 - Redis on FlashRedisConf18 - Redis on Flash
RedisConf18 - Redis on FlashRedis Labs
 
How to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBHow to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBMongoDB
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...MongoDB
 
Engineering an Encrypted Storage Engine
Engineering an Encrypted Storage EngineEngineering an Encrypted Storage Engine
Engineering an Encrypted Storage EngineMongoDB
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 

Tendances (20)

Using Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterUsing Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your Cluster
 
It's a Dangerous World
It's a Dangerous World It's a Dangerous World
It's a Dangerous World
 
Introducing Stitch
Introducing Stitch Introducing Stitch
Introducing Stitch
 
Building the Real-Time Performance Panel
Building the Real-Time Performance PanelBuilding the Real-Time Performance Panel
Building the Real-Time Performance Panel
 
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail ApplicationWebinar: Avoiding Sub-optimal Performance in your Retail Application
Webinar: Avoiding Sub-optimal Performance in your Retail Application
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
Managing Cloud Security Design and Implementation in a Ransomware World
Managing Cloud Security Design and Implementation in a Ransomware World Managing Cloud Security Design and Implementation in a Ransomware World
Managing Cloud Security Design and Implementation in a Ransomware World
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
Containerizing MongoDB with kubernetes
Containerizing MongoDB with kubernetesContainerizing MongoDB with kubernetes
Containerizing MongoDB with kubernetes
 
Getting Started with MongoDB Using the Microsoft Stack
Getting Started with MongoDB Using the Microsoft Stack Getting Started with MongoDB Using the Microsoft Stack
Getting Started with MongoDB Using the Microsoft Stack
 
RedisConf18 - Redis on Flash
RedisConf18 - Redis on FlashRedisConf18 - Redis on Flash
RedisConf18 - Redis on Flash
 
How to Achieve Scale with MongoDB
How to Achieve Scale with MongoDBHow to Achieve Scale with MongoDB
How to Achieve Scale with MongoDB
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
MongoDB World 2016: From the Polls to the Trolls: Seeing What the World Think...
 
Engineering an Encrypted Storage Engine
Engineering an Encrypted Storage EngineEngineering an Encrypted Storage Engine
Engineering an Encrypted Storage Engine
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 

Similaire à Performance Tipping Points - Hitting Hardware Bottlenecks

Using ТРСС to study Firebird performance
Using ТРСС to study Firebird performanceUsing ТРСС to study Firebird performance
Using ТРСС to study Firebird performanceMind The Firebird
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
MongoDB World 2019: RDBMS Versus MongoDB Aggregation Performance
MongoDB World 2019: RDBMS Versus MongoDB Aggregation PerformanceMongoDB World 2019: RDBMS Versus MongoDB Aggregation Performance
MongoDB World 2019: RDBMS Versus MongoDB Aggregation PerformanceMongoDB
 
MongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB
 
Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013Datafiniti
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in productionParis Data Engineers !
 
Neo4j after 1 year in production
Neo4j after 1 year in productionNeo4j after 1 year in production
Neo4j after 1 year in productionAndrew Nikishaev
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Databricks
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheDavid Grier
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)DataWorks Summit
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the FieldMongoDB
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...ScyllaDB
 

Similaire à Performance Tipping Points - Hitting Hardware Bottlenecks (20)

Using ТРСС to study Firebird performance
Using ТРСС to study Firebird performanceUsing ТРСС to study Firebird performance
Using ТРСС to study Firebird performance
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
MongoDB World 2019: RDBMS Versus MongoDB Aggregation Performance
MongoDB World 2019: RDBMS Versus MongoDB Aggregation PerformanceMongoDB World 2019: RDBMS Versus MongoDB Aggregation Performance
MongoDB World 2019: RDBMS Versus MongoDB Aggregation Performance
 
MongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the Field
 
Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013Acug datafiniti pellon_sept2013
Acug datafiniti pellon_sept2013
 
IO Dubi Lebel
IO Dubi LebelIO Dubi Lebel
IO Dubi Lebel
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
 
Neo4j after 1 year in production
Neo4j after 1 year in productionNeo4j after 1 year in production
Neo4j after 1 year in production
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)
 
Database Hardware Benchmarking
Database Hardware BenchmarkingDatabase Hardware Benchmarking
Database Hardware Benchmarking
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
 

Plus de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Performance Tipping Points - Hitting Hardware Bottlenecks

  • 1. #MDBW17 Hitting hardware bottlenecks PERFORMANCE TIPPING POINTS Akira Kurogane Senior TSE, MongoDB (Sydney)
  • 2. WHY YOU'RE HERE TODAY Curious about DB performance Responsible for DB performance You've experienced a bad "Everything in production is slow!" day at work before, and you'd like to never have one again.
  • 3. #MDBW17 DB PERFORMANCE IS A COMPLEX EQUATION • "What if the queries per second rate increased by 50% compared to now?" • "What if the queries and aggregations get larger on average?" • "What if the read to write ratio changes?" • "What if I downsize the server or use cheaper disk storage to reduce cost?" δx = . . . . .?
  • 4. #MDBW17 DIFFERENT TECHNOLOGIES, DIFFERENT PHYSICAL LIMITS • CPU • SRAM • DRAM • Flash memory • (Magnetic) Disk heads Powers of ten apart in data delivery speed.
  • 5. #MDBW17 TWO DIFFERENT PERFORMANCE-LIMITING MECHANISMS 1. For any channel The channel's throughput capacity is saturated -> bottleneck. Mainly influenced by: The rate of db ops / sec * cost of avg op 2. Storage I/O channels Small, fast storage layer is full -> the next, slower level is used L1 Cache -> L2 -> L3 -> RAM -> Disk Mainly influenced by: How much data is 'active'
  • 6. #MDBW17 "ACTIVE DATA SET" Active data set size is not derived simply from total data size, or server specs. My definition: The portion of your data where 99%* of reads are expected to be completed within a fixed latency. * or 99.9%, or 99.99%, etc., according to your preference At PerformanceShopper.com we found 99.9% of reads are either on recently-inserted documents, or from certain small collections. If we get in-memory latencies for that 99.9% then disk latency for the other 0.1% is fine. Our total data size may be ~1TB but the "active" documents are < 100GB. Example:
  • 8. #MDBW17 DEMO #1: CPU BOTTLENECK db.collection.aggregate([ { $match: { city: X } }, //Indexed lookup { $limit: 20000 } ]) db.collection.aggregate([ { $match: { city: X } }, //Indexed lookup { $limit: 20000 }, { $sort: { first_name: 1 } } //Test diff. Unindexed field ])
  • 9. #MDBW17 Normal query/sec rate Unwelcome surprise 100% CPU bottleneck A SUDDEN CHANGE IN PERFORMANCE
  • 10. #MDBW17 DEMO #2: NETWORK BOTTLENECK A simple query that can be switched between tiny and huge result sizes. db.collection.find( /* find */ { _id: X, nested_array: Y }, //Nested array is large in every document /*project*/ { _id: true, "nested_array.$": true } //Oops: Let's 'accidentally' forget the ".$". ) Avg result is ~ 0.4 kB when nested_array.$ is used; ~1.8 MB when it is not
  • 12. #MDBW17 Large drop in ops/sec (Test-created) huge change in network usage Disproportionately small change to server-side latency DEMO 2B: NETWORK BOTTLENECK (MIXED)
  • 13. < - - - - >
  • 14. #MDBW17 DEMO #3: ACTIVE DATA SET GROWS BEYOND WIREDTIGER CACHE Collection "foo": 160 GB. Average document size is 1kb, when uncompressed in the WiredTiger cache. RAM on the server is 15 GB: WiredTiger cache set to 10GB. Test query: db.foo.find({_id: <val>}) for random _id within limited range.
  • 15. #MDBW17 WHAT YOU SEE IN MONGODB OPCOUNTERS
  • 16. #MDBW17 SLOW COMMANDS IN THE LOGS 2017-05-28T23:03:29.710+0000 I COMMAND [conn19764] command ptp_as_demo.foo command: find { find: "foo", filter: { _id: 24401243 }, projection: { long_string: false } } planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1 numYields:0 nreturned:1 reslen:459 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 1ms 2017-05-28T23:16:39.359+0000 I COMMAND [conn19993] command ptp_as_demo.foo command: find { find: "foo", filter: { _id: 26384539 }, projection: { long_string: false } } planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1 numYields:1 nreturned:1 reslen:459 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } protocol:op_query 12ms 2017-05-28T23:03:29.710+0000 I COMMAND [conn19764] command ptp_as_demo.foo command: find { find: "foo", filter: { _id: 24401243 }, projection: { long_string: false } } planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1 numYields:0 nreturned:1 reslen:459 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 1ms 2017-05-28T23:16:39.359+0000 I COMMAND [conn19993] command ptp_as_demo.foo command: find { find: "foo", filter: { _id: 26384539 }, projection: { long_string: false } } planSummary: IDHACK keysExamined:1 docsExamined:1 cursorExhausted:1 numYields:1 nreturned:1 reslen:459 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } protocol:op_query 12ms
  • 17. #MDBW17 THE STORY IN WIREDTIGER CACHE ACTIVITY N.b. to make decompression run slower than typical in this test I artificially constrained CPU cores to just 2. 2 cores 8 cores 100 MB/s 200 MB/s 300 MB/s
  • 18. This test’s active data set size was kept within that size. So far this test is rigged to be pure RAM. Disk was avoided. • Default WiredTiger cache size 60% of RAM. • Leaves 40% for OS and filesystem cache. ‒ Let's say 35% for filesystem page cache Even with mildly compressible document data more than twice the cache size of document data will be in RAM. It just needs to be decompressed on the fly. RAM
  • 19. #MDBW17 DEMO #4: ACTIVE DATA SET GROWS INTO DISK RANGES Continue the same test, gradually increasing the range of data being queried to be ~10x than the WiredTiger cache size. Rough calculation: By the end > 70% of queries will need to wait for disk.
  • 20. #MDBW17 WHAT YOU SEE IN MONGODB OP COUNTERS Linear increase in active data set size Welcome to disk-land!
  • 21. #MDBW17 THE STORY IN WIREDTIGER CACHE ACTIVITY RAM RAM ↓ Decompress ↓ RAM Disk WT Cache Activity (before) Where now?
  • 22. #MDBW17 AGGREGATE SUMS OF DIFFERENT LATENCIES Classic latency comparison numbers -------------------------- Main memory reference 0.1 μs Compress 1K bytes with snappy 3 μs Read 4K randomly from SSD 150 μs Read 1 MB sequentially from memory 250 μs Read 1 MB seq'ly from 300MB/s SSD 3,300 μs Disk seek 10,000 μs Read 1 MB sequentially from disk 20,000 μs 105 102
  • 23. #MDBW17 RAM vs Disk % Latency What the users say 100 / 0 25 ms (normal) 99 / 1 42 ms "Everything's really slow" 90 / 10 200 ms "Everything's broken' 50 / 50 1000 ms "What do you mean ETA ..." 0 / 100 2000 ms "... is next week?!" Magnetic disk THEORETICAL 100MB READ LATENCIES
  • 24. #MDBW17 RAM vs SSD % Latency What the users say 100 / 0 25 ms (normal) 99 / 1 28 ms (normal) 90 / 10 50 ms "Everything's really slow" 50 / 50 175 ms "Everything's broken" 0 / 100 330 ms "ETA within today?" THEORETICAL 100MB READ LATENCIES Low-end SSD, 300MB/s
  • 25. #MDBW17 WRITE LOADS The previous demonstrations focused on read-only cases alone. Writes are more I/O bound that reads. Every write is involved in disk access at two points. • First all writes go to journal. (Commits ~10 times per second) • Asynchronously WiredTiger cache block marked 'dirty' -> compressed -> fdatasync'ed to disk (once per min) Key point: focus even more on disk util% and WiredTiger Cache Activity than we did in the previous demonstrations.
  • 27. #MDBW17 CPU, NETWORK BOTTLENECKS • It's unlikely you're suffering from these • But on the other hand it's not hard to check them • Check it, forget it, move onto storage I/O
  • 28. #MDBW17 STORAGE • On a logarithmic scale the differences between disk latency and RAM latency doesn't look so bad .... ... but here in the real, linear-time universe it is. • Increased read MB/s into WiredTiger cache is not a problem if it's being read from filesystem page cache, but: • That metric growing from near-zero to 100's of MB/s gives you a warning that the active data set is getting closer to ‘disk-land’.

Notes de l'éditeur

  1. Hello All. I'm happy to be here, to have the opportunity to discuss performance topics with you. My name is Akira and I work at MongoDB as a Technical Service Engineer, that is I'm a member of the support team. In the support team we have diverse set of skills, but mine are more towards server development and Linux performance. As we go through this presentation you're going to see four demonstration. These demonstrations are drawn from lessons I've learnt from supporting MongoDB in the field. Although these have been simplified all still reflect real-world situations that have caught someone out. Let's begin.
  2. ◉◉◉ In this presentation I'm only going to show perfectly-running software hosted on perfectly good hardware. So what are you going to learn? What is the problem I will address? It's that we, the users and administrators of the software, are not very good at predicting when performance will change. Well, I think majority of us do have good gut feelings about how much more load we can place on our database servers, but in truth if we were challenged to answer "When?" and "How much?" we know our estimations are too rough. Some of us have had the experience of discovering our estimations were wrong - very wrong - and even if it hasn't happened to you yet, don't kid yourself. It could.
  3. In the process of trying to predict how your database server will perform in the future you're probably going to ask the following sorts of 'what if' questions. ◉◉◉◉ These are a simple as you can make them - you're just changing one (◉) variable according to DB metrics in these questions. The problem is each of those single DB metrics relies on multiple variables at the hardware level.
  4. The first reason it's a complex equation: The wildly different response times of these different parts of the modern computer are unintutive The human mind isn't naturally suited to process polynomials that have constants of nanoseconds in one place and tenths of a second in another. And to get into semantics it's not even an equation - it's an algorithm. An algorithm with these polynomials on the inside.
  5. The second reason it's a complex equation: There are two different top-level mechanisms influencing database performance ◉ ◉ The first and simpler rule is a generic one that when a server process saturates the capacity of any channel, that will be a bottleneck. "This is mainly influenced by: the rate of db ops multiplied by their average execution cost" The first two demonstrations will show examples of these ◉ ◉ The second is that when the ACTIVE DATASET grows larger then the fraction of data that has to be accessed from the lower, slower storage levels increases. Data reads are slower by 1 to 2 powers of 10 for each level lower. "This is mainly influenced by: how much data is 'active'" The third and fourth demonstrations will show this mechanism.
  6. I used the term "Active dataset" in the last slide and I will use it again. Is anyone concerned that I am not defining it clearly enough? (Raise hand) Good. You should be pestering me to explain this, it's an important concept. ◉ It is not derived simply from total data size, or server specs. ◉ It is a subjective measure. It depends on what your goals for latency are (personally, or those set by a SLA) ◉ To give one example, to give the general idea, let's imagine I have an ecommerce company called PerformanceShopper. Here's what I might say about my active data set. (Read the example)
  7. Now I begin the demonstrations. The first two will be classic bottleneck cases. The third and fouth ones will show what happens when the active data set overflows available RAM.
  8. This first demonstration will compare these two aggregations. The second one will be adding a sort an unindexed field to force high CPU load. Requesting a sort in a query or aggregation doesn't necessarily cause CPU load. If an applicable index exists then the collection data can be iterated in that order. (In this case that would be a compound index on city and first_name, in that order.) But when there is no applicable index the query engine must perform a sort for every query. Once you have 100's of documents per query that will become a significant cost. These aggregations select 20,000 small documents each. What can the high CPU usage needed to sort them do to you performance? The answer is here (next slide)
  9. ◉ ◉ In these two graphs we can see a blue line representing Queries per Second in the opcounters graph, and a matching latency graph on the right. As you can see when the problem begins the latency increases and the rate of database ops per second decreases. You know what I've done here, but I'd like you to imagine this came upon you out of nowhere. You suspect CPU-greedy operations somewhere. What can you do to prove or disprove? ◉ Simply look at the CPU usage. If the system-wide CPU usage has gone to 100% or close to it, that's it. If it isn't .... then it isn't. If it has then of course you want to look for CPU-intensive operations. Use the mongod logs to look for lots of slow commands, or use the profiler. Look for sorts or anything that you could suspect of being relatively compute-intensive. Some examples of other CPU-intensive operations include: - Aggregations that work on large result sets. - Authentication functions are CPU-intensive by design, so if you're needlessly opening and closing scores or hundreds of new connections each second that can give you a CPU bottleneck too.
  10. In my time at MongoDB I have only seen one clear, undeniable case of network bottleneck, and it was achieved exactly like the mistake above. For those who are unfamiliar with the positional $ projection operator (shown in orange here) it lets you keep only the nested_array item that matches the query clause, and discard all the others in the same nested array. You can see that the latter query will return several thousand times as much data over the network compared to the first one.
  11. I think we can show everything with these single set of graphs. In the top left graph (Opcounters) please pay the most attention to the blue line. It shows the number of find commands drops dramatically. The top right is a graph of database operation latency. Can you see what is strange? The latency, measured in the query engine, has only changed a small amount but the client is receiving far less results per minute. Q: What explains the difference? A: Saturation in the network ◉ What goes up is Network bytes out / sec. To a very high value. In this test case I managed to get a fairly flat ceiling, which I suspect is deliberate rate limitation in AWS. In a normal self-owned data center I'd expect the traffic between other servers on the same LAN to be more variable. Still very high though - LANS these days are very fast.
  12. The previous test was so simple I was afraid you might not believe it was a meaningful demonstration. So I've added this supplementary example to show you the effects of the previous test diluted by effects of another benchmark test running at the same time. Despite the mixed load the key points in a network bottleneck remain the same - ◉ a really high transmission rate, ◉ a drop in operations per second, ◉ but at the same time little difference to the average latency measured server-side.
  13. Intermission topics: By the way, I'm based in the Sydney office of MongoDB. Anyone here from Australia? Akira Kurogane == Michael Castley Before we go on I want to impress that the issues in the follow slide are going to happen to you. The previous demonstrations are problems that can spring upon you suddenly, but they only happen when people don't take care to test what the performance effects of queries etc are. But the following problems are something that will come upon without you making any mistake. No mistake other than being insufficient vigilant about the increasing size of your database.
  14. In the last two demonstrations we looked at the simpler mechanism of running into a single bottleneck. Now I'll bring the second factor into play - changes in Active data set size. If you have ample RAM then it will simply use more and more of that RAM, but when that is exhausted ... an increasing amount of data will be accessed from the lower, slower storage layers. Databases are, in the great majority of use cases, bound by the latency of the I/O on the server they are running on, rather than CPU. That I/O layer is not going to be the CPU caches - they're used of course, but they have capacities of just MB rather than GB. Instead the majority of reads will be from RAM and/or disk. For this demonstration I created a single collection 160 GB collection. The test server has 15 GB of RAM. And to be more specific the WiredTiger cache size is 10 GB. In the initial stage of this demonstration I picked a range of documents that only need 2GB of space. I warmed up that small data set, then I ran the tests you will see on the following slides.
  15. For those unfamiliar with the term OpCounters, I'm referring to those that are sourced from the MongoDB serverStatus output. These are the count of insert, update, query, getmore and delete etc. commands. In this test I started with a range of ids that needed only 2GB of cache space. I increased the range of ids being queried by a constant amount. The size of data needed to hold those documents grew from 6, to 7, to 8, 9, 10, 11 GB etc. It's obvious there's an effect as the cache size is exceeded. OK let's stop, step back for a moment. I'd like you to imagine you're seeing this graph when looking at the performance of the database at your workplace. You're disturbed about the drop in performance, as it happening to you in real life and not just in some test case you saw at conference. Something appears to have changed, so your ask the application developers to find out what happened. But then they tell you "The application didn't change; the queries didn't change". With that information alone this an unexplainable incident. If you're a DBA, having an unexplainable incident like this is exactly where you don't want to be. Of course you need investigate more deeply. Let's start with the mongod logs.
  16. Although the mongod logs are the number one diagnostic resource generally it isn't going to give us the insight we need. But lets pass through here for completeness. The slow command log lines above are examples of the query I am running in this test. The top one is from the higher performance time, the bottom is from a lower performance time. Let's highlight the differences as best we can ◉ There aren't many. White - unimportant, . Green - important, but identical. Red - different. The green parts show that the query engine is using the same plan type, scanned the same number of index entries and documents - so you can see the query engine is doing the same thing all through. It's just the latencies that have changed So if the query engine is doing the same thing let's look at the storage engine.
  17. To observe what's happening in the storage engine I'm going to look at the WiredTiger cache activity metrics. As a first picture, here's the OpCounters graph from before. Let's fade that out to show the graph of WiredTiger cache activity from the same time. ◉◉ Here we see the reason- the increasingly slower performance starting at that elbow lines up precisely with the increase of WiredTiger Cache Activity. ◉ In this test, on this server, you can see that having to read 200Mb into the cache every second has caused the performance to drop 10%. When it gets close the 400MB/s the performance has dropped by about 20%. Because there wasn't enough RAM to keep everything in cache the storage engine has to do more work every single second re-reading out-of-cache data back in, and so the queries became slower on average. ◉ I have to confess I cheated a bit just to make a nice graph for this presentation. I limited the mongod, this mongod running 40,000 queries per second, to just 2 cores to magnify the effect that decompressing the out-of-cache data has on aggregate performance. With a two cores limit I was able to experience a mild CPU bottleneck. I had to do this because the degradation effect was basically invisible when I used all 8 cores on the server. (At least 8 cores is typical for a 40k/s load.) ◉ So the little side-lesson here is: Yes, WiredTiger uses compression, and compression and decompression must consume CPU, but it hardly taxes the overall multi-core power on the typical server
  18. And now I have to make a second confession. The previous test was rigged in a more fundamental way. It avoided disk, and thus avoided the worst latencies you can get in a server. How is explained in the slide here - WiredTiger Cache should not be configured to use all the RAM. By default it's 60% - that's deliberate so there'll be a fair amount of RAM left for the kernel to use for filesystem cache. When MongoDB is busy on the server then it's data files are the ones that the kernel will buffer there. So the filesystem cache will become in effect an in-memory store of compressed document data. Even with mildly compressible data it will possible for it to be holding 2, 3 times as many documents in it as there in the WiredTiger cache. ◉◉ So just by doing decompression you can have an active data set that is 2, 3 times as large as the WiredTiger cache, so reads of that data don't need to touch the disk at all.
  19. For the fourth demo let's do the whole thing again, but keep going. Make the active data set even larger. For comparison I will include the previous demonstration's test stages again. They'll be on the left. Then on the right we will see the performance decrease caused by using disk. Are you ready? Here it comes, in one hit:-
  20. ◉ (Welcome to disk-land) ◉ I have to remind you at this point that the active data set size was only being increased in a steady, linear fashion. You too could go those few extra percent and FALL OFF this cliff. ◉
  21. "How? Why?" you're probably asking. To start with I'm going to show the WiredTiger cache activity again. ◉◉ Can anyone guess what happens to the cache activity when disk reads are introduced? (Can I see a show of hands for those who will think it goes up? Stays even? Down?) ◉◉◉ As you can see the WiredTiger cache activity goes down at this time. The new collection data being read from disk still needs to be decompressed and brought into the cache, but it's queued. It's queued waiting for blocks of file from disk to be delivered. Time to step back and see a new angle. This graph is the story of two-and-a-half storage layers ◉ Everything in RAM ◉ The 'psuedo-layer' of compressed data in RAM ◉ And disk By the way disk utilization at this time, where the performance plummets, jumps straight from near-zero figures to 100%, and then sticks there.
  22. So, what is best explanation, the real explanation for this behaviour? It's all ABOUT HOW the time cost of running 10 thousand or 10 million database operations is really about the aggregate sum of different hardware latencies. Here I show a table I'm sure nearly everyone here is familiar is. It shows that the latencies between different technologies in our servers are basically powers of ten apart. ◉ The difference in seek times between RAM and Magnetic disk are dramatic: 10^5 apart ◉ It makes the mere 10^2 difference in the throughput rate look mild. But remember 100 times slower is not mild if that's a factor hitting your database throughput. 100x slower is a disaster.
  23. I'll give you a moment to read this. I'd like the lesson of this slide to be: Even a little disk quickly poisons the average latency. Just one percent of spinning magnetic disk can halve your performance. And if 1/10th of the data is being sourced from disk instead of RAM you've lost 90% of your performance. In the next slide I will show the same thing, but for SSD instead of spinning magnetics disk.
  24. You can see that even a low-spec SSD buys you a power-of-ten delay before the issue manifests itself.
  25. READ MESSAGE ON SLIDE
  26. Let's summarize. Please start thinking of any questions you would like to ask.
  27. First: the CPU and network bottlenecks. Even if you have strong suspicions it's unlikely you're suffering from these. But on the other hand it's not too hard to check them. For your peace of mind just check them, and move on. (For reference when I do support cases I don't look at either of these things first, they're that uncommon.)
  28. ◉ I would say the second most important thing I was sharing today is that when you get into disk-land performance drops quickly, and hard. The No 1. important thing was the concept of ACTIVE DATA SET SIZE. It's not your total data size that matters, it's the subset of it that is being actively accessed every minute. That's what needs to be kept in RAM. ◉◉ Lastly - you don't have to be blind to an upcoming "Everything in production is slow" disaster day. Watch your disk metrics over the long term, that's obvious, but also as demonstrated here today even thought you may be currently keeping 99.99% of reads serviced by RAM watch your WiredTiger cache activity. If it has been a low value for a long time, but recently it's increasing ... and increasing .... then you've pushed above the uncompressed WiredTiger cache size and you're using an increasing amount of filesystem cache. How far that will go before you run depends on how compressible your data is. But obviously what you need to do when you see this is get some more RAM before you run out.
  29. That’s it – but as we have some time left did anyone want to ask a question to: Clarify something, or Ask what would happen if we had X or Y instead of A or B in these demonstrations or anything else like that?