SlideShare a Scribd company logo
1 of 20
Download to read offline
Operational Best
   Practices
Tales from the field
The Plan
● Review support cases
  ○ Taken from real issues
  ○ Names/ips/dates changed to protect identities
● Analyze reported issues
● Distill best practices
● Summarize takeaways

● Repeat...
Scenario 1
● Fire, it is on fire!

● Users notice response time takes 1-3 sec
● App logs show timeouts
● Server log show socket exceptions
Scenario 1 - Diagnostics
● Logs

● Understanding the timeouts
   ○ Client read timeout set
   ○ Connection closed/discarded
   ○ Symptom not cause


● Server connection exceptions
  ○ Match timing of client timeouts
   ○ Symptom not cause
Scenario 1 - Monitoring
Graphs speak a thousand words
Scenario 1 - Takeaways
● Monitor Logs
  ○ Alert, escalate
  ○ Correlate
● Disk
   ○ Monitor
   ○ Moved to RAID (10)
● Instrument/Monitor App
● Know your application and application (write)
  characteristics
Scenario 2
● Alerts warn that server is running hot
● Random (small) slowdowns
● Increased traffic/queries
Scenario 2 - Symptoms
High use cpu




Similar query
pattern
Scenario 2 - Diagnostics
● Turn on DB Profiling
● Look at logs


Identify query patterns taking longest or with
highest frequency and run explain
Scenario 2 - Explain
db.scenario2.find({...}).sort({...}).explain() {
     "cursor" : "BtreeCursor ABC",
     "nscanned" : 160677,
     "nscannedObjects" : 12015,
     "n" : 55,
     "millis" : 99,
     "scanAndOrder" : true,
     "indexBounds" : {...} }
Scenario 2 - Diagnostics
● Create a compound index
  ○ Used for criteria and sort
  ○ Reduced CPU dramatically
Scenario 2 - Takeaways
●   Performance test/analyze system behavior
●   Load test before deployment
●   Alert on abnormal states
●   High CPU is a sign of poorly indexed
●   Rolling upgrade for indexes
Scenario 3
● General slowdown on login
● High disk utilization
Scenario 3 - Diagnostics
iostat
Device:   rrqm/s wrqm/s r/s   w/s rsec/s wsec/s avgrq-sz avgqu-sz await   svctm %util
sdp       0.00 0.00 0.50      0.00 27.86 0.00 56.00 149.58        20320.00 2010.00 100.00
Scenario 3
$ blockdev --report
RO RA SSZ BSZ StartSec       Size    Device
rw 8096 512 4048    0 1099494850560 /dev/sdp


Huge read-ahead of 4MB
Scenario 3 - Takeaways
●   Pay attention to disk configurations
●   Load testing would have found this early
●   MongoDB depends on the OS a lot
●   Connect the dots from disportionate effects
Best Practices Learned
● System provisioning
  ○   Capacity
  ○   Performance
  ○   Scale
  ○   Configuration
● Logs
  ○ Review
  ○ Alert
  ○ Rotate and collect (per cluster)
Best Practices Learned
● Query/Index Analysis
  ○ Database Profiler
  ○ Run explain periodically (sampled)
  ○ Instrument code, generate metrics
● Plan/test rollouts
  ○ Rolling upgrade for Replica Set
  ○ Generate indexes on secondaries first
  ○ Name services, use redirection
Thanks, more refs
Please take a look at http://mongodb.org (docs)

● Ask on mongodb-user group
● Use MMS or historic monitoring
   ○ Watch for trends
   ○ Create alerts
   ○ Forecast capacity for provisioning
● logrotate unix command
● monitor disk - munin or the like
● iostat, dstat, vmstat, free, netstat
Questions

More Related Content

Viewers also liked

Java Development with MongoDB
Java Development with MongoDBJava Development with MongoDB
Java Development with MongoDBScott Hernandez
 
Petty Cash Management - How To Manage Logs and Transactions
Petty Cash Management - How To Manage Logs and TransactionsPetty Cash Management - How To Manage Logs and Transactions
Petty Cash Management - How To Manage Logs and TransactionsDavid Olson
 
Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011lennartkoopmann
 
"Grand Challenges" of Log Management
"Grand Challenges" of Log Management"Grand Challenges" of Log Management
"Grand Challenges" of Log ManagementAnton Chuvakin
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaSpringPeople
 
Gaining Operational Insights out of Your Logs
Gaining Operational Insights out of Your LogsGaining Operational Insights out of Your Logs
Gaining Operational Insights out of Your LogsAmazon Web Services
 

Viewers also liked (8)

Java Development with MongoDB
Java Development with MongoDBJava Development with MongoDB
Java Development with MongoDB
 
Petty Cash Management - How To Manage Logs and Transactions
Petty Cash Management - How To Manage Logs and TransactionsPetty Cash Management - How To Manage Logs and Transactions
Petty Cash Management - How To Manage Logs and Transactions
 
Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011Managing the logs of your (Rails) applications - RailsWayCon 2011
Managing the logs of your (Rails) applications - RailsWayCon 2011
 
"Grand Challenges" of Log Management
"Grand Challenges" of Log Management"Grand Challenges" of Log Management
"Grand Challenges" of Log Management
 
Log Files
Log FilesLog Files
Log Files
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & Kibana
 
Gaining Operational Insights out of Your Logs
Gaining Operational Insights out of Your LogsGaining Operational Insights out of Your Logs
Gaining Operational Insights out of Your Logs
 
Logs management
Logs managementLogs management
Logs management
 

Similar to MongoDB Operational Best Practices (mongosf2012)

How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...Red Hat Developers
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningPetar Djekic
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetJuan Berner
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafkaconfluent
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and MetricsRicardo Lourenço
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Oksana Safronova - Will you detect it or not? How to check if security team i...
Oksana Safronova - Will you detect it or not? How to check if security team i...Oksana Safronova - Will you detect it or not? How to check if security team i...
Oksana Safronova - Will you detect it or not? How to check if security team i...NoNameCon
 
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...Redis Labs
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseBig Data Spain
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
 
Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...Umair Shahid
 

Similar to MongoDB Operational Best Practices (mongosf2012) (20)

How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
 
Performance Whackamole (short version)
Performance Whackamole (short version)Performance Whackamole (short version)
Performance Whackamole (short version)
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budget
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Oksana Safronova - Will you detect it or not? How to check if security team i...
Oksana Safronova - Will you detect it or not? How to check if security team i...Oksana Safronova - Will you detect it or not? How to check if security team i...
Oksana Safronova - Will you detect it or not? How to check if security team i...
 
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...
 
Tuning Java Servers
Tuning Java Servers Tuning Java Servers
Tuning Java Servers
 

More from Scott Hernandez

MongoDB 2.8 Replication Internals: Fitting it all together
MongoDB 2.8 Replication Internals: Fitting it all togetherMongoDB 2.8 Replication Internals: Fitting it all together
MongoDB 2.8 Replication Internals: Fitting it all togetherScott Hernandez
 
Advanced Replication Internals
Advanced Replication InternalsAdvanced Replication Internals
Advanced Replication InternalsScott Hernandez
 
Realtime Analytics with MongoDB Counters (mongonyc 2012)
Realtime Analytics with MongoDB Counters (mongonyc 2012)Realtime Analytics with MongoDB Counters (mongonyc 2012)
Realtime Analytics with MongoDB Counters (mongonyc 2012)Scott Hernandez
 
MongoDB Datacenter Awareness (mongosf2012)
MongoDB Datacenter Awareness (mongosf2012)MongoDB Datacenter Awareness (mongosf2012)
MongoDB Datacenter Awareness (mongosf2012)Scott Hernandez
 
Mongo sf easy java persistence
Mongo sf   easy java persistenceMongo sf   easy java persistence
Mongo sf easy java persistenceScott Hernandez
 
MongoDB: Easy Java Persistence with Morphia
MongoDB: Easy Java Persistence with MorphiaMongoDB: Easy Java Persistence with Morphia
MongoDB: Easy Java Persistence with MorphiaScott Hernandez
 
MongoDB: Mastering the shell
MongoDB: Mastering the shellMongoDB: Mastering the shell
MongoDB: Mastering the shellScott Hernandez
 
MongoDB: Backup, Restore, and DR
MongoDB: Backup, Restore, and DRMongoDB: Backup, Restore, and DR
MongoDB: Backup, Restore, and DRScott Hernandez
 
What's new in the MongoDB Java Driver (2.5)?
What's new in the MongoDB Java Driver (2.5)?What's new in the MongoDB Java Driver (2.5)?
What's new in the MongoDB Java Driver (2.5)?Scott Hernandez
 
MongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupMongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupScott Hernandez
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksScott Hernandez
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellScott Hernandez
 

More from Scott Hernandez (13)

MongoDB 2.8 Replication Internals: Fitting it all together
MongoDB 2.8 Replication Internals: Fitting it all togetherMongoDB 2.8 Replication Internals: Fitting it all together
MongoDB 2.8 Replication Internals: Fitting it all together
 
Advanced Replication Internals
Advanced Replication InternalsAdvanced Replication Internals
Advanced Replication Internals
 
Realtime Analytics with MongoDB Counters (mongonyc 2012)
Realtime Analytics with MongoDB Counters (mongonyc 2012)Realtime Analytics with MongoDB Counters (mongonyc 2012)
Realtime Analytics with MongoDB Counters (mongonyc 2012)
 
MongoDB Datacenter Awareness (mongosf2012)
MongoDB Datacenter Awareness (mongosf2012)MongoDB Datacenter Awareness (mongosf2012)
MongoDB Datacenter Awareness (mongosf2012)
 
Mongo sf easy java persistence
Mongo sf   easy java persistenceMongo sf   easy java persistence
Mongo sf easy java persistence
 
MongoDB: Easy Java Persistence with Morphia
MongoDB: Easy Java Persistence with MorphiaMongoDB: Easy Java Persistence with Morphia
MongoDB: Easy Java Persistence with Morphia
 
MongoDB: Mastering the shell
MongoDB: Mastering the shellMongoDB: Mastering the shell
MongoDB: Mastering the shell
 
MongoDB: Backup, Restore, and DR
MongoDB: Backup, Restore, and DRMongoDB: Backup, Restore, and DR
MongoDB: Backup, Restore, and DR
 
A Brief MongoDB Intro
A Brief MongoDB IntroA Brief MongoDB Intro
A Brief MongoDB Intro
 
What's new in the MongoDB Java Driver (2.5)?
What's new in the MongoDB Java Driver (2.5)?What's new in the MongoDB Java Driver (2.5)?
What's new in the MongoDB Java Driver (2.5)?
 
MongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF MeetupMongoDB Aug2010 SF Meetup
MongoDB Aug2010 SF Meetup
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript Shell
 

Recently uploaded

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

MongoDB Operational Best Practices (mongosf2012)

  • 1. Operational Best Practices Tales from the field
  • 2. The Plan ● Review support cases ○ Taken from real issues ○ Names/ips/dates changed to protect identities ● Analyze reported issues ● Distill best practices ● Summarize takeaways ● Repeat...
  • 3. Scenario 1 ● Fire, it is on fire! ● Users notice response time takes 1-3 sec ● App logs show timeouts ● Server log show socket exceptions
  • 4. Scenario 1 - Diagnostics ● Logs ● Understanding the timeouts ○ Client read timeout set ○ Connection closed/discarded ○ Symptom not cause ● Server connection exceptions ○ Match timing of client timeouts ○ Symptom not cause
  • 5. Scenario 1 - Monitoring Graphs speak a thousand words
  • 6. Scenario 1 - Takeaways ● Monitor Logs ○ Alert, escalate ○ Correlate ● Disk ○ Monitor ○ Moved to RAID (10) ● Instrument/Monitor App ● Know your application and application (write) characteristics
  • 7. Scenario 2 ● Alerts warn that server is running hot ● Random (small) slowdowns ● Increased traffic/queries
  • 8. Scenario 2 - Symptoms High use cpu Similar query pattern
  • 9. Scenario 2 - Diagnostics ● Turn on DB Profiling ● Look at logs Identify query patterns taking longest or with highest frequency and run explain
  • 10. Scenario 2 - Explain db.scenario2.find({...}).sort({...}).explain() { "cursor" : "BtreeCursor ABC", "nscanned" : 160677, "nscannedObjects" : 12015, "n" : 55, "millis" : 99, "scanAndOrder" : true, "indexBounds" : {...} }
  • 11. Scenario 2 - Diagnostics ● Create a compound index ○ Used for criteria and sort ○ Reduced CPU dramatically
  • 12. Scenario 2 - Takeaways ● Performance test/analyze system behavior ● Load test before deployment ● Alert on abnormal states ● High CPU is a sign of poorly indexed ● Rolling upgrade for indexes
  • 13. Scenario 3 ● General slowdown on login ● High disk utilization
  • 14. Scenario 3 - Diagnostics iostat Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdp 0.00 0.00 0.50 0.00 27.86 0.00 56.00 149.58 20320.00 2010.00 100.00
  • 15. Scenario 3 $ blockdev --report RO RA SSZ BSZ StartSec Size Device rw 8096 512 4048 0 1099494850560 /dev/sdp Huge read-ahead of 4MB
  • 16. Scenario 3 - Takeaways ● Pay attention to disk configurations ● Load testing would have found this early ● MongoDB depends on the OS a lot ● Connect the dots from disportionate effects
  • 17. Best Practices Learned ● System provisioning ○ Capacity ○ Performance ○ Scale ○ Configuration ● Logs ○ Review ○ Alert ○ Rotate and collect (per cluster)
  • 18. Best Practices Learned ● Query/Index Analysis ○ Database Profiler ○ Run explain periodically (sampled) ○ Instrument code, generate metrics ● Plan/test rollouts ○ Rolling upgrade for Replica Set ○ Generate indexes on secondaries first ○ Name services, use redirection
  • 19. Thanks, more refs Please take a look at http://mongodb.org (docs) ● Ask on mongodb-user group ● Use MMS or historic monitoring ○ Watch for trends ○ Create alerts ○ Forecast capacity for provisioning ● logrotate unix command ● monitor disk - munin or the like ● iostat, dstat, vmstat, free, netstat