MongoDB Management Service (MMS) is the application for managing MongoDB, created by the engineers who develop MongoDB. MMS provides visibility into the performance of your cluster, alerting when key metrics are out of range and backup and recovery of your mission critical data. This session will provide you with an overview of MMS, including installation and setup and a walk through of metrics and alerts. Then we'll compare and contrast the various different backup strategies, with a deep dive on using MMS to back up your MongoDB data.
5. MMS User Interface
Navigation tabs take you to the different functional areas of MongoDB Management
Service. Through this interface, you can monitor your deployment, configure alerts
via email or SMS, backup your data and automate your deployments.
7. Monitoring
MMS monitors deployments through a monitoring agent
installed on a host server. One agent can:
• Identify the members of the deployment and server
configuration, MongoDB Version, query profile, logs, etc.
• Dynamically create a graphical representation of the
deployment technology.
• Visualize performance indications like lock percentage,
reads/writes, queues, etc.
• Enable you to configure alerts for when these numbers
aren’t “normal”
11. What is “Normal”?
• Set a base line for normal by seeing how your
production environment responds to regular traffic
• Check for spikes in operations – peak or
unexpected load?
12. Key Performance Indicators
• Page faults, queues and lock % may be indicators
that you need to scale up or out
• Oplog window indicates how long a secondary can
be behind the primary
• Background average flush indicates if your disks are
struggling
13. Proactive Support
Additionally, MongoDB offers Proactive Support for Subscription Customers,
where our engineers are able to monitor your deployment and make
suggestions in order to tweak for better performance or avoid doom.
24. Risks Are Everywhere
Storage
failures
Power
outages
Programmer
error
Hardware
failures
DC failures
Cyber
attacks
Weather
25. Analyzing Risk Tolerance
• Relative to any particular risk
– How much data can you afford to lose? (RPO)
– How long can you afford to be offline? (RTO)
– What price are you willing to pay to reduce risk?
• MongoDB solutions
– Replication
– Application/Infrastructure Engineering
– Backups!
27. Replication
• Built into MongoDB, effects ops and infrastructure
cost
• Tunable durability minimizes risk in case of failure
• Automatic failover process lasts for very short
interval
• However…programmer errors will replicate almost
immediately!
28. App and Infrastructure Engineering
Many potential solutions to ensure redunancy in
applications and infrastructure, such as:
• Multiple racks
• Multiple data centers
29. Backups
Backing up data is one way to ensure availability and
lower risk. They require active engagement,
otherwise:
• Backups can be outdated
• Process can be slow (backup or restore)
• Isolated
• …but they are relatively inexpensive and do well at
minimizing risk
31. mongodump/mongorestore
• Run online or offline
• Oplog aware for point-in-time restores
• Filter in, filter out
• Considerations
– Data size
– Sharding
– Working set
32. Storage Level Backup
• Copy files from data directory (e.g. /data)
• File system or block level snapshots
• Fastest way to backup/restore
• Considerations
– Journaling
– Consistency
– Granularity
– Ops coordination, point-in-time
– Snapshot storage
33. MMS Backup
• Cloud or On-Premise
• Support for a variety of deployment systems
– RHEL, Ubuntu, CentOS, Mac, Windows
• Support for a variety of deployment types
– Single, Replica, Sharded
34. MongoDB Backup Approaches
Mongodump File system MMS Backup
Initial complexity Medium High Low
Confidence in
Backups
Medium Medium High
Point in time
recovery of replica
set
Sort of ☺ No Yes
System Overhead High Can be low Low
Scalable No With work Yes
Consistent
Difficult Difficult Yes
Snapshot of
Sharded System
36. Getting Started with MMS Backup
• Sign into MMS
• Install the MMS Backup agent onto one node
• Select the replica sets or sharded cluster to back up
• Start
39. Configurable Backups
• Include/exclude Replica Sets
• Include/exclude Databases and/or Collections
• Control snapshot frequency (as low as 15 minutes)
• Control data retention (up to 1 year)
40. Snapshot Process
• Starting with the initial sync, we rebuild your data in
our data centers and start snapshotting
• Defaults snapshots every 6 hours
• Oplog is stored for 24 hours
44. Sharded Cluster Backups
• Balancer paused every 6 hours (default,
configurable)
• A “no-op” token inserted into oplog, mongos, config
• Oplog applied to backup shards up to token point
Provides consistent state of the cluster across shards
45. Restoring Sharded Cluster
• Select cluster in MMS interface
• Restore from pre-built snapshot or request
checkpoint restore (15 minute window)
• Download one data file per shard and one for config