For the first time this year, 10gen will be offering a track completely dedicated to Operations at MongoSV, 10gen's annual MongoDB user conference on December 4. Learn more at MongoSV.com
5. Summary/Agenda
• Why does operating in the cloud matter?
• How MongoDB is suited for the cloud
• MongoDB Basics
– Replica Sets, Sharding
• Deploying MongoDB in the cloud
• Amazon - EC2, AWS
• Conclusion
6. What is the cloud?
MongoDB
What does it mean to be ready
“Cloud Ready” for the cloud?
Why should you care?
7. What is “the Cloud”?
• Abstract Concept, buzzword, marketing
• Properties in this context:
– Computing as a resource
– Quick deployment (horizontal)
– Flexible deployment (vertical)
– Platform Agnostic
– Distributed
8. Working Well in the Cloud
• Horizontally Scalable
– For Reads (from secondaries) and Writes
(sharding)
– Therefore being able to spin up new instances
quickly (a defining feature of the cloud) can be
leveraged.
• Multi-platform Support
– Windows (EC2, Azure)
– Linux (EC2, Eucalyptus and more)
– Solaris (Joyent)
9. Why should you care about the Cloud ?
• Time to spin up new instances
– Vastly reduced versus hardware deployment
• Deploy close to users
– Office, users, decoupled geographically
• Cost of deployment
– Very cheap to test, scale up as needed
– Pay As You Go options
• No need to be a Datacenter expert
– Focus on what you do well instead
12. Sharding - Typical Set Up
config DB
config DB
mongos mongos mongos
config DB
Primary Primary Primary
Secondary Secondary Secondary
Secondary Secondary Secondary
18. General Guidelines
• Use 64-bit only, 32-bit is not recommended
• Primary/Secondary should usually be equal
• High CPU is usually not necessary
• High Memory for large mongod instances
• Disk IO Capacity and Latency are usually a limitation
– RAID 10 or similar usually a good idea
• http://www.mongodb.org/display/DOCS/Amazon+EC2
• http://www.mongodb.org/display/DOCS/Production+Notes
19. EC2 Specific Notes - Instance Sizes
Instance Type API Name Available RAM (GB) Network (Gbps) Cores EC2 Units
Standard m1.small 1.71 0.25 1 1
m1.medium 3.75 0.25 1 2
m1.large 7.5 0.5** 2 2
m1.xlarge 15 1** 4 8
Hi-Mem m2.xlarge 17.1 0.25 2 6.5
m2.2xlarge 34.2 0.5 4 13
m2.4xlarge 68.4 1** 8 26
Hi-CPU c1.medium 1.7 0.25 2 5
c1.xlarge 7 1.0 8 20
Hi-IO hi1-4xlarge 60.5 10* 8 35
Cluster Compute cc1.4xlarge 23 10* 8 33.5
cc1.8xlarge 60 10* 16 88
Micro t1.micro ~0.613 ~0.1 ~1 ~2
* CC and high IO nodes have 10Gbps dedicated, but there is a 2Gbps rate limit between the instances and EBS,
** Provisioned IOPs is available on these instances, adds dedicated IO bandwidth (0.5 or 1GBps) to normal
22. System Configuration Notes - Linux
• Set file descriptor limits (20,000 or above)
• Turn off atime on filesystem (pre-2.6.30
especially)
• Use ext4/XFS as the filesystem (not ext3)
– kernel >= 2.6.23/2.6.25 respectively
• RAID 10 is recommended everywhere
– mitigates slow volumes (fail the bad volume)
• Do not use large VM pages
• Configure swap to prevent OOM Killer
25. Deploying in the Cloud
• Although there are different challenges
when deploying in the cloud, the benefits
generally outweigh the difficulties
• MongoDB can be and has been deployed
at scale with great success
• Allows developers and DBAs to do what
they do best and not have to be datacenter
expert (though operational best practices
are always a good idea).
26. Getting Started
• Test a deployment in EC2, see the
Quickstart Guide:
– http://www.mongodb.org/display/DOCS/Amazon
+EC2+Quickstart
• Whitepaper (soon to be updated):
– http://media.amazonwebservices.com/
AWS_NoSQL_MongoDB.pdf
Who am I\nExperience\nHow this applies outside the cloud\n\n
There is always a Stack Overflow question\n
\n
And now, Australia has AWS locally too!\n
Stick to a summary, don’t go too deep (on next slides anyway)\n\nSpeak about generally\n\n* Why would you use the cloud?\n* Brief talk about the basic pieces of a MongoDB set up (Ops Workshop tomorrow for more)\n
\n
This is definition for this presentation, at least\n
In order to work well in the cloud, you need to be:\n\nHorizontally scalable\nHave multi-platform support\nBe smart about deploying in the cloud\n
Also - everything we go through here applies to non-Cloud deployments also\n
Ops Workshop for details\n
Auto-failover\nOne primary\nTalk about pro/con of configs\n
\n
Take “typical” config from previous slide, move it to be a shard\n\nWhat does that mean?\n\nADDING META DATA (config DB holds it, mongos caches it)\n\n
\n
\n
\n
\n
\n
\n
\n
Network capacity is IO capacity - except for P-IOPS \nTrade off - use m1.large --> m2.2xlarge as an example\n
\n
Show PIOPS impact - random reads and writes on the left are both higher and more consistent\n
\n
\n
Readahead - lot of bad advice and misunderstandings out there - for random access, small docs, set it low\nRAID - Already mentioned, but worth mentioning again\nLatency - Can be a factor - MongoDB lets you deploy across regions, countries etc. but there can be drawbacks\nNetworking - Required for “health” of a cluster but also needs to be solid for data extraction, insertion\nContention in busy cloud regions like US-East in Amazon\nShared Resources, Noisy Neighbors\n\n
\n
Bring it back to the core proposition - why use the cloud?\nMongoDB does work in the cloud, and works well, when configured and used properly\nIf you are a dev, be a dev, not an ops person or data center engineer, but still, be aware\n