2. My Background
• For the past year, I’ve looked at MongoDB
logs at least once every day.
• We routinely answer the question “how
can I improve performance?”
Thursday, June 27, 13
3. Who’s this talk for?
• New to MongoDB
• Seeing some slow operations, and need
help debugging
• Running database operations on a sizeable
deploy
• I have a MongoDB deployment, and I’ve hit
a performance wall
Thursday, June 27, 13
4. What should you learn?
Know where to look on a running MongoDB
to uncover slowness, and discuss solutions.
MongoDB has performance “patterns”.
How to think about improving performance.
And . . .
Thursday, June 27, 13
6. MongoDB Indexing
Constraints
• One index per query *
• One range operator per query ($)
• Range operator must be last field in index
• Using RAM well
* except $or, but the sin with $or is appending a sort to the query.
Thursday, June 27, 13
7. The Tools
• `mongostat` for MongoDB Behavior
• `tail` the logs for current options
• `iostat` for disk util
• `top -c` for CPU usage
Thursday, June 27, 13
8. First, a Simple One
query getmore command res faults locked db ar|aw netIn netOut conn time
129 4 7 126m 2 my_db:0.0% 3|0 27k 445k 42 15:36:54
64 4 3 126m 0 my_db:0.0% 5|0 12k 379k 42 15:36:55
65 7 8 126m 0 my_db:0.1% 3|0 15k 230k 42 15:36:56
65 3 3 126m 1 my_db:0.0% 3|0 13k 170k 42 15:36:57
66 1 6 126m 1 my_db:0.0% 0|0 14k 262k 42 15:36:58
32 8 5 126m 0 my_db:0.0% 5|0 5k 445k 42 15:36:59
a truncated mongostat
Alerted due to high CPU
Thursday, June 27, 13
10. Solution
{ $query: { publisher: "US Weekly" }, orderby: { publishedAt: -1 } }
db.my_collection.ensureIndex({“publisher”: 1, publishedAt: -1}, {background: true})
We are fixing this query
With this index
I would show you the logs, but now they are silent.
Thursday, June 27, 13
11. The Pattern
Inefficient Read Queries from in-memory
table scans cause high CPU load
Caused by not matching indexes to queries.
Thursday, June 27, 13
22. Example 4
A map reduce has gradually run
slower and slower.
Thursday, June 27, 13
23. Finding Offenders
Find the time of the slowest query of the day:
grep '[0-9]{3,100}ms$' $MONGODB_LOG | awk '{print $NF}' | sort -n
Thursday, June 27, 13
25. Solution
Query is slow because it has multiple multi-value operators: $or, $gte, and $lte
Problem
Solution
Change schema to use an “hour_created” attribute:
hour_created: “%Y-%m-%d %H”
Create an index on “hour_created” with followed by “$or” values. Query
using the new “hour_created.”
Thursday, June 27, 13
26. Words of caution
2 / 4 solutions were to add an index.
New indexes as a solution scales poorly.
Thursday, June 27, 13
27. Sometimes . . .
It is best to do nothing, except add shards / add hardware.
Go back to the drawing board on the design.
Thursday, June 27, 13
28. Bad things happen to
good databases?
• ORMs
• Manage your indexes and queries.
• Constraints will set you free.
Thursday, June 27, 13
29. Road Map for
Refactoring
• Measure, measure, measure.
• Find your slowest queries and determine if
they can be indexed
• Rephrase the problem you are solving by
asking “How do I want to query my data?”
Thursday, June 27, 13