1. Challenges with
MongoDB
Stone Gao
MongoDB Beijing 2012
Monday, April 2, 2012
2. About Me
Tech Lead at Umeng.com
Monday, April 2, 2012
3. MongoDB is Awesome
• Document-oriented storage
• Full Index Support
• Replication & High Availability
• Auto-Sharding
• Querying
• Fast In-Place Updates
• Map/Reduce
• GridFS
Monday, April 2, 2012
4. But...
This talk is not Yet Another Talk about it’s
Awesomeness
but
challenges with MongoDB
Monday, April 2, 2012
5. Outline
1. Global Write Lock Sucks
2. Auto-Sharding is not that Reliable
3. Schema-less is Over Rated
4. Community Contribution is Quite Low
5. Attitude Matters
Monday, April 2, 2012
6. 1. Global Write Lock Sucks
http://www.clker.com/cliparts/3/3/5/D/X/b/locked-exclamation-mark-padlock-hi.png
Monday, April 2, 2012
7. 1. Global Write Lock Sucks
single global write lock for the entire server (process)
collection1 table1
doc1 doc1
doc2 doc2
db-1 db-1
collection2 table2
doc1 doc1
doc2 doc2
mongod mysqld
doc1
collection1 VS. doc1
table1
doc2 doc2
db-n db-n
collection2 table2
doc1 doc1
doc2 doc2
DB Process Lock VS. Row Lock
Monday, April 2, 2012
8. 1. Global Write Lock Sucks
Intel SSD 320 RAID10 & mongostat
39.5K Rread IOPS / 23K Write IOPS
Nearly all data in RAM, lock ratio is pretty high and
bunch of Queued Writes(qw)
Monday, April 2, 2012
9. 1. Global Write Lock Sucks
Intel SSD 320 RAID10 & mongostat
39.5K Rread IOPS / 23K Write IOPS
Nearly all data in RAM, lock ratio is pretty high and
bunch of Queued Writes(qw)
Monday, April 2, 2012
10. 1. Global Write Lock Sucks
Intel SSD 320 RAID10 & mongostat
39.5K Rread IOPS / 23K Write IOPS
Nearly all data in RAM, lock ratio is pretty high and
bunch of Queued Writes(qw)
Monday, April 2, 2012
11. 1. Global Write Lock Sucks
Intel SSD 320 RAID10 & mongostat
39.5K Rread IOPS / 23K Write IOPS
Nearly all data in RAM, lock ratio is pretty high and
bunch of Queued Writes(qw)
Monday, April 2, 2012
12. Possible Solutions/Workarounds #1
Wait for lock related issues on JIRA
•SERVER-2563 : When hitting disk, yield lock - phase 1
https://jira.mongodb.org/browse/SERVER-2563 Fixed in 1.9.1 Vote (25)
• any time we actually have to hit disk. so if a memory mapped page is not in ram, then we should yield
update by _id, remove, long cursor iteration
•SERVER-1240 : Collection level locking
https://jira.mongodb.org/browse/SERVER-1240 Planning Bucket A Vote (154)
•SERVER-1241 : Intra collection locking (maybe extent)
https://jira.mongodb.org/browse/SERVER-1241 Planning Bucket A Vote (25)
•SERVER-1169 : Record level locking
https://jira.mongodb.org/browse/SERVER-1169 Rejected Vote (1)
and more ...
Monday, April 2, 2012
13. Possible Solutions/Workarounds #2
One Collection per DB to Reduce Lock Ratio
But you can go no further
Use Auto-Sharding to the rescue ?
Monday, April 2, 2012
14. 2. Auto-Sharding is not that Reliable
http://www.autoinsurancecompanies.com/wp-content/uploads/2011/11/reliable.jpg
Monday, April 2, 2012
16. Problems with Auto-Sharding
• MongoDB can’t figure out how many docs in a collection after sharding
• Balancer dead lock
[Balancer] skipping balancing round during ongoing split or move activity.)
[Balancer] dist_lock lock failed because taken by....
[Balancer] Assertion failure cm s/balance.cpp...
• Uneven shard load distribution
• ...
(Note: I did the experiment before 2.0. So some of the issues might be fixed
or improved in new versions of MongoDB coz it’s evolving very fast)
Monday, April 2, 2012
17. Possible Solutions/Workarounds #1
Manual Chunk Pre-Splitting
http://www.mongodb.org/display/DOCS/Splitting+Shard+Chunks
https://groups.google.com/d/msg/mongodb-user/tYBFKSMM3cU/TiYtoOiNMgEJ
http://blog.zawodny.com/2011/03/06/mongodb-pre-splitting-for-faster-data-loading-and-importing/
0) Turn off the balancer (balancing won't understand your locations, but it shouldn't matter b/c
you're using hashed shard keys)
1) Shard the empty collection over the shard key { location : 1, hash : 1 }
2) run db.runCommand({ split : "<coll>", middle : { "location":"DEN", "hash": "8000...0" }})
3) run db.runCommand({ split : "<coll>", middle : { "location":"SC", "hash": "0000...0" }})
4) move those empty chunks to whatever shards you want
- Greg Studer
Monday, April 2, 2012
18. Possible Solutions/Workarounds #2
SERVER-2001 : Option to hash shard key
https://jira.mongodb.org/browse/SERVER-2001 Unresolved Fix Version/s: 2.1.1 Vote (27)
“The lack of hashing based read/write distribution
amongst available shards is a huge issue for us now.
We're actually considering implementing an app-side
layer to do this but that obviously has a number of
serious drawbacks.”
- Remon van Vliet
“Seems like a good idea : we implemented hashed
shard key on client-side : operation rate sky rocked
( x3 and less variability). Balancing is moreover
quicker and done during our very heavy insertion
process : perfect !”
- Grégoire Seux
https://github.com/twitter/gizzard/raw/master/doc/forwarding_table.png
Monday, April 2, 2012
19. Possible Solutions/Workarounds #3
Plain-old Application Level Sharding
https://github.com/twitter/gizzard/raw/master/doc/forwarding_table.png
Monday, April 2, 2012
20. 3. Schema-less is Over Rated
http://images.sodahead.com/polls/001635729/1863780_overrated_answer_2_xlarge.jpeg
Monday, April 2, 2012
21. Schema-less is Over Rated
Schema-Free (schema-less) is not free.
It means repeat the schema in every docs (records) !
Monday, April 2, 2012
22. Possible Solutions/Workarounds #1
Use Short Key Names
1.6 billion documents
{"sequence":"AHAHSPGPGSAVKLPAPHSVGKSALR",
"location":{
243 GB
"chromosome":"19",
"strand":"-",
"begin":"51067007",
"end":"51067085"
}}
183 GB
{"s":"AHAHSPGPGSAVKLPAPHSVGKSALR",
"l":{
"c":"19",
"s":"-",
"b":"51067007",
"e":"51067085"
}}
60 GB saved!
ref : http://christophermaier.name/blog/2011/05/22/MongoDB-key-names
Monday, April 2, 2012
23. Possible Solutions/Workarounds #2
SERVER-863 : Tokenize the field names
https://jira.mongodb.org/browse/SERVER-863 planned but not scheduled Vote (66)
“Most collections, even if they don’t contain the same
structure , they contain similar. So it would make a
lot of sense and save a lot of space to tokenize the field
names.”
“The overall benefit as mentioned by other users is that
you reduce the amount of storage/RAM taken up by
redundant data in each document (so you can use
less resources per request, hence gain more throughput
and capacity), while importantly also freeing the
developer from having to pick short and hard to read
field names as a workaround for a technical limitation.”
- Andrew Armstrong
Monday, April 2, 2012
24. Possible Solutions/Workarounds #3
SERVER-164 : Option to store data compressed
https://jira.mongodb.org/browse/SERVER-164 planned but not scheduled Vote (126)
“The way oracle handles this is transparent to the
database server at the block engine level. They
compress the blocks similar to how SAN store's handle
it rather than at a record level. They use zlib type
compression and the overhead is less than 5 percent.
Due to the IO access reduction in both number of
blocks touched, and amount of data transferred, the
overall effect is a cumulative speed increase.
Should MongoDB do it this way? Maybe? But at the end
of the day, the architecture must make Mongo more
scalable, as well as increase the ability limit the storage
footprint.”
- Michael D. Joy
Monday, April 2, 2012
25. 4. Community Contribution is
Quite Low
http://www.thompsoncrg.com/wp-content/themes/zoomtechnic/images/slide/img3.jpg
Monday, April 2, 2012
26. Community Contribution is
Quite Low
https://github.com/mongodb/mongo/graphs/impact
https://github.com/mongodb/mongo/contributors
Monday, April 2, 2012
28. 5. Attitude Matters
http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart
MongoDB already has the sweetest API in the
NoSQL world.
Wish more effort invested in fixing the Hard
Problems : locking, sharding, storage engine...
Monday, April 2, 2012
29. We are hiring
We are doing bigdata analytics
• Backend Engineer (MongoDB, Hadoop,
HBase, Storm, Scala, Java, Ruby, Clojure)
• Data Mining Engineer
• DevOps Engineer
• Front End Engineer
hr@umeng.com
Monday, April 2, 2012