10. Mitch Pirtle
• Recovering Joomla! founder
• Mongo Master
• Starting companies since
1995
• Musician, skate punk, football
coach
• American idiot living in Turin
11. Important Mitch Facts
• I am not cool. However I have been called perky.
• I am not Rich. My name is Mitch. Such is life.
• I am internet famous. Just to be clear:
Internet Famous + $1.50 = $1.50
16. All About MongoDB
• Brief introduction to MongoDB
• CONSOLE!
• Really cool discoveries and surprises
• Shameful admissions and painful stories
17. In The Beginning
• We had relational databases. Back then they
were called “databases” and that’s where you
stored your data.
• Primary focus: atomicity, consistency, reliability.
• Was normal to spend 6 hours. ON ONE QUERY.
• I love vacuum tubes, keep you warm in winter.
• Life was good.
18. What Happened
• Hello, Internet!
• Databases became immediate source of pain for
scale, performance
• Traffic grew, along with it came bigger
expectations, infinitely more complexity, a slew
of new platforms, and Big Data™
20. Troubled Relations
• Web languages gravitated toward objects, not
3NF entites/relations
• Size of data needed to live on more than one
physical machine
• Performance requirements needed to be far
better
21. Along came sharding
• Can split your data across multiple machines
• Also splits your query load across multiple
machines
• Like RAID for your data, right?
22. What sharding brought
along for the ride
• How do you back this stuff
up?
• How do you spread a group
query across N machines
again?
• How do you run a join query
that spans a sharded table?
29. The Promises of MongoDB
• Speed - crazy whack-daddy fast
• Simplicity - JSON documents FTW
• Embedded documents
• 16MB limit
• Scale - sharding, multimaster out of the box
• Yes, I said whack-daddy.
31. Wait, there’s more
• Fulltext: Allows for compound indexes, supports
many languages
• Sharding: You can scale collections across N
machines
• GridFS: Simple interface to store files in your
database (CONSOLE!)
• Multimaster: Replica Sets make it possible for
read slaves, failover, redundancy
33. Mini Case Study: Totsy
• First ecommerce site to rely on MongoDB for all
data. Everything. Even product images and
associated media.
• I suspected it would be fast.
• I suspected we could develop quickly.
(This was important, as they only let me hire one
guy.)
35. Launch story
• Went live with MongoDB on a quad-core
consumer grade el-cheapo machine, only 2GB
RAM.
• I was terrified.
• Over a million moms waiting for the launch.
• Upon launch, load was 0.05. Highest it ever got
was around 0.5.
37. Development impact
• Simple models make for less code. There were
no sixteen-table joins, no ORM, one result had
all the data needed from a single query.!
• Less code makes for less bugs. No more six-
hour query debugging marathons. No more
learning why UNION was faster than JOIN…
• Less bugs leaves time for more code. Did I
mention they only let me hire one guy?
38. Even moar impact
• Used GridFS for all media storage.
• Allowed free MD5 checking for duplicates.
• Allowed storage of metadata per file (views,
comments, rates, whatever else we wanted).
• No need for NFS, clumsy rsync cronjobs, high
costs of NAS or iSCSI.
40. The perils of schemaless
• Started prototyping quickly enough
• Made a couple changes to user model
• Made some more changes…
• WHUPS WHY FIFTEEN KINDS OF USER?!?!
42. Everything in the database!
• Backups were brutal
• Forgot to separate GridFS data from main
database
• Totally unprepared for the operational impact