MongoDB in Production at Sailthru: How We Architected for Scale

MongoDB
in Production at
Sailthru

Ian White
MongoDB NY Meetup
1/11/11 (!)

Sailthru
• API-based transactional email led to...
• Mass campaign email led to...
• Intelligence and user behavior
• Three engineers built the ESP we always
wanted to use
• Clients: Huffpo, AOL, Swirl, Thrillist

How We Got To
MongoDB from SQL
• JSON was part of Sailthru infrastructure
from start (SQL columns and S3)
• Kept a close eye on CouchDB project
• MongoDB felt like natural ﬁt
• Used for user proﬁles and analytics
• Migrated one table at a time (very, very
carefully)

Our Cloud (roughly)
load load
balancer balancer

web web web
web (ui) web (api)
(horizon) (link) (failover)

db1 db2 db4 db5 db6 db7 db8 db9 db3

q1 q2 proc jmailer1 jmailer2

Sailthru Architecture
• User interface to display stats, build
campaigns and templates, etc (PHP/EC2)
• API, link rewriting, and onsite endpoints
(PHP/EC2)
• Core mailer engine (Java/EC2 and colo)
• Modiﬁed-postﬁx SMTP servers (colo)
• 9 database servers

MongoDB Overview

• Nine instances on EC2 (4 two-member
replica sets, 1 backup server)
• About 40 collections
• Largest collection 300mil records, 60GB
• 1000 writes/sec, 2000 reads/sec

Users are Documents

• Users aren’t records split among multiple
tables
• End user’s lists, clickstream interests,
geolocation, browser, time of day, purchase
history becomes one ever-growing
document

User Proﬁle
{ "_id" : ObjectId("4b2d368aed948543a5fca4b4"), "browser" : { "Chrome" : 3, "Firefox" : 1, "iPhone" : 2 }, "click_count" : 1, "click_time" :
"Wed Feb 17 2010 09:03:37 GMT-0500 (EST)", "client_id" : 450, "email" : "ibwhite@gmail.com", "email_hour" : { "13" : 1, "14" : 2, "16" : 2,
"17" : 2, "18" : 3, "21" : 2 }, "geo" : { "city" : { "New York, NY US" : 3, "Sterling, VA US" : 1 }, "count" : 6, "country" : { "US" : 6 },
"state" : { "NY US" : 3, "VA US" : 1 }, "zip" : { "10011 US" : 1, "10065 US" : 1 } }, "horizon" : { "admob" : 1, "advertising" : 3,
"afghanistan" : 1, "aig" : 2, "airline-industry" : 2, "alleyinsider" : 45, "analyst-research" : 1, "apple" : 25, "apple-tablet" : 5, "att" :
8, "bailout" : 5, "banks" : 6, "barack-obama" : 25, "ben-bernanke" : 1, "big-tech" : 17, "billionaires" : 1, "boats" : 1, "bonus" : 6, "bp" :
1, "budget" : 1, "cable" : 1, "caribbean" : 2, "cars" : 5, "chart-of-the-day" : 3, "china" : 3, "clusterstock" : 36, "cnbc" : 1, "comcast" :
1, "commodities" : 3, "conan-obrien" : 6, "crime" : 2, "curbedcom" : 1, "death-of-tv" : 1, "debt" : 7, "deepwater-horizon-oil-spill" : 1,
"dell" : 4, "development" : 1, "dick-fuld" : 1, "economy" : 10, "education" : 1, "employment" : 2, "entertainment" : 7, "europe" : 1,
"facebook" : 4, "features" : 13, "financial-crisis" : 7, "financial-services" : 2, "fox" : 4, "fraud" : 1, "futures" : 1, "gadgets" : 21,
"gas" : 1, "gawker" : 5, "gold" : 3, "goldman-sachs" : 1, "google" : 7, "green" : 5, "green-tech" : 2, "health" : 5, "health-care-reform" :
7, "hedge-funds" : 3, "hires-and-fires" : 1, "housing-crisis" : 1, "hp" : 4, "hulu" : 2, "humor" : 1, "iad" : 1, "international" : 3,
"investing" : 5, "ios" : 1, "ipad" : 2, "iphone" : 10, "jay-leno" : 5, "jim-cramer" : 1, "jobs" : 2, "john-gruber" : 2, "law-firms" : 1,
"lawreview" : 3, "lehman-brothers" : 1, "litigation" : 5, "luxury" : 1, "mac" : 1, "magazines" : 1, "markets" : 7, "media" : 20,
"mercedesbenz" : 4, "microsoft" : 1, "mining" : 1, "mobile" : 14, "mobile-ads" : 2, "moguls" : 1, "money" : 6, "money-media" : 2,
"moneygame" : 16, "morningstar" : 3, "mortgages" : 1, "mtv" : 1, "nbc" : 6, "new-york" : 1, "new-york-times" : 4, "news" : 9, "newspapers" :
5, "nouriel-roubini" : 6, "oil" : 1, "online" : 10, "optimum-energy" : 4, "paul-krugman" : 3, "people" : 5, "politics" : 26, "radio" : 1,
"real-estate" : 2, "recession" : 4, "regulation" : 12, "sai" : 15, "satellite-radio" : 1, "scandals" : 5, "security" : 1, "senate" : 4,
"silicon-alley-insider" : 1, "sirius" : 1, "social-networking" : 3, "sports" : 1, "startups" : 1, "steve-jobs" : 1, "stimulus" : 1, "stock-
market" : 5, "stocks" : 3, "tax-cuts" : 1, "taxes" : 1, "tbi" : 163, "tbi-live" : 3, "terrorism" : 3, "the-atlantic" : 1, "the-way-we-live-
now" : 1, "themoneygame" : 3, "thewire" : 17, "tim-geithner" : 3, "time-warner-cable" : 1, "transportation" : 7, "treasury" : 2, "tv" : 7,
"tv-everywhere" : 1, "twitter" : 3, "uk" : 1, "unemployment" : 2, "us-government" : 8, "verizon" : 4, "video" : 6, "wall-st-cheat-sheet" : 1,
"wall-street" : 25, "wall-street-journal" : 1, "warren-buffett" : 1, "white-house" : 4, "wwdc-2010" : 1, "yachts" : 1, "10gen" : 1, "2010-
world-cup" : 1 }, "horizon_count" : 303, "horizon_time" : "Tue Dec 07 2010 15:26:35 GMT-0500 (EST)", "lists" : [ "TBI Research 1 - Beta",
"Dedicated Email", "TBI Research", "411" ], "lists_signup" : { "BI_iphone App" : null, "Clusterstock Chart Of The Day" : null, "Clusterstock
Select" : null, "Dedicated Email" : "Tue Dec 22 2009 13:29:43 GMT-0500 (EST)", "Dedicated Email - The Ladders" : null, "Green Sheet Select" :
null, "Insider 411" : null, "Insider 411 - Economist" : null, "Insider 411 - Ooyala" : null, "Insider 411 - The Wire Promo" : null, "Insider
411- Economist" : null, "Law Review Select" : null, "Media Select" : null, "Silicon Alley Insider Chart Of The Day" : null, "Silicon Alley
Insider Select" : null, "TBI Research" : "Tue Jan 05 2010 13:58:09 GMT-0500 (EST)", "TBI Research 1 - Beta" : "Mon Nov 09 2009 12:34:58
GMT-0500 (EST)", "TBI Select" : null, "The Money Game Select" : null, "War Room Select" : null, "z_sailthru" : null, "10 Things Before the
Opening Bell" : null, "411" : "Wed Jul 07 2010 11:28:03 GMT-0400 (EDT)" }, "open_count" : 11, "open_time" : "Tue Dec 07 2010 13:30:31
GMT-0500 (EST)", "optout_templates" : [ ], "order" : 12, "signup_time" : "Mon Nov 09 2009 12:34:58 GMT-0500 (EST)", "site_hour" : { "20" :
1 }, "status" : null, "status_time" : "Thu Jan 06 2011 11:09:54 GMT-0500 (EST)", "ts" : "Thu Jan 06 2011 11:09:54 GMT-0500 (EST)", "urls" :
[ "http://www.businessinsider.com/" ], "urls_count" : 1, "vars" : { "name" : "eonwhite" } }

Proﬁles Accessible
Everywhere
• Put abandoned shopping cart notiﬁcations
within a mass email
{if profile.purchase_incomplete}
<p>This is what’s in your cart:</p>
{foreach profile.purchase_incomplete.items as item}
{item.qty} <a href=”{item.url}”>{item.title}</a><br/>
{/foreach}
{/if}

Everywhere
• Show a section of content conditional on
the user’s location

{if profile.geo.city[‘New York, NY US’] > 0}
<div>Come to the New York Meetup on the 27th!</div>
{/if}

Everywhere
• Show different content depending on user
interests as measured by on-site behavior
{select}
{case horizon_interest('black,dark')}
<img src="http://example.com/dress-image-black.jpg" />
{/case}
{case horizon_interest('green')}
<img src="http://example.com/dress-image-green.jpg" />
{/case}
{case horizon_interest('purple,polka_dot,pattern')}
<img src="http://example.com/dress-image-polkadot.jpg" />
{/case}
{/select}

Everywhere
• Pick top content from a data feed based on
tags

{set(‘myheadlines’,horizon_select(allheadlines,10))}

{foreach myheadlines as h}
<a href=”{h.url}”>{h.title}</a><br/>
{/foreach}

Other Advantages of
MongoDB
• High performance
• Take any parameters from our clients
• Really ﬂexible development
• Great for analytics (internal and external)
• No more downtime for schema migrations
or reindexing

How We Run mongod
• mongod --dbpath /path/to/db --logpath /path/to/log/
mongodb.log --logappend --fork --rest --replSet
main1

• Don’t ever run without replication
• Don’t ever kill -9
• Don’t run without writing to a log
• Run behind a ﬁrewall
• Take frequent mongodump backups
• Use --rest, it’s handy

Separate DBs By
Collections
• Lower-effort than auto-sharding
• Separate databases for different usage
patterns
• Consider consequences of database failure/
unavailability
• But make sure your backup and monitoring
strategy is prepared for multiple DBs

main DB
• core database functionality, aggregate stats,
editing, low overall usage
• smaller instances than the other databases
• all collections that don’t have scaling challenges go
in here
• will probably never have to shard this

email DB
• holds every message ever sent, plus link rewriting
• contains our largest collections (half billion docs)
• high write demands at peak send times
• will probably be the ﬁrst thing we have to look at
sharding

horizon DB
• browsing data for onsite usage
• high number of reads from a very small collection
(aggregate site data) - this may get cached soon
• not that many writes now, will get higher
• logically separated so that a failure caused by
trafﬁc spike will not affect other operations

proﬁle DB

• contains only user proﬁles - around 30 million
• separated out because access is much more
random and much more of the total dataset must
be in memory
• lots of big expensive queries that must happen on
slave/secondary

Monitoring

• Some stuff to monitor: faults/sec, index
misses, % locked, queue size, load average
• we check basic status once/minute on all
database servers (SMS alerts if down), email
warnings on thresholds every 10 minutes
• some cacti graphs (looking to improve)

Migrating From MySQL

• Take it one collection at a time (not table)
• Change code to write to both MySQL and
MongoDB
• Write and run script to backﬁll old data
• Remove code that writes to MySQL

Thoughts On Migrating

• Take advantage of MongoDB’s ﬂexibility
• Rethink your schema
• Reduce the number of tables/collections

Develop Your Mental
Model of MongoDB

• You don’t need to look at the internals
• But try to gain a working understanding of
how MongoDB operates, especially RAM
and indexes

Disk Access
Will Kill You
• (on EC2 anyway)
• ... so working set RAM is crucial
• Watch faults/sec in mongostat verrrry
closely... it is the sign of impending doom
• With SSD maybe this isn’t quite as much of
an issue

Some Design
Questions To Ask
• What is the most common read scenario?
• How common are reads vs writes?
• Embed vs top-level collection?
• Denormalize (double-store data)?
• How many/which indexes?
• Arrays vs hashes for embedding?
• Optimize in favor of your major use cases

“But premature
optimization is evil”
• Knuth said that about code, which is
ﬂexible and easy to optimize later
• Data is not as ﬂexible as code
• So doing some planning for performance is
usually good when it comes to your data

Questions To Ask

• How big will this collection get?
• The bigger the collection, the more
planning it needs

Favor Human-Readable
Foreign Keys
• DBRefs are a bit cumbersome
• Referencing by MongoId often means doing
extra lookups
• Build human-readable references to save
you doing lookups and manual joins

Example

• Store the Template and the Email as strings
on the message object
• { template: “Internal - Blast Notify”, email:
“support-alerts@sailthru.com” }

• No external reference lookups required
• The tradeoff is basically just disk space

Embed vs Top-Level
Collections?
• The great question of MongoDB schema
design
• If you can ask the question at all, you might
want to err on the side of embedding
• Don’t embed if the embedding could get
huge
• Don’t feel too bad about denormalizing by
embedding AND storing in a top-level
collection

Embedding Pros
• Super-fast retrieval of document with
related data
• Atomic updates
• “Ownership” of embedded document is
obvious
• Usually maps well to code structures

Embedding Cons

• Harder to get at, do mass queries
• Does not size up inﬁnitely, will hit 4MB limit
• Hard to create references to embedded
object
• Can’t index within the embedded objects

Indexes
• Index all highly frequent queries
• Do less-indexed queries only on slaves
• Reduce the size of indexes whereever you
can on big collections
• Don’t sweat the medium-sized collections,
focus on the big wins

Take Advantage of
Multikey Indexes
• Order matters
• If you have an index on {client_id:
1, email: 1 }

• Then you also have the {client_id:
1} index “for free”

• but not { email: 1}

Use your _id

• You must use an _id for every collection,
which will cost you index size
• So do something useful with _id

Take advantage of fast
^indexes
• Messages have _ids like: 32423.00000341
• Need all messages in blast 32423:
• db.message.blast.find(
{ _id: /^32423./ } );

• (Yeah, I know the . is ugly. Don’t use a dot if you do this.)

Organize Indexes To
Minimize Working RAM
• Finding the most recent messages sent to a
user:
• The obvious index is { client_id: 1, email: 1,
send_time: 1 }
• A more efﬁcient index: { month: 1,
client_id: 1, email: 1, send_time: 1 }

SOME TIPS AND
RECOMMENDATIONS
(all just my opinion, take it or leave it)

Minimize Documents
Moving On Disk
• Documents get moved when they exceed
their initial size + padding factor
• You will see “moved” in the log
• So if there are ﬁelds that are likely to get
populated later, pre-populate them with
empty data on insert (to get Mongo to
preallocate more space)

Autoincrement in
MongoDB?
• Can be safely emulated with no race
conditions with ﬁndAndModify
• Generally best avoided, especially with any
collection with high numbers of inserts
• But useful for human-readable ids!
• 521 vs 4b2d368aed948543a5fca4b4
• We use it for clients and blasts (which is
what our Support team needs)

Notes on Types

• Currency: beware of ﬂoating-point
problems, store in pennies
• Dates: BSON Dates are better for
timestamps, not abstract days; store as
YYYYMMDD ints or strings

Consider Before You
Use A Mapper
• ODMs are less necessary than ORMs since
there is much much less mapping to do
• If it helps you, cool -- just make sure you’re
not using one out of relational reﬂex
• If you are building a scalable system you
don’t want to abstract away performance

No Silver Bullet:
Use The Right Tool
• We store a fair amount of archival data
(TB) in ﬂatﬁles on S3
• Big data that does not need random or
frequent access and would be unwieldy in
the database
• Could this data be in MongoDB? Maybe in
GridFS? Yes. But ultimately cheaper on S3.

Queues Are Your
Friend
• Sailthru had them when we needed them
(MySQL). We still have them because
they’re so useful
• Allows you to “spread out” the updates
from peak request load
• Allows you to shut off writes in
emergencies or during database upgrades
without losing data

Have An Upgrade Plan
• Create a mode for your site where writes
queue instead of going to MongoDB
• Turn off writes
• Point reads to slave (if not using repl sets)
• Do what you have to do (upgrade, etc)
• Point reads back to master
• Turn on writes again

When Things Go
Wrong
• mongostat is your friend
• So is the REST interface
• So is the log, grep it for slow queries
• iostat -x 2 to see if disks are saturated
• Always be monitoring to be warned of
coming problems

Oh Yeah, By The Way

• We’re hiring developers and sysadmins
• jobs@sailthru.com

Questions?

ian@sailthru.com
twitter.com/eonwhite

MongoDB in Production at Sailthru: How We Architected for Scale

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to MongoDB in Production at Sailthru: How We Architected for Scale

Similar to MongoDB in Production at Sailthru: How We Architected for Scale (20)

Recently uploaded

Recently uploaded (20)

MongoDB in Production at Sailthru: How We Architected for Scale

Editor's Notes