Eric Lubow gave a presentation on how SimpleReach fixed problems with their MongoDB implementation. They implemented a sharded replica set architecture across availability zones for high availability and speed. They improved data accuracy by separating databases and enforcing consistent access patterns. SimpleReach also implemented a controlled data flow using NSQ to batch and route data between MongoDB, Cassandra, Vertica, and other tools for analytics and real-time usage. Their architecture provides redundancy, minimal downtime for changes, and monitors performance using tools like Nagios, Statsd and Cloudwatch.
2. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Overview
The Secret
SimpleReach
Usage Patterns
Tools
Architecture Implementation
Questions
•
•
•
•
•
•
3. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
4. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
The 2 Truths
5. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Even with the right tools, 80% of the work
of building a big data system is acquiring
and refining the raw data into usable data.
The Real Truth
6. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
7. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
8. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Millions of URLs per day
Over 1.25 billion page views per month
500m events per day (~6k events/second)
Auto-scale 125-160 machines depending on traffic
Built a predictive measurement algorithm for the social web
SimpleReach
9. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
And It Goes Like This...
C*
Vertica
10. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
11. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Why Mongo?
Fast and easy prototyping
Low barrier to entry
B-Tree indexes and range queries
Aggergation
Everything is JSON
TTLs
MongoID
•
•
•
•
•
•
•
12. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Goals
Highly available
Speed
Repeatability
Data accuracy (across storage engines)
Clients should have minimal architecture knowledge
Controlled Data Flow Patterns
Control data set size
Restore capabilities for non-ephemeral data
•
•
•
•
•
•
•
•
13. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Availability and Speed
Internal service architecture
Mongos on every server that talks to Mongo
Server distribution across data centers
Latest version isn’t always the greatest version
Understand how usage patterns affect Mongo
•
•
•
•
•
14. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Repeatability - Sharded Replica Set
SHARD0000A
MONGOS
PRIMARY SECONDARY
BASE AMI
ORGANIZATIONAL BASE
BASE IMAGE
LAYOUT
APPLICATION GROUP
AMAZON
LINUX
MONITORING
USERS
MONGOD
MONGOD-
ARBITER
SHARD0000B
MONGOS
AMAZON
LINUX
MONITORING
USERS
MONGOD
APPLICATION
15. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Availability - Architecture Distribution
US-EAST-
1a
MONGO-SHARD-
0001-B
MONGO-SHARD-
0000-A
CASSANDRA-0001
CASSANDRA-0010
REDIS-0001A
VERTICA-0001
iAPI-
0001
US-EAST-
1b
MONGO-SHARD-
0002-B
MONGO-SHARD-
0001-A
CASSANDRA-0002
CASSANDRA-0011
REDIS-0001B
iAPI-
0002
US-EAST-
1e
MONGO-SHARD-
0002-A
MONGO-SHARD-
0000-B
CASSANDRA-0003
CASSANDRA-0012
VERTICA-0003
iAPI-
0003
VERTICA-0002
16. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
The Schrute of the Problem
17. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Releases
Reasons why I update software:
Because I want the latest version
To get rid of the reminder
18. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Usage Patterns
Mongos uses TCP-based flow control
Separate DBs to deal with DB level locking
Consistent access patterns
Schema design
Proper indexing
Avoid scatter/gather and aim for targeted
•
•
•
•
•
•
19. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Consistent Access Patterns
realtime_score
(‘score’, ‘realtime’)
score.realtime
srt
20. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Schema Design
Randomly pre-populate consistent document structures
Use SetOnInsert to pre-populate
Shard keys
Separate DBs to deal with DB level locking (volume based)
TTL
Hashed shard keys
$inc when possible, $set is expensive
•
•
•
•
•
•
•
23. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Path of a Packet
INTERNET
InternalAPI
Solr
C*
Mongo
Redis
Vertica
Consumers
Queue
FIRE
HOSE
EC
API
SC
24. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
NSQ by Bit.ly
Distributed and de-centralized topology
At least once delivery guaranteed
Multicast style message routing
Runtime discovery for consumers to find producers
Allow for maintenance windows with no downtime
Ephemeral channels for testing
•
•
•
•
•
•
25. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Controlled Data Flow
Social Event
Collector
Social Data
Batch & Write
Processed Data
Batch & Write
Raw Data
Calculate Score Write
NSQ Multicast NSQ NSQ
26. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Problems?
27. Big Architectures for Big Data Eric Lubow @elubow #Cassandra13
Service Architecture
Internal API
Solr
Real-time
C*
C*
Vertica
28. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Anatomy of an Endpoint
MONGO
MONGO
VERTICA
C*
C*
HOURLY
CONTENT
MONGO
MONGO
VERTICA
C*
C*
TENMINUTE
CONTENT
QUERYINGMACHINES
HELENUS
HELENUS
PYVERTICA
PYMONGO
PYMONGO
PYVERTICA
29. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Endpoint Breakout Advantages
Availability
Consistent Access Patterns
Minimal downtime changes
Smaller code deploys
Non-monolithic code base
No async necessary
•
•
•
•
•
•
30. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
DevOps
Monitor: Nagios, Statsd, and Cloudwatch
Manage: Chef, OpsWorks, cSSHx, Vagrant
Know failure cases
Turn off balancer on backups
Restart EVERYTHING on upgrade
Extensive use of AWS
•
•
•
•
•
•
31. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Cloud Specificsblockdev --setra 256
Use ephemeral storage, not EBS volumes
Use MMS
Cloudwatch Metrics are important and easily scriptable
Don’t use spots but always expect instance loss
Kernel tuning
•
•
•
•
•
•
32. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Summary
Understand your usage patterns
Know the common failure cases
Architecture distribution
Homogeneous Distribution
Monitoring & Automation
•
•
•
•
•
33. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
We’re
Hiring(Ask about Food Coma
Fridays)
34. How We Fixed Our MongoDB
Problems
Eric Lubow @elubow #MongoDBDays
Questions are guaranteed in life.
Answers aren’t.
Eric Lubow
@elubow
elubow@simplereach.com
#Cassandra13
Thank you.