SlideShare une entreprise Scribd logo
1  sur  18
mongo @ ex.fm
 Lucas Hrabovsky
      CTO
   #MongoPGH
ex.fm turns websites into CD’s
browser extensions
_id and indexes
• Bad Ideas
  – ObjectId("4fb284…")
  – Big Compound Indexes
  – Long,VariableWidthStringsMissIndexes
• Good Ideas
  – Make _id mean something
  – Fixed Width Hashes
  – Use _id as a compound index
activity feeds: first attempt
{“_id”: “201109122304-lucas-dan-c7dede43…”,
"username”: “lucas”, "created”: 201109122304,
"actor”: “dan”, “verb”: “love”}


db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟})
.sort({„created‟: -1})



Working just fine for 4MM documents, but getting slow…
new version of activity feeds
{“_id”: “201109122304-lucas-dan-
c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”:
lucas-love-201109122304, "actor”: “dan”}


db.user.feed.find({„vid‟: /^lucas-/})
.sort({„vid‟: -1})

Fast for all 3 use cases!
removing indexes pays off




Don‟t need to buy more/bigger machines!
sites! sites! sites!
padding factor
•   Variable document size
•   Allocate for the latest and fattest
•   Document moves
•   Can be very inefficient
•   More RAM!
•   Pre-allocate to prevent moves
unbounded embedded lists
•   Useful for followers, favorites
•   Good for a few things, bad for lots
•   Constantly bumping up padding factor
•   Lots of document moves
a metaphor
     • You run a coffee shop and can buy only
       one size of cup. Which size do you buy?
     • On average, each customer has only one
       cup
     • Heavy drinkers have hundreds of cups




credit: Macintex macintex.deviantart.com
bucketing!
•   Split list across multiple documents
•   Median number of items = bucket size
•   Pre-allocate
•   Easy seeking and traversal
•   Much faster
hey charts!
site.meta 1                         site.meta 2

site.songs 1                             site.songs 2




  Allocated and unused

  Allocated and full of data
same charts when using
                bucketing
site.meta 1                               site.meta 2

site.songs 1 - 1               site.songs 2 - 1    site.songs 2 - 2



site.songs 1 -2                site.songs 2 - 3    site.songs 2 - 4



                               site.songs 2 - 5    site.songs 2 -6




  Allocated and unused

  Allocated and full of data
doesn’t work for everything…
• Picking right bucket size
• Defragging
• Random insertion
  – Easy for things you don‟t much care about the
    order of
  – More difficult is you‟re going to insert and
    change the order later
micro documents
db.site.songs.find({_id:
/^bfc25de08d964a8a41226c6016dd7753-
/}).sort({_id:-1})

{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" :
18436532 }
{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" :
18804590 }
{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" :
18804591 }
paying it back
• Bent mongoengine to make this easy
• Follow github.com/exfm
• Also added tooling for
  – Trace all queries
  – Aggregate tracing by request middleware
  – Raise exceptions when queries miss an index
thanks!

  lucas@ex.fm
github.com/exfm

Contenu connexe

Similaire à Optimize MongoDB performance with micro documents and bucketing

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkinskiwilkins
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDBlehresman
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooAndrew Brust
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooAndrew Brust
 
Real-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBReal-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBFredrik Björk
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsMatias Cascallares
 
Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsEDB
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Saltmarch Media
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBMarakana Inc.
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019Dave Nielsen
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status FeedMongoDB
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analyticsMongoDB
 
Using Aggregation for Analytics
Using Aggregation for Analytics Using Aggregation for Analytics
Using Aggregation for Analytics MongoDB
 

Similaire à Optimize MongoDB performance with micro documents and bucketing (20)

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architecture
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data Hullabaloo
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
 
Real-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBReal-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
 
Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data Models
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analytics
 
Using Aggregation for Analytics
Using Aggregation for Analytics Using Aggregation for Analytics
Using Aggregation for Analytics
 

Dernier

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Optimize MongoDB performance with micro documents and bucketing

  • 1. mongo @ ex.fm Lucas Hrabovsky CTO #MongoPGH
  • 2. ex.fm turns websites into CD’s
  • 4. _id and indexes • Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes – Long,VariableWidthStringsMissIndexes • Good Ideas – Make _id mean something – Fixed Width Hashes – Use _id as a compound index
  • 5. activity feeds: first attempt {“_id”: “201109122304-lucas-dan-c7dede43…”, "username”: “lucas”, "created”: 201109122304, "actor”: “dan”, “verb”: “love”} db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟}) .sort({„created‟: -1}) Working just fine for 4MM documents, but getting slow…
  • 6. new version of activity feeds {“_id”: “201109122304-lucas-dan- c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”: lucas-love-201109122304, "actor”: “dan”} db.user.feed.find({„vid‟: /^lucas-/}) .sort({„vid‟: -1}) Fast for all 3 use cases!
  • 7. removing indexes pays off Don‟t need to buy more/bigger machines!
  • 9. padding factor • Variable document size • Allocate for the latest and fattest • Document moves • Can be very inefficient • More RAM! • Pre-allocate to prevent moves
  • 10. unbounded embedded lists • Useful for followers, favorites • Good for a few things, bad for lots • Constantly bumping up padding factor • Lots of document moves
  • 11. a metaphor • You run a coffee shop and can buy only one size of cup. Which size do you buy? • On average, each customer has only one cup • Heavy drinkers have hundreds of cups credit: Macintex macintex.deviantart.com
  • 12. bucketing! • Split list across multiple documents • Median number of items = bucket size • Pre-allocate • Easy seeking and traversal • Much faster
  • 13. hey charts! site.meta 1 site.meta 2 site.songs 1 site.songs 2 Allocated and unused Allocated and full of data
  • 14. same charts when using bucketing site.meta 1 site.meta 2 site.songs 1 - 1 site.songs 2 - 1 site.songs 2 - 2 site.songs 1 -2 site.songs 2 - 3 site.songs 2 - 4 site.songs 2 - 5 site.songs 2 -6 Allocated and unused Allocated and full of data
  • 15. doesn’t work for everything… • Picking right bucket size • Defragging • Random insertion – Easy for things you don‟t much care about the order of – More difficult is you‟re going to insert and change the order later
  • 16. micro documents db.site.songs.find({_id: /^bfc25de08d964a8a41226c6016dd7753- /}).sort({_id:-1}) { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" : 18436532 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" : 18804590 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" : 18804591 }
  • 17. paying it back • Bent mongoengine to make this easy • Follow github.com/exfm • Also added tooling for – Trace all queries – Aggregate tracing by request middleware – Raise exceptions when queries miss an index