SlideShare une entreprise Scribd logo
1  sur  57
Télécharger pour lire hors ligne
Building Social Features
                         with MongoDB
                              Nathan Smith
                             BranchOut.com
                              Jan. 22, 2013




Tuesday, January 22, 13
BranchOut
                          A more social professional network



                    • Connect with your colleagues (follow)
                    • Activity feed of their professional activity
                    • Timeline of an individual’s posts


Tuesday, January 22, 13
BranchOut
                          A more social professional network



                    • 30M installed users
                    • 750MM total user records
                    • Average 300 connections per installed user


Tuesday, January 22, 13
MongoDB @ BranchOut




Tuesday, January 22, 13
MongoDB @ BranchOut


                    • 100% MySQL until ~July 2012




Tuesday, January 22, 13
MongoDB @ BranchOut


                    • 100% MySQL until ~July 2012
                    • Much of our data fits well into a document
                          model




Tuesday, January 22, 13
MongoDB @ BranchOut


                    • 100% MySQL until ~July 2012
                    • Much of our data fits well into a document
                          model
                    • Our data design avoids RDBMS features


Tuesday, January 22, 13
Follow System




Tuesday, January 22, 13
Follow System
                             Business logic




Tuesday, January 22, 13
Follow System
                                   Business logic



                    • Limit of 2000 followees (people you follow)




Tuesday, January 22, 13
Follow System
                                   Business logic



                    • Limit of 2000 followees (people you follow)
                    • Unlimited followers



Tuesday, January 22, 13
Follow System
                                   Business logic



                    • Limit of 2000 followees (people you follow)
                    • Unlimited followers
                    • Both lists reflect updates in near-real time


Tuesday, January 22, 13
Follow System
                                Traditional RDBMS (i.e. MySQL)

                   follower_uid          followee_uid     follow_time
                          123                456        2013-01-22 15:43:00

                          456                123        2013-01-22 15:52:00




Tuesday, January 22, 13
Follow System
                                Traditional RDBMS (i.e. MySQL)

                   follower_uid          followee_uid     follow_time
                          123                456        2013-01-22 15:43:00

                          456                123        2013-01-22 15:52:00




         Advantage: Easy inserts, deletes




Tuesday, January 22, 13
Follow System
                                Traditional RDBMS (i.e. MySQL)

                   follower_uid          followee_uid     follow_time
                          123                456        2013-01-22 15:43:00

                          456                123        2013-01-22 15:52:00




         Advantage: Easy inserts, deletes

         Disadvantage: Data locality, index size



Tuesday, January 22, 13
Follow System
                           MongoDB (first pass)


                           followee: {
                             _id: 123
                             uids: [456, 567, 678]
                           }




Tuesday, January 22, 13
Follow System
                           MongoDB (first pass)


                           followee: {
                             _id: 123
                             uids: [456, 567, 678]
                           }




         Advantage: Compact data, read locality




Tuesday, January 22, 13
Follow System
                           MongoDB (first pass)


                           followee: {
                             _id: 123
                             uids: [456, 567, 678]
                           }




         Advantage: Compact data, read locality
         Disadvantage: Can’t display a user’s followers


Tuesday, January 22, 13
Follow System
                          Can’t display a user’s followers (easily)


                                   followee: {
                                     _id: 123
                                     uids: [456, 567, 678]
                                   }
                                                                ...with multi-key index on
                                                                uids




               db.follow.find({uids: 456}, {_id: 1});




Tuesday, January 22, 13
Follow System
                          Can’t display a user’s followers (easily)


                                   followee: {
                                     _id: 123
                                     uids: [456, 567, 678]
                                   }
                                                                ...with multi-key index on
                                                                uids




               db.follow.find({uids: 456}, {_id: 1});

                          Expensive! Also, no guarantee of order.

Tuesday, January 22, 13
Follow System
                                     MongoDB (second pass)
                                                 follower: {
                                                   _id: 1,
                          followee: {
                                                   uids: [2]
                            _id: 1,
                                                 },
                            uids: [2, 3]
                                                 follower: {
                          },
                                                    _id: 2,
                          followee: {
                                                    uids: [1]
                             _id: 2,
                                                 }
                             uids: [1, 3]
                                                 follower: {
                          }
                                                    _id: 3,
                                                    uids: [1, 2]
                                                 }




Tuesday, January 22, 13
Follow System
                                     MongoDB (second pass)
                                                 follower: {
                                                   _id: 1,
                          followee: {
                                                   uids: [2]
                            _id: 1,
                                                 },
                            uids: [2, 3]
                                                 follower: {
                          },
                                                    _id: 2,
                          followee: {
                                                    uids: [1]
                             _id: 2,
                                                 }
                             uids: [1, 3]
                                                 follower: {
                          }
                                                    _id: 3,
                                                    uids: [1, 2]
                                                 }



         Advantages: Local data, fast selects



Tuesday, January 22, 13
Follow System
                                     MongoDB (second pass)
                                                 follower: {
                                                   _id: 1,
                          followee: {
                                                   uids: [2]
                            _id: 1,
                                                 },
                            uids: [2, 3]
                                                 follower: {
                          },
                                                    _id: 2,
                          followee: {
                                                    uids: [1]
                             _id: 2,
                                                 }
                             uids: [1, 3]
                                                 follower: {
                          }
                                                    _id: 3,
                                                    uids: [1, 2]
                                                 }



         Advantages: Local data, fast selects
         Disadvantages: Follower doc size

Tuesday, January 22, 13
Follow System
                           Follower document size




Tuesday, January 22, 13
Follow System
                           Follower document size

                • Max Mongo doc size: 16MB




Tuesday, January 22, 13
Follow System
                                 Follower document size

                • Max Mongo doc size: 16MB
                • Number of people who follow our
                          community manager: 30MM




Tuesday, January 22, 13
Follow System
                                 Follower document size

                • Max Mongo doc size: 16MB
                • Number of people who follow our
                          community manager: 30MM
                • 30MM uids × 8 bytes/uid = 240MB


Tuesday, January 22, 13
Follow System
                                 Follower document size

                • Max Mongo doc size: 16MB
                • Number of people who follow our
                          community manager: 30MM
                • 30MM uids × 8 bytes/uid = 240MB
                • Max followers per doc: ~2MM

Tuesday, January 22, 13
Follow System
                                       MongoDB (final pass)
                                                  follower: {
                          followee: {               _id: “1”,
                            _id: 1,                 uids: [2,3,4,...],
                            uids: [2, 3]            count: 20001,
                          },                        next_page: 2
                          followee: {             },
                             _id: 2,              follower: {
                             uids: [1, 3]            _id: “1_p2”,
                          }                          uids: [23,24,25,...],
                                                     count: 10000
                                                  }




Tuesday, January 22, 13
Follow System
                                       MongoDB (final pass)
                                                  follower: {
                          followee: {               _id: “1”,
                            _id: 1,                 uids: [2,3,4,...],
                            uids: [2, 3]            count: 20001,
                                                            10001,
                          },                        next_page: 23
                          followee: {             },
                             _id: 2,              follower: {
                             uids: [1, 3]            _id: “1_p2”,
                          }                          uids: [23,24,25,...],
                                                     count: 10000
                                                  }




Tuesday, January 22, 13
Follow System
                                       MongoDB (final pass)
                                                  follower: {
                          followee: {               _id: “1”,
                            _id: 1,                 uids: [2,3,4,...],
                            uids: [2, 3]            count: 20001,
                                                            10001,
                          },                        next_page: 23
                          followee: {             },
                             _id: 2,              follower: {
                             uids: [1, 3]            _id: “1_p2”,
                          }                          uids: [23,24,25,...],
                                                     count: 10000
                                                  }




           Asynchronous thread manages follower documents


Tuesday, January 22, 13
Activity Feed




Tuesday, January 22, 13
Activity Feed
                          Push vs Pull architecture




Tuesday, January 22, 13
Activity Feed
                          Push vs Pull architecture




Tuesday, January 22, 13
Activity Feed
                          Push vs Pull architecture




Tuesday, January 22, 13
Activity Feed
                             Business logic




Tuesday, January 22, 13
Activity Feed
                                          Business logic

                •         All connections and followees appear in your feed




Tuesday, January 22, 13
Activity Feed
                                          Business logic

                •         All connections and followees appear in your feed

                •         Reverse chron sort order (but should support other
                          rankings)




Tuesday, January 22, 13
Activity Feed
                                           Business logic

                •         All connections and followees appear in your feed

                •         Reverse chron sort order (but should support other
                          rankings)

                •         Support for evolving set of feed event types




Tuesday, January 22, 13
Activity Feed
                                           Business logic

                •         All connections and followees appear in your feed

                •         Reverse chron sort order (but should support other
                          rankings)

                •         Support for evolving set of feed event types

                •         Tagging creates multiple feed events for the same
                          underlying object




Tuesday, January 22, 13
Activity Feed
                                           Business logic

                •         All connections and followees appear in your feed

                •         Reverse chron sort order (but should support other
                          rankings)

                •         Support for evolving set of feed event types

                •         Tagging creates multiple feed events for the same
                          underlying object

                •         Feed events are not ephemeral -- Timeline


Tuesday, January 22, 13
Activity Feed
                              Traditional RDBMS (i.e. MySQL)

           activity_id         uid     event_time          type     oid1     oid2
                          1    123   2013-01-22 15:43:00   photo    123abc   789ghi

                          2    345   2013-01-22 15:52:00   status   456def   foobar




Tuesday, January 22, 13
Activity Feed
                              Traditional RDBMS (i.e. MySQL)

           activity_id         uid     event_time          type     oid1     oid2
                          1    123   2013-01-22 15:43:00   photo    123abc   789ghi

                          2    345   2013-01-22 15:52:00   status   456def   foobar




         Advantage: Easy inserts




Tuesday, January 22, 13
Activity Feed
                              Traditional RDBMS (i.e. MySQL)

           activity_id         uid     event_time          type     oid1     oid2
                          1    123   2013-01-22 15:43:00   photo    123abc   789ghi

                          2    345   2013-01-22 15:52:00   status   456def   foobar




         Advantage: Easy inserts
         Disadvantages: Rigid schema adapts poorly to
         new activity types, doesn’t scale


Tuesday, January 22, 13
Activity Feed
                                            MongoDB

                          user_feed_card              user_feed_month

                     ufc:{                      ufm:{
                       _id: 123, // UID           _id: “123_2013_01”,
                       total_events: 18,          events: [
                       2013_01_total: 4,            {
                       2012_12_total: 8,              uid: 123,
                       2012_11_total: 6,              type: “photo_upload”,
                       ...other counts...             content_id: “abcd9876”,
                     }                                timestamp: 1358824502,
                                                      ...more metadata...
                                                    },
                                                    ...more events...
                                                  ]
                                                }




Tuesday, January 22, 13
Activity Feed
                             Algorithm




Tuesday, January 22, 13
Activity Feed
                                      Algorithm
                1. Load user_feed_cards for all connections




Tuesday, January 22, 13
Activity Feed
                                      Algorithm
                1. Load user_feed_cards for all connections
                2. Calculate which user_feed_months to load




Tuesday, January 22, 13
Activity Feed
                                      Algorithm
                1. Load user_feed_cards for all connections
                2. Calculate which user_feed_months to load
                3. Load user_feed_months




Tuesday, January 22, 13
Activity Feed
                                      Algorithm
                1. Load user_feed_cards for all connections
                2. Calculate which user_feed_months to load
                3. Load user_feed_months
                4. Aggregate events that refer to the same story




Tuesday, January 22, 13
Activity Feed
                                      Algorithm
                1. Load user_feed_cards for all connections
                2. Calculate which user_feed_months to load
                3. Load user_feed_months
                4. Aggregate events that refer to the same story
                5. Sort (reverse chron)




Tuesday, January 22, 13
Activity Feed
                                      Algorithm
                1. Load user_feed_cards for all connections
                2. Calculate which user_feed_months to load
                3. Load user_feed_months
                4. Aggregate events that refer to the same story
                5. Sort (reverse chron)
                6. Load content, comments, etc. and build stories



Tuesday, January 22, 13
Activity Feed
                             Performance




Tuesday, January 22, 13
Activity Feed
                                        Performance


                • Response times average under 500 ms (98th
                          percentile under 1 sec




Tuesday, January 22, 13
Activity Feed
                                        Performance


                • Response times average under 500 ms (98th
                          percentile under 1 sec
                • Design expected to scale well horizontally



Tuesday, January 22, 13
Activity Feed
                                        Performance


                • Response times average under 500 ms (98th
                          percentile under 1 sec
                • Design expected to scale well horizontally
                • Need to continue to optimize


Tuesday, January 22, 13
Building Social Features
                         with MongoDB
                                                                  Nathan Smith
                                                       BrO: http://branchout.com/nate
                                                      FB: http://facebook.com/neocortica
                                                              Twitter: @nate510
                                                        Email: nate@branchout.com




                    Aditya Agarwal on Facebook’s architecture: http://www.infoq.com/presentations/Facebook-Software-Stack

                   Dan McKinley on Etsy’s activity feed: http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture

                                                 Good Quora questions on activity feeds:
                   http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed
                            http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed




Tuesday, January 22, 13

Contenu connexe

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

En vedette

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

En vedette (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Creating social features at BranchOut using MongoDB

  • 1. Building Social Features with MongoDB Nathan Smith BranchOut.com Jan. 22, 2013 Tuesday, January 22, 13
  • 2. BranchOut A more social professional network • Connect with your colleagues (follow) • Activity feed of their professional activity • Timeline of an individual’s posts Tuesday, January 22, 13
  • 3. BranchOut A more social professional network • 30M installed users • 750MM total user records • Average 300 connections per installed user Tuesday, January 22, 13
  • 5. MongoDB @ BranchOut • 100% MySQL until ~July 2012 Tuesday, January 22, 13
  • 6. MongoDB @ BranchOut • 100% MySQL until ~July 2012 • Much of our data fits well into a document model Tuesday, January 22, 13
  • 7. MongoDB @ BranchOut • 100% MySQL until ~July 2012 • Much of our data fits well into a document model • Our data design avoids RDBMS features Tuesday, January 22, 13
  • 9. Follow System Business logic Tuesday, January 22, 13
  • 10. Follow System Business logic • Limit of 2000 followees (people you follow) Tuesday, January 22, 13
  • 11. Follow System Business logic • Limit of 2000 followees (people you follow) • Unlimited followers Tuesday, January 22, 13
  • 12. Follow System Business logic • Limit of 2000 followees (people you follow) • Unlimited followers • Both lists reflect updates in near-real time Tuesday, January 22, 13
  • 13. Follow System Traditional RDBMS (i.e. MySQL) follower_uid followee_uid follow_time 123 456 2013-01-22 15:43:00 456 123 2013-01-22 15:52:00 Tuesday, January 22, 13
  • 14. Follow System Traditional RDBMS (i.e. MySQL) follower_uid followee_uid follow_time 123 456 2013-01-22 15:43:00 456 123 2013-01-22 15:52:00 Advantage: Easy inserts, deletes Tuesday, January 22, 13
  • 15. Follow System Traditional RDBMS (i.e. MySQL) follower_uid followee_uid follow_time 123 456 2013-01-22 15:43:00 456 123 2013-01-22 15:52:00 Advantage: Easy inserts, deletes Disadvantage: Data locality, index size Tuesday, January 22, 13
  • 16. Follow System MongoDB (first pass) followee: { _id: 123 uids: [456, 567, 678] } Tuesday, January 22, 13
  • 17. Follow System MongoDB (first pass) followee: { _id: 123 uids: [456, 567, 678] } Advantage: Compact data, read locality Tuesday, January 22, 13
  • 18. Follow System MongoDB (first pass) followee: { _id: 123 uids: [456, 567, 678] } Advantage: Compact data, read locality Disadvantage: Can’t display a user’s followers Tuesday, January 22, 13
  • 19. Follow System Can’t display a user’s followers (easily) followee: { _id: 123 uids: [456, 567, 678] } ...with multi-key index on uids db.follow.find({uids: 456}, {_id: 1}); Tuesday, January 22, 13
  • 20. Follow System Can’t display a user’s followers (easily) followee: { _id: 123 uids: [456, 567, 678] } ...with multi-key index on uids db.follow.find({uids: 456}, {_id: 1}); Expensive! Also, no guarantee of order. Tuesday, January 22, 13
  • 21. Follow System MongoDB (second pass) follower: { _id: 1, followee: { uids: [2] _id: 1, }, uids: [2, 3] follower: { }, _id: 2, followee: { uids: [1] _id: 2, } uids: [1, 3] follower: { } _id: 3, uids: [1, 2] } Tuesday, January 22, 13
  • 22. Follow System MongoDB (second pass) follower: { _id: 1, followee: { uids: [2] _id: 1, }, uids: [2, 3] follower: { }, _id: 2, followee: { uids: [1] _id: 2, } uids: [1, 3] follower: { } _id: 3, uids: [1, 2] } Advantages: Local data, fast selects Tuesday, January 22, 13
  • 23. Follow System MongoDB (second pass) follower: { _id: 1, followee: { uids: [2] _id: 1, }, uids: [2, 3] follower: { }, _id: 2, followee: { uids: [1] _id: 2, } uids: [1, 3] follower: { } _id: 3, uids: [1, 2] } Advantages: Local data, fast selects Disadvantages: Follower doc size Tuesday, January 22, 13
  • 24. Follow System Follower document size Tuesday, January 22, 13
  • 25. Follow System Follower document size • Max Mongo doc size: 16MB Tuesday, January 22, 13
  • 26. Follow System Follower document size • Max Mongo doc size: 16MB • Number of people who follow our community manager: 30MM Tuesday, January 22, 13
  • 27. Follow System Follower document size • Max Mongo doc size: 16MB • Number of people who follow our community manager: 30MM • 30MM uids × 8 bytes/uid = 240MB Tuesday, January 22, 13
  • 28. Follow System Follower document size • Max Mongo doc size: 16MB • Number of people who follow our community manager: 30MM • 30MM uids × 8 bytes/uid = 240MB • Max followers per doc: ~2MM Tuesday, January 22, 13
  • 29. Follow System MongoDB (final pass) follower: { followee: { _id: “1”, _id: 1, uids: [2,3,4,...], uids: [2, 3] count: 20001, }, next_page: 2 followee: { }, _id: 2, follower: { uids: [1, 3] _id: “1_p2”, } uids: [23,24,25,...], count: 10000 } Tuesday, January 22, 13
  • 30. Follow System MongoDB (final pass) follower: { followee: { _id: “1”, _id: 1, uids: [2,3,4,...], uids: [2, 3] count: 20001, 10001, }, next_page: 23 followee: { }, _id: 2, follower: { uids: [1, 3] _id: “1_p2”, } uids: [23,24,25,...], count: 10000 } Tuesday, January 22, 13
  • 31. Follow System MongoDB (final pass) follower: { followee: { _id: “1”, _id: 1, uids: [2,3,4,...], uids: [2, 3] count: 20001, 10001, }, next_page: 23 followee: { }, _id: 2, follower: { uids: [1, 3] _id: “1_p2”, } uids: [23,24,25,...], count: 10000 } Asynchronous thread manages follower documents Tuesday, January 22, 13
  • 33. Activity Feed Push vs Pull architecture Tuesday, January 22, 13
  • 34. Activity Feed Push vs Pull architecture Tuesday, January 22, 13
  • 35. Activity Feed Push vs Pull architecture Tuesday, January 22, 13
  • 36. Activity Feed Business logic Tuesday, January 22, 13
  • 37. Activity Feed Business logic • All connections and followees appear in your feed Tuesday, January 22, 13
  • 38. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) Tuesday, January 22, 13
  • 39. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) • Support for evolving set of feed event types Tuesday, January 22, 13
  • 40. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) • Support for evolving set of feed event types • Tagging creates multiple feed events for the same underlying object Tuesday, January 22, 13
  • 41. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) • Support for evolving set of feed event types • Tagging creates multiple feed events for the same underlying object • Feed events are not ephemeral -- Timeline Tuesday, January 22, 13
  • 42. Activity Feed Traditional RDBMS (i.e. MySQL) activity_id uid event_time type oid1 oid2 1 123 2013-01-22 15:43:00 photo 123abc 789ghi 2 345 2013-01-22 15:52:00 status 456def foobar Tuesday, January 22, 13
  • 43. Activity Feed Traditional RDBMS (i.e. MySQL) activity_id uid event_time type oid1 oid2 1 123 2013-01-22 15:43:00 photo 123abc 789ghi 2 345 2013-01-22 15:52:00 status 456def foobar Advantage: Easy inserts Tuesday, January 22, 13
  • 44. Activity Feed Traditional RDBMS (i.e. MySQL) activity_id uid event_time type oid1 oid2 1 123 2013-01-22 15:43:00 photo 123abc 789ghi 2 345 2013-01-22 15:52:00 status 456def foobar Advantage: Easy inserts Disadvantages: Rigid schema adapts poorly to new activity types, doesn’t scale Tuesday, January 22, 13
  • 45. Activity Feed MongoDB user_feed_card user_feed_month ufc:{ ufm:{ _id: 123, // UID _id: “123_2013_01”, total_events: 18, events: [ 2013_01_total: 4, { 2012_12_total: 8, uid: 123, 2012_11_total: 6, type: “photo_upload”, ...other counts... content_id: “abcd9876”, } timestamp: 1358824502, ...more metadata... }, ...more events... ] } Tuesday, January 22, 13
  • 46. Activity Feed Algorithm Tuesday, January 22, 13
  • 47. Activity Feed Algorithm 1. Load user_feed_cards for all connections Tuesday, January 22, 13
  • 48. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load Tuesday, January 22, 13
  • 49. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months Tuesday, January 22, 13
  • 50. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months 4. Aggregate events that refer to the same story Tuesday, January 22, 13
  • 51. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months 4. Aggregate events that refer to the same story 5. Sort (reverse chron) Tuesday, January 22, 13
  • 52. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months 4. Aggregate events that refer to the same story 5. Sort (reverse chron) 6. Load content, comments, etc. and build stories Tuesday, January 22, 13
  • 53. Activity Feed Performance Tuesday, January 22, 13
  • 54. Activity Feed Performance • Response times average under 500 ms (98th percentile under 1 sec Tuesday, January 22, 13
  • 55. Activity Feed Performance • Response times average under 500 ms (98th percentile under 1 sec • Design expected to scale well horizontally Tuesday, January 22, 13
  • 56. Activity Feed Performance • Response times average under 500 ms (98th percentile under 1 sec • Design expected to scale well horizontally • Need to continue to optimize Tuesday, January 22, 13
  • 57. Building Social Features with MongoDB Nathan Smith BrO: http://branchout.com/nate FB: http://facebook.com/neocortica Twitter: @nate510 Email: nate@branchout.com Aditya Agarwal on Facebook’s architecture: http://www.infoq.com/presentations/Facebook-Software-Stack Dan McKinley on Etsy’s activity feed: http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture Good Quora questions on activity feeds: http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed Tuesday, January 22, 13