SlideShare une entreprise Scribd logo
1  sur  32
Building a Social Platform
with MongoDB
MongoDB Inc
Darren Wood & Asya Kamsky
#MongoDBWorld
Building a Social Platform
Part 2:
Managing the Social Graph
Socialite
• Open Source
• Reference Implementation
– Various Fanout Feed Models
– User Graph Implementation
– Content storage
• Configurable models and options
• REST API in Dropwizard (Yammer)
– https://dropwizard.github.io/dropwizard/
• Built-in benchmarking
https://github.com/10gen-labs/socialite
Architecture
GraphServiceProxy
ContentProxy
Graph Data - Social
John Kate
follows
Bob
Pete
Graph Data - Social
John Kate
follows
Bob
Pete
Recommendation ?
Graph Data - Promotional
John Kate
follows
Bob
Pete
Acme
Soda
Mention
Recommendation ?
Graph Data - Everywhere
• Retail
• Complex product catalogues
• Product recommendation engines
• Manufacturing and Logistics
• Tracing failures to faulty component batches
• Determining fallout from supply interruption
• Healthcare
• Patient/Physician interactions
Design Considerations
The Tale of Two Biebers
VS
The Tale of Two Biebers
VS
Follower Churn
• Tempting to focus on scaling content
• Follow requests rival message send rates
• Twitter enforces per day follow limits
Edge Metadata
• Models – friends/followers
• Requirements typically start simple
• Add Groups, Favorites, Relationships
Storing Graphs in MongoDB
Option One – Embedding Edges
Embedded Edge Arrays
• Storing connections with user (popular choice)
 Most compact form
 Efficient for reads
• However….
– User documents grow
– Upper limit on degree (document size)
– Difficult to annotate (and index) edge
{
"_id" : "djw",
"fullname" : "Darren Wood",
"country" : "Australia",
"followers" : [ "jsr", "ian"],
"following" : [ "jsr", "pete"]
}
Embedded Edge Arrays
• Creating Rich Graph Information
– Can become cumbersome
{
"_id" : "djw",
"fullname" : "Darren Wood",
"country" : "Australia",
"friends" : [
{"uid" : "jsr", "grp" : "school"},
{"uid" : "ian", "grp" : "work"} ]
}
{
"_id" : "djw",
"fullname" : "Darren Wood",
"country" : "Australia",
"friends" : [ "jsr", "ian"],
"group" : [ ”school", ”work"]
}
Option Two – Edge Collection
Edge Collections
• Document per edge
• Very flexible for adding edge data
> db.followers.findOne()
{
"_id" : ObjectId(…),
"from" : "djw",
"to" : "jsr"
}
> db.friends.findOne()
{
"_id" : ObjectId(…),
"from" : "djw",
"to" : "jsr",
"grp" : "work",
"ts" : Date("2013-07-10")
}
Operational issues
• Updates of embedded arrays
– grow non-linearly with number of indexed array
elements
• Updating edge collection => inserts
– grows close to linearly with existing number of
edges/user
Edge Insert Rate
Edge Collection
Indexing Strategies
Finding Followers
Consider our single followercollection :
> db.followers.find({from : "djw"}, {_id:0, to:1})
{
"to" : "jsr"
}
Using index :
{
"v" : 1,
"key" : { "from" : 1, "to" : 1 },
"unique" : true,
"ns" : "socialite.followers",
"name" : "from_1_to_1"
}
Covered index when
searching on "from"
for all followers
Specify only if
multiple edges cannot
exist
Finding Following
What about who a user is following?
Can use a reverse covered index :
{
"v" : 1,
"key" : { "from" : 1, "to" : 1 },
"unique" : true,
"ns" : "socialite.followers",
"name" : "from_1_to_1"
}
{
"v" : 1,
"key" : { "to" : 1, "from" : 1 },
"unique" : true,
"ns" : "socialite.followers",
"name" : "to_1_from_1"
}
Notice the flipped
field order here
Finding Following
Wait ! There is an issue with the reverse index…..
SHARDING !
{
"v" : 1,
"key" : { "from" : 1, "to" : 1 },
"unique" : true,
"ns" : "socialite.followers",
"name" : "from_1_to_1"
}
{
"v" : 1,
"key" : { "to" : 1, "from" : 1 },
"unique" : true,
"ns" : "socialite.followers",
"name" : "to_1_from_1"
}
If we shard this collection by
"from", looking up followers
for a specific user is
"targeted" to a shard
To find who the user is
following however, it must
scatter-gather the query to
all shards
Dual Edge Collections
Dual Edge Collections
When "following" queries are common
– Not always the case
– Consider overhead carefully
Can use dual collections storing
– One for each direction
– Edges are duplicated reversed
– Can be sharded independently
Edge Query Rate Comparison
Number of shards
vs
Number of queries
Followers collection
with forward and
reverse indexes
Two collections,
followers, following
one index each
1 10,000 10,000
3 90,000 30,000
6 360,000 60,000
12 1,440,000 120,000
Follower Counts
Can use the edge indexes :
How to determine these counts ?
> db.followers.find({_f : "djw"}).count()
> db.following.find({_f : "djw"}).count()
However this can be heavy weight
- Especially for rendering landing page
- Consider maintaining counts on user document
Socialite User Service
• Manages user profiles and the follower graph
• Supports arbitrary user data passthrough
• Options for graph storage
– Uses edge collections (can shard by _f)
– Options for maintaining separate follower/ing graphs
– Storing counts vs counting
{
"_id" : ObjectId("52cd1d32a0ee9a1a76d369bb"),
"_f" : "jsr",
"_t" : "djw"
}
{
"v" : 1,
"key" : {"_f" : 1, "_t" : 1},
"unique" : true,
}
Next up @ 11:50am :
Scaling the Data Feed
• Delivering user content to followers
• Comparing fanout models
• Caching user timelines for fast retrieval
• Embedding vs Linking Content
Building a Social Platform
with MongoDB
MongoDB Inc
Darren Wood & Asya Kamsky
#MongoDBWorld

Contenu connexe

Tendances

3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향
3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향
3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향SANGHEE SHIN
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISAndrea Bollini
 
nginx 입문 공부자료
nginx 입문 공부자료nginx 입문 공부자료
nginx 입문 공부자료choi sungwook
 
MOBILITY X DATA : 모빌리티 산업의 도전 과제
MOBILITY X DATA : 모빌리티 산업의 도전 과제MOBILITY X DATA : 모빌리티 산업의 도전 과제
MOBILITY X DATA : 모빌리티 산업의 도전 과제Seongyun Byeon
 
An Introduction to MongoDB Compass
An Introduction to MongoDB CompassAn Introduction to MongoDB Compass
An Introduction to MongoDB CompassMongoDB
 
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)Seongyun Byeon
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013Fabien Gandon
 
[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기
[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기
[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기NAVER D2
 
Converting Relational to Graph Databases
Converting Relational to Graph DatabasesConverting Relational to Graph Databases
Converting Relational to Graph DatabasesAntonio Maccioni
 
[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf
[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf
[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdfDeukJin Jeon
 
RESTful API 설계
RESTful API 설계RESTful API 설계
RESTful API 설계Jinho Yoo
 
Or2019 DSpace 7 Enhanced submission & workflow
Or2019 DSpace 7 Enhanced submission & workflowOr2019 DSpace 7 Enhanced submission & workflow
Or2019 DSpace 7 Enhanced submission & workflow4Science
 
Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...Vinh Nguyen
 
DSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status updateDSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status update4Science
 
Model Your Application Domain, Not Your JSON Structures
Model Your Application Domain, Not Your JSON StructuresModel Your Application Domain, Not Your JSON Structures
Model Your Application Domain, Not Your JSON StructuresMarkus Lanthaler
 
Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph modelLuca Garulli
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingNeo4j
 
Rails DB migrations
Rails DB migrationsRails DB migrations
Rails DB migrationsDenys Kurets
 
Using MongoDB as a high performance graph database
Using MongoDB as a high performance graph databaseUsing MongoDB as a high performance graph database
Using MongoDB as a high performance graph databaseChris Clarke
 

Tendances (20)

3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향
3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향
3차원 위치 기반의 CAD/BIM/GIS 융합 활용 방향
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 
nginx 입문 공부자료
nginx 입문 공부자료nginx 입문 공부자료
nginx 입문 공부자료
 
MOBILITY X DATA : 모빌리티 산업의 도전 과제
MOBILITY X DATA : 모빌리티 산업의 도전 과제MOBILITY X DATA : 모빌리티 산업의 도전 과제
MOBILITY X DATA : 모빌리티 산업의 도전 과제
 
An Introduction to MongoDB Compass
An Introduction to MongoDB CompassAn Introduction to MongoDB Compass
An Introduction to MongoDB Compass
 
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
 
[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기
[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기
[134]병리 AI Product 개발을 위한 데이터 관리 및 좌충우돌 삽질기
 
Neo4J 사용
Neo4J 사용Neo4J 사용
Neo4J 사용
 
Converting Relational to Graph Databases
Converting Relational to Graph DatabasesConverting Relational to Graph Databases
Converting Relational to Graph Databases
 
[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf
[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf
[전득진_22년4월] AI_ML담당_Tech_seminar-emart.pdf
 
RESTful API 설계
RESTful API 설계RESTful API 설계
RESTful API 설계
 
Or2019 DSpace 7 Enhanced submission & workflow
Or2019 DSpace 7 Enhanced submission & workflowOr2019 DSpace 7 Enhanced submission & workflow
Or2019 DSpace 7 Enhanced submission & workflow
 
Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...
 
DSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status updateDSpace implementation of the COAR Notify Project - status update
DSpace implementation of the COAR Notify Project - status update
 
Model Your Application Domain, Not Your JSON Structures
Model Your Application Domain, Not Your JSON StructuresModel Your Application Domain, Not Your JSON Structures
Model Your Application Domain, Not Your JSON Structures
 
Switching from relational to the graph model
Switching from relational to the graph modelSwitching from relational to the graph model
Switching from relational to the graph model
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and Modeling
 
Rails DB migrations
Rails DB migrationsRails DB migrations
Rails DB migrations
 
Using MongoDB as a high performance graph database
Using MongoDB as a high performance graph databaseUsing MongoDB as a high performance graph database
Using MongoDB as a high performance graph database
 

Similaire à Socialite, the Open Source Status Feed Part 2: Managing the Social Graph

Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status FeedMongoDB
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRaghunath A
 
Building a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDBBuilding a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDBMongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBSean Laurent
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDBlehresman
 
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasRemaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasMongoDB
 
Data_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfData_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfjill734733
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
 
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB
 
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014:  Social Network Benchmark (SNB) Graph GeneratorFOSDEM 2014:  Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014: Social Network Benchmark (SNB) Graph GeneratorLDBC council
 
An Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and KeynoteAn Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and KeynoteMongoDB
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedMongoDB
 
Jornadas gvSIG 2009 WSS English
Jornadas gvSIG 2009 WSS EnglishJornadas gvSIG 2009 WSS English
Jornadas gvSIG 2009 WSS Englishsabueso81
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
managing big data
managing big datamanaging big data
managing big dataSuveeksha
 
Black friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchSylvain Wallez
 
The Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesThe Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesInfiniteGraph
 
Solr 6.0 Graph Query Overview
Solr 6.0 Graph Query OverviewSolr 6.0 Graph Query Overview
Solr 6.0 Graph Query OverviewKevin Watters
 

Similaire à Socialite, the Open Source Status Feed Part 2: Managing the Social Graph (20)

Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
MediaGlu and Mongo DB
MediaGlu and Mongo DBMediaGlu and Mongo DB
MediaGlu and Mongo DB
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Building a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDBBuilding a Cross Channel Content Delivery Platform with MongoDB
Building a Cross Channel Content Delivery Platform with MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasRemaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
 
Data_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdfData_Modeling_MongoDB.pdf
Data_Modeling_MongoDB.pdf
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Houston 2019: Best Practices for Working with IoT and Time-ser...
 
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014:  Social Network Benchmark (SNB) Graph GeneratorFOSDEM 2014:  Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
 
An Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and KeynoteAn Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and Keynote
 
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data FeedSocialite, the Open Source Status Feed Part 3: Scaling the Data Feed
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
 
Jornadas gvSIG 2009 WSS English
Jornadas gvSIG 2009 WSS EnglishJornadas gvSIG 2009 WSS English
Jornadas gvSIG 2009 WSS English
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
managing big data
managing big datamanaging big data
managing big data
 
Black friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling Elasticsearch
 
The Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesThe Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use Cases
 
Solr 6.0 Graph Query Overview
Solr 6.0 Graph Query OverviewSolr 6.0 Graph Query Overview
Solr 6.0 Graph Query Overview
 

Plus de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Dernier (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Socialite, the Open Source Status Feed Part 2: Managing the Social Graph

  • 1. Building a Social Platform with MongoDB MongoDB Inc Darren Wood & Asya Kamsky #MongoDBWorld
  • 2. Building a Social Platform Part 2: Managing the Social Graph
  • 3. Socialite • Open Source • Reference Implementation – Various Fanout Feed Models – User Graph Implementation – Content storage • Configurable models and options • REST API in Dropwizard (Yammer) – https://dropwizard.github.io/dropwizard/ • Built-in benchmarking https://github.com/10gen-labs/socialite
  • 5. Graph Data - Social John Kate follows Bob Pete
  • 6. Graph Data - Social John Kate follows Bob Pete Recommendation ?
  • 7. Graph Data - Promotional John Kate follows Bob Pete Acme Soda Mention Recommendation ?
  • 8. Graph Data - Everywhere • Retail • Complex product catalogues • Product recommendation engines • Manufacturing and Logistics • Tracing failures to faulty component batches • Determining fallout from supply interruption • Healthcare • Patient/Physician interactions
  • 10. The Tale of Two Biebers VS
  • 11. The Tale of Two Biebers VS
  • 12. Follower Churn • Tempting to focus on scaling content • Follow requests rival message send rates • Twitter enforces per day follow limits
  • 13. Edge Metadata • Models – friends/followers • Requirements typically start simple • Add Groups, Favorites, Relationships
  • 14. Storing Graphs in MongoDB
  • 15. Option One – Embedding Edges
  • 16. Embedded Edge Arrays • Storing connections with user (popular choice)  Most compact form  Efficient for reads • However…. – User documents grow – Upper limit on degree (document size) – Difficult to annotate (and index) edge { "_id" : "djw", "fullname" : "Darren Wood", "country" : "Australia", "followers" : [ "jsr", "ian"], "following" : [ "jsr", "pete"] }
  • 17. Embedded Edge Arrays • Creating Rich Graph Information – Can become cumbersome { "_id" : "djw", "fullname" : "Darren Wood", "country" : "Australia", "friends" : [ {"uid" : "jsr", "grp" : "school"}, {"uid" : "ian", "grp" : "work"} ] } { "_id" : "djw", "fullname" : "Darren Wood", "country" : "Australia", "friends" : [ "jsr", "ian"], "group" : [ ”school", ”work"] }
  • 18. Option Two – Edge Collection
  • 19. Edge Collections • Document per edge • Very flexible for adding edge data > db.followers.findOne() { "_id" : ObjectId(…), "from" : "djw", "to" : "jsr" } > db.friends.findOne() { "_id" : ObjectId(…), "from" : "djw", "to" : "jsr", "grp" : "work", "ts" : Date("2013-07-10") }
  • 20. Operational issues • Updates of embedded arrays – grow non-linearly with number of indexed array elements • Updating edge collection => inserts – grows close to linearly with existing number of edges/user
  • 23. Finding Followers Consider our single followercollection : > db.followers.find({from : "djw"}, {_id:0, to:1}) { "to" : "jsr" } Using index : { "v" : 1, "key" : { "from" : 1, "to" : 1 }, "unique" : true, "ns" : "socialite.followers", "name" : "from_1_to_1" } Covered index when searching on "from" for all followers Specify only if multiple edges cannot exist
  • 24. Finding Following What about who a user is following? Can use a reverse covered index : { "v" : 1, "key" : { "from" : 1, "to" : 1 }, "unique" : true, "ns" : "socialite.followers", "name" : "from_1_to_1" } { "v" : 1, "key" : { "to" : 1, "from" : 1 }, "unique" : true, "ns" : "socialite.followers", "name" : "to_1_from_1" } Notice the flipped field order here
  • 25. Finding Following Wait ! There is an issue with the reverse index….. SHARDING ! { "v" : 1, "key" : { "from" : 1, "to" : 1 }, "unique" : true, "ns" : "socialite.followers", "name" : "from_1_to_1" } { "v" : 1, "key" : { "to" : 1, "from" : 1 }, "unique" : true, "ns" : "socialite.followers", "name" : "to_1_from_1" } If we shard this collection by "from", looking up followers for a specific user is "targeted" to a shard To find who the user is following however, it must scatter-gather the query to all shards
  • 27. Dual Edge Collections When "following" queries are common – Not always the case – Consider overhead carefully Can use dual collections storing – One for each direction – Edges are duplicated reversed – Can be sharded independently
  • 28. Edge Query Rate Comparison Number of shards vs Number of queries Followers collection with forward and reverse indexes Two collections, followers, following one index each 1 10,000 10,000 3 90,000 30,000 6 360,000 60,000 12 1,440,000 120,000
  • 29. Follower Counts Can use the edge indexes : How to determine these counts ? > db.followers.find({_f : "djw"}).count() > db.following.find({_f : "djw"}).count() However this can be heavy weight - Especially for rendering landing page - Consider maintaining counts on user document
  • 30. Socialite User Service • Manages user profiles and the follower graph • Supports arbitrary user data passthrough • Options for graph storage – Uses edge collections (can shard by _f) – Options for maintaining separate follower/ing graphs – Storing counts vs counting { "_id" : ObjectId("52cd1d32a0ee9a1a76d369bb"), "_f" : "jsr", "_t" : "djw" } { "v" : 1, "key" : {"_f" : 1, "_t" : 1}, "unique" : true, }
  • 31. Next up @ 11:50am : Scaling the Data Feed • Delivering user content to followers • Comparing fanout models • Caching user timelines for fast retrieval • Embedding vs Linking Content
  • 32. Building a Social Platform with MongoDB MongoDB Inc Darren Wood & Asya Kamsky #MongoDBWorld

Notes de l'éditeur

  1. Scaling the delivery of posts and content to the follower networks of millions of users has many challenges. In this section we look at the various approaches to fanning out posts and look at a performance comparison between them. We will highlight some tricks for caching the recent timeline of active users to drive down read latency.
  2. image at https://dropwizard.github.io/dropwizard of the hat 
  3. Tempting to focus on scaling content Follow requests rival message send rates Twitter enforces per day follow limits
  4. Single Collection
  5. How to test, show how growing documents are very painful to update. Add the MTV or appmetrics mtools plot showing what happens to outliers.
  6. actual performance – show how inserting million users was easy – no point even trying to update embedded documents...
  7. side-point of
  8. NEED TO GENERATE FOR broadcast (scatter gather) for following, direct for followers. Number of total queries by number of shards... TO GET WHOM THE USER IS FOLLOWING
  9. talk about real life trade-offs
  10. hidden in original