SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Scaling Online Social
   Networks (OSNs)

Presented by: Maria Stylianou     Coworker: Anis Uddin
           Supervisor: Šarūnas Girdzijauskas

              KTH - Royal Institute of Technology
            Implementation of Distributed Systems

                   December 6th, 2012
Outline
●   Motivation
●   Current Algorithms
    –   SPAR
    –   JA-BE-JA
●   Contributions
    –   Challenges
    –   Solution
●   Evaluation & Conclusions

                               2
Outline
●   Motivation
●   Current Algorithms
    –   SPAR
    –   JA-BE-JA
●   Contributions
    –   Challenges
    –   Solution
●   Evaluation & Conclusions

                               3
“Pandora's box”
             Online Social Networks




Source: http://technorati.com/social-media/article/social-networks-theyre-what-every-local/
                      Motivation-Algorithms-Contribution-Evaluation                     4
Easy to maintain...
             Online Social Networks




Source: http://mastersofmedia.hum.uva.nl/2009/09/14/a-review-of-taken-out-of-context/
                      Motivation-Algorithms-Contribution-Evaluation                     5
...or not!
             Online Social Networks




Source: http://mastersofmedia.hum.uva.nl/2009/09/14/a-review-of-taken-out-of-context/
                      Motivation-Algorithms-Contribution-Evaluation                     6
Scaling Approaches
    Vertical Scaling                       Horizontal Scaling
●   Full Replication                   ●   Adding servers
●   Data Locality                      ●   Clean & Disjoint
                                           Partitions
●   But:
                                       ●   But:
    –   Expensive
    –   Saturation
                                           –   Not applicable in
                                               OSNs




                Motivation-Algorithms-Contribution-Evaluation      7
Scaling Approaches
    Vertical Scaling                       Horizontal Scaling
●   Full Replication                   ●   Adding servers
●   Data Locality                      ●   Clean & Disjoint
                                           Partitions
●   But:
                                       ●   But:
    –   Expensive
    –   Saturation
                                           –   Not applicable in
                                               OSNs



                       Inefficient
                Motivation-Algorithms-Contribution-Evaluation      8
Existing 'Solutions' for OSNs
Relational Databases               Key-Value Stores




         Motivation-Algorithms-Contribution-Evaluation   9
Existing 'Solutions' for OSNs
Relational Databases               Key-Value Stores




                Inefficient
         Motivation-Algorithms-Contribution-Evaluation   10
Outline
●   Motivation
●   Current Algorithms
    –   SPAR
    –   JA-BE-JA
●   Contributions
    –   Challenges
    –   Solutions
●   Evaluation & Conclusions

                               11
SPAR
    Social Partitioning & Replication middle-ware
●   Transparent OSN scalability                       avoids
●   Data Locality                                     performance
●   Load Balancing                                    bottlenecks

●   Fault Tolerance
●   Stability
●   Replication Overhead Minimization
                Motivation-Algorithms-Contribution-Evaluation       12
SPAR
    Events
●   Nodes – Add/Remove
●   Edges – Add/Remove
●   Servers – Add/Remove




             Motivation-Algorithms-Contribution-Evaluation   13
SPAR Algorithm
                  M2
                                           6'
                                   5
                                           1'     Create Edge (1,6)
2    5'
3     1

4
                                           5'

M1                                 6                  Master Node

                                                      Replica Node
                  M3
          Motivation-Algorithms-Contribution-Evaluation             14
SPAR Algorithm
                  M2
                                           6'
                                   5              Create Edge (1,6)

2    5'                                    1'    C1: Create 6' in M1
                                                    Create 1' in M3
3     1

4    6'
                                           5'

M1                                 6                  Master Node
                                           1'
                                                      Replica Node
                  M3
          Motivation-Algorithms-Contribution-Evaluation             15
SPAR Algorithm
             M2
                                      6'
                            5
                                             Create Edge (1,6)
2                                     1'
                                            C2: Move 1 to M3
3    1'

4
                                      5'
                                      2'         Master Node
M1
                        6       1     3'
                                                 Replica Node
             M3                       4'

     Motivation-Algorithms-Contribution-Evaluation             16
SPAR Algorithm
                  M2
                                           6'
                                   5
                                                  Create Edge (1,6)
2    5'                                    1'
                                                  C3: Move 6 to M1
3     1

4     6

M1                                                    Master Node

                                                      Replica Node
                  M3
          Motivation-Algorithms-Contribution-Evaluation             17
JA-BE-JA
●   Distributed Partitioning Algorithm
●   K-way Partitioning
●   Load Balancing
●   Gossip Learning




              Motivation-Algorithms-Contribution-Evaluation   18
JA-BE-JA - Policies
●   Sampling                                    ●   Swapping
     –   Local                                       –   Energy Function
         ●   Select neighbors                             ●   Reach minimum
     –   Random                                      –   Simulated Annealing
         ●   Select from random                           ●   Escape from local
             walk                                             optima
     –   Hybrid
         ●   Local & Random



Source: http://socialnetworking.lovetoknow.com/Growth_of_Online_Social_Networking_in_Business
                       Motivation-Algorithms-Contribution-Evaluation                      19
Outline
●   Motivation
●   Current Algorithms
    –   SPAR
    –   JA-BE-JA
●   Contributions
    –   Challenges
    –   Solution
●   Evaluation & Conclusions

                               20
Challenges

                                             Global View
Partition Manager                            requirement
→ Single Point
  of Failure
                        SPAR
                        SPAR

                                        Replication
                                        Overhead
           Motivation-Algorithms-Contribution-Evaluation   21
Our Solution

                                                Global View
Partition Manager                               requirement
→ Single Point
  of Failure      SPAR                                  Local View

Distributed
                           &
Partition               JA-BE-JA
Manager
                                             Replication
                                             Overhead
              Motivation-Algorithms-Contribution-Evaluation          22
Our Solution
                   (wait for it...)

               Client Requests
SPAR




                                                       Data Store
                                                        Servers
       Motivation-Algorithms-Contribution-Evaluation                23
Our Solution
                   Client Requests
  SPAR
    &
JA-BE-JA




                             JA
                             BE
                             JA

                                                           Data Store
                                                            Servers
           Motivation-Algorithms-Contribution-Evaluation                24
Outline
●   Motivation
●   Current Algorithms
    –   SPAR
    –   JA-BE-JA
●   Contributions
    –   Challenges
    –   Solution
●   Evaluation & Conclusions

                               25
Implementation

●   SPAR
●   SPAR-JA

           This is SPARJA!




              Motivation-Algorithms-Contribution-Evaluation   26
Datasets
●   Facebook Graphs
    by Stanford Network Analysis Project

    –   #nodes: 150 #edges: ~3000
    –   #nodes: 224 #edges: ~6000
    –   #nodes: 786 #edges: ~60000

    Source: http://snap.stanford.edu/




                   Motivation-Algorithms-Contribution-Evaluation   27
Datasets
●   Synthesized Graphs
    –   using our own Graph Generator
                                                     Graph Visualization Tool
    –   #nodes: 1000, #degree: 10                       https://gephi.org/

    Randomized                 Clustered                 Highly Clustered




                 Motivation-Algorithms-Contribution-Evaluation              28
Experiments
Replication Overhead on Different Datasets
#k-replicas: 0 (fault tolerance)         #Servers: 4
                                                          Synthesized Graphs
                                                          10000 edges

                                                          synth-r: Randomized
                                                          synth-c: Clustered
                                                          synth-hc:
                                                             Highly Clustered


                                                          Facebook Graphs
                                                          fcbk-1: ~3000 edges
                                                          fcbk-2: ~6000 edges
                                                          fcbk-3: ~60000 edges


                 Motivation-Algorithms-Contribution-Evaluation            29
Experiments
Replication Overhead vs Replication Factor


                                                         K=0
                                                         K=2




         Motivation-Algorithms-Contribution-Evaluation         30
Experiments
Replication Overhead on both algorithms




                                                  Fault Tolerance
                                                  K=2

                                                  synth-hc:
                                                  - Highly Clustered
                                                  - Synthesized Graph
                                                  - 10000 edges



        Motivation-Algorithms-Contribution-Evaluation               31
Experiments
Replication Overhead on both algorithms




                                                 Fault Tolerance
                                                 K=2

                                                 fcbk-3:
                                                 - 3rd facebook graph
                                                 - 60,000 edges




        Motivation-Algorithms-Contribution-Evaluation              32
Conclusions
●   SPAR + JA-BE-JA = SPAR-JA
    –   Highly clustered nodes
    –   Achieves fault tolerance 'by-default'
    –   Better than SPAR in case of high clusterization


●   Future Work
    –   More datasets
    –   Bigger datasets

                 Motivation-Algorithms-Contribution-Evaluation   33
Scaling Online Social
   Networks (OSNs)

Presented by: Maria Stylianou     Coworker: Anis Uddin
           Supervisor: Šarūnas Girdzijauskas

              KTH - Royal Institute of Technology
            Implementation of Distributed Systems

                   December 6th, 2012

Contenu connexe

Similaire à Scaling Online Social Networks (OSNs)

Olap scalability
Olap scalabilityOlap scalability
Olap scalabilitylucboudreau
 
Sql Performance Tuning For Developers
Sql Performance Tuning For DevelopersSql Performance Tuning For Developers
Sql Performance Tuning For Developerssqlserver.co.il
 
Software architecture, Patterns for Scale
Software architecture, Patterns for ScaleSoftware architecture, Patterns for Scale
Software architecture, Patterns for ScaleiGbanam
 
Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...
Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...
Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...Roland Ewald
 
Distributed ML/DL with Ignite ML Module Using Apache Spark as Database
Distributed ML/DL with Ignite ML Module Using Apache Spark as DatabaseDistributed ML/DL with Ignite ML Module Using Apache Spark as Database
Distributed ML/DL with Ignite ML Module Using Apache Spark as DatabaseDatabricks
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsAli Hodroj
 
Apache Spark MLlib - Random Foreset and Desicion Trees
Apache Spark MLlib - Random Foreset and Desicion TreesApache Spark MLlib - Random Foreset and Desicion Trees
Apache Spark MLlib - Random Foreset and Desicion TreesTuhin Mahmud
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathJohn Holden
 
Azure machine learning
Azure machine learningAzure machine learning
Azure machine learningMark Reynolds
 
Getting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsGetting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsairisData
 
Designing Distributed Machine Learning on Apache Spark
Designing Distributed Machine Learning on Apache SparkDesigning Distributed Machine Learning on Apache Spark
Designing Distributed Machine Learning on Apache SparkDatabricks
 
Cutting Edge Predictive Modeling For Classification
Cutting Edge Predictive Modeling For ClassificationCutting Edge Predictive Modeling For Classification
Cutting Edge Predictive Modeling For ClassificationPankaj Sharma
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...Databricks
 
Willump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML InferenceWillump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML InferenceDatabricks
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
Smartphone Activity Prediction
Smartphone Activity PredictionSmartphone Activity Prediction
Smartphone Activity PredictionTriskelion_Kaggle
 

Similaire à Scaling Online Social Networks (OSNs) (20)

Olap scalability
Olap scalabilityOlap scalability
Olap scalability
 
Sql Performance Tuning For Developers
Sql Performance Tuning For DevelopersSql Performance Tuning For Developers
Sql Performance Tuning For Developers
 
Software architecture, Patterns for Scale
Software architecture, Patterns for ScaleSoftware architecture, Patterns for Scale
Software architecture, Patterns for Scale
 
Kx for wine tasting
Kx for wine tastingKx for wine tasting
Kx for wine tasting
 
Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...
Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...
Evaluating Simulation Software Components with Player Rating Systems (SIMUToo...
 
Distributed ML/DL with Ignite ML Module Using Apache Spark as Database
Distributed ML/DL with Ignite ML Module Using Apache Spark as DatabaseDistributed ML/DL with Ignite ML Module Using Apache Spark as Database
Distributed ML/DL with Ignite ML Module Using Apache Spark as Database
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
Apache Spark MLlib - Random Foreset and Desicion Trees
Apache Spark MLlib - Random Foreset and Desicion TreesApache Spark MLlib - Random Foreset and Desicion Trees
Apache Spark MLlib - Random Foreset and Desicion Trees
 
Machine Learning - Principles
Machine Learning - PrinciplesMachine Learning - Principles
Machine Learning - Principles
 
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary pathISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
 
Azure machine learning
Azure machine learningAzure machine learning
Azure machine learning
 
LanceIntroSpark_box
LanceIntroSpark_boxLanceIntroSpark_box
LanceIntroSpark_box
 
Getting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsGetting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analytics
 
Designing Distributed Machine Learning on Apache Spark
Designing Distributed Machine Learning on Apache SparkDesigning Distributed Machine Learning on Apache Spark
Designing Distributed Machine Learning on Apache Spark
 
Cutting Edge Predictive Modeling For Classification
Cutting Edge Predictive Modeling For ClassificationCutting Edge Predictive Modeling For Classification
Cutting Edge Predictive Modeling For Classification
 
MTECH IT syllabus
MTECH IT syllabusMTECH IT syllabus
MTECH IT syllabus
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 
Willump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML InferenceWillump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML Inference
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Smartphone Activity Prediction
Smartphone Activity PredictionSmartphone Activity Prediction
Smartphone Activity Prediction
 

Plus de Maria Stylianou

SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication MiddlewareSPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication MiddlewareMaria Stylianou
 
Quantum Cryptography and Possible Attacks
Quantum Cryptography and Possible AttacksQuantum Cryptography and Possible Attacks
Quantum Cryptography and Possible AttacksMaria Stylianou
 
Green Optical Networks with Signal Quality Guarantee
Green Optical Networks with Signal Quality Guarantee Green Optical Networks with Signal Quality Guarantee
Green Optical Networks with Signal Quality Guarantee Maria Stylianou
 
Cano projectGreen Optical Networks with Signal Quality Guarantee
Cano projectGreen Optical Networks with Signal Quality Guarantee Cano projectGreen Optical Networks with Signal Quality Guarantee
Cano projectGreen Optical Networks with Signal Quality Guarantee Maria Stylianou
 
Performance Analysis of multithreaded applications based on Hardware Simulati...
Performance Analysis of multithreaded applications based on Hardware Simulati...Performance Analysis of multithreaded applications based on Hardware Simulati...
Performance Analysis of multithreaded applications based on Hardware Simulati...Maria Stylianou
 
Automatic Energy-based Scheduling
Automatic Energy-based SchedulingAutomatic Energy-based Scheduling
Automatic Energy-based SchedulingMaria Stylianou
 
Intelligent Placement of Datacenters for Internet Services
Intelligent Placement of Datacenters for Internet ServicesIntelligent Placement of Datacenters for Internet Services
Intelligent Placement of Datacenters for Internet ServicesMaria Stylianou
 
Instrumenting the MG applicaiton of NAS Parallel Benchmark
Instrumenting the MG applicaiton of NAS Parallel BenchmarkInstrumenting the MG applicaiton of NAS Parallel Benchmark
Instrumenting the MG applicaiton of NAS Parallel BenchmarkMaria Stylianou
 
Low-Latency Multi-Writer Atomic Registers
Low-Latency Multi-Writer Atomic RegistersLow-Latency Multi-Writer Atomic Registers
Low-Latency Multi-Writer Atomic RegistersMaria Stylianou
 
How Companies Learn Your Secrets
How Companies Learn Your SecretsHow Companies Learn Your Secrets
How Companies Learn Your SecretsMaria Stylianou
 
EEDC - Why use of REST for Web Services
EEDC - Why use of REST for Web Services EEDC - Why use of REST for Web Services
EEDC - Why use of REST for Web Services Maria Stylianou
 
EEDC - Distributed Systems
EEDC - Distributed SystemsEEDC - Distributed Systems
EEDC - Distributed SystemsMaria Stylianou
 

Plus de Maria Stylianou (15)

SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication MiddlewareSPARJA: a Distributed Social Graph Partitioning and Replication Middleware
SPARJA: a Distributed Social Graph Partitioning and Replication Middleware
 
Quantum Cryptography and Possible Attacks
Quantum Cryptography and Possible AttacksQuantum Cryptography and Possible Attacks
Quantum Cryptography and Possible Attacks
 
Erlang in 10 minutes
Erlang in 10 minutesErlang in 10 minutes
Erlang in 10 minutes
 
Pregel - Paper Review
Pregel - Paper ReviewPregel - Paper Review
Pregel - Paper Review
 
Google's Dremel
Google's DremelGoogle's Dremel
Google's Dremel
 
Green Optical Networks with Signal Quality Guarantee
Green Optical Networks with Signal Quality Guarantee Green Optical Networks with Signal Quality Guarantee
Green Optical Networks with Signal Quality Guarantee
 
Cano projectGreen Optical Networks with Signal Quality Guarantee
Cano projectGreen Optical Networks with Signal Quality Guarantee Cano projectGreen Optical Networks with Signal Quality Guarantee
Cano projectGreen Optical Networks with Signal Quality Guarantee
 
Performance Analysis of multithreaded applications based on Hardware Simulati...
Performance Analysis of multithreaded applications based on Hardware Simulati...Performance Analysis of multithreaded applications based on Hardware Simulati...
Performance Analysis of multithreaded applications based on Hardware Simulati...
 
Automatic Energy-based Scheduling
Automatic Energy-based SchedulingAutomatic Energy-based Scheduling
Automatic Energy-based Scheduling
 
Intelligent Placement of Datacenters for Internet Services
Intelligent Placement of Datacenters for Internet ServicesIntelligent Placement of Datacenters for Internet Services
Intelligent Placement of Datacenters for Internet Services
 
Instrumenting the MG applicaiton of NAS Parallel Benchmark
Instrumenting the MG applicaiton of NAS Parallel BenchmarkInstrumenting the MG applicaiton of NAS Parallel Benchmark
Instrumenting the MG applicaiton of NAS Parallel Benchmark
 
Low-Latency Multi-Writer Atomic Registers
Low-Latency Multi-Writer Atomic RegistersLow-Latency Multi-Writer Atomic Registers
Low-Latency Multi-Writer Atomic Registers
 
How Companies Learn Your Secrets
How Companies Learn Your SecretsHow Companies Learn Your Secrets
How Companies Learn Your Secrets
 
EEDC - Why use of REST for Web Services
EEDC - Why use of REST for Web Services EEDC - Why use of REST for Web Services
EEDC - Why use of REST for Web Services
 
EEDC - Distributed Systems
EEDC - Distributed SystemsEEDC - Distributed Systems
EEDC - Distributed Systems
 

Dernier

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Dernier (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Scaling Online Social Networks (OSNs)

  • 1. Scaling Online Social Networks (OSNs) Presented by: Maria Stylianou Coworker: Anis Uddin Supervisor: Šarūnas Girdzijauskas KTH - Royal Institute of Technology Implementation of Distributed Systems December 6th, 2012
  • 2. Outline ● Motivation ● Current Algorithms – SPAR – JA-BE-JA ● Contributions – Challenges – Solution ● Evaluation & Conclusions 2
  • 3. Outline ● Motivation ● Current Algorithms – SPAR – JA-BE-JA ● Contributions – Challenges – Solution ● Evaluation & Conclusions 3
  • 4. “Pandora's box” Online Social Networks Source: http://technorati.com/social-media/article/social-networks-theyre-what-every-local/ Motivation-Algorithms-Contribution-Evaluation 4
  • 5. Easy to maintain... Online Social Networks Source: http://mastersofmedia.hum.uva.nl/2009/09/14/a-review-of-taken-out-of-context/ Motivation-Algorithms-Contribution-Evaluation 5
  • 6. ...or not! Online Social Networks Source: http://mastersofmedia.hum.uva.nl/2009/09/14/a-review-of-taken-out-of-context/ Motivation-Algorithms-Contribution-Evaluation 6
  • 7. Scaling Approaches Vertical Scaling Horizontal Scaling ● Full Replication ● Adding servers ● Data Locality ● Clean & Disjoint Partitions ● But: ● But: – Expensive – Saturation – Not applicable in OSNs Motivation-Algorithms-Contribution-Evaluation 7
  • 8. Scaling Approaches Vertical Scaling Horizontal Scaling ● Full Replication ● Adding servers ● Data Locality ● Clean & Disjoint Partitions ● But: ● But: – Expensive – Saturation – Not applicable in OSNs Inefficient Motivation-Algorithms-Contribution-Evaluation 8
  • 9. Existing 'Solutions' for OSNs Relational Databases Key-Value Stores Motivation-Algorithms-Contribution-Evaluation 9
  • 10. Existing 'Solutions' for OSNs Relational Databases Key-Value Stores Inefficient Motivation-Algorithms-Contribution-Evaluation 10
  • 11. Outline ● Motivation ● Current Algorithms – SPAR – JA-BE-JA ● Contributions – Challenges – Solutions ● Evaluation & Conclusions 11
  • 12. SPAR Social Partitioning & Replication middle-ware ● Transparent OSN scalability avoids ● Data Locality performance ● Load Balancing bottlenecks ● Fault Tolerance ● Stability ● Replication Overhead Minimization Motivation-Algorithms-Contribution-Evaluation 12
  • 13. SPAR Events ● Nodes – Add/Remove ● Edges – Add/Remove ● Servers – Add/Remove Motivation-Algorithms-Contribution-Evaluation 13
  • 14. SPAR Algorithm M2 6' 5 1' Create Edge (1,6) 2 5' 3 1 4 5' M1 6 Master Node Replica Node M3 Motivation-Algorithms-Contribution-Evaluation 14
  • 15. SPAR Algorithm M2 6' 5 Create Edge (1,6) 2 5' 1' C1: Create 6' in M1 Create 1' in M3 3 1 4 6' 5' M1 6 Master Node 1' Replica Node M3 Motivation-Algorithms-Contribution-Evaluation 15
  • 16. SPAR Algorithm M2 6' 5 Create Edge (1,6) 2 1' C2: Move 1 to M3 3 1' 4 5' 2' Master Node M1 6 1 3' Replica Node M3 4' Motivation-Algorithms-Contribution-Evaluation 16
  • 17. SPAR Algorithm M2 6' 5 Create Edge (1,6) 2 5' 1' C3: Move 6 to M1 3 1 4 6 M1 Master Node Replica Node M3 Motivation-Algorithms-Contribution-Evaluation 17
  • 18. JA-BE-JA ● Distributed Partitioning Algorithm ● K-way Partitioning ● Load Balancing ● Gossip Learning Motivation-Algorithms-Contribution-Evaluation 18
  • 19. JA-BE-JA - Policies ● Sampling ● Swapping – Local – Energy Function ● Select neighbors ● Reach minimum – Random – Simulated Annealing ● Select from random ● Escape from local walk optima – Hybrid ● Local & Random Source: http://socialnetworking.lovetoknow.com/Growth_of_Online_Social_Networking_in_Business Motivation-Algorithms-Contribution-Evaluation 19
  • 20. Outline ● Motivation ● Current Algorithms – SPAR – JA-BE-JA ● Contributions – Challenges – Solution ● Evaluation & Conclusions 20
  • 21. Challenges Global View Partition Manager requirement → Single Point of Failure SPAR SPAR Replication Overhead Motivation-Algorithms-Contribution-Evaluation 21
  • 22. Our Solution Global View Partition Manager requirement → Single Point of Failure SPAR Local View Distributed & Partition JA-BE-JA Manager Replication Overhead Motivation-Algorithms-Contribution-Evaluation 22
  • 23. Our Solution (wait for it...) Client Requests SPAR Data Store Servers Motivation-Algorithms-Contribution-Evaluation 23
  • 24. Our Solution Client Requests SPAR & JA-BE-JA JA BE JA Data Store Servers Motivation-Algorithms-Contribution-Evaluation 24
  • 25. Outline ● Motivation ● Current Algorithms – SPAR – JA-BE-JA ● Contributions – Challenges – Solution ● Evaluation & Conclusions 25
  • 26. Implementation ● SPAR ● SPAR-JA This is SPARJA! Motivation-Algorithms-Contribution-Evaluation 26
  • 27. Datasets ● Facebook Graphs by Stanford Network Analysis Project – #nodes: 150 #edges: ~3000 – #nodes: 224 #edges: ~6000 – #nodes: 786 #edges: ~60000 Source: http://snap.stanford.edu/ Motivation-Algorithms-Contribution-Evaluation 27
  • 28. Datasets ● Synthesized Graphs – using our own Graph Generator Graph Visualization Tool – #nodes: 1000, #degree: 10 https://gephi.org/ Randomized Clustered Highly Clustered Motivation-Algorithms-Contribution-Evaluation 28
  • 29. Experiments Replication Overhead on Different Datasets #k-replicas: 0 (fault tolerance) #Servers: 4 Synthesized Graphs 10000 edges synth-r: Randomized synth-c: Clustered synth-hc: Highly Clustered Facebook Graphs fcbk-1: ~3000 edges fcbk-2: ~6000 edges fcbk-3: ~60000 edges Motivation-Algorithms-Contribution-Evaluation 29
  • 30. Experiments Replication Overhead vs Replication Factor K=0 K=2 Motivation-Algorithms-Contribution-Evaluation 30
  • 31. Experiments Replication Overhead on both algorithms Fault Tolerance K=2 synth-hc: - Highly Clustered - Synthesized Graph - 10000 edges Motivation-Algorithms-Contribution-Evaluation 31
  • 32. Experiments Replication Overhead on both algorithms Fault Tolerance K=2 fcbk-3: - 3rd facebook graph - 60,000 edges Motivation-Algorithms-Contribution-Evaluation 32
  • 33. Conclusions ● SPAR + JA-BE-JA = SPAR-JA – Highly clustered nodes – Achieves fault tolerance 'by-default' – Better than SPAR in case of high clusterization ● Future Work – More datasets – Bigger datasets Motivation-Algorithms-Contribution-Evaluation 33
  • 34. Scaling Online Social Networks (OSNs) Presented by: Maria Stylianou Coworker: Anis Uddin Supervisor: Šarūnas Girdzijauskas KTH - Royal Institute of Technology Implementation of Distributed Systems December 6th, 2012