SlideShare une entreprise Scribd logo
1  sur  11
From “Think Like a
Vertex” to “Think Like
a Graph”
Yuanyuan Tian, Andrey Balmin, Severin Andreas
Corsten, Shirish Tatikonda, John McPherson
Large Scale Graph Processing
 Graph data is everywhere and growing rapidly!
 126 million blogs
 50 billion Web pages
 90 trillion emails
 Analyzing graph data is increasingly important.
 Retail: customer segmentation and recommendation
 Consumer Insights: key influencer analysis
 Telco: churn prediction
 Biology & Health Care: disease propagation analysis
 Government: Terrorists detection
 What is the right programming model for large-scale graph
processing?
Existing Graph Processing Systems
 Divide input graphs into partitions
 Employ vertex-centric programming model
 Programmers “think like a vertex”
 Operate on a vertex and its edges
 Communication to other vertices
 Message passing (e.g Pregel/Giraph)
 Scheduling of updates (e.g GraphLab)
“Think like a vertex”  “Think like a graph”
“Think like a vertex”  “Think like a graph”
 Partition: A collection of vertices  A proper subgraph
 Computation: A vertex and its edges  A subgraph
 Communication: 1-hop at a time
e.g ABD
 Multiple-hops at a time
e.g. AD
Graph-Centric Programming Model
 Expose subgraphs to programmers
 Internal vertices vs boundary vertices
 Information exchange between internal vertices of a partition is
immediate
 Messages are only sent from boundary vertices of a partition to
internal vertices of a different partition
internal vertex
external vertex
(primary copy)
(local copy)
message
Advantages of graph-centric model
 Any algorithm expressed in the vertex-centric model can
be expressed in the graph-centric model
 Allow lower-level access for algorithm-specific
optimizations
 Use of existing off-the-shell graph algorithms on subgraphs
 Local asynchronous computation
 Natural translation of existing graph-centric parallel algorithms
 Provide sufficiently high-level abstraction for easy of use
Example: Weakly Connected Component (WCC)
compute() on each subgraph
 If 0th
superstep
 Sequential WCC on subgraph
 Send labels of boundary vertices
to their corresponding internal
vertices
 Else
 Use the received messages to
update the labels of vertices in the
subgraph
 Merge connected components
 For boundary vertices with label
change, send labels to their
corresponding internal vertices
vertex-centricmodelg
Hybrid Execution Model
 Can we keep the simple vertex-centric programming
model but still improve performance?
 Key: Differentiate internal messages and external
messages
 What messages can be used in local computation?
 Vertex-centric model
 Only messages (external and internal) from previous superstep
 Hybrid model
 External messages from previous superstep
 Internal messages from previous + current superstep (local
asynchronous computation)
 Only apply to a limited set of graph algorithms
Giraph++: A New Graph Processing System
 Built on top of Apache Giraph
 Support vertex-centric model (VM), graph-centric model
(GM) & hybrid model (HM) in the same Giraph++ system
 Contributed to Apache Giraph Project
 Planed in future Giraph release
A Peek of Performance Results
 10-node cluster
 Per node: 32GB RAM, 8 cores, 7 workers
 Network: 1Gbit
 Dataset: 4 web graphs
 # nodes: 19million ~ 428million
 # edges: 298million ~ 1.0billion
 Significant improvement in execution time
and network messages, especially with
better graph partitioning strategy
 Up to 62X speedup in execution time!
 Up to 200X reduction in network messages!
Data Set
Time(sec)
Networkmsgs
VM: vertex-centric model
GM: graph-centric model
HM: hybrid model
HP: hash partitioning
GP: graph partitioningWCC Algorithm
Conclusion
 “Think like a vertex”  “Think like a graph”
 Take advantage of local graph structure in a partition
 Enable complex and flexible graph algorithms
 Can be exploited to various graph applications
 Bring significant performance improvement, especially with
better graph partitioning strategy
 A valuable complement to existing vertex-centric model

Contenu connexe

Tendances

Delta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the HoodDelta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the Hood
Databricks
 
Reflection in java
Reflection in javaReflection in java
Reflection in java
upen.rockin
 
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneySetting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Weaveworks
 
GitOps for Helm Users by Scott Rigby
GitOps for Helm Users by Scott RigbyGitOps for Helm Users by Scott Rigby
GitOps for Helm Users by Scott Rigby
Weaveworks
 

Tendances (20)

Delta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the HoodDelta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the Hood
 
Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBM
 
Scope - Static and Dynamic
Scope - Static and DynamicScope - Static and Dynamic
Scope - Static and Dynamic
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Reflection in java
Reflection in javaReflection in java
Reflection in java
 
Topdown parsing
Topdown parsingTopdown parsing
Topdown parsing
 
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
 
GUI programming
GUI programmingGUI programming
GUI programming
 
Data saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewData saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overview
 
Simplifies and normal forms - Theory of Computation
Simplifies and normal forms - Theory of ComputationSimplifies and normal forms - Theory of Computation
Simplifies and normal forms - Theory of Computation
 
Java Garbage Collection - How it works
Java Garbage Collection - How it worksJava Garbage Collection - How it works
Java Garbage Collection - How it works
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
 
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneySetting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
 
Scylla Summit 2022: ScyllaDB Embraces Wasm
Scylla Summit 2022: ScyllaDB Embraces WasmScylla Summit 2022: ScyllaDB Embraces Wasm
Scylla Summit 2022: ScyllaDB Embraces Wasm
 
Lecture 05 syntax analysis 2
Lecture 05 syntax analysis 2Lecture 05 syntax analysis 2
Lecture 05 syntax analysis 2
 
Java I/O
Java I/OJava I/O
Java I/O
 
Program control statements in c#
Program control statements in c#Program control statements in c#
Program control statements in c#
 
Java-java virtual machine
Java-java virtual machineJava-java virtual machine
Java-java virtual machine
 
GitOps for Helm Users by Scott Rigby
GitOps for Helm Users by Scott RigbyGitOps for Helm Users by Scott Rigby
GitOps for Helm Users by Scott Rigby
 

En vedette

Large Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache GiraphLarge Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache Giraph
sscdotopen
 

En vedette (6)

Large Graph Processing
Large Graph ProcessingLarge Graph Processing
Large Graph Processing
 
Comparing pregel related systems
Comparing pregel related systemsComparing pregel related systems
Comparing pregel related systems
 
Pregel and giraph
Pregel and giraphPregel and giraph
Pregel and giraph
 
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
 
Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014Giraph at Hadoop Summit 2014
Giraph at Hadoop Summit 2014
 
Large Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache GiraphLarge Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache Giraph
 

Similaire à Giraph++: From "Think Like a Vertex" to "Think Like a Graph"

Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
inside-BigData.com
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poli
ivascucristian
 
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big ComputingEuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
Jonathan Dursi
 

Similaire à Giraph++: From "Think Like a Vertex" to "Think Like a Graph" (20)

BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONSBIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
 
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
 
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizardPhily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
 
Xia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATLXia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATL
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
 
SAMOA: A Platform for Mining Big Data Streams (Apache BigData Europe 2015)
SAMOA: A Platform for Mining Big Data Streams (Apache BigData Europe 2015)SAMOA: A Platform for Mining Big Data Streams (Apache BigData Europe 2015)
SAMOA: A Platform for Mining Big Data Streams (Apache BigData Europe 2015)
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
 
Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poli
 
Python in big data world
Python in big data worldPython in big data world
Python in big data world
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
 
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big ComputingEuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
 
Pregel
PregelPregel
Pregel
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 

Dernier

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 

Dernier (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 

Giraph++: From "Think Like a Vertex" to "Think Like a Graph"

  • 1. From “Think Like a Vertex” to “Think Like a Graph” Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, John McPherson
  • 2. Large Scale Graph Processing  Graph data is everywhere and growing rapidly!  126 million blogs  50 billion Web pages  90 trillion emails  Analyzing graph data is increasingly important.  Retail: customer segmentation and recommendation  Consumer Insights: key influencer analysis  Telco: churn prediction  Biology & Health Care: disease propagation analysis  Government: Terrorists detection  What is the right programming model for large-scale graph processing?
  • 3. Existing Graph Processing Systems  Divide input graphs into partitions  Employ vertex-centric programming model  Programmers “think like a vertex”  Operate on a vertex and its edges  Communication to other vertices  Message passing (e.g Pregel/Giraph)  Scheduling of updates (e.g GraphLab)
  • 4. “Think like a vertex”  “Think like a graph” “Think like a vertex”  “Think like a graph”  Partition: A collection of vertices  A proper subgraph  Computation: A vertex and its edges  A subgraph  Communication: 1-hop at a time e.g ABD  Multiple-hops at a time e.g. AD
  • 5. Graph-Centric Programming Model  Expose subgraphs to programmers  Internal vertices vs boundary vertices  Information exchange between internal vertices of a partition is immediate  Messages are only sent from boundary vertices of a partition to internal vertices of a different partition internal vertex external vertex (primary copy) (local copy) message
  • 6. Advantages of graph-centric model  Any algorithm expressed in the vertex-centric model can be expressed in the graph-centric model  Allow lower-level access for algorithm-specific optimizations  Use of existing off-the-shell graph algorithms on subgraphs  Local asynchronous computation  Natural translation of existing graph-centric parallel algorithms  Provide sufficiently high-level abstraction for easy of use
  • 7. Example: Weakly Connected Component (WCC) compute() on each subgraph  If 0th superstep  Sequential WCC on subgraph  Send labels of boundary vertices to their corresponding internal vertices  Else  Use the received messages to update the labels of vertices in the subgraph  Merge connected components  For boundary vertices with label change, send labels to their corresponding internal vertices vertex-centricmodelg
  • 8. Hybrid Execution Model  Can we keep the simple vertex-centric programming model but still improve performance?  Key: Differentiate internal messages and external messages  What messages can be used in local computation?  Vertex-centric model  Only messages (external and internal) from previous superstep  Hybrid model  External messages from previous superstep  Internal messages from previous + current superstep (local asynchronous computation)  Only apply to a limited set of graph algorithms
  • 9. Giraph++: A New Graph Processing System  Built on top of Apache Giraph  Support vertex-centric model (VM), graph-centric model (GM) & hybrid model (HM) in the same Giraph++ system  Contributed to Apache Giraph Project  Planed in future Giraph release
  • 10. A Peek of Performance Results  10-node cluster  Per node: 32GB RAM, 8 cores, 7 workers  Network: 1Gbit  Dataset: 4 web graphs  # nodes: 19million ~ 428million  # edges: 298million ~ 1.0billion  Significant improvement in execution time and network messages, especially with better graph partitioning strategy  Up to 62X speedup in execution time!  Up to 200X reduction in network messages! Data Set Time(sec) Networkmsgs VM: vertex-centric model GM: graph-centric model HM: hybrid model HP: hash partitioning GP: graph partitioningWCC Algorithm
  • 11. Conclusion  “Think like a vertex”  “Think like a graph”  Take advantage of local graph structure in a partition  Enable complex and flexible graph algorithms  Can be exploited to various graph applications  Bring significant performance improvement, especially with better graph partitioning strategy  A valuable complement to existing vertex-centric model

Notes de l'éditeur

  1. The industry of large scale grpah processing is mainly dirven by the open source communities, of which Giraph is one of the most important projects. By contributing to Giraph, we influence the direction of the industry.