4. Introduction
• Today, Many practical computing problems concern large
graphs
• Applied algorithms
- Shortest paths computations
- Page rank
- Clustering techniques
• MapReduce is ill-suited for graph processing
- Many iterations are needed for parallel graph processing
- Materializations of intermediate results at every MapReduce
iteration harm performance
4
5. Introduction
• Hadoop is well-suited for non-iterative, data
parallelized processing
5
Smith Waterman
is a non iterativ
e case and of c
ourse runs fine
6. Introduction
6
map map
reduce
Compute the dist
ance to each dat
a point from eac
h cluster center a
nd assign points
to cluster centers
Compute new cluster
centers
Compute new clust
er centers
User program
7. Iterative?
• Should Handle iterative processing like PDE
(Partial Differential Equation)
• http://www.iterativemapreduce.org/
7
8. Graph based Computation
• Pregel
– Google’s large scale graph
• GordenOrb
• Giraph
– Yahoo’s platform
• Hama
– Apache Hama’s
• Pegasus
– Carnegie Melon University 8
9. Single Source Shortest Path (SSSP)
Problem
– Find shortest path from a source node to all target
nodes
Solution
– MapReduce
– Pregel
9
10. Example: SSSP—using MapReduce
• A Map task receives
– Key: node n
– Value: D (distance from start), points-to (list of nodes
reachable from n)
– D(n) = dist + min(D(m))
• The Reduce task gathers possible distances and selects
the minimum one
10
11. Example: SSSP—using MapReduce
Adjacency matrix
Adjacency List
A: (B, 10), (D, 5)
B: (C, 1), (D, 2)
C: (E, 4)
D: (B, 3), (C, 9), (E, 2)
E: (A, 7), (C, 6)
A B C D E
A 0 10 0 5 0
B 0 0 1 2 0
C 0 0 0 0 4
D 0 3 9 0 2
E 7 0 6 0 0
11
0
10
5
2 3
2
1
9
7
4 6
A
B C
D E
20. • The MapReduce use the key/value pairs to save the
nodes and adjacent distance, It is more suitable to
process huge datasets rather than the large-scale
graph
Here, we introduce the new system– Pregel!
20
22. Model of Pregel Computation
Input
Output
Supersteps:
• A sequence of iterations
• Vertex compute in parallel
Input: a directed graph
•Vertex : a vertex ID
a modifiable
•Edges: a target vertex
a modifiable
associate with source vertices
Output: a directed graph
•The set of values explicitly output
by the vertices
•vertices and edges can be added
and moved
22
34. Pregel vs MapReduce
Pregel
– Keeps vertices & edges on
the machine that performs
computation
– Uses network transfers only
for messages
– Sufficiently expressive, no
need for remote reads
MapReduce
– Require much more
communication and
associated overhead
– Needs to coordinate the
steps of a chained
MapReduce add the
programming complexity
36. System Architecture
Pregel system uses the master/worker model
– Master
Coordinating worker activity
Determines the amount of partitions and assign to worker
Recovers faults of workers (“ping” messges)
Maintains statistics about the progress of computation
and the state of the graph
– Worker
Maintain the state of its portion of the graph in memory
Executing the Compute() method
Communicates with the other workers
36
37. 37
•Assign portion of the input
•Instruct each worker to
perform a superstep
•call Compute() for each
vertex
• update the data structure
• receive/send messages
• responds to master when
finished
•Control the number of
partitions in graph
•Notify the master to
start the processing
38. Fault Tolerance
Checkpointing
– The master periodically instructs the workers to save the state
of their partitions to persistent storage
e.g., Vertex values, edge values, incoming messages
Failure detection
– Using regular “ping” messages
Recovery
– The master reassigns graph partitions to the currently available
workers
– The workers all reload their partition state from most recent
available checkpoint
38
40. Goldenorb
• Open Source Version of Google’s Pregel
• Implemented in Java
• Version 0.1.1
• Requirements
- hadoop file system
- zookeeper for communication
40
42. Message Exchange
• Message교환은 Superstep간에 이루어짐
• [S-1] superstep의 outbound message들은 [s] superstep의 inbound
messages
• Outbound Queue가 가득차면 message들을 보내고 다시 queuing
• Superstep 중간에 message를 받은 partition은 inbound queue에
저장하고 다음 Superstep까지 보관
• 현재 superstep에 사용할 message들은 current message queue에 복사
• 이 때, inbound queue가 system이나 jvm의 memory size 를 넘어서면
overflow 발생
43. Memory management
• Outbound Message Queue
- Fixed size, 가득 차면 바로 messages 보냄
• Inbound Message Queue
- 다음 Superstep에 사용
- Message 양이 많아지면 overflow가능성 있음
• Current Message Queue
- Inbound Queue 과 같은 사이즈
- 현 Superstep 에서는 CurrentQueue에 inboundQueue를 복사해서 사용하므로
currentQueue+inboundQueue 의 메모리 사용 overflow
Inbound Queue를 file 기반의 local 저장공간에 구현 필요
44. API
• Sub-classing the predefined classes
– Reader/writer/vertex/message
44
Class Vertex {
public Vertex(Class<VV> vertexValue, Class<EV> edgeValue, Class<MV> messageValue);
String vertexID();
abstract void compute(Collection<MV> messages);
long superStep();
void setValue(VV value);
VV getValue();
Collection<Edge<EV>> getEdges();
void sendMessage(MV message);
void voteToHalt();
}
45. Not yet implemented
• Aggregator
– a mechanism for global communication monitoring and data
• Combiner
– Reducing the number of messages
– Ex) if compute() sum messages’ value, combiner can calculate
and transmit single message(sum)
• Topology mutation
– Remove or add Vertex/Edge
• Fault Recovery
45
49. PageRank
• PageRank is Google’s way of deciding a Page’s
importance
• A important page is linked to by many pages with
high PageRank
• PR(A) = PR(inLink_v1)/L(t1) + ….+ P(inLink_vn)/L(tn)
• Add damping factor d
• PR(A) = (1-d) + d∑PR(v)/L(v)
• Repeat until converged
49
52. K-means
• N observations are parted to k cluster
• Each observation belongs to the cluster with the
nearest mean
No object
move group?
End
Number of cluster K
Calculate centroids
Distance objects to
centroids
Grouping based on
minimum distance
start
NO
YES
53. K-means
53
• Message includes cluster id and value
• Every superstep, a vertex sends message to all
vertices
1
2
3
100
101
102
seed2
seed1
A
B C
D
E
F
Step A B C D E F
S0 C1 C2 - - - -
55. K-means
55
Step A B C D E F
S0 C1 C2 - - - -
S1 C1 C2 C1 C1 C1 C1
S2 C2 C2 C2 C1 C1 C1
1
2
3
100
101
102
A
B C
D
E
F
Centroid1 = Mean(A,C,D,E,F)
Centroid2 = Mean(B)
1
2
3
100
101
102
A
B C
D
E
F
56. K-means
56
Step A B C D E F
S0 C1 C2 - - - -
S1 C1 C2 C1 C1 C1 C1
S2 C2 C2 C2 C1 C1 C1
S3 C2 C2 C2 C1 C1 C1
1
2
3
100
101
102
A
B C
D
E
F
Centroid1 = Mean(D,E,F)
Centroid2 = Mean(A,B,C)
If centroids are
converged,
Quit Process!
60. Giraph
• ASF(Apache Software Foundation)’s Open Source
Version of Google’s Pregel
• Implemented in Java
• Apache incubator
• Requirements
- hadoop 0.20.203 or higher version
: map-only job in hadoop
- zookeeper
: if not exist, use hadoop file system instead of zookeeper
60
62. Giraph - usages
• Users can set the checkpoint frequency
– GiraphJob.getConfiguration().set(“giraph.checkpointFrequency”, 0)
//means no check points
• User should set zookeeper configuration
– GiraphJob.setZookeeperConfiguration(“zk-server-list”);
62
63. Giraph - Characteristics
• Faulty tolerance
– If the master dies, a new one will automatically take over
– If a worker dies, the app is rolled back to a previously checkpointed
superstep
– If a zookeeper server dies, as long as a quorum remains, the app can
proceed
– But, Hadoop SPOF still exist
• Combiner/Aggregator
• JSON in/out format
• Easy Job status monitoring (http)
63
71. Install Goldenorb(2)
• Set configuration
① ORB_HOME 환경변수
> export ORB_HOME=/usr/local/goldenorb
② Conf/orbServers
> localhost:/usr/local/goldenorb
③ Conf/orb-site.xml
> cp orb-site.sample.xml orb-site.xml
> vi orb-site.xml
④ If Distributed mode ,
copy to all servers
71
<property>
<name>goldenOrb.zookeeper.quorum</name>
<value> localhost</value>
<description> The server running zookeeper</description>
<property>
……
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
Target
zookeeper
server IP
72. Install Goldenorb(3)
• Set running environment
① Hadoop 실행
> $HADOOP_HOME/bin/start-dfs.sh
② Zookeeper 실행
> $ZK_HOME/bin/zkServer.sh start
③ Orb-tracker 실행
> $ORB_HOME/bin/orb-tracker.sh start
④ Log 확인
> cat #ORB_HOME/logs/xxx.log
72
73. Install Goldenorb(4)
• Make input
- ex) maximum value
< Vertex-id > <value> <outgoing-edge-list>
73
0
8
5
11
7
A
B C
D E
A 0 B D
B 8 C D
C 11 E
D 5 B C E
E 7 A C