STINGER: Multi-threaded Graph
Streaming
Jason Riedy and David Bader
Georgia Institute of Technology
Graph Algorithms Build...
(insert current prefix here)scale data analysis
Health care Finding outbreaks, population epidemiology
Social networks Adve...
Outline
Motivation: Graph Algorithms for Analysis
Graphs and Streaming Data
STING/STINGER Analysis Framework
Building Bloc...
General approaches
High-performance static graph analysis
Develop techniques that apply to unchanging massive
graphs.
Prov...
Streaming graph data
Data Rates
From www.statisticsbrain.com:
58M posts per day on Twitter (671 / sec)
1M links shared per...
Streaming Queries
Different kinds of questions
How are individual graph metrics (e.g. clustering
coefficients) changing?
Wh...
On to STING...
Motivation: Graph Algorithms for Analysis
Graphs and Streaming Data
STING/STINGER Analysis Framework
Buildi...
STING’s focus
Source data
predictionaction
summary
Control
VizSimulation / query
STING: Spatio-Temporal Interaction Networ...
STING: High-level architecture
Slide credit: Rob McColl and David Ediger
OpenMP + sufficiently POSIX-ish
Multiple processes...
STING Extensible Representation: Core data structure
source
vertex
edge
type
adj. vtx weight mod. time first time
edge type...
Building Blocks for Streaming Graph Data
Motivation: Graph Algorithms for Analysis
Graphs and Streaming Data
STING/STINGER...
Incremental PageRank
PageRank: Well-understood method for ranking vertices
based on random walks (related to minimizing
co...
Incremental PageRank: Iteration
Iterative solver
Step k → k + 1:
∆x(k+1)
= α(A + ∆A)T
(D + ∆D)−1
∆x(k)
+
α (A + ∆A)T
(D + ...
Incremental PageRank: Limiting Expansion
Iterative solver
Step k → k + 1:
∆ˆx(k+1)
= α(A + ∆A)T
(D + ∆D)−1
∆ˆx
(k)
|ex + α...
Incremental PageRank: Test Cases
Initial, high-level implementation via sparse matrices in Julia.
Test graphs from the 10t...
Incremental PageRank: Throughput v. latency
Percent of edge traversals relative to re-started iteration:
1000 100 10
q q q...
Triangle Counting
Current version
Count all the triangles around each graph vertex.
Used in clustering coefficients (numera...
Triangle Counting: Small Batches
Low-latency case
In generall, diag(A3) is a silly option.
But A3∆x, a BFS, to count aroun...
Community Detection
What do we mean?
Partition a graph’s vertices
into disjoint communities.
Locally optimize some metric,...
Parallel Agglomerative Method
A
B
C
D
E
F
G
Use a matching to avoid a serializing
queue.
Simple greedy algorithm.
Would re...
Parallel Agglomerative Method
A
B
C
D
E
F
G
C
D
G
Use a matching to avoid a serializing
queue.
Simple greedy algorithm.
Wo...
Parallel Agglomerative Method
A
B
C
D
E
F
G
C
D
G
E
B
C
Use a matching to avoid a serializing
queue.
Simple greedy algorit...
Seed Set Expansion
Problem
Given a small number of vertices, and
find a region of interest around them.
Start with a subset...
Observations
Throughput / latency trade offs:
Different levels of parallelism and optimizations
Larger batches ⇒ higher th...
STINGER: Where do you get it?
www.cc.gatech.edu/stinger/
Gateway to
code,
development,
documentation,
presentations...
(wo...
Acknowledgment of support
Riedy, Bader— STINGER: Building Blocks 19 May 2014 24 / 24
Prochain SlideShare
Chargement dans…5
×

STINGER: Multi-threaded Graph Streaming

381 vues

Publié le

Publié dans : Logiciels
0 commentaire
0 j’aime
Statistiques
Remarques
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Aucun téléchargement
Vues
Nombre de vues
381
Sur SlideShare
0
Issues des intégrations
0
Intégrations
6
Actions
Partages
0
Téléchargements
5
Commentaires
0
J’aime
0
Intégrations 0
Aucune incorporation

Aucune remarque pour cette diapositive

STINGER: Multi-threaded Graph Streaming

  1. 1. STINGER: Multi-threaded Graph Streaming Jason Riedy and David Bader Georgia Institute of Technology Graph Algorithms Building Blocks (GABB), 19 May 2014
  2. 2. (insert current prefix here)scale data analysis Health care Finding outbreaks, population epidemiology Social networks Advertising, searching, grouping Intelligence Decisions at scale, regulating algorithms Systems biology Understanding interactions, drug design Power grid Disruptions, conservation Simulation Discrete events, cracking meshes Data analysis runs throughout. Many problems can be phrased through graphs. Any many are changing and dynamic. Riedy, Bader— STINGER: Building Blocks 19 May 2014 2 / 24
  3. 3. Outline Motivation: Graph Algorithms for Analysis Graphs and Streaming Data STING/STINGER Analysis Framework Building Blocks for Streaming Graph Data PageRank Triangle Counting Agglomerative Communities Observations Riedy, Bader— STINGER: Building Blocks 19 May 2014 3 / 24
  4. 4. General approaches High-performance static graph analysis Develop techniques that apply to unchanging massive graphs. Provides useful after-the-fact information, starting points. Serves many existing applications well: market research, much bioinformatics, ... Needs to be O(|E|). High-performance streaming graph analysis Focus on the dynamic changes within massive graphs. Find trends or new information as they appear. Serves upcoming applications: fault or threat detection, trend analysis, online prediction... Can be O(|∆E|)? O(Vol(∆V))? Less data ⇒ faster, efficient Riedy, Bader— STINGER: Building Blocks 19 May 2014 4 / 24
  5. 5. Streaming graph data Data Rates From www.statisticsbrain.com: 58M posts per day on Twitter (671 / sec) 1M links shared per 20 minutes on Facebook Other applications (e.g. network security) need to respond nearly at line rate, 81k-1.5M pps on gigabit ethernet. Opportunities Do not need to analyze the entire graph. Different domains: Throughput & latency Expose different levels of concurrency Can achieve ridiculous “speed ups.” Riedy, Bader— STINGER: Building Blocks 19 May 2014 5 / 24
  6. 6. Streaming Queries Different kinds of questions How are individual graph metrics (e.g. clustering coefficients) changing? What are the patterns in the changes? Are there seasonal variations? What are responses to events? What are temporal anomalies in the graph? Do key members in clusters / communities change? Are there indicators of event responses before they are obvious? Riedy, Bader— STINGER: Building Blocks 19 May 2014 6 / 24
  7. 7. On to STING... Motivation: Graph Algorithms for Analysis Graphs and Streaming Data STING/STINGER Analysis Framework Building Blocks for Streaming Graph Data Observations Riedy, Bader— STINGER: Building Blocks 19 May 2014 7 / 24
  8. 8. STING’s focus Source data predictionaction summary Control VizSimulation / query STING: Spatio-Temporal Interaction Networks and Graphs STING manages queries against changing graph data. Visualization and control often are application specific. Ideal: Maintain many persistent graph analysis kernels. One current graph snapshot, kernels keep smaller histories. Also (a harder goal), coordinate the kernels’ cooperation. Riedy, Bader— STINGER: Building Blocks 19 May 2014 8 / 24
  9. 9. STING: High-level architecture Slide credit: Rob McColl and David Ediger OpenMP + sufficiently POSIX-ish Multiple processes for resilience Riedy, Bader— STINGER: Building Blocks 19 May 2014 9 / 24
  10. 10. STING Extensible Representation: Core data structure source vertex edge type adj. vtx weight mod. time first time edge type index vertex index Initial considerations [Bader, et al.] Be useful for the entire “large graph” community Permit good performance: No single structure is optimal for all. Assume globally addressable memory access and atomic operations Not a graph database, but supports types, subsets Large graph ⇒ rare conflicts Riedy, Bader— STINGER: Building Blocks 19 May 2014 10 / 24
  11. 11. Building Blocks for Streaming Graph Data Motivation: Graph Algorithms for Analysis Graphs and Streaming Data STING/STINGER Analysis Framework Building Blocks for Streaming Graph Data PageRank Triangle Counting Agglomerative Communities Observations Riedy, Bader— STINGER: Building Blocks 19 May 2014 11 / 24
  12. 12. Incremental PageRank PageRank: Well-understood method for ranking vertices based on random walks (related to minimizing conductance). Equivalent problem, solve (I − αATD−1)x = (1 − α)v given initial weights v. Goal: Use for seed set expansion, sparse v. State-of-the-art for updating x when the graph represented by A changes? Re-start iteration with the previous x. Can do significantly better for low-latency needs. Compute the change ∆x instead of the entire new x. Riedy, Bader— STINGER: Building Blocks 19 May 2014 12 / 24
  13. 13. Incremental PageRank: Iteration Iterative solver Step k → k + 1: ∆x(k+1) = α(A + ∆A)T (D + ∆D)−1 ∆x(k) + α (A + ∆A)T (D + ∆D)−1 − AT D−1 x Additive part: Non-zero only at changes. Operator: Pushes changes outward. Riedy, Bader— STINGER: Building Blocks 19 May 2014 13 / 24
  14. 14. Incremental PageRank: Limiting Expansion Iterative solver Step k → k + 1: ∆ˆx(k+1) = α(A + ∆A)T (D + ∆D)−1 ∆ˆx (k) |ex + α∆ˆx|held α (A + ∆A)T (D + ∆D)−1 − AT D−1 ˆx Additive part: Non-zero only at changes. Operator: Pushes sufficiently large changes outward. Riedy, Bader— STINGER: Building Blocks 19 May 2014 14 / 24
  15. 15. Incremental PageRank: Test Cases Initial, high-level implementation via sparse matrices in Julia. Test graphs from the 10th DIMACS Implementation Challenge. Add uniformly random edges... worst case. Up to 100k in different batch sizes. One sequence of edge actions per graph shared across experiments. Conv. & hold base threshold: 10−12 Graph |V| |E| Avg. Deg. Size (MiB) caidaRouterLevel 192 244 609 066 3.17 5.38 coPapersCiteseer 434 102 1 603 6720 36.94 124.01 coPapersDBLP 540 486 15 245 729 28.21 118.38 great-britain.osm 7 733 822 8 156 517 1.05 91.73 PGPgiantcompo 10 680 24 316 2.28 0.23 power 4 941 6 594 1.33 0.07 Riedy, Bader— STINGER: Building Blocks 19 May 2014 15 / 24
  16. 16. Incremental PageRank: Throughput v. latency Percent of edge traversals relative to re-started iteration: 1000 100 10 q q q q q q q q q qq q q q q q q q q qq q q q q q q q q qq q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0% 20% 40% 60% 80% 0% 20% 40% 60% 80% 0% 20% 40% 60% 80% 0% 20% 40% 60% 80% 0% 20% 40% 60% 80% 0% 20% 40% 60% 80% caidaRouterLevelcoPapersCiteseercoPapersDBLPgreat_britain.osmPGPgiantcompopower 0 2500 5000 7500 10000 0 2500 5000 7500 10000 0 2500 5000 7500 10000 "Time" Fractionofedgesv.restartedPR Graph name q caidaRouterLevel coPapersCiteseer coPapersDBLP great_britain.osm PGPgiantcompo power Held vertex threshold / 10**−12 q q q q10000.0 1000.0 100.0 10.0 Riedy, Bader— STINGER: Building Blocks 19 May 2014 16 / 24
  17. 17. Triangle Counting Current version Count all the triangles around each graph vertex. Used in clustering coefficients (numerator), etc. Up to 130 000 graph updates per second on X5570 (Nehalem-EP, 2.93GHz) 2000× speed-up over static recomputation Main algorithm, for each vertex v: Sort its adjacency list. For each neighbor w, search for w’s neighbors in the sorted list. Could compute diag(A3), more or less... Riedy, Bader— STINGER: Building Blocks 19 May 2014 17 / 24
  18. 18. Triangle Counting: Small Batches Low-latency case In generall, diag(A3) is a silly option. But A3∆x, a BFS, to count around a few vertices... Brute force (MTAAP10) Roughly 4× slower with moderate batches, and less than 2× slower with small batches. Could be reasonable for a quick hack. Small changes (low latency) may find more applications of linear algebra-like primitives. Riedy, Bader— STINGER: Building Blocks 19 May 2014 18 / 24
  19. 19. Community Detection What do we mean? Partition a graph’s vertices into disjoint communities. Locally optimize some metric, e.g. modularity, conductance Try to capture that vertices are more similar within one community than between communities. Modularity: More internal edges than expected. Jason’s network via LinkedIn Labs Riedy, Bader— STINGER: Building Blocks 19 May 2014 19 / 24
  20. 20. Parallel Agglomerative Method A B C D E F G Use a matching to avoid a serializing queue. Simple greedy algorithm. Would require smuggling data and communication in through element operators... Highly scalable, 5.8 × 106 – 7 × 106 edges/sec, 8× – 16× speed-up on 80-thread E7-8870 (thanks Intel!) Extends to dynamic community maintenance Extract vertices from communities, re-agglomerate Matrix triple product-ish Riedy, Bader— STINGER: Building Blocks 19 May 2014 20 / 24
  21. 21. Parallel Agglomerative Method A B C D E F G C D G Use a matching to avoid a serializing queue. Simple greedy algorithm. Would require smuggling data and communication in through element operators... Highly scalable, 5.8 × 106 – 7 × 106 edges/sec, 8× – 16× speed-up on 80-thread E7-8870 (thanks Intel!) Extends to dynamic community maintenance Extract vertices from communities, re-agglomerate Matrix triple product-ish Riedy, Bader— STINGER: Building Blocks 19 May 2014 20 / 24
  22. 22. Parallel Agglomerative Method A B C D E F G C D G E B C Use a matching to avoid a serializing queue. Simple greedy algorithm. Would require smuggling data and communication in through element operators... Highly scalable, 5.8 × 106 – 7 × 106 edges/sec, 8× – 16× speed-up on 80-thread E7-8870 (thanks Intel!) Extends to dynamic community maintenance Extract vertices from communities, re-agglomerate Matrix triple product-ish Riedy, Bader— STINGER: Building Blocks 19 May 2014 20 / 24
  23. 23. Seed Set Expansion Problem Given a small number of vertices, and find a region of interest around them. Start with a subset consisting of the selection. Evaluate the change in modularity around the current subset. Absorb all vertices that... may increase modularity by a significant amount, or are within the top 10% of changes, or... Repeat until the set is large enough. Step-wise guided expansion doesn’t fit current primitives. Riedy, Bader— STINGER: Building Blocks 19 May 2014 21 / 24
  24. 24. Observations Throughput / latency trade offs: Different levels of parallelism and optimizations Larger batches ⇒ higher throughput, more collisions Small batches ⇒ lower latency, more scattered Impact optimizations similarly to direction-optimized BFS Can build proposed building blocks against STINGER Many algorithms are not naturally expressed: Matching Guided set expansion by changing criteria Streaming versions of these... Targets for version 2? Riedy, Bader— STINGER: Building Blocks 19 May 2014 22 / 24
  25. 25. STINGER: Where do you get it? www.cc.gatech.edu/stinger/ Gateway to code, development, documentation, presentations... (working on usage and development screencasts) Remember: Still academic code, but maturing. Users / contributors / questioners: Georgia Tech, PNNL, CMU, Berkeley, Intel, Cray, NVIDIA, IBM, Federal Government, Ionic Security, Citi Riedy, Bader— STINGER: Building Blocks 19 May 2014 23 / 24
  26. 26. Acknowledgment of support Riedy, Bader— STINGER: Building Blocks 19 May 2014 24 / 24

×