SlideShare une entreprise Scribd logo
1  sur  30
http://bigdata.com 
http://mapgraph.io 
SYSTAP, LLC 
Graphs 
Graph Databases 
Graph Analytics on GPUs 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
1 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Graphs 
• This talk is about recent advances in large scale graph processing on 
GPUs. 
– The motivation is extreme performance. 
– Everything we do (as a company) is focused on graphs. 
• Graph Database 
• Graph processing 
• Common characteristics: 
– irregular data shape, irregular access patterns, and irregular parallelism. 
• A lot of data can be mapped onto graphs 
– Sparse matrices and graphs are very close data structures 
– Graphs, as we deal with them, have attributes on vertices and edges. 
• A lot of algorithms can be mapped onto graphs 
– Including many machine learning algorithms. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
2 
http://www.bigdata.com/blog
Small Business, Founded 2006 100% Employee Owned 
http://bigdata.com 
http://mapgraph.io 
SYSTAP, LLC 
Graph Database 
• High performance, Scalable 
– 50B edges/node 
– High level query language 
– Efficient Graph Traversal 
– High 9s solution 
• Open Source 
– Subscriptions 
GPU Analytics 
• Extreme Performance 
– 5-100x faster than graphlab 
– 10,000x faster than graphdbs 
• DARPA funding 
• Disruptive technology 
– Early adopters 
– Huge ROIs 
• Open Source 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
3 
9/19/2014
Related “Graph” Technologies 
Redpoint repositions existing technology, 
adding interoperability for blueprints and 
gremlin. 
http://bigdata.com 
http://mapgraph.io 
STTR 
Pair up bigdata 
and MapGraph 
MapGraph compares favorably with high end 
hardware solutions from YARC, Oracle, and 
SAP, but is open source and uses commodity 
hardware. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
4 
9/19/2014
Embedded, Single Server, HA, Scale-out 
• RDF/SPARQL 
• Property graphs 
– Blueprints, gremlin, rexter 
• REST API (NSS) 
• Extension points 
– Stored queries for custom 
application logic on the 
server. 
– Custom services & indices 
– Custom functions 
– Vertex-centric programs 
http://bigdata.com 
http://mapgraph.io 
• Embedded Server 
Journal 
JVM 
• Standalone Server 
Journal 
WAR 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
5 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
High Availability 
• Shared nothing architecture 
– Same data on each node 
– Coordinate only at commit 
– Transparent load balancing 
• Scaling 
– 50 billion triples or quads 
– Query throughput scales linearly 
• Self healing 
– Automatic failover 
– Automatic resync after disconnect 
– Online single node disaster recovery 
• Online Backup 
– Online snapshots (full backups) 
– HA Logs (incremental backups) 
• Point in time recovery (offline) 
HAService 
Quorum 
k=3 
size=3 
leader 
follower 
HAService 
HAService 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
6 
9/19/2014
Embedded, Single Server, HA, Scale-out 
http://bigdata.com 
http://mapgraph.io 
RDF Data and SPARQL Query 
Client Service 
Distributed Index Management and Query 
Management Functions 
Client Service 
Registrar 
Data Service 
Client Service 
Data Service Data Service Data Service 
Data Service Data Service Data Service 
Zookeeper 
Shard Locator 
Transaction Mgr 
Load Balancer 
Unified API 
Application 
Client 
Application 
Client 
Application 
Client 
Application 
Client 
Application 
Client 
Client Service 
SPARQL XML 
SPARQL JSON 
RDF/XML 
N-Triples 
N-Quads 
Turtle 
TriG 
RDF/JSON 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
7 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
And now on GPUs 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
8 
9/19/2014
Similar models, different problems 
• Graph query and graph analytics (traversal/mining) 
– Related data models 
– Very different computational requirements 
• Many technologies are a bad match or limited solution 
– Key-value stores (bigtable, Accumulo, Cassandra, HBase) 
– Map-reduce 
• Anti-pattern 
– Dump all data into “big bucket” 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
9 
9/19/2014
Similar models, different problems 
• Graph query and graph analytics (traversal/mining) 
– Related data models 
– Very different computational requirements 
• Many technologies are a bad match or limited solution 
– Key-value stores (bigtable, Accumulo, Cassandra, HBase) 
– Map-reduce 
• Anti-pattern 
– Dump all data into “big bucket” 
Storage and computation patterns must be 
correctly matched for high performance. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
10 
9/19/2014
Optimize for the right problem 
• Graph analytics 
– Parallelism – work must be distributed and balanced. 
– Memory bandwidth – memory, not disk, is the bottleneck 
– 2D partitioning – O(log(N)) communications pattern (versus O(N*N)) 
• 1D design looses locality when updating link weights for reverse indices. 
BFS PR 
• Storage and computation patterns must be correctly matched 
for high performance. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
11 
9/19/2014
GPUs – A Game Changer for Graph Analytics 
• Graphs are a hard problem 
• Non-locality 
• Data dependent parallelism 
• Memory, PCIe bus and network are 
bottlenecks 
• Recent performance gains driven by 
innovations in bottom-up search, 
data layout, and partitioning. 
• GPUs deliver effective parallelism 
• 10x CPU FLOPS 
• 10x CPU/RAM bandwidth 
• Significant speeds up over CPU 
• 3 GTEPS on one GPU 
• 32 GTEPS on 64 GPU cluster 
http://bigdata.com 
http://mapgraph.io 
3500 
3000 
2500 
2000 
1500 
1000 
500 
0 
Breadth-First Search on Graphs 
NVIDIA Tesla C2050 
Multicore per socket 
Sequential 
0 
2 
1 10 100 1000 10000 100000 
Million Traversed Edges per 
Second 
Average Traversal Depth 
1 
1 
2 
1 
1 
2 
2 
2 
1 
3 
2 
3 
2 
1 2 
2 
10x Speedup on GPUs
http://bigdata.com 
http://mapgraph.io 
GPU Hardware Trends 
• K40 GPU (today) 
• 12G RAM/GPU 
• 288 GB/s bandwidth 
• PCIe Gen 3 
• Pascal GPU (Q1 2016) 
• 24G RAM/GPU 
• 1 TB/s bandwidth 
• Unified memory 
across CPU, GPUs 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
13 
9/19/2014
Full Bandwidth Access to CPU RAM 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
14 
9/19/2014
Architecture shapes performance 
• The data was a scale-free graph with 2.7M vertices and 5.6M 
– MapGraph used a larger version of the graph (24M vertices, 25M edges) 
• The query was a 5-degree subgraph (depth-limited BFS) 
• Two main takeaways 
– Horizontal scaling for titan is very expensive – wrong abstraction. 
– GPUs are ridiculously fast. 
platform load (s) 
http://bigdata.com 
http://mapgraph.io 
query 
(ms) comments 
titan 497.00 935 4 node cluster using Cassandra 
neo4j 608.00 668 single node community edition 
bigdata 396.00 281 single node (open source) 
MapGraph 0.08 27 NVIDIA K20 GPU 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
15 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
MapGraph 
Graph Processing on GPUs 
http://MapGraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
16 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Think Like a Vertex 
• Simple APIs 
pageRank(Message m) { 
total = m.value(); 
vertex.val = .15 * .85 + total; 
for(nbr : out_neighbors) { 
SendMsg(nbr, vertex.val/num_out_nbrs); 
} 
} 
• Lots of algorithms 
– BFS, SSSP, Page Rank, Connected Components, Louvain Modularity, 
Jaccard Distance, k-means clustering, Betweenness-Centrality, 
Personalized Page Rank, Loopy Belief Propagation, Graph search (crisp 
and approximate), etc. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
17 
9/19/2014
GAS – a Graph-Parallel Abstraction 
• Graph-Parallel Vertex-Centric API ala GraphLab 
• “Think like a vertex” 
• Gather: collect information 
about my neighborhood 
• Apply: update my value 
• Scatter: signal adjacent vertices 
• Can write all sorts of graph algorithms this way 
– BFS, PageRank, Connected Component, Triangle Counting, Max Flow, 
etc. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
18 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
MapGraph 
• High-level graph processing framework 
• High programmability 
GPU architecture 
Optimization techniques 
CUDA 
• High performance 
Comparable to low-level approach 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
19 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
MapGraph 
• High-level graph processing framework 
• High programmability 
GPU architecture 
Optimization techniques 
CUDA 
• High performance 
Comparable to low-level approach 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
20 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Single GPU MapGraph (BFS) 
Dataset #vertices #edges Max Degree Milliseconds 
Webbase 1,000,005 3,105,536 23 1.2 
Delaunay 2,097,152 6,291,408 4,700 24.5 
Bitcoin 6,297,539 28,143,065 4,075,472 345.3 
Wiki 3,566,907 45,030,389 7,061 51.0 
Kron 1,048,576 89,239,674 131,505 47.7 
154.0 
513.6 
74.8 
821.3 
1870.9 
2,000 
1,800 
1,600 
1,400 
1,200 
1,000 
800 
600 
400 
200 
0 
Webbase Delaunay Bitcoin Wiki Kron 
MTEPS 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
21 
9/19/2014
BFS Results : MapGraph vs GraphLab 
1,000.00 
100.00 
10.00 
http://bigdata.com 
http://mapgraph.io 
1.00 
0.10 
Webbase Delaunay Bitcoin Wiki Kron 
Speedup 
MapGraph Speedup vs GraphLab (BFS) 
GL-2 
GL-4 
GL-8 
GL-12 
MPG 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
22 
9/19/2014
PageRank : MapGraph vs GraphLab 
100.00 
10.00 
http://bigdata.com 
http://mapgraph.io 
1.00 
0.10 
Webbase Delaunay Bitcoin Wiki Kron 
Speedup 
MapGraph Speedup vs GraphLab (Page Rank) 
GL-2 
GL-4 
GL-8 
GL-12 
MPG 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
23 
9/19/2014
Graph Mining on GPU Clusters 
• 2D partitioning (aka vertex cuts) 
• Minimizes the communication volume. 
• Batch parallel Gather in row, Scatter in 
column. 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
24 
9/19/2014
Accelerated Graph Analytics 
http://bigdata.com 
http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
25 
9/19/2014
• Work spans multiple orders of magnitude. 
10,000,000 
1,000,000 
100,000 
10,000 
1,000 
100 
http://bigdata.com 
http://mapgraph.io 
10 
1 
Scale 25 Traversal 
0 1 2 3 4 5 6 
Frontier Size 
Iteration 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
26 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Strong Scaling 
• Speedup on a constant problem size with more GPUs 
• Problem scale 25 
– 2^25 vertices (33,554,432) 
– 2^26 directed edges (1,073,741,824) 
23 
22 
21 
20 
19 
18 
17 
16 
15 
14 
13 
10 20 30 40 50 60 70 
GTEPS 
#GPUs 
Strong scaling 
GPUs GTEPS Time (s) 
16 14.3 0.075 
25 16.4 0.066 
36 18.1 0.059 
64 22.7 0.047 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
27 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Weak Scaling 
• Scaling the problem size with more GPUs 
35 
30 
25 
20 
15 
10 
5 
0 
1 4 16 64 
GTEPS 
#GPUs 
Weak scaling 
GPUs Scale Vertices Edges Time (s) GTEPS 
1 21 2,097,152 67,108,864 0.0254 3 
4 23 8,388,608 268,435,456 0.0429 6 
16 25 33,554,432 1,073,741,824 0.0715 15 
64 27 134,217,728 4,294,967,296 0.1478 29 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
28 
9/19/2014
http://bigdata.com 
http://mapgraph.io 
Highlights 
• For algorithms on large graphs 
– Memory is the bottleneck 
• CPUs quickly saturate the memory bus. 
• CPU cache thrashing limits scaling for graph traversal. 
• Continued performance gains for CPUs focus on reducing the #of visited edges to reduce 
bandwidth. 
• Hybrid CPU/GPU architectures offload either small degree vertices (reduce cache 
thrashing) or high degree vertices (if the algorithm is FLOPS bound on the CPU, e.g., BC) 
– Many core is the future. 
– GPUs are primarily known for their FLOPS, but they have high memory bandwidth 
and can deliver effective parallelism on parallel graph problems (with sophisticated 
kernels). 
• Scaling to very large graphs on large compute clusters 
– Communications bound. 
• Communication must be constant for perfect scaling 
– Hybrid partitioning seeks to reduce #of messages, size of messages, and optimize 
for asynchronous communications and degree-aware layouts for bottom-up search 
to reduce memory bandwidth. 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
29 
http://www.bigdata.com/blog
http://bigdata.com 
http://mapgraph.io 
Bryan Thompson 
SYSTAP, LLC 
bryan@systap.com 
http://bigdata.com http://mapgraph.io 
SYSTAP™, LLC 
© 2006-2014 All Rights Reserved 
30 
9/19/2014

Contenu connexe

Tendances

Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 

Tendances (20)

MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at Scale
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkNear Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedIn
 
MySql to HBase in 5 Steps
MySql to HBase in 5 StepsMySql to HBase in 5 Steps
MySql to HBase in 5 Steps
 
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UNMariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
MariaDB 10.2 & MariaDB 10.1 by Michael Monty Widenius at Database Camp 2016 @ UN
 
Geospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache SparkGeospatial Analytics at Scale with Deep Learning and Apache Spark
Geospatial Analytics at Scale with Deep Learning and Apache Spark
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse Offload
 
Cost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceCost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark Service
 
Performing Simulation-Based, Real-time Decision Making with Cloud HPC
Performing Simulation-Based, Real-time Decision Making with Cloud HPCPerforming Simulation-Based, Real-time Decision Making with Cloud HPC
Performing Simulation-Based, Real-time Decision Making with Cloud HPC
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
 
Improving MySQL performance with Hadoop
Improving MySQL performance with HadoopImproving MySQL performance with Hadoop
Improving MySQL performance with Hadoop
 
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsHarnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data Payloads
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 

Similaire à Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL

RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
Connected Data World
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
boorad
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
AAMIR FAROOQUI
 

Similaire à Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL (20)

Deploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudDeploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloud
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
A Glass Half Full: Using Programmable Hardware Accelerators in Analytical Dat...
 
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
El Punto Neutro de Internet en Cataluña
El Punto Neutro de Internet en CataluñaEl Punto Neutro de Internet en Cataluña
El Punto Neutro de Internet en Cataluña
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
 
Network Engineering for High Speed Data Sharing
Network Engineering for High Speed Data SharingNetwork Engineering for High Speed Data Sharing
Network Engineering for High Speed Data Sharing
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CIT
 
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
Stinger Initiative: Leveraging Hive & Yarn for High-Performance/Interactive Q...
 

Plus de MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 

Plus de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Bryan Thompson, Chief Scientist and Founder, SYSTAP, LLC at MLconf ATL

  • 1. http://bigdata.com http://mapgraph.io SYSTAP, LLC Graphs Graph Databases Graph Analytics on GPUs SYSTAP™, LLC © 2006-2014 All Rights Reserved 1 9/19/2014
  • 2. http://bigdata.com http://mapgraph.io Graphs • This talk is about recent advances in large scale graph processing on GPUs. – The motivation is extreme performance. – Everything we do (as a company) is focused on graphs. • Graph Database • Graph processing • Common characteristics: – irregular data shape, irregular access patterns, and irregular parallelism. • A lot of data can be mapped onto graphs – Sparse matrices and graphs are very close data structures – Graphs, as we deal with them, have attributes on vertices and edges. • A lot of algorithms can be mapped onto graphs – Including many machine learning algorithms. SYSTAP™, LLC © 2006-2014 All Rights Reserved 2 http://www.bigdata.com/blog
  • 3. Small Business, Founded 2006 100% Employee Owned http://bigdata.com http://mapgraph.io SYSTAP, LLC Graph Database • High performance, Scalable – 50B edges/node – High level query language – Efficient Graph Traversal – High 9s solution • Open Source – Subscriptions GPU Analytics • Extreme Performance – 5-100x faster than graphlab – 10,000x faster than graphdbs • DARPA funding • Disruptive technology – Early adopters – Huge ROIs • Open Source SYSTAP™, LLC © 2006-2014 All Rights Reserved 3 9/19/2014
  • 4. Related “Graph” Technologies Redpoint repositions existing technology, adding interoperability for blueprints and gremlin. http://bigdata.com http://mapgraph.io STTR Pair up bigdata and MapGraph MapGraph compares favorably with high end hardware solutions from YARC, Oracle, and SAP, but is open source and uses commodity hardware. SYSTAP™, LLC © 2006-2014 All Rights Reserved 4 9/19/2014
  • 5. Embedded, Single Server, HA, Scale-out • RDF/SPARQL • Property graphs – Blueprints, gremlin, rexter • REST API (NSS) • Extension points – Stored queries for custom application logic on the server. – Custom services & indices – Custom functions – Vertex-centric programs http://bigdata.com http://mapgraph.io • Embedded Server Journal JVM • Standalone Server Journal WAR SYSTAP™, LLC © 2006-2014 All Rights Reserved 5 9/19/2014
  • 6. http://bigdata.com http://mapgraph.io High Availability • Shared nothing architecture – Same data on each node – Coordinate only at commit – Transparent load balancing • Scaling – 50 billion triples or quads – Query throughput scales linearly • Self healing – Automatic failover – Automatic resync after disconnect – Online single node disaster recovery • Online Backup – Online snapshots (full backups) – HA Logs (incremental backups) • Point in time recovery (offline) HAService Quorum k=3 size=3 leader follower HAService HAService SYSTAP™, LLC © 2006-2014 All Rights Reserved 6 9/19/2014
  • 7. Embedded, Single Server, HA, Scale-out http://bigdata.com http://mapgraph.io RDF Data and SPARQL Query Client Service Distributed Index Management and Query Management Functions Client Service Registrar Data Service Client Service Data Service Data Service Data Service Data Service Data Service Data Service Zookeeper Shard Locator Transaction Mgr Load Balancer Unified API Application Client Application Client Application Client Application Client Application Client Client Service SPARQL XML SPARQL JSON RDF/XML N-Triples N-Quads Turtle TriG RDF/JSON SYSTAP™, LLC © 2006-2014 All Rights Reserved 7 9/19/2014
  • 8. http://bigdata.com http://mapgraph.io And now on GPUs SYSTAP™, LLC © 2006-2014 All Rights Reserved 8 9/19/2014
  • 9. Similar models, different problems • Graph query and graph analytics (traversal/mining) – Related data models – Very different computational requirements • Many technologies are a bad match or limited solution – Key-value stores (bigtable, Accumulo, Cassandra, HBase) – Map-reduce • Anti-pattern – Dump all data into “big bucket” http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 9 9/19/2014
  • 10. Similar models, different problems • Graph query and graph analytics (traversal/mining) – Related data models – Very different computational requirements • Many technologies are a bad match or limited solution – Key-value stores (bigtable, Accumulo, Cassandra, HBase) – Map-reduce • Anti-pattern – Dump all data into “big bucket” Storage and computation patterns must be correctly matched for high performance. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 10 9/19/2014
  • 11. Optimize for the right problem • Graph analytics – Parallelism – work must be distributed and balanced. – Memory bandwidth – memory, not disk, is the bottleneck – 2D partitioning – O(log(N)) communications pattern (versus O(N*N)) • 1D design looses locality when updating link weights for reverse indices. BFS PR • Storage and computation patterns must be correctly matched for high performance. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 11 9/19/2014
  • 12. GPUs – A Game Changer for Graph Analytics • Graphs are a hard problem • Non-locality • Data dependent parallelism • Memory, PCIe bus and network are bottlenecks • Recent performance gains driven by innovations in bottom-up search, data layout, and partitioning. • GPUs deliver effective parallelism • 10x CPU FLOPS • 10x CPU/RAM bandwidth • Significant speeds up over CPU • 3 GTEPS on one GPU • 32 GTEPS on 64 GPU cluster http://bigdata.com http://mapgraph.io 3500 3000 2500 2000 1500 1000 500 0 Breadth-First Search on Graphs NVIDIA Tesla C2050 Multicore per socket Sequential 0 2 1 10 100 1000 10000 100000 Million Traversed Edges per Second Average Traversal Depth 1 1 2 1 1 2 2 2 1 3 2 3 2 1 2 2 10x Speedup on GPUs
  • 13. http://bigdata.com http://mapgraph.io GPU Hardware Trends • K40 GPU (today) • 12G RAM/GPU • 288 GB/s bandwidth • PCIe Gen 3 • Pascal GPU (Q1 2016) • 24G RAM/GPU • 1 TB/s bandwidth • Unified memory across CPU, GPUs SYSTAP™, LLC © 2006-2014 All Rights Reserved 13 9/19/2014
  • 14. Full Bandwidth Access to CPU RAM http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 14 9/19/2014
  • 15. Architecture shapes performance • The data was a scale-free graph with 2.7M vertices and 5.6M – MapGraph used a larger version of the graph (24M vertices, 25M edges) • The query was a 5-degree subgraph (depth-limited BFS) • Two main takeaways – Horizontal scaling for titan is very expensive – wrong abstraction. – GPUs are ridiculously fast. platform load (s) http://bigdata.com http://mapgraph.io query (ms) comments titan 497.00 935 4 node cluster using Cassandra neo4j 608.00 668 single node community edition bigdata 396.00 281 single node (open source) MapGraph 0.08 27 NVIDIA K20 GPU SYSTAP™, LLC © 2006-2014 All Rights Reserved 15 9/19/2014
  • 16. http://bigdata.com http://mapgraph.io MapGraph Graph Processing on GPUs http://MapGraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 16 9/19/2014
  • 17. http://bigdata.com http://mapgraph.io Think Like a Vertex • Simple APIs pageRank(Message m) { total = m.value(); vertex.val = .15 * .85 + total; for(nbr : out_neighbors) { SendMsg(nbr, vertex.val/num_out_nbrs); } } • Lots of algorithms – BFS, SSSP, Page Rank, Connected Components, Louvain Modularity, Jaccard Distance, k-means clustering, Betweenness-Centrality, Personalized Page Rank, Loopy Belief Propagation, Graph search (crisp and approximate), etc. SYSTAP™, LLC © 2006-2014 All Rights Reserved 17 9/19/2014
  • 18. GAS – a Graph-Parallel Abstraction • Graph-Parallel Vertex-Centric API ala GraphLab • “Think like a vertex” • Gather: collect information about my neighborhood • Apply: update my value • Scatter: signal adjacent vertices • Can write all sorts of graph algorithms this way – BFS, PageRank, Connected Component, Triangle Counting, Max Flow, etc. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 18 9/19/2014
  • 19. http://bigdata.com http://mapgraph.io MapGraph • High-level graph processing framework • High programmability GPU architecture Optimization techniques CUDA • High performance Comparable to low-level approach SYSTAP™, LLC © 2006-2014 All Rights Reserved 19 9/19/2014
  • 20. http://bigdata.com http://mapgraph.io MapGraph • High-level graph processing framework • High programmability GPU architecture Optimization techniques CUDA • High performance Comparable to low-level approach SYSTAP™, LLC © 2006-2014 All Rights Reserved 20 9/19/2014
  • 21. http://bigdata.com http://mapgraph.io Single GPU MapGraph (BFS) Dataset #vertices #edges Max Degree Milliseconds Webbase 1,000,005 3,105,536 23 1.2 Delaunay 2,097,152 6,291,408 4,700 24.5 Bitcoin 6,297,539 28,143,065 4,075,472 345.3 Wiki 3,566,907 45,030,389 7,061 51.0 Kron 1,048,576 89,239,674 131,505 47.7 154.0 513.6 74.8 821.3 1870.9 2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0 Webbase Delaunay Bitcoin Wiki Kron MTEPS SYSTAP™, LLC © 2006-2014 All Rights Reserved 21 9/19/2014
  • 22. BFS Results : MapGraph vs GraphLab 1,000.00 100.00 10.00 http://bigdata.com http://mapgraph.io 1.00 0.10 Webbase Delaunay Bitcoin Wiki Kron Speedup MapGraph Speedup vs GraphLab (BFS) GL-2 GL-4 GL-8 GL-12 MPG SYSTAP™, LLC © 2006-2014 All Rights Reserved 22 9/19/2014
  • 23. PageRank : MapGraph vs GraphLab 100.00 10.00 http://bigdata.com http://mapgraph.io 1.00 0.10 Webbase Delaunay Bitcoin Wiki Kron Speedup MapGraph Speedup vs GraphLab (Page Rank) GL-2 GL-4 GL-8 GL-12 MPG SYSTAP™, LLC © 2006-2014 All Rights Reserved 23 9/19/2014
  • 24. Graph Mining on GPU Clusters • 2D partitioning (aka vertex cuts) • Minimizes the communication volume. • Batch parallel Gather in row, Scatter in column. http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 24 9/19/2014
  • 25. Accelerated Graph Analytics http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 25 9/19/2014
  • 26. • Work spans multiple orders of magnitude. 10,000,000 1,000,000 100,000 10,000 1,000 100 http://bigdata.com http://mapgraph.io 10 1 Scale 25 Traversal 0 1 2 3 4 5 6 Frontier Size Iteration SYSTAP™, LLC © 2006-2014 All Rights Reserved 26 9/19/2014
  • 27. http://bigdata.com http://mapgraph.io Strong Scaling • Speedup on a constant problem size with more GPUs • Problem scale 25 – 2^25 vertices (33,554,432) – 2^26 directed edges (1,073,741,824) 23 22 21 20 19 18 17 16 15 14 13 10 20 30 40 50 60 70 GTEPS #GPUs Strong scaling GPUs GTEPS Time (s) 16 14.3 0.075 25 16.4 0.066 36 18.1 0.059 64 22.7 0.047 SYSTAP™, LLC © 2006-2014 All Rights Reserved 27 9/19/2014
  • 28. http://bigdata.com http://mapgraph.io Weak Scaling • Scaling the problem size with more GPUs 35 30 25 20 15 10 5 0 1 4 16 64 GTEPS #GPUs Weak scaling GPUs Scale Vertices Edges Time (s) GTEPS 1 21 2,097,152 67,108,864 0.0254 3 4 23 8,388,608 268,435,456 0.0429 6 16 25 33,554,432 1,073,741,824 0.0715 15 64 27 134,217,728 4,294,967,296 0.1478 29 SYSTAP™, LLC © 2006-2014 All Rights Reserved 28 9/19/2014
  • 29. http://bigdata.com http://mapgraph.io Highlights • For algorithms on large graphs – Memory is the bottleneck • CPUs quickly saturate the memory bus. • CPU cache thrashing limits scaling for graph traversal. • Continued performance gains for CPUs focus on reducing the #of visited edges to reduce bandwidth. • Hybrid CPU/GPU architectures offload either small degree vertices (reduce cache thrashing) or high degree vertices (if the algorithm is FLOPS bound on the CPU, e.g., BC) – Many core is the future. – GPUs are primarily known for their FLOPS, but they have high memory bandwidth and can deliver effective parallelism on parallel graph problems (with sophisticated kernels). • Scaling to very large graphs on large compute clusters – Communications bound. • Communication must be constant for perfect scaling – Hybrid partitioning seeks to reduce #of messages, size of messages, and optimize for asynchronous communications and degree-aware layouts for bottom-up search to reduce memory bandwidth. SYSTAP™, LLC © 2006-2014 All Rights Reserved 29 http://www.bigdata.com/blog
  • 30. http://bigdata.com http://mapgraph.io Bryan Thompson SYSTAP, LLC bryan@systap.com http://bigdata.com http://mapgraph.io SYSTAP™, LLC © 2006-2014 All Rights Reserved 30 9/19/2014