SlideShare une entreprise Scribd logo
1  sur  63
Télécharger pour lire hors ligne
®
© 2016 MapR Technologies 9-1© 2017 MapR Technologies
®
Spark GraphX
®
© 2016 MapR Technologies 9-2
Learning Goals
•  Describe GraphX
•  Define Regular, Directed, and Property Graphs
•  Create a Property Graph
•  Perform Operations on Graphs
®
© 2016 MapR Technologies 9-3
Learning Goals
•  Describe GraphX
•  Define Regular, Directed, and Property Graphs
•  Create a Property Graph
•  Perform Operations on Graphs
®
© 2016 MapR Technologies 9-4
What is a Graph?
Graph: vertices connected by edges
vertex
edge
5
1
®
© 2016 MapR Technologies 9-5
What is a Graph?
set of vertices, connected by edges.
vertex
edge
DFW
ATL
Relationship: distance
®
© 2016 MapR Technologies 9-6
Graphs are Essential to Data Mining and Machine Learning
•  Identify influential entities (people, information…)
•  Find communities
•  Understand people’s shared interests
•  Model complex data dependencies
®
© 2016 MapR Technologies 9-7
Real World Graphs
•  Web Pages
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-8
Real World Graphs
•  Web Pages
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-9
Real World Graphs
•  Web Pages
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-10
Real World Graphs
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-11
Real World Graphs
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-12
Real World Graphs
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-13
Real World Graphs
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-14
Real World Graphs
Reference Spark GraphX in Action
®
© 2016 MapR Technologies 9-15
Real World Graphs
•  Recommendations
Ratings
 Items
Users
®
© 2016 MapR Technologies 9-16
Real World Graphs
•  Credit Card Application Fraud
Reference Spark Summit
®
© 2016 MapR Technologies 9-17
Real World Graphs
•  Credit Card Fraud
®
© 2016 MapR Technologies 9-18
Finding Communities
Count triangles passing through each vertex:
"


Measures “cohesiveness” of local community
More Triangles
Stronger Community
Fewer Triangles
Weaker Community
1
2
 3
4
®
© 2016 MapR Technologies 9-19
Real World Graphs
Healthcare
®
© 2016 MapR Technologies 9-20
Liberal Conservative
Post
Post
Post
Post
Post
Post
Post
Post
Predicting User Behavior
Post
Post
Post
Post
Post
Post
Post	
Post	
Post	
Post	
Post	
Post	
Post	
Post	
?	
?	
?	
?	
?	
?	
?	
?	 ?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	
?	 ?	
?	
20	
Conditional Random Field!
Belief Propagation!
®
© 2016 MapR Technologies 9-21
Enable JoiningTables and Graphs
21	
User
Data
Product
Ratings
Friend
Graph
ETL
Product Rec.
Graph
Join Inf.
Prod.
Rec.
Tables Graphs
®
© 2016 MapR Technologies 9-22
Table and Graph Analytics
®
© 2016 MapR Technologies 9-23
What is GraphX?
Spark SQL
•  Structured Data
•  Querying with
SQL/HQL
•  DataFrames
Spark Streaming
•  Processing of live
streams
•  Micro-batching
MLlib
•  Machine Learning
•  Multiple types of
ML algorithms
GraphX
•  Graph processing
•  Graph parallel
computations
RDD Transformations and Actions
•  Task scheduling
•  Memory management
•  Fault recovery
•  Interacting with storage systems
Spark Core
®
© 2016 MapR Technologies 9-24
Apache Spark GraphX
•  Spark component for graphs and graph-
parallel computations
•  Combines data parallel and graph parallel
processing in single API
•  View data as graphs and as collections (RDD)
–  no duplication or movement of data
•  Operations for graph computation
–  includes optimized version of Pregel
•  Provides graph algorithms and builders
GraphX
•  Graph processing
•  Graph parallel
computations
®
© 2016 MapR Technologies 9-25
Learning Goals
•  Describe GraphX
•  Define Regular, Directed, and Property Graphs
•  Create a Property Graph
•  Perform Operations on Graphs
®
© 2016 MapR Technologies 9-26
Regular Graphs vs Directed Graphs
edge
Carol
Bob
vertex
Relationship:
Friends
•  Regular graph: each vertex has the same
number of edges
•  Example: Facebook friends
–  Bob is a friend of Carol
–  Carol is a friend of Bob
®
© 2016 MapR Technologies 9-27
Regular Graphs vs Directed Graphs
vertex
edge
Carol
1
2
3
Oprah
6
•  Directed graph: edges have a direction
•  Example: Twitter followers
–  Carol follows Oprah
–  Oprah does not follow Carol
Relationship:
follows
®
© 2016 MapR Technologies 9-28
Property Graph
Flight 123
Flight 1002
LAX
SJC
Properties
Properties
®
© 2016 MapR Technologies 9-29
Flight Example with GraphX
edge
ORD
vertex
SFO
1800 miles
800 miles1400 miles
DFW
Originating
Airport
Destination
Airport
Distance
SFO ORD 1800 miles
ORD DFW 800 miles
DFW SFO 1400 miles
®
© 2016 MapR Technologies 9-30
Flight Example with GraphX
edge
ORD
vertex
SFO
1800 miles
800 miles1400 miles
DFW
Id Property
1 SFO
2 ORD
3 DFW
SrcId DestId Property
1 2 1800
2 3 800
3 1 1400
Vertex Table
Edge Table
®
© 2016 MapR Technologies 9-31
Spark Property Graph class
edge
ORD
vertex
SFO
1800 miles
800 miles1400 miles
DFW
class Graph[VD, ED] {
val vertices: VertexRDD[VD]
val edges: EdgeRDD[ED]
}
®
© 2016 MapR Technologies 9-32
Learning Goals
•  Define GraphX
•  Define Regular, Directed, and Property Graphs
•  Create a Property Graph
•  Perform Operations on Graphs
®
© 2016 MapR Technologies 9-33
Create a Property Graph
Import required classes
Create vertex RDD
Create edge RDD
Create graph
1
2
3
4
®
© 2016 MapR Technologies 9-34
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
Create a Property Graph
1 Import required classes
®
© 2016 MapR Technologies 9-35
Create a Property Graph: Data Set
Vertices: Airports
Edges: Routes
Source ID Dest ID Property (E)
Id Id Distance (Integer)
Vertex ID Property (V)
Id (Long) Name (String)
®
© 2016 MapR Technologies 9-36
Create a Property Graph
// create vertices RDD with ID and Name
val vertices=Array((1L, ("SFO")),(2L, ("ORD")),(3L,("DFW")))
val vRDD= sc.parallelize(vertices)
vRDD.take(1)
// Array((1,SFO))
2 Create vertex RDD
Id Property
1 SFO
2 ORD
3 DFW
®
© 2016 MapR Technologies 9-37
Create a Property Graph
3 Create edge RDD
// create routes RDD with srcid, destid , distance
val edges = Array(Edge(1L,2L,1800),Edge(2L,3L,800),
Edge(3L,1L,1400))
val eRDD= sc.parallelize(edges)
eRDD.take(2)
// Array(Edge(1,2,1800), Edge(2,3,800))
SrcId DestId Property
1 2 1800
2 3 800
3 1 1400
®
© 2016 MapR Technologies 9-38
Create a Property Graph
4 Create graph
// define default vertex nowhere
val nowhere = “nowhere”
//build initial graph
val graph = Graph(vertices, edges, nowhere)
graph.vertices.take(3).foreach(print)
// (2,ORD)(1,SFO)(3,DFW)
graph.edges.take(3).foreach(print)
// Edge(1,2,1800) Edge(2,3,800) Edge(3,1,1400)
®
© 2016 MapR Technologies 9-39
Learning Goals
•  Define GraphX
•  Define Regular, Directed, and Property Graphs
•  Create a Property Graph
•  Perform Operations on Graphs
®
© 2016 MapR Technologies 9-40
Graph Operators
To answer questions such as:
•  How many airports are there?
•  How many flight routes are there?
•  What are the longest distance routes?
•  Which airport has the most incoming
flights?
•  What are the top 10 flights?
®
© 2016 MapR Technologies 9-41
Graph Class
®
© 2016 MapR Technologies 9-42
Graph Operators
To find information about the graph
Operator Description
numEdges number of edges (Long)
numVertices number of vertices (Long)
inDegrees The in-degree of each vertex (VertexRDD[Int])
outDegrees The out-degree of each vertex (VertexRDD[Int])
degrees The degree of each vertex (VertexRDD[Int])
®
© 2016 MapR Technologies 9-43
Graph Operators
Graph Operators
// How many airports?
val numairports = graph.numVertices
// Long = 3
// How many routes?
val numroutes = graph.numEdges
// Long = 3
// routes > 1000 miles distance?
graph.edges.filter {
case ( Edge(org_id, dest_id,distance))=>
distance > 1000
}.take(3)
// Array(Edge(1,2,1800), Edge(3,1,1400)
®
© 2016 MapR Technologies 9-44
Triplets
// Triplets add source and destination properties to Edges
graph.triplets.take(3).foreach(println)
((1,SFO),(2,ORD),1800)
((2,ORD),(3,DFW),800)
((3,DFW),(1,SFO),1400)
®
© 2016 MapR Technologies 9-45
Triplets What are the longest routes ?
((1,SFO),(2,ORD),1800)
((2,ORD),(3,DFW),800)
((3,DFW),(1,SFO),1400)
// print out longest routes
graph.triplets.sortBy(_.attr, ascending=false)
.map(triplet =>"Distance" + triplet.attr.toString + “from"
+ triplet.srcAttr + “to" + triplet.dstAttr)
.collect.foreach(println)
Distance 1800 from SFO to ORD
Distance 1400 from DFW to SFO
Distance 800 from ORD to DFW
®
© 2016 MapR Technologies 9-46
Graph Operators
Which airport has the most incoming flights? (real dataset)
// Define a function to compute the highest degree vertex
def max(a:(VertexId,Int),b:(VertexId, Int)):(VertexId, Int) =
{
if (a._2 > b._2) a else b
}
// Which Airport has the most incoming flights?
val maxInDegree:(VertexId, Int)= graph.inDegrees.reduce(max)
// (10397,152) ATL
®
© 2016 MapR Technologies 9-47
Graph Operators
Which 3 airports have the most incoming flights? (real dataset)
// get top 3
val maxIncoming = graph.inDegrees.collect
.sortWith(_._2 > _._2)
.map(x => (airportMap(x._1), x._2)).take(3)
maxIncoming.foreach(println)
(ATL,152)
(ORD,145)
(DFW,143)
®
© 2016 MapR Technologies 9-48
Graph Operators
Caching Graphs
Operator Description
cache() Caches the vertices and edges; default level is
MEMORY_ONLY
persist(newLevel) Caches the vertices and edges at specified storage level;
returns a reference to this graph
unpersist(blocking) Uncaches both vertices and edges of this graph
unpersistVertices(blocking) Uncaches only the vertices, leaving edges alone
®
© 2016 MapR Technologies 9-49
Graph Class
®
© 2016 MapR Technologies 9-50
Class Discussion
1.  How many airports are there?
•  In our graph, what represents airports?
•  Which operator could you use to find the number of airports?
2.  How many routes are there?
•  In our graph, what represents routes?
•  Which operator could you use to find the number of routes?
®
© 2016 MapR Technologies 9-51
How Many Airports are There?
How many airports are there?
•  In our graph, what represents airports?
Vertices
•  Which operator could you use to find the number of airports?
graph.numVertices
®
© 2016 MapR Technologies 9-52
Pregel API
•  GraphX exposes variant of Pregel API
•  iterative graph processing
–  Iterations of message passing between vertices
®
© 2016 MapR Technologies 9-53
The Graph-Parallel Abstraction
A user-definedVertex-Program runs on each Graph vertex
•  Using messages (e.g. Pregel )
•  Parallelism: run multiple vertex programs simultaneously
®
© 2016 MapR Technologies 9-54
Pregel Operator
Initial message received at each vertex
Message computed at each vertex
Sum of message received at each vertex
Message computed at each vertex
Sum of message received at each vertex
Message computed at each vertex
1Super step
2Super step
nSuper step
Loop until no messages left
OR max iterations
®
© 2016 MapR Technologies 9-55
Pregel Operator: Example
Use Pregel to find the cheapest airfare:
// starting vertex
val sourceId: VertexId = 13024
// a graph with edges containing airfare cost calculation
val gg = graph.mapEdges(e => 50.toDouble + e.attr.toDouble/20 )
// initialize graph, all vertices except source have distance infinity
val initialGraph = gg.mapVertices((id, _) =>
if (id == sourceId) 0.0 else Double.PositiveInfinity
®
© 2016 MapR Technologies 9-56
Graph Class
Pregel
®
© 2016 MapR Technologies 9-57
Pregel Operator: Example
Use Pregel to find the cheapest airfare:
// call pregel on graph
val sssp = initialGraph.pregel(Double.PositiveInfinity)(
// Vertex Program
(id, distCost, newDistCost) => math.min(distCost, newDistCost),
triplet => {
// Send Message
if (triplet.srcAttr + triplet.attr < triplet.dstAttr) {
Iterator((triplet.dstId, triplet.srcAttr + triplet.attr))
} else {
Iterator.empty
}
},
// Merge Message
(a,b) => math.min(a,b)
)
®
© 2016 MapR Technologies 9-58
Pregel Operator: Example
Use Pregel to find the cheapest airfare:
// routes , lowest flight cost
println(sssp.edges.take(4).mkString("n"))
Edge(10135,10397,84.6)
Edge(10135,13930,82.7)
Edge(10140,10397,113.45)
Edge(10140,10821,133.5)
®
© 2016 MapR Technologies 9-59
PageRank
•  Measures the importance of vertices in a graph
•  In links are votes
•  In links from important vertices are more important
•  Returns a graph with vertex attributes
graph.pageRank(tolerance).vertices
®
© 2016 MapR Technologies 9-60
Page Rank: Example
Use Page Rank:
// use pageRank
val ranks = graph.pageRank(0.1).vertices
// join the ranks with the map of airport id to name
val temp= ranks.join(airports)
temp.take(1)
// Array((15370,(0.5365013694244737,TUL)))
// sort by ranking
val temp2 = temp.sortBy(_._2._1, false)
temp2.take(2)
//Array((10397,(5.431032677813346,ATL)), (13930,(5.4148119418905765,ORD)))
// get just the airport names
val impAirports =temp2.map(_._2._2)
impAirports.take(4)
//res6: Array[String] = Array(ATL, ORD, DFW, DEN)
®
© 2016 MapR Technologies 9-61
Use Case
Monitor air
traffic at
airports
Monitor delays Analyze airport
and routes
overall
Analyze airport
and routes by
airline
®
© 2016 MapR Technologies 9-62
Learn More
•  https://www.mapr.com/blog/how-get-started-using-apache-spark-graphx-scala
•  GraphX Programming Guide http://spark.apache.org/docs/latest/graphx-
programming-guide.html
•  MapR announces Free Complete Apache Spark Training and Developer
Certification https://www.mapr.com/company/press-releases/mapr-unveils-free-
complete-apache-spark-training-and-developer-certification
•  Free Spark On Demand Training http://learn.mapr.com/?q=spark#-l
•  Get Certified on Spark with MapR Spark Certification http://learn.mapr.com/?
q=spark#certification-1,-l
®
© 2016 MapR Technologies 9-63
Open Source Engines & Tools Commercial Engines & Applications
Enterprise-Grade Platform Services
DataProcessing
Web-Scale Storage
MapR-FS MapR-DB
Search
and
Others
Real Time Unified Security Multi-tenancy Disaster
Recovery
Global NamespaceHigh Availability
MapR Streams
Cloud
and
Managed
Services
Search and
Others
UnifiedManagementandMonitoring
Search
and
Others
Event StreamingDatabase
Custom
Apps
MapR Converged Data Platform
HDFS API POSIX, NFS Kakfa APIHBase API OJAI API

Contenu connexe

Tendances

Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Simplilearn
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introductionsudhakara st
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDatabricks
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with PythonGokhan Atil
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Graph processing - Powergraph and GraphX
Graph processing - Powergraph and GraphXGraph processing - Powergraph and GraphX
Graph processing - Powergraph and GraphXAmir Payberah
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkDatabricks
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkHome
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkmxmxm
 

Tendances (20)

Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
 
Apache spark
Apache sparkApache spark
Apache spark
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Graph processing - Powergraph and GraphX
Graph processing - Powergraph and GraphXGraph processing - Powergraph and GraphX
Graph processing - Powergraph and GraphX
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Spark
SparkSpark
Spark
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 

Similaire à Spark graphx

Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesCarol McDonald
 
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkFree Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkMapR Technologies
 
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBAnalyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBCarol McDonald
 
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloudIBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloudTorsten Steinbach
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
Spark Streaming Data Pipelines
Spark Streaming Data PipelinesSpark Streaming Data Pipelines
Spark Streaming Data PipelinesMapR Technologies
 
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Carol McDonald
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
Geospatial applications created using java script(and nosql)
Geospatial applications created using java script(and nosql)Geospatial applications created using java script(and nosql)
Geospatial applications created using java script(and nosql)Comsysto Reply GmbH
 
Graph Computing with Apache TinkerPop
Graph Computing with Apache TinkerPopGraph Computing with Apache TinkerPop
Graph Computing with Apache TinkerPopJason Plurad
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Tugdual Grall
 
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talkHadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talkboorad
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Tugdual Grall
 
What are customers building with new Bing Maps capabilities
What are customers building with new Bing Maps capabilitiesWhat are customers building with new Bing Maps capabilities
What are customers building with new Bing Maps capabilitiesMicrosoft Tech Community
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide trainingSpark Summit
 
Developing Spatial Applications with Google Maps and CARTO
Developing Spatial Applications with Google Maps and CARTODeveloping Spatial Applications with Google Maps and CARTO
Developing Spatial Applications with Google Maps and CARTOCARTO
 
GraphFrames Access Methods in DSE Graph
GraphFrames Access Methods in DSE GraphGraphFrames Access Methods in DSE Graph
GraphFrames Access Methods in DSE GraphJim Hatcher
 
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...WhereCampBerlin
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)Ankur Dave
 

Similaire à Spark graphx (20)

Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
 
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkFree Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache Spark
 
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBAnalyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
 
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloudIBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Spark Streaming Data Pipelines
Spark Streaming Data PipelinesSpark Streaming Data Pipelines
Spark Streaming Data Pipelines
 
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
Geospatial applications created using java script(and nosql)
Geospatial applications created using java script(and nosql)Geospatial applications created using java script(and nosql)
Geospatial applications created using java script(and nosql)
 
Graph Computing with Apache TinkerPop
Graph Computing with Apache TinkerPopGraph Computing with Apache TinkerPop
Graph Computing with Apache TinkerPop
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
 
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talkHadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talk
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
What are customers building with new Bing Maps capabilities
What are customers building with new Bing Maps capabilitiesWhat are customers building with new Bing Maps capabilities
What are customers building with new Bing Maps capabilities
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide training
 
Developing Spatial Applications with Google Maps and CARTO
Developing Spatial Applications with Google Maps and CARTODeveloping Spatial Applications with Google Maps and CARTO
Developing Spatial Applications with Google Maps and CARTO
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
GraphFrames Access Methods in DSE Graph
GraphFrames Access Methods in DSE GraphGraphFrames Access Methods in DSE Graph
GraphFrames Access Methods in DSE Graph
 
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
 
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
 

Plus de Carol McDonald

Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUsCarol McDonald
 
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Carol McDonald
 
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...Carol McDonald
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningCarol McDonald
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBCarol McDonald
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Carol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Carol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Carol McDonald
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareCarol McDonald
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Carol McDonald
 
Spark machine learning predicting customer churn
Spark machine learning predicting customer churnSpark machine learning predicting customer churn
Spark machine learning predicting customer churnCarol McDonald
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Carol McDonald
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient DataCarol McDonald
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APICarol McDonald
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataCarol McDonald
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine LearningCarol McDonald
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBaseCarol McDonald
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBaseCarol McDonald
 

Plus de Carol McDonald (20)

Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUs
 
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
 
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
 
Spark machine learning predicting customer churn
Spark machine learning predicting customer churnSpark machine learning predicting customer churn
Spark machine learning predicting customer churn
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient Data
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine Learning
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBase
 

Dernier

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 

Dernier (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 

Spark graphx

  • 1. ® © 2016 MapR Technologies 9-1© 2017 MapR Technologies ® Spark GraphX
  • 2. ® © 2016 MapR Technologies 9-2 Learning Goals •  Describe GraphX •  Define Regular, Directed, and Property Graphs •  Create a Property Graph •  Perform Operations on Graphs
  • 3. ® © 2016 MapR Technologies 9-3 Learning Goals •  Describe GraphX •  Define Regular, Directed, and Property Graphs •  Create a Property Graph •  Perform Operations on Graphs
  • 4. ® © 2016 MapR Technologies 9-4 What is a Graph? Graph: vertices connected by edges vertex edge 5 1
  • 5. ® © 2016 MapR Technologies 9-5 What is a Graph? set of vertices, connected by edges. vertex edge DFW ATL Relationship: distance
  • 6. ® © 2016 MapR Technologies 9-6 Graphs are Essential to Data Mining and Machine Learning •  Identify influential entities (people, information…) •  Find communities •  Understand people’s shared interests •  Model complex data dependencies
  • 7. ® © 2016 MapR Technologies 9-7 Real World Graphs •  Web Pages Reference Spark GraphX in Action
  • 8. ® © 2016 MapR Technologies 9-8 Real World Graphs •  Web Pages Reference Spark GraphX in Action
  • 9. ® © 2016 MapR Technologies 9-9 Real World Graphs •  Web Pages Reference Spark GraphX in Action
  • 10. ® © 2016 MapR Technologies 9-10 Real World Graphs Reference Spark GraphX in Action
  • 11. ® © 2016 MapR Technologies 9-11 Real World Graphs Reference Spark GraphX in Action
  • 12. ® © 2016 MapR Technologies 9-12 Real World Graphs Reference Spark GraphX in Action
  • 13. ® © 2016 MapR Technologies 9-13 Real World Graphs Reference Spark GraphX in Action
  • 14. ® © 2016 MapR Technologies 9-14 Real World Graphs Reference Spark GraphX in Action
  • 15. ® © 2016 MapR Technologies 9-15 Real World Graphs •  Recommendations Ratings Items Users
  • 16. ® © 2016 MapR Technologies 9-16 Real World Graphs •  Credit Card Application Fraud Reference Spark Summit
  • 17. ® © 2016 MapR Technologies 9-17 Real World Graphs •  Credit Card Fraud
  • 18. ® © 2016 MapR Technologies 9-18 Finding Communities Count triangles passing through each vertex: " Measures “cohesiveness” of local community More Triangles Stronger Community Fewer Triangles Weaker Community 1 2 3 4
  • 19. ® © 2016 MapR Technologies 9-19 Real World Graphs Healthcare
  • 20. ® © 2016 MapR Technologies 9-20 Liberal Conservative Post Post Post Post Post Post Post Post Predicting User Behavior Post Post Post Post Post Post Post Post Post Post Post Post Post Post ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 20 Conditional Random Field! Belief Propagation!
  • 21. ® © 2016 MapR Technologies 9-21 Enable JoiningTables and Graphs 21 User Data Product Ratings Friend Graph ETL Product Rec. Graph Join Inf. Prod. Rec. Tables Graphs
  • 22. ® © 2016 MapR Technologies 9-22 Table and Graph Analytics
  • 23. ® © 2016 MapR Technologies 9-23 What is GraphX? Spark SQL •  Structured Data •  Querying with SQL/HQL •  DataFrames Spark Streaming •  Processing of live streams •  Micro-batching MLlib •  Machine Learning •  Multiple types of ML algorithms GraphX •  Graph processing •  Graph parallel computations RDD Transformations and Actions •  Task scheduling •  Memory management •  Fault recovery •  Interacting with storage systems Spark Core
  • 24. ® © 2016 MapR Technologies 9-24 Apache Spark GraphX •  Spark component for graphs and graph- parallel computations •  Combines data parallel and graph parallel processing in single API •  View data as graphs and as collections (RDD) –  no duplication or movement of data •  Operations for graph computation –  includes optimized version of Pregel •  Provides graph algorithms and builders GraphX •  Graph processing •  Graph parallel computations
  • 25. ® © 2016 MapR Technologies 9-25 Learning Goals •  Describe GraphX •  Define Regular, Directed, and Property Graphs •  Create a Property Graph •  Perform Operations on Graphs
  • 26. ® © 2016 MapR Technologies 9-26 Regular Graphs vs Directed Graphs edge Carol Bob vertex Relationship: Friends •  Regular graph: each vertex has the same number of edges •  Example: Facebook friends –  Bob is a friend of Carol –  Carol is a friend of Bob
  • 27. ® © 2016 MapR Technologies 9-27 Regular Graphs vs Directed Graphs vertex edge Carol 1 2 3 Oprah 6 •  Directed graph: edges have a direction •  Example: Twitter followers –  Carol follows Oprah –  Oprah does not follow Carol Relationship: follows
  • 28. ® © 2016 MapR Technologies 9-28 Property Graph Flight 123 Flight 1002 LAX SJC Properties Properties
  • 29. ® © 2016 MapR Technologies 9-29 Flight Example with GraphX edge ORD vertex SFO 1800 miles 800 miles1400 miles DFW Originating Airport Destination Airport Distance SFO ORD 1800 miles ORD DFW 800 miles DFW SFO 1400 miles
  • 30. ® © 2016 MapR Technologies 9-30 Flight Example with GraphX edge ORD vertex SFO 1800 miles 800 miles1400 miles DFW Id Property 1 SFO 2 ORD 3 DFW SrcId DestId Property 1 2 1800 2 3 800 3 1 1400 Vertex Table Edge Table
  • 31. ® © 2016 MapR Technologies 9-31 Spark Property Graph class edge ORD vertex SFO 1800 miles 800 miles1400 miles DFW class Graph[VD, ED] { val vertices: VertexRDD[VD] val edges: EdgeRDD[ED] }
  • 32. ® © 2016 MapR Technologies 9-32 Learning Goals •  Define GraphX •  Define Regular, Directed, and Property Graphs •  Create a Property Graph •  Perform Operations on Graphs
  • 33. ® © 2016 MapR Technologies 9-33 Create a Property Graph Import required classes Create vertex RDD Create edge RDD Create graph 1 2 3 4
  • 34. ® © 2016 MapR Technologies 9-34 import org.apache.spark._ import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD Create a Property Graph 1 Import required classes
  • 35. ® © 2016 MapR Technologies 9-35 Create a Property Graph: Data Set Vertices: Airports Edges: Routes Source ID Dest ID Property (E) Id Id Distance (Integer) Vertex ID Property (V) Id (Long) Name (String)
  • 36. ® © 2016 MapR Technologies 9-36 Create a Property Graph // create vertices RDD with ID and Name val vertices=Array((1L, ("SFO")),(2L, ("ORD")),(3L,("DFW"))) val vRDD= sc.parallelize(vertices) vRDD.take(1) // Array((1,SFO)) 2 Create vertex RDD Id Property 1 SFO 2 ORD 3 DFW
  • 37. ® © 2016 MapR Technologies 9-37 Create a Property Graph 3 Create edge RDD // create routes RDD with srcid, destid , distance val edges = Array(Edge(1L,2L,1800),Edge(2L,3L,800), Edge(3L,1L,1400)) val eRDD= sc.parallelize(edges) eRDD.take(2) // Array(Edge(1,2,1800), Edge(2,3,800)) SrcId DestId Property 1 2 1800 2 3 800 3 1 1400
  • 38. ® © 2016 MapR Technologies 9-38 Create a Property Graph 4 Create graph // define default vertex nowhere val nowhere = “nowhere” //build initial graph val graph = Graph(vertices, edges, nowhere) graph.vertices.take(3).foreach(print) // (2,ORD)(1,SFO)(3,DFW) graph.edges.take(3).foreach(print) // Edge(1,2,1800) Edge(2,3,800) Edge(3,1,1400)
  • 39. ® © 2016 MapR Technologies 9-39 Learning Goals •  Define GraphX •  Define Regular, Directed, and Property Graphs •  Create a Property Graph •  Perform Operations on Graphs
  • 40. ® © 2016 MapR Technologies 9-40 Graph Operators To answer questions such as: •  How many airports are there? •  How many flight routes are there? •  What are the longest distance routes? •  Which airport has the most incoming flights? •  What are the top 10 flights?
  • 41. ® © 2016 MapR Technologies 9-41 Graph Class
  • 42. ® © 2016 MapR Technologies 9-42 Graph Operators To find information about the graph Operator Description numEdges number of edges (Long) numVertices number of vertices (Long) inDegrees The in-degree of each vertex (VertexRDD[Int]) outDegrees The out-degree of each vertex (VertexRDD[Int]) degrees The degree of each vertex (VertexRDD[Int])
  • 43. ® © 2016 MapR Technologies 9-43 Graph Operators Graph Operators // How many airports? val numairports = graph.numVertices // Long = 3 // How many routes? val numroutes = graph.numEdges // Long = 3 // routes > 1000 miles distance? graph.edges.filter { case ( Edge(org_id, dest_id,distance))=> distance > 1000 }.take(3) // Array(Edge(1,2,1800), Edge(3,1,1400)
  • 44. ® © 2016 MapR Technologies 9-44 Triplets // Triplets add source and destination properties to Edges graph.triplets.take(3).foreach(println) ((1,SFO),(2,ORD),1800) ((2,ORD),(3,DFW),800) ((3,DFW),(1,SFO),1400)
  • 45. ® © 2016 MapR Technologies 9-45 Triplets What are the longest routes ? ((1,SFO),(2,ORD),1800) ((2,ORD),(3,DFW),800) ((3,DFW),(1,SFO),1400) // print out longest routes graph.triplets.sortBy(_.attr, ascending=false) .map(triplet =>"Distance" + triplet.attr.toString + “from" + triplet.srcAttr + “to" + triplet.dstAttr) .collect.foreach(println) Distance 1800 from SFO to ORD Distance 1400 from DFW to SFO Distance 800 from ORD to DFW
  • 46. ® © 2016 MapR Technologies 9-46 Graph Operators Which airport has the most incoming flights? (real dataset) // Define a function to compute the highest degree vertex def max(a:(VertexId,Int),b:(VertexId, Int)):(VertexId, Int) = { if (a._2 > b._2) a else b } // Which Airport has the most incoming flights? val maxInDegree:(VertexId, Int)= graph.inDegrees.reduce(max) // (10397,152) ATL
  • 47. ® © 2016 MapR Technologies 9-47 Graph Operators Which 3 airports have the most incoming flights? (real dataset) // get top 3 val maxIncoming = graph.inDegrees.collect .sortWith(_._2 > _._2) .map(x => (airportMap(x._1), x._2)).take(3) maxIncoming.foreach(println) (ATL,152) (ORD,145) (DFW,143)
  • 48. ® © 2016 MapR Technologies 9-48 Graph Operators Caching Graphs Operator Description cache() Caches the vertices and edges; default level is MEMORY_ONLY persist(newLevel) Caches the vertices and edges at specified storage level; returns a reference to this graph unpersist(blocking) Uncaches both vertices and edges of this graph unpersistVertices(blocking) Uncaches only the vertices, leaving edges alone
  • 49. ® © 2016 MapR Technologies 9-49 Graph Class
  • 50. ® © 2016 MapR Technologies 9-50 Class Discussion 1.  How many airports are there? •  In our graph, what represents airports? •  Which operator could you use to find the number of airports? 2.  How many routes are there? •  In our graph, what represents routes? •  Which operator could you use to find the number of routes?
  • 51. ® © 2016 MapR Technologies 9-51 How Many Airports are There? How many airports are there? •  In our graph, what represents airports? Vertices •  Which operator could you use to find the number of airports? graph.numVertices
  • 52. ® © 2016 MapR Technologies 9-52 Pregel API •  GraphX exposes variant of Pregel API •  iterative graph processing –  Iterations of message passing between vertices
  • 53. ® © 2016 MapR Technologies 9-53 The Graph-Parallel Abstraction A user-definedVertex-Program runs on each Graph vertex •  Using messages (e.g. Pregel ) •  Parallelism: run multiple vertex programs simultaneously
  • 54. ® © 2016 MapR Technologies 9-54 Pregel Operator Initial message received at each vertex Message computed at each vertex Sum of message received at each vertex Message computed at each vertex Sum of message received at each vertex Message computed at each vertex 1Super step 2Super step nSuper step Loop until no messages left OR max iterations
  • 55. ® © 2016 MapR Technologies 9-55 Pregel Operator: Example Use Pregel to find the cheapest airfare: // starting vertex val sourceId: VertexId = 13024 // a graph with edges containing airfare cost calculation val gg = graph.mapEdges(e => 50.toDouble + e.attr.toDouble/20 ) // initialize graph, all vertices except source have distance infinity val initialGraph = gg.mapVertices((id, _) => if (id == sourceId) 0.0 else Double.PositiveInfinity
  • 56. ® © 2016 MapR Technologies 9-56 Graph Class Pregel
  • 57. ® © 2016 MapR Technologies 9-57 Pregel Operator: Example Use Pregel to find the cheapest airfare: // call pregel on graph val sssp = initialGraph.pregel(Double.PositiveInfinity)( // Vertex Program (id, distCost, newDistCost) => math.min(distCost, newDistCost), triplet => { // Send Message if (triplet.srcAttr + triplet.attr < triplet.dstAttr) { Iterator((triplet.dstId, triplet.srcAttr + triplet.attr)) } else { Iterator.empty } }, // Merge Message (a,b) => math.min(a,b) )
  • 58. ® © 2016 MapR Technologies 9-58 Pregel Operator: Example Use Pregel to find the cheapest airfare: // routes , lowest flight cost println(sssp.edges.take(4).mkString("n")) Edge(10135,10397,84.6) Edge(10135,13930,82.7) Edge(10140,10397,113.45) Edge(10140,10821,133.5)
  • 59. ® © 2016 MapR Technologies 9-59 PageRank •  Measures the importance of vertices in a graph •  In links are votes •  In links from important vertices are more important •  Returns a graph with vertex attributes graph.pageRank(tolerance).vertices
  • 60. ® © 2016 MapR Technologies 9-60 Page Rank: Example Use Page Rank: // use pageRank val ranks = graph.pageRank(0.1).vertices // join the ranks with the map of airport id to name val temp= ranks.join(airports) temp.take(1) // Array((15370,(0.5365013694244737,TUL))) // sort by ranking val temp2 = temp.sortBy(_._2._1, false) temp2.take(2) //Array((10397,(5.431032677813346,ATL)), (13930,(5.4148119418905765,ORD))) // get just the airport names val impAirports =temp2.map(_._2._2) impAirports.take(4) //res6: Array[String] = Array(ATL, ORD, DFW, DEN)
  • 61. ® © 2016 MapR Technologies 9-61 Use Case Monitor air traffic at airports Monitor delays Analyze airport and routes overall Analyze airport and routes by airline
  • 62. ® © 2016 MapR Technologies 9-62 Learn More •  https://www.mapr.com/blog/how-get-started-using-apache-spark-graphx-scala •  GraphX Programming Guide http://spark.apache.org/docs/latest/graphx- programming-guide.html •  MapR announces Free Complete Apache Spark Training and Developer Certification https://www.mapr.com/company/press-releases/mapr-unveils-free- complete-apache-spark-training-and-developer-certification •  Free Spark On Demand Training http://learn.mapr.com/?q=spark#-l •  Get Certified on Spark with MapR Spark Certification http://learn.mapr.com/? q=spark#certification-1,-l
  • 63. ® © 2016 MapR Technologies 9-63 Open Source Engines & Tools Commercial Engines & Applications Enterprise-Grade Platform Services DataProcessing Web-Scale Storage MapR-FS MapR-DB Search and Others Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability MapR Streams Cloud and Managed Services Search and Others UnifiedManagementandMonitoring Search and Others Event StreamingDatabase Custom Apps MapR Converged Data Platform HDFS API POSIX, NFS Kakfa APIHBase API OJAI API