This document provides an overview of two graph data models: RDF and Property Graphs. It describes the key components of each model, including triples for RDF and nodes/edges/properties for Property Graphs. It also discusses Apache projects that work with each model like Apache Jena for RDF and Apache TinkerPop, Spark, Giraph and Flink for Property Graphs. Finally, it notes that while the models have different focuses, they could potentially share technologies like storage and query capabilities.
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Two graph data models : RDF and Property Graphs
1. Two graph data models
RDF and Property Graphs
Andy Seaborne
Paolo Castagna
andy@a.o, castagna@a.o
2. Introduction
This talk is about two graph data models
(RDF and Property Graphs), example of a
couple of Apache projects using such data
models, and a few lessons learned along the
way.
4. RDF
➢ IRIs (=URIs), literals (strings, numbers, …),
blank nodes
➢ Triple => subject-predicate-object
● Predicate (or property) is the link name : an IRI
➢ Graph => set of triples
5. prefix : <http://example/myData/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
# foaf:name is a short form of <http://xmlns.com/foaf/0.1/name>
:alice rdf:type foaf:Person ;
foaf:name "Alice Smith" ; # ; means “same subject”
foaf:knows :bob .
:alice
foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type
:bob
12. Apache Jena
TLP: April 2012
➢ Involvement in standards
➢ RDF 1.1, SPARQL 1.1
➢ RDF database
➢ SPARQL server
Other RDF@ASF:
➢ Any23, Marmotta, Clerezza, Stanbol, Rya
13. Property Graph Data Model
A property graph is a set of vertexes and edges with
respective properties (i.e. key / values):
➢ each vertex or edge has a unique identifier
➢ each vertex has a set of outgoing edges and a set of incoming edges
➢ edges are directed: each edge has a start vertex and an end vertex
➢ each edge has a label which denotes the type of relationship
➢ vertexes and edges can have a properties (i.e. key / value pairs)
Directed multigraph with properties
attached to vertexes and edges
14. Property Graph: Example
id = 1 id = 2
name = “Alice”
surname = “Smith”
age = 32
email = alice@example.com
...
name = “Bob”
surname = “Brown”
age = 45
email = bob@example.com
...
since = 01/01/1970
...
id = 3
knows
15. Apache Spark: GraphX*
// Creating a Graph
val vertexes: RDD[(VertexId, (String, String))] =
sc.parallelize (Array((1L,("Alice", "alice@example.com")), (2L,("Bob", "bob@example.com"))))
val edges: RDD[Edge[String]] =
sc.parallelize(Array(Edge(1L, 2L, "knows"))
val graph = Graph(vertexes, edges)
...
Example of parallel graph algorithms available:
// Find the triangle count for each vertex
val triCounts = graph.triangleCount().vertices
// Find the connected components
val cc = graph.connectedComponents().vertices
// Run PageRank
val ranks = graph.pageRank(0.0001).vertices
* GraphX is in the alpha stage
17. Use Case for Graphs
➢ Analytics
● Social networks and recommendation engines
● Data center infrastructure management
➢ Knowledge Graphs
● Happenings: people, places, events
● Customer databases / products catalogues
18. Some Conclusions
➢ Data Graphs are (still) new to many people
➢ RDF emphasizes information modelling
→ Knowledge graphs
→ SQL-like query
➢ Property Graph emphasizes data processing
→ Data capture
→ Graph analytic algorithms
➢ Naive layering of data models leads dissatisfaction
→ Can only mix toolsets by knowing it’s layered
➢ Could share technology
→ Storage, data access, query algebra