Dev Dives: Streamline document processing with UiPath Studio Web
Database Pro Power Days 2010 - Graph data in the cloud using .NET
1. Photo: Large Magellanic Cloud, ESO
19./20. Oktober 2010
Nürnberg
www.databasepro-powerdays.de
sones
Graph Data in the cloud
using .NET
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 1
2. Photo: funky64, flickr
For 35 years information has been
well-defined data within some tables
jailed in closed database silos.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 2
3. Photo: shamballah, flickr
The relational model and SQL have
become much too limited for open
linked data and cloud requirements.
== graph data
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 3
4. Photo: shamballah, flickr
Application 1 Application 2
Applications can
not access, under-
stand and process
? ?
unknown relational
DB 1 DB 2
data easily.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 4
5. Photo: Gephi, flickr
1. Graph-Databases
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 5
6. Photo: Gephi, flickr
The Property-Graph Model
Vertex
Edge
Alice Friends Bob
ID = 1 since = 2009/09/21 ID = 2
Age = 21 reason = classmates Age = 23
Edge- Vertex-
Properties Properties
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 6
7. Photo: Gephi, flickr
The Property-Graph Model
Direct linking
ID 1 without external
TYPE Person indices
REVISION 20101014… Bob
Name Alice ID = 2
Age 21 Age = 23
Boyfriend
Friends
FavColors Red, Green Carol
ID = 3
Address.Street 1 Infinite Loop Age = 20
Address.Town Cupertino
Close to Object- and
Document-Databases
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 7
8. Photo: Gephi, flickr
The Property-Graph Model
using RDF-like semantics
+ Unambiguous identifiers
+ Named relations
+ Close to RDF molecules
ID ID http://test.com/vertices/1
rdf:type TYPE http://test.com/#person
sones:revId REVISION 20101014… Bob
foaf:name Name Alice ID = 2
foaf:age Age 21 Age = 23
person Boyfriend
Set<person> Friends
List<string> FavColors Red, Green Carol
ID = 3
XML Address Age = 20
gn:streeet Address.Street 1 Infinite Loop
gn:town Address.Town Cupertino
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 8
9. Photo: Large Magellanic Cloud, ESO
Graph-Databases in a cloud
• Vertices and edges are resources
• Access via e.g. http://test.com/vertices/[$id]
• Common CRUD operations (GET, POST, PUT…)
+ Atomicity
+ Statelessness
+ Idempotence
+ Parallelism
GraphDB
REST
Hypermedia • Representation must be „link-aware“
e.g. XML+XLINK, ATOM, RDFa…
• Representation should be self-describing
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 9
10. Photo: bombeador, flickr
Graph
2. The Object-relational
Impedance Mismatch
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 10
11. Photo: bombeador, flickr
Inflexible relational schemata
• Expensive ALTER TABLE operations
• Entity-Attribute-Value Model ↔ RDF
• No semi-/unstructured data
XML, JSON, … hierarchies, graphs, … binary data
• No Multi-Attribute Values
List<String>, Set<Integer>, Set<Person>
• No simple way for versioned data
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 11
12. Photo: bombeador, flickr
Relational Anti-Patterns:
• Relations via foreign key constraints
No explicit concept for relations
No index-free adjacency
• Querying relational data via JOINs is hard
Just storing a graph was never a challenge ;)
• No recursive JOINs
Inefficient query processing
(Except: Oracle’s “CONNECTED BY”)
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 12
13. Photo: shamballah, flickr
SQL and Cloud-Readiness?
• No explicit scaling or partitioning within
the relational model
• No JOINs between different databases
and/or vendors
• No well interaction with state-of-the-art
web technologies
e.g. HTTP/REST, Hypermedia, Semantic Web
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 13
14. Photo: Gephi, flickr
3. Benefits of Graph-Databases
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 14
15. Photo: Gephi, flickr
The explicit graph data model
provides a higher level of abstraction
and a better understanding of the
domain model.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 15
16. Photo: Gephi, flickr
Index-free adjacency provides an
improved scalability, data-locality
and a superior graph traversal
performance.
( Independent of the size of the graph )
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 16
17. Photo: squacco, flickr
Consistency criteria and indices for
simple attributes up to complex
subgraph structures.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 17
18. Photo: Khem, flickr
Traversing linked information, finding
shortest-paths, do semantic
partitions.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 18
19. Photo: NASA, flickr
personal social item-related
Recommendation and discovery
of potentially interessting linked
information.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 19
20. Photo: Birger Hoppe, flickr
Good integration into state-of-the-art
programming concepts and web
technologies.
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 20
21. Photo: Large Magellanic Cloud, ESO
Graph-Databases, REST and RDF
symantics are a solid foundation for
cloud services
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 21
22. Photo: Gephi, flickr
4. Graph data in the cloud
using .NET / Mono
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 22
23. Photo: sones
sones GraphDB
• URL http://www.sones.de
• License AGPLv3
• Language C# 4.0
• Goals Management of linked data
• Concurrency MVCC
• Repl./Scaling p2p (alpha)
• Persistency Proprietary file system
• Cloud Connector for Microsoft Azure
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 23
24. Photo: sones
sones Architecture
GraphDS
REST, WebShell, C# API
GraphDB
GQL, Graph Traversals, Indices
GraphFS
Object Management, (De-)Serialization
Host File System / Microsoft Azure
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 24
25. Photo: sones
Shared nothing
GraphDB 1
GraphDS 1
GraphFS 1
User
it
depends…
Azure
GraphFS 2
GraphDS 2
GraphDB 2
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 25
26. Photo: sones
sones Property-Hypergraph
Edge
User Friend Bob
ID = 2
since = 2009/09/21
Hyperedge Alter = 23
Alice SET<User> Friends
ID = 1 Virtual -Edge
SetMaxNumber = 12
Alter = 21
Hyperedge-Properties
User Friend Carol
ID = 3
since = 2010/04/11
Alter = 20
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 26
27. Photo: sones
sones Property-Hypergraph
• Properties may include code as data
Think of stored procedures; C#: Func<…>, ExpressionTrees
• Allows hyperedge calculations be done among
the set of their edges
(GetMinWeight, SetMaxNumber, …)
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010
NoSQL Frankfurt , 9/28/2010 27
28. Photo: Shayne Kaye, flickr
sones Graph Query Language
FROM User SELECT User.Friends.Friends.Name
• “SQL for graphs” providing a user-friendly DSL for ad-
hoc graph queries and graph discovery
• Functions and aggregates are type-safe and can be
extended by your own plug-ins, e.g.
• SELECT COUNT(User.Friends)
• SELECT User.Friends.Random(2)
• SELECT User.Friends.Name.Substring(2,5)
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 28
29. Photo: Gephi, flickr
sones Graph Query Language
// sones gql example
CREATE VERTEX User
ADD ATTRIBUTES (String Name, SET<User> Friends)
INDICES (Name)
MANDATORY (Name)
INSERT INTO User VALUES (Name = "Alice", Age = 21)
INSERT INTO User VALUES (Name = "Bob", Age = 23)
LINK User(Name = ‘Alice') VIA Friends TO User(Name = ‘Bob')
LINK User(Name = ‘Bob') VIA Friends TO User(Name = ‘Alice‘)
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 29
30. Photo: Gephi, flickr
C# API
// C# API type creation
var _Person = _GraphDB.TypeManager.
CreateVertex(„Person“).
AddString(„Name“, mandatory: true, indexed: true).
AddLoop(„Friends”, hyperEdge: true).
execute();
Type _PersonT = _GraphDB.TypeManager.
GenerateType(_Person);
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 30
31. Photo: Gephi, flickr
C# API
// C# API vertex initialization
Person _Alice = _GraphDB.TypeManager.ActivateVertex(
_Person, new VertexUUID(1));
_Alice.Name = „Alice“;
dynamic _Alice2 = _Alice;
_Alice.Age = 21;
_Alice.bdayparty = (Action) (() => { _Alice.Age++; });
_Alice.bdayparty();
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 31
32. Photo: Gephi, flickr
C# API
// sones C# API example
var _Friends = new GraphAttribute(„Friends“, Type: „foaf:knows“);
var _Bob = _GraphDB.TypeManager.ActivateVertex(
_Person, new VertexUUID(2));
_Alice.Link(_Friends, _Bob);
_Bob.Link(_Friends, _Alice);
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 32
33. Photo: Gephi, flickr
C# API
// Graph Traversals
Public T TraverseVertex<T> (
IVertex myStartVertex,
TraversalOperation TraversalOperation =
TraversalOperation.BreathFirst,
Func<IVertex, IEdge, Boolean> myFollowThisEdge = null,
Func<IVertex, Boolean> myMatchEvaluator = null,
Action<IVertex> myMatchAction = null,
Func<TraversalState, Boolean> myStopEvaluator = null,
Func<IEnumerable<IVertex>, T> myWhenFinished = null)
{
// Traverse the graph
}
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 33
34. sones
For more information…
achim@sones.de
http://www.twitter.com/ahzf
http://www.twitter.com/graphdbs
Achim Friedland <achim@sones.de> Database Pro Power Days , 10/20/2010 34