Graphs, Edges & Nodes - Untangling the Social Web

Graphs, Edges &
Nodes
Untangling the social web.

Graph
10

19
9 7
2 15
7
3
12
13
9
6
6
4 3
5 7
4
14
1

4

Graph
11 10 10

19
6 9 7
2 15
7 21
3 8
12
15 13 13
17 9
22
6
6
3
4 4 3
2 5 7
4
6 14 9 12
1
10 4
19

Simple

At most one edge bet ween any pair of nodes.

Multigraph

Multiple edges bet ween vertices allowed.

Pseudograph

Self-loops are permitted.

What’s a node?
vertex
point
junction
0-simplex

What’s an edge?
arc
branch
line
link
1-simplex

(Graph does not include Justin Bieber)

Find the band that is most often co-listened with the given one.

People


People

Bands


Basically, most kinds of simple
content/co-occurrence similarity.

That’s a 2-step path on a bipartite graph.

There are many of these ‘fundamental’
graph units:

- tripartite
- folksonomies (tripartite 3-graph + 2-
step path)
- multicolor-multiparity graph
- etc.

Neo4j
“An embedded, disk-based, fully transactional Java persistence engine
that stores data structured in graphs rather than in tables.”

http://neo4j.org

HypergraphDB
“A general purpose, extensible, portable, distributed, embeddable,
open-source data storage mechanism. It is a graph database designed
speciﬁcally for artiﬁcial intelligence and semantic web projects.”

http://kobrix.org/hgdb.jsp

Special Purpose
Storage Engines

FlockDB
“FlockDB is a database that stores graph data, but it isn't a database optimized for
graph-traversal operations. Instead, it's optimized for very large adjacency lists,
fast reads and writes, and page-able set arithmetic queries.”

http://engineering.t witter.com/2010/05/introducing-
ﬂockdb.html

Redis
“Redis is an advanced key-value store. [...] the dataset is not volatile, and values can be strings,
exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be
manipulated with atomic operations to push/pop elements, add/remove elements, perform
server side union, intersection, difference bet ween sets, etc.”

http://code.google.com/p/redis

A Redis Friends/
Followers Example

Redis makes you think in terms of datastructures,
and operations on those structures.

Set:
Finite (for our cases) collection of objects in which
order has no signiﬁcance and multiplicity is generally
ignored.
S = { Alice, Bob, Carol }

List:
Finite (for our cases) collection of objects in which
order *is* signiﬁcant and multiplicity is allowed.
L = [ X, Y, X, Z, Q]

Insert a user into a set

SET uid:1000:username jperras
SET uid:1000:password bazinga!

Use sets for denoting my followers/people
I follow.

uid:1000:followers => Set of uids of all the followers users
uid:1000:following => Set of uids of all the following users

Adding a new follower

SADD uid:1000:following 1001
SADD uid:1001:followers 1000

Posting Updates

$r = Redis();
$postid = $r->incr("global:nextPostId");
$post = $User['id'] ."|". time() ."|". $status;
$r->set("post:$postid", $post);
$followers = $r->smembers("uid:".$User['id'].":followers");

if ($followers === false) $followers = Array();
$followers[] = $User['id']; /* Add the post to our own posts too */

foreach($followers as $fid) {
$r->push("uid:$fid:posts", $postid, false);
}
# Push the post on the timeline, and trim the timeline to the
# newest 1000 elements.
$r->push("global:timeline", $postid, false);
$r->ltrim("global:timeline",0,1000);

Common followers? - Set intersections!

SINTER users:1000:followers users:1000:followers

Let’s compare that
to MySQL

Relational databases can work for the simplest
of cases, but fail horribly at nearly all graph-related
operations/algorithms.

Graphs and graph-databases are only
going to be more and more useful.

However, graph algorithms are hard.

So don’t write your own.

And make sure you use a persistent storage engine
that is best suited for the type of queries
you will be performing.

Resources
The Algorithm Design Manual,
Steve S. Skiena
Programming Collective
Intelligence, Toby Segaran
Introduction to Algorithms,
Cormen, Leiserson, Rivest

Photo Credits

Graph of the internet, circa 2003: http://www.duniacyber.com/freebies/education/what-
is-internet-lookslike/ (built from partial troll of public servers using traceroute)

My real friends for letting me use their Facebook profile images.

References

Large Scale Graph Algorithms (class lectures), Yuri Lifshits, Steklov Institute of
Mathematics at St. Petersburg

http://mathworld.wolfram.com/Set.html

Programming Collective Intelligence, Toby Segaran

The Algorithm Design Manual, Steve S. Skiena

Graphs, Edges & Nodes - Untangling the Social Web

Recommended

Recommended

More Related Content

Similar to Graphs, Edges & Nodes - Untangling the Social Web

Similar to Graphs, Edges & Nodes - Untangling the Social Web (20)

Recently uploaded

Recently uploaded (20)

Graphs, Edges & Nodes - Untangling the Social Web

Editor's Notes