A talk presented at WWW 2015 on how to label blank nodes in an RDF graph in a deterministic way. Applications include Skolemisation (mapping blank nodes to IRIs); detecting RDF graph isomorphism; as well as RDF graph hashing, signing, canonicalisation etc.
(The slides make heavy use of animation and when flattened, they will not make much sense. Hence I recommend to open them as a slideshow in .ppt format.)
6. Blank nodes are common in real-world data …
Aidan Hogan, Marcelo Arenas, Alejandro Mallea and Axel Polleres
"Everything You Always Wanted to Know About Blank Nodes".
Journal of Web Semantics 27: pp. 42–69, 2014
7. BLANK NODES ENABLE SYNTAX SHORTCUTS
They represent implicit nodes in the graph
They help specify order, higher-arity relations, reification, etc., succinctly
They are common in real-world data
22. An old question that won’t go away …
Jeremy J. Carroll. “Signing RDF Graphs.” ISWC 2003.
Edzard Höfig, Ina Schieferdecker. “Hashing of RDF Graphs
and a Solution to the Blank Node Problem.” URSW 2014.
23. NO EXISTING APPROACH IS GENERAL
• Hard cases seem unlikely in practice
• Let’s build a general (and thus worst-case exponential) algorithm
that’s efficient for practical cases
62. What about hash collisions?
128 bit: MD5, Murmur3_128
160 bit: SHA1
63. HASHING MAY LEAD TO COLLISIONS
• Don’t care what hashing you want to use
• 128-bit hash shortest hash with acceptable collision probability
• For cryptographic use-cases, SHA-256 or better might be needed
82. Trim the search tree
using “found” automorphisms
Found Automorphisms …
83. PRUNING PER AUTOMORPHISMS AVOIDS
SYMMETRIC REPETITIONS
• Automorphisms are found naturally
• Makes very “regular” structures (like cliques) a lot easier
• Need to be careful how to manage the automorphism group