This was a presentation for the 2009 Semantic Technologies meeting in San Jose, California. I was not able to attend the meeting, but hopefully the slides will be self-explanatory.
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
RDF Redux
1. RDF Redux
or
What RDF could have been
Pat Hayes, IHMC. Slides Prepared For SemTech 2009, San Jose
2. RDF Redux
✤ With hindsight comes wisdom
✤ RDF ought to have been very easy to grasp, but some things have
resisted simple explanations and caused a lot of confusion.
✤ Thinking about this reveals a basic gap in the RDF conceptual model,
one that we simply missed.
✤ Filling that gap properly makes RDF simpler, more rational, more
useful and vastly more expressive.
✤ This talk is about that important, basic idea.
3. RDF Redux
✤ There are a lot of things about RDF that, it is now clear, could have
been done better.
✤ allowing literals in subject position
✤ treating literals uniformly
✤ providing names for RDF graphs
✤ etc.
✤ But what has given the most grief is RDF blank nodes.
4. Why are blank nodes so hard to get right?
✤ RDF abstract syntax is a
node-arc diagram
✤ and blank nodes are just
nodes that have no label.
✤ That seems pretty
obvious
5. Why are blank nodes so hard to get right?
✤ blank nodes are just nodes that have no label.
✤ That seems pretty obvious.
But its not so obvious how to say this mathematically.
The RDF spec uses set language: it says that an RDF graph
is a set of triples, and that blank nodes are elements of a set
of items disjoint from URIs and literals.
But there is something fundamentally wrong with this 'set'
approach.
6. Why are blank nodes so hard to get right?
✤ Mathematical sets aren't the right kind of thing to make syntax out of.
✤ Sets exist in a Platonic universe of abstractions. There is no type/
token distinction. You can't copy a set. You can't write or transmit a set.
You can't put a set on a Web server.
✤ There are unresolved puzzles. Is any set of triples an RDF graph?
What defines the 'boundary' of an RDF graph? Etc..
✤ Elements of sets have very tight identity conditions, even blank
nodes. Merging/union train wreck.
7. A blank node is a mark on a surface.
✤ What is missing in RDF concepts is something to capture the intuition
that an RDF graph is like a diagram. (Not a 'mathematical' graph!)
✤ RDF graphs are drawn on surfaces. Blank nodes are marks on the
surface. Intuitively, think of a surface as a piece of paper, or a screen,
or a document.
✤ Surfaces provide the missing type/token distinction. Putting the same
graph onto a new surface is like making a copy. But copying a graph
onto a new surface always gets you new blank nodes, because a mark
can only be on one surface. Aha!
8. A blank node is a mark, on a surface.
✤ Formally. Take the RDF concepts as published, add a set of surfaces,
disjoint from all the others, and a functional property of being on
between blank nodes (call them marks for emphasis) and surfaces.
Call the set of marks on a surface the graffiti of the surface. Define a
graph to be a pair of an RDF graph G and a surface S such that the
blank nodes of G are a subset of the graffiti of S. The triples of a graph
are the triples of the RDF graph. We will say that the triples of the
graph, and the URIs and literals which occur in the RDF graph are on
the surface.
9. A blank node is a mark, on a surface.
✤ (From now on, 'graph' means RDF-graph-plus-surface.)
✤ A graph can have extra marks, but they don't mean anything so are
harmless (technically, they say that something exists.)
✤ A surface can have more than one graph on it, but a graph cannot be
split over multiple surfaces. (Contrast RDF graph.)
✤ Even with no blank nodes, each graph is on a single surface.
✤ A copy of a graph <G, S> is a graph <G', S'> such that there is a 1:1
map m from the marks of S to those of S' and G'=m(G)
10. Surfaces are a good idea.
✤ Surfaces make sense of RDF syntax, while keeping it abstract. They also provide a neat
abstraction for some Webbish notions.
✤ Surfaces provide the missing type/token distinction, and make sense of the ideas of copying
and transmitting (= copying onto a distant surface) RDF graphs.
✤ Surfaces get rid of the merge/union distinction. A conjunction of two graphs is a graph got by
copying them both onto a single surface. (No need to "standardize apart")
✤ Surfaces provide a way to define syntactic scope in RDF. Graphs have a natural 'boundary'.
✤ The URI of a named graph identifies a graph. (Not an RDF graph!)
✤ Surfaces provide a way to track 'dynamic' RDF graphs. The surface retains its identity through
RDF graph changes.
✤ Surfaces handily resolve tricky bnode-scoping issues e.g. in SPARQL. The query, the reference
graph and the answers are all on distinct surfaces: end of story.
11. Surfaces are a very good idea.
✤ By allowing different kinds of surface, we can encode different assertional
modes. For example, the surface can assert the graph or deny the graph or just
display the graph without making claims about its truth either way. None of this
changes the RDF semantics of RDF graphs!
✤ Once we have denial and scoping, we have negation. RDF already has conjunction
and the existential quantifier (blank nodes). This gives a graphical syntax for full
first-order logic, if we have the freedom to combine them properly.
✤ Using a graph syntax for logic is one the oldest ideas (C.S.Peirce, 1885) and very
well understood. See http://www.flickr.com/photos/lilitupili/260552781/
✤ ((p => a) & (q => b)) => ((p & q) => (a & b))
12. Kinds of surface.
✤ Think colored paper.
✤ Positive surfaces claim that an RDF graph on them is true. This is the
current RDF default assumption. (If we only allow positive surfaces, this
is just current RDF but with a cleaner conceptual model.)
✤ Negative surfaces claim that an RDF graph on them is false.
✤ Neutral surfaces simply make no claims at all about their graphs.
(Good place to put eg. RDF collection triples in OWL/RDF.)
✤ We can imagine others (deprecating surfaces?) but this will do for
now.
13. Kinds of surface.
✤ Because RDF graphs retain their current RDF semantics, marks on a
negative surface are more like universally quantified variables.
✤ DeMorgan's law: (not (exists x ...)) = (forall x (not ...))
✤
_:x rdf:type ex:oddities oddities exist
_:x rdf:type ex:oddities not(oddities exist)
nothing is an oddity
everything is not an oddity
14. Surfaces on surfaces: RDF codices.
✤ In order to get the full power of logic, we need a way to include
surfaces inside other surfaces.
✤ Extend the abstract RDF-surface model to allow surfaces, as well as
nodes and triples, to be on a surface.
✤ A finite set of surfaces tree-ordered by on is a codex. Extending RDF to
allow graphs on codices instead of (simple) surfaces makes it into
Pierce conceptual graph notation, giving it the power of full FOL (in
fact, of ISO Common Logic.)
15. Surfaces on surfaces: RDF codices.
✤ Putting RDF graphs on a codex requires that we are precise about
exactly which surface each node of each triple in the graph is on.
✤ Every city is a human rdf:type
rdf:type
ex:HumanCommunity
community.
ex:City
✤ Some non-city is a rdf:type
ex:HumanCommunity
rdf:type
human community.
ex:City
16. Surfaces on surfaces: RDF codices.
✤ Putting RDF graphs on a codex requires that we are precise about
exactly which surface each node of each triple in the graph is on. This
is easy to do graphically:
rdf:type
rdf:type ex:HumanCommunity
ex:City
✤ Not( exists something which is a City and Not(a HumanCommunity))
✤ Every City is a HumanCommunity
✤ rdfs:subClassOf ex:City ex:HumanCommunity .
17. Surfaces on surfaces: RDF codices.
rdf:type
rdf:type ex:HumanCommunity
ex:City
✤ rdfs:subClassOf ex:City ex:HumanCommunity .
✤ This graph now has its RDFS meaning in RDF already. The RDF
semantics defines the RDFS meaning. It is not a "semantic extension".
✤ We can do this for all of OWL and RIF. With just this much extra
apparatus, RDF can be a complete semantic framework for all Web logics.
18. Surfaces on surfaces: RDF codices.
✤ Graphical convention (used already): an RDF triple is attached to a
surface by its property arc label. The subject and object nodes might
be on other surfaces.
rdf:type
rdf:type ex:HumanCommunity
ex:City
19. Surfaces on surfaces: RDF codices.
✤ Text convention: add 'surface parentheses' and explicit bnode binding
syntax to Ntriples or Turtle.
_:x
rdf:type _:x rdf:type ex:City
rdf:type ex:HumanCommunity
ex:City _:x rdf:type ex:HumanCommunity
✤ %not[ _:x
_:x rdf:type ex:city .
%not[
_:x rdf:type ex:HumanCommunity .
%]
%]
21. Abbreviations may not be very easy to read, but they do
work.
aaa is bbb owl:allValuesFrom ccc .
aaa rdfs:range bbb .
==>>
==>>
%not[ _:x _:y
%not[ _:x _:y _:x rdf:type aaa .
_:x aaa _:y . _:x bbb _:y .
%not[ %not[
_:y rdf:type bbb . _:y rdf:type ccc .
%]] %]]
%not[ _:x
%not[ _:y
_:x bbb _:y .
%not[
_:y rdf:type ccc .
%]]
%not[
_:x rdf:type aaa .
%]]
22. Semantic OWL/RDF
✤ Currently, OWL has its semantics and so does RDF and this is a
problem. Getting them to align properly is difficult and fiddly. OWL/
RDF is essentially a surface syntactic mapping.
✤ When RDF is fully expressive, we can simply encode OWL meanings
directly in RDF, using the RDF semantics rather than ignoring it. Then,
OWL (and much of RIF) are simply organized collections of RDF
abbreviations and restrictions. There is no extra semantics, and one
engine can process all the semantic notations uniformly.
✤ The RDF+surfaces conceptual model provides a single, universal
interchange format for (nearly) all SWeb languages, with a single,
uniform semantic model.
23. A bigger base for the layer cake.
Some of RIF is outside normal logic. SPARQL is a law unto itself.
The rest is (revized) RDF with syntactic sugar and restrictions.
24. Resources
✤ Piercian graphical logic has been widely used, see
http://conceptualgraphs.org/, and even standardized (ISO 24707
App. B) .
✤ John Sowa has been very active in this area, and I have used his ideas
at key places. See http://www.jfsowa.com/cg/index.htm