Talk by Sören Auer: Potentials and benefits of linked open data - given at the (Linked) Open Data MeetUp at The Waag in Amsterdam on 24 March 2013. See also: http://bit.ly/146hu1B
12. Creating Know l e dge
Publishing Data about Kindergartens in XML (1)
out of Interlinked Data
<kindergarten>
<name>Seven Dwarfs</name>
<location>...</location>
<description>...</description>
</kindergarten>
13. Creating Know l e dge
Publishing Data about Kindergartens in XML (2)
out of Interlinked Data
<child_care name=„Seven Dwarfs“>
<address>
<street>...</street>
<zip>...</zip>
</address>
<text>...</text>
</child_care>
14. Creating Know l e dge
Publishing Data about Kindergartens in XML (3)
out of Interlinked Data
<daycare id=„Seven Dwarfs“
address=„...“>
. . .
</daycare>
15. Creating Know l e dge
out of Interlinked Data
<child_care name=„Seven Dwarfs“>
<kindergarten> <address> <daycare id=„Seven Dwarfs“
<name>Seven Dwarfs</name> <street>...</street> address=„...“>
<location>...</location> <zip>...</zip> . . .
<description>...</description> </address> </ daycare >
<text>...</text>
</kindergarten>
</child_care>
Syntactic heterogenity – different trees
Semantic heterogenity – different tags and
attributes (e.g. kindergarten, child_care,
daycare)
16. Creating Know l e dge
Maybe CSV helps?
out of Interlinked Data
Kindergarten Location Description …
Seven Dwarfs Rosentalgasse 9, … …
04105
… … … …
Child_care street Zip text
Seven Dwarfs Rosentalgasse 04105 …
… … … …
Type Name Location Features
Daycare Seven Dwarfs 42.052384|13.2736 …
79
… … … …
17. Creating Know l e dge
A nightmare …
out of Interlinked Data
Imagine you have 10.000 open data files
describing child care from communities all
over Europe all in different XML, CSV, Excel,
JSON, … formats
And then you want to look into polution, road
congestion, health care, …
18. Creating Know l e dge
Distribution of file formats at PublicData.eu
out of Interlinked Data
19. Creating Know l e dge
out of Interlinked Data
How can we fix open data?
20. Creating Know l e dge
How can we fix Open Data?
out of Interlinked Data
• Increasing data literacy???
• Organizing hackdays, hackathons???
• Publish more data???
Yes, but this won‘t scale
We need also:
• Standard formats, which preserve semantic: RDF
• Reuse vocabularies
• Visualizatuion widgets, mashups, apps, which can make
sense out of those vocabularies
21. Linked Data in a Nutshell
Creating Know l e dge
out of Interlinked Data
1. Uses RDF Data Model starts 24.3.2013
organizes
OKFN LOD-MeetUp
Subject Predicate Object takesPlaceIn Amsterdam
2. Is serialised in triples:
OKFN organizes LOD-MeetUp .
LOD-MeetUp starts “20130324”^^xsd:date .
LOD-MeetUp takesPlaceAt Amsterdam .
3. Uses Content-negotiation
22. Creating Know l e dge
7 Dwarfs in RDF
out of Interlinked Data
Seven_Dwarfs rdf:type Kindergarten
Seven_Dwarfs rdfs:label „Seven Dwarfs“
Seven_Dwarfs foaf:location „Rosentalgasse 9“
Seven_Dwarfs rdfs:description „...“
...
Different Kindergarten descriptions also might look different, but there will
be definitely less variety than with XML or CSV
You can mix and mesh different vocabularies (RDF, RDFS, FOAF)
More information can be added without destroying the data structure
23. Creating Know l e dge
What has to be done?
out of Interlinked Data
• Publish Open Data in RDF reusing vocabularies which can
be understood and combined by apps in unforeseen ways
(e.g. visualization widgets)
link your data
Where we should be
use URIs to
Where we are now denote things
use non-proprietary formats
(e.g., CSV instead of Excel)
make it available as structured data
(e.g., Excel instead of image scan of a table)
make your stuff available on the Web (whatever format) under an
open license
24. Creating Know l e dge
out of Interlinked Data
How can we lift Open Data to Linked
Open Data?
25. Creating Know l e dge
All CSV on PublicData.eu is transformed in RDF
out of Interlinked Data
27. Creating Know l e dge
Mapping Wiki
out of Interlinked Data
• Automatic CSV to
RDF
transformation
won‘t render good
results
• Mappings Wiki
enables the
crowdsourcing of
mappings
28. Creating Know l e dge
CSV2RDF Mapping Syntax
out of Interlinked Data
1 {{CSV2RDFHeader}}
2
3 ...
4
5 {{RelCSV2RDF
6 | name = default-mapping
7 | header = 1
8 | omitRows = -1
9 | omitCols = -1
10 | delimiter =
11 | col1 = Department Family
12 | col2 = Entity
13 | col3 = Payment Date^^xsd:date
14 | col4 = rdf:type
15 | col5 = Cost Centre Name
16 | col6 = Supplier
17 | col7 = Transaction No.
18 | col8 = Line Amount
19 | col9 = Invoice Total^^xsd:decimal
20 }}
29. Creating Know l e dge
How can we make this happen?
out of Interlinked Data
SemMap OntoWiki
Exploration
Domain specific
… Spatial faceted- Faceted- Statistical … Entity-/faceted-
Widgets
visualizations browsing browsing visualization Based browsing
Data Portal
• Dataset analysis (size, vocabularies, properties)
• Selection of suitable visualization widgets
Open Datasets
31. Creating Know l e dge
Browsing Spatial Data with SemMap
out of Interlinked Data
32. Inter-
linking/
Fusing
Creating Know l e dge Manual Classifi-
out of Interlinked Data revision/ cation/
authoring Enrichment
LOD Lifecycle
Storage/
supported by Quality
Querying
Debian based Analysis
LOD2 Stack
http://stack.lod2.eu
Evolution /
Extraction Repair
Search/
Browsing/
Exploration
33. Creating Know l e dge
Take home
out of Interlinked Data
• Open Data will only scale when ist Linked Open Data
• The RDF data model helps to reduce syntactic and
semantic heterogenity
• When Open Data is published as LOD adhering to
standard vocabularies, visualization widgets, mashups,
apps etc. can be applied to the data at runtime and in
possibly unforeseen ways
• By ultimately reducing the entrance and usage barrier
LOD will facilitate long-tail applications
34. Creating Know l e dge
out of Interlinked Data
Thank You!!!
http://lod2.eu
http://aksw.org
35. The emerging Web of Data
2007 2008
2008 2009
2008 2009
2008
2010
Linking Open Data cloud diagram, by
Richard Cyganiak and Anja Jentzsch.
36. Creating Know l e dge
Why do we need the Linked Open Data
out of Interlinked Data
Problem: Try to search for these things on the current Web:
• Apartments near German-English bilingual childcare in Leipzig
• ERP service providers with offices in Vienna and London
• Researchers working on multimedia topics in Eastern Europe
Information is available on the Web, but opaque to current search.
Solution: complement text on Web pages with structured linked
open data & intelligently combine/integrate/join such structured
information from different sources:
HTML
Search engine HTML
RDF
RDF
Web Web
leipzig.de server Immobilienscout.de server
Has everything about Knows all about real estate
childcare in Leipzig. DB offers in Germany DB