SlideShare a Scribd company logo
1 of 36
Potentials and Benefits of Linked Open Data

Dr. Sören Auer
What‘s wrong with Open Data?
Creating Know l e dge
out of Interlinked Data
Creating Know l e dge
                          1st Resource Download Attempt
out of Interlinked Data
Creating Know l e dge
                          2nd Resource Download Attempt
out of Interlinked Data
After Installing 7zip for opening .gz
Creating Know l e dge
out of Interlinked Data
                          files
Creating Know l e dge
out of Interlinked Data
Creating Know l e dge
                          3rd Resource Download Attempt
out of Interlinked Data
5th Resource Download Attempt
Creating Know l e dge
out of Interlinked Data
Creating Know l e dge
out of Interlinked Data
Creating Know l e dge
out of Interlinked Data




               Giving up 
Creating Know l e dge
                          Publishing Data about Kindergartens in XML (1)
out of Interlinked Data




      <kindergarten>
             <name>Seven Dwarfs</name>
             <location>...</location>
             <description>...</description>
      </kindergarten>
Creating Know l e dge
                          Publishing Data about Kindergartens in XML (2)
out of Interlinked Data




      <child_care name=„Seven Dwarfs“>
             <address>
                <street>...</street>
                <zip>...</zip>
             </address>
        <text>...</text>
      </child_care>
Creating Know l e dge
                          Publishing Data about Kindergartens in XML (3)
out of Interlinked Data




      <daycare id=„Seven Dwarfs“
             address=„...“>
            . . .
      </daycare>
Creating Know l e dge
out of Interlinked Data
                                   <child_care name=„Seven Dwarfs“>
<kindergarten>                      <address>                         <daycare id=„Seven Dwarfs“
  <name>Seven Dwarfs</name>           <street>...</street>             address=„...“>
  <location>...</location>            <zip>...</zip>                      . . .
  <description>...</description>    </address>                        </ daycare >
                                    <text>...</text>
</kindergarten>
                                   </child_care>




      Syntactic heterogenity – different trees
      Semantic heterogenity – different tags and
      attributes (e.g. kindergarten, child_care,
      daycare)
Creating Know l e dge
                          Maybe CSV helps?
out of Interlinked Data


       Kindergarten         Location           Description         …
       Seven Dwarfs         Rosentalgasse 9,   …                   …
                            04105
       …                    …                  …                   …

       Child_care           street             Zip                 text
       Seven Dwarfs         Rosentalgasse      04105               …
       …                    …                  …                   …



       Type                 Name               Location            Features
       Daycare              Seven Dwarfs       42.052384|13.2736   …
                                               79
       …                    …                  …                   …
Creating Know l e dge
                          A nightmare …
out of Interlinked Data




      Imagine you have 10.000 open data files
      describing child care from communities all
      over Europe all in different XML, CSV, Excel,
      JSON, … formats


      And then you want to look into polution, road
      congestion, health care, …
Creating Know l e dge
                          Distribution of file formats at PublicData.eu
out of Interlinked Data
Creating Know l e dge
out of Interlinked Data




               How can we fix open data?
Creating Know l e dge
                          How can we fix Open Data?
out of Interlinked Data



      •     Increasing data literacy???
      •     Organizing hackdays, hackathons???
      •     Publish more data???
      Yes, but this won‘t scale 


      We need also:
      •     Standard formats, which preserve semantic: RDF
      •     Reuse vocabularies
      •     Visualizatuion widgets, mashups, apps, which can make
            sense out of those vocabularies
Linked Data in a Nutshell
Creating Know l e dge
out of Interlinked Data

      1. Uses RDF Data Model                               starts      24.3.2013
                           organizes
            OKFN                           LOD-MeetUp
            Subject       Predicate          Object     takesPlaceIn   Amsterdam
     2. Is serialised in triples:
            OKFN                      organizes         LOD-MeetUp .
            LOD-MeetUp                starts            “20130324”^^xsd:date .
            LOD-MeetUp                takesPlaceAt      Amsterdam .
     3. Uses Content-negotiation
Creating Know l e dge
                          7 Dwarfs in RDF
out of Interlinked Data



      Seven_Dwarfs          rdf:type               Kindergarten
      Seven_Dwarfs          rdfs:label             „Seven Dwarfs“
      Seven_Dwarfs          foaf:location          „Rosentalgasse 9“
      Seven_Dwarfs          rdfs:description       „...“
      ...


      Different Kindergarten descriptions also might look different, but there will
      be definitely less variety than with XML or CSV
      You can mix and mesh different vocabularies (RDF, RDFS, FOAF)
      More information can be added without destroying the data structure
Creating Know l e dge
                             What has to be done?
out of Interlinked Data



       • Publish Open Data in RDF reusing vocabularies which can
           be understood and combined by apps in unforeseen ways
           (e.g. visualization widgets)
                                                                                                   link your data
                                       Where we should be
                                                                                                 use URIs to
                          Where we are now                                                       denote things

                                                                            use non-proprietary formats
                                                                            (e.g., CSV instead of Excel)

                                                           make it available as structured data
                                                           (e.g., Excel instead of image scan of a table)

                                             make your stuff available on the Web (whatever format) under an
                                             open license
Creating Know l e dge
out of Interlinked Data




               How can we lift Open Data to Linked
               Open Data?
Creating Know l e dge
                          All CSV on PublicData.eu is transformed in RDF
out of Interlinked Data
Creating Know l e dge
out of Interlinked Data
Creating Know l e dge
                          Mapping Wiki
out of Interlinked Data



 • Automatic CSV to
   RDF
   transformation
   won‘t render good
   results
 • Mappings Wiki
   enables the
   crowdsourcing of
   mappings
Creating Know l e dge
                          CSV2RDF Mapping Syntax
out of Interlinked Data

                                      1 {{CSV2RDFHeader}}
                                      2
                                      3 ...
                                      4
                                      5 {{RelCSV2RDF
                                      6 | name = default-mapping
                                      7 | header = 1
                                      8 | omitRows = -1
                                      9 | omitCols = -1
                                      10 | delimiter =
                                      11 | col1 = Department Family
                                      12 | col2 = Entity
                                      13 | col3 = Payment Date^^xsd:date
                                      14 | col4 = rdf:type
                                      15 | col5 = Cost Centre Name
                                      16 | col6 = Supplier
                                      17 | col7 = Transaction No.
                                      18 | col8 = Line Amount
                                      19 | col9 = Invoice Total^^xsd:decimal
                                      20 }}
Creating Know l e dge
                                How can we make this happen?
out of Interlinked Data

                                        SemMap OntoWiki
    Exploration




                     Domain specific
                                     … Spatial faceted- Faceted- Statistical … Entity-/faceted-
    Widgets




                     visualizations    browsing         browsing visualization Based browsing
     Data Portal




                               • Dataset analysis (size, vocabularies, properties)
                               • Selection of suitable visualization widgets
     Open Datasets
Browsing Statistical Data with
Creating Know l e dge
out of Interlinked Data
                          CubeViz
Creating Know l e dge
                          Browsing Spatial Data with SemMap
out of Interlinked Data
Inter-
                                             linking/
                                              Fusing
Creating Know l e dge      Manual                               Classifi-
out of Interlinked Data   revision/                             cation/
                          authoring                           Enrichment

                              LOD Lifecycle
                  Storage/
                              supported by                              Quality
                  Querying
                              Debian based                              Analysis



                               LOD2 Stack
                                       http://stack.lod2.eu
                                                              Evolution /
                          Extraction                            Repair

                                              Search/
                                            Browsing/
                                            Exploration
Creating Know l e dge
                          Take home
out of Interlinked Data



      •     Open Data will only scale when ist Linked Open Data
      •     The RDF data model helps to reduce syntactic and
            semantic heterogenity
      •     When Open Data is published as LOD adhering to
            standard vocabularies, visualization widgets, mashups,
            apps etc. can be applied to the data at runtime and in
            possibly unforeseen ways
      •     By ultimately reducing the entrance and usage barrier
            LOD will facilitate long-tail applications
Creating Know l e dge
out of Interlinked Data




         Thank You!!!

         http://lod2.eu
         http://aksw.org
The emerging Web of Data




                                                    2007   2008
                                                                        2008          2009
                                                               2008            2009
                                                                 2008


                                                                                      2010




Linking Open Data cloud diagram, by
Richard Cyganiak and Anja Jentzsch.
Creating Know l e dge
                          Why do we need the Linked Open Data
out of Interlinked Data


 Problem: Try to search for these things on the current Web:
 •   Apartments near German-English bilingual childcare in Leipzig
 •   ERP service providers with offices in Vienna and London
 •   Researchers working on multimedia topics in Eastern Europe
 Information is available on the Web, but opaque to current search.

 Solution: complement text on Web pages with structured linked
 open data & intelligently combine/integrate/join such structured
 information from different sources:
                                  HTML
                                               Search engine                  HTML
                                                                        RDF
                                         RDF
                                 Web                                             Web
        leipzig.de              server               Immobilienscout.de         server
        Has everything about                         Knows all about real estate
        childcare in Leipzig.    DB                  offers in Germany           DB

More Related Content

More from LOD2 Creating Knowledge out of Interlinked Data

More from LOD2 Creating Knowledge out of Interlinked Data (20)

LOD2 Webinar Series FOX
LOD2 Webinar Series FOXLOD2 Webinar Series FOX
LOD2 Webinar Series FOX
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
LOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the StackLOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the Stack
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
LOD2 Webinar Series: Virtuoso 7
LOD2 Webinar Series: Virtuoso 7LOD2 Webinar Series: Virtuoso 7
LOD2 Webinar Series: Virtuoso 7
 
LOD2 Webinar Series: DBpedia Spotlight
LOD2 Webinar Series: DBpedia SpotlightLOD2 Webinar Series: DBpedia Spotlight
LOD2 Webinar Series: DBpedia Spotlight
 
LOD2 Webinar Series: publicdata.eu and CKAN
LOD2 Webinar Series: publicdata.eu and CKANLOD2 Webinar Series: publicdata.eu and CKAN
LOD2 Webinar Series: publicdata.eu and CKAN
 
LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine
 
LOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industryLOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industry
 
LOD2 General Presentation 2012
LOD2 General Presentation 2012LOD2 General Presentation 2012
LOD2 General Presentation 2012
 
LOD2 Webinar Series: PoolParty
LOD2 Webinar Series: PoolPartyLOD2 Webinar Series: PoolParty
LOD2 Webinar Series: PoolParty
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
 
LOD2 Webinar Series: LIMES
LOD2 Webinar Series: LIMESLOD2 Webinar Series: LIMES
LOD2 Webinar Series: LIMES
 
LOD2 Plenary Vienna 2012: WP12 - Project Management
LOD2 Plenary Vienna 2012: WP12 - Project ManagementLOD2 Plenary Vienna 2012: WP12 - Project Management
LOD2 Plenary Vienna 2012: WP12 - Project Management
 
LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...
LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...
LOD2 Plenary Vienna 2012: WP10 - Training, Dissemination, Community Building,...
 
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
 
LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...
LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...
LOD2 Plenary Vienna 2012: WP9 publicdata.eu – Publishing Governmental Informa...
 
LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web
LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data WebLOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web
LOD2 Plenary Vienna 2012: WP8: Linked Open Data for Enterprise Data Web
 
LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing
LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing
LOD2 Plenary Vienna 2012: WP7 - Linked Open Data for Media and Publishing
 
LOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 Stack
LOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 StackLOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 Stack
LOD2 Plenary Vienna 2012: WP6 - Interfaces, Integration & LOD2 Stack
 

Potentials and Benefits of Linked Open Data (LOD)

  • 1. Potentials and Benefits of Linked Open Data Dr. Sören Auer
  • 2. What‘s wrong with Open Data?
  • 3. Creating Know l e dge out of Interlinked Data
  • 4. Creating Know l e dge 1st Resource Download Attempt out of Interlinked Data
  • 5. Creating Know l e dge 2nd Resource Download Attempt out of Interlinked Data
  • 6. After Installing 7zip for opening .gz Creating Know l e dge out of Interlinked Data files
  • 7. Creating Know l e dge out of Interlinked Data
  • 8. Creating Know l e dge 3rd Resource Download Attempt out of Interlinked Data
  • 9. 5th Resource Download Attempt Creating Know l e dge out of Interlinked Data
  • 10. Creating Know l e dge out of Interlinked Data
  • 11. Creating Know l e dge out of Interlinked Data Giving up 
  • 12. Creating Know l e dge Publishing Data about Kindergartens in XML (1) out of Interlinked Data <kindergarten> <name>Seven Dwarfs</name> <location>...</location> <description>...</description> </kindergarten>
  • 13. Creating Know l e dge Publishing Data about Kindergartens in XML (2) out of Interlinked Data <child_care name=„Seven Dwarfs“> <address> <street>...</street> <zip>...</zip> </address> <text>...</text> </child_care>
  • 14. Creating Know l e dge Publishing Data about Kindergartens in XML (3) out of Interlinked Data <daycare id=„Seven Dwarfs“ address=„...“> . . . </daycare>
  • 15. Creating Know l e dge out of Interlinked Data <child_care name=„Seven Dwarfs“> <kindergarten> <address> <daycare id=„Seven Dwarfs“ <name>Seven Dwarfs</name> <street>...</street> address=„...“> <location>...</location> <zip>...</zip> . . . <description>...</description> </address> </ daycare > <text>...</text> </kindergarten> </child_care> Syntactic heterogenity – different trees Semantic heterogenity – different tags and attributes (e.g. kindergarten, child_care, daycare)
  • 16. Creating Know l e dge Maybe CSV helps? out of Interlinked Data Kindergarten Location Description … Seven Dwarfs Rosentalgasse 9, … … 04105 … … … … Child_care street Zip text Seven Dwarfs Rosentalgasse 04105 … … … … … Type Name Location Features Daycare Seven Dwarfs 42.052384|13.2736 … 79 … … … …
  • 17. Creating Know l e dge A nightmare … out of Interlinked Data Imagine you have 10.000 open data files describing child care from communities all over Europe all in different XML, CSV, Excel, JSON, … formats And then you want to look into polution, road congestion, health care, …
  • 18. Creating Know l e dge Distribution of file formats at PublicData.eu out of Interlinked Data
  • 19. Creating Know l e dge out of Interlinked Data How can we fix open data?
  • 20. Creating Know l e dge How can we fix Open Data? out of Interlinked Data • Increasing data literacy??? • Organizing hackdays, hackathons??? • Publish more data??? Yes, but this won‘t scale  We need also: • Standard formats, which preserve semantic: RDF • Reuse vocabularies • Visualizatuion widgets, mashups, apps, which can make sense out of those vocabularies
  • 21. Linked Data in a Nutshell Creating Know l e dge out of Interlinked Data 1. Uses RDF Data Model starts 24.3.2013 organizes OKFN LOD-MeetUp Subject Predicate Object takesPlaceIn Amsterdam 2. Is serialised in triples: OKFN organizes LOD-MeetUp . LOD-MeetUp starts “20130324”^^xsd:date . LOD-MeetUp takesPlaceAt Amsterdam . 3. Uses Content-negotiation
  • 22. Creating Know l e dge 7 Dwarfs in RDF out of Interlinked Data Seven_Dwarfs rdf:type Kindergarten Seven_Dwarfs rdfs:label „Seven Dwarfs“ Seven_Dwarfs foaf:location „Rosentalgasse 9“ Seven_Dwarfs rdfs:description „...“ ... Different Kindergarten descriptions also might look different, but there will be definitely less variety than with XML or CSV You can mix and mesh different vocabularies (RDF, RDFS, FOAF) More information can be added without destroying the data structure
  • 23. Creating Know l e dge What has to be done? out of Interlinked Data • Publish Open Data in RDF reusing vocabularies which can be understood and combined by apps in unforeseen ways (e.g. visualization widgets) link your data Where we should be use URIs to Where we are now denote things use non-proprietary formats (e.g., CSV instead of Excel) make it available as structured data (e.g., Excel instead of image scan of a table) make your stuff available on the Web (whatever format) under an open license
  • 24. Creating Know l e dge out of Interlinked Data How can we lift Open Data to Linked Open Data?
  • 25. Creating Know l e dge All CSV on PublicData.eu is transformed in RDF out of Interlinked Data
  • 26. Creating Know l e dge out of Interlinked Data
  • 27. Creating Know l e dge Mapping Wiki out of Interlinked Data • Automatic CSV to RDF transformation won‘t render good results • Mappings Wiki enables the crowdsourcing of mappings
  • 28. Creating Know l e dge CSV2RDF Mapping Syntax out of Interlinked Data 1 {{CSV2RDFHeader}} 2 3 ... 4 5 {{RelCSV2RDF 6 | name = default-mapping 7 | header = 1 8 | omitRows = -1 9 | omitCols = -1 10 | delimiter = 11 | col1 = Department Family 12 | col2 = Entity 13 | col3 = Payment Date^^xsd:date 14 | col4 = rdf:type 15 | col5 = Cost Centre Name 16 | col6 = Supplier 17 | col7 = Transaction No. 18 | col8 = Line Amount 19 | col9 = Invoice Total^^xsd:decimal 20 }}
  • 29. Creating Know l e dge How can we make this happen? out of Interlinked Data SemMap OntoWiki Exploration Domain specific … Spatial faceted- Faceted- Statistical … Entity-/faceted- Widgets visualizations browsing browsing visualization Based browsing Data Portal • Dataset analysis (size, vocabularies, properties) • Selection of suitable visualization widgets Open Datasets
  • 30. Browsing Statistical Data with Creating Know l e dge out of Interlinked Data CubeViz
  • 31. Creating Know l e dge Browsing Spatial Data with SemMap out of Interlinked Data
  • 32. Inter- linking/ Fusing Creating Know l e dge Manual Classifi- out of Interlinked Data revision/ cation/ authoring Enrichment LOD Lifecycle Storage/ supported by Quality Querying Debian based Analysis LOD2 Stack http://stack.lod2.eu Evolution / Extraction Repair Search/ Browsing/ Exploration
  • 33. Creating Know l e dge Take home out of Interlinked Data • Open Data will only scale when ist Linked Open Data • The RDF data model helps to reduce syntactic and semantic heterogenity • When Open Data is published as LOD adhering to standard vocabularies, visualization widgets, mashups, apps etc. can be applied to the data at runtime and in possibly unforeseen ways • By ultimately reducing the entrance and usage barrier LOD will facilitate long-tail applications
  • 34. Creating Know l e dge out of Interlinked Data Thank You!!! http://lod2.eu http://aksw.org
  • 35. The emerging Web of Data 2007 2008 2008 2009 2008 2009 2008 2010 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
  • 36. Creating Know l e dge Why do we need the Linked Open Data out of Interlinked Data Problem: Try to search for these things on the current Web: • Apartments near German-English bilingual childcare in Leipzig • ERP service providers with offices in Vienna and London • Researchers working on multimedia topics in Eastern Europe Information is available on the Web, but opaque to current search. Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate/join such structured information from different sources: HTML Search engine HTML RDF RDF Web Web leipzig.de server Immobilienscout.de server Has everything about Knows all about real estate childcare in Leipzig. DB offers in Germany DB