Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Uplift – Generating RDF datasets from non-RDF data with R2RML

95 vues

Publié le

R2RML tutorial organized together with the Ordnance Survey Ireland, and with funding from the Irish Open Data Engagement Fund.

Publié dans : Formation
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Uplift – Generating RDF datasets from non-RDF data with R2RML

  1. 1. Upli% – Genera-ng RDF datasets from non-RDF data with R2RML h>p://bit.ly/upli%-r2rml Dr. Christophe Debruyne
  2. 2. BACKGROUND •  Resource Descrip-on Framework (RDF) •  RDF Schema •  SPARQL •  Linked Data 04/11/17 christophe.debruyne@adaptcentre.ie 2
  3. 3. Context •  The Open Data Engagement Fund – developed by both the Department of Public Expenditure and the Open Data Governance Board – aims to provide support towards improving the availability and usage of data on data.gov.ie. •  We observe that the vast majority of the data on the portal provide CSV (or XML), which are structured, but not necessarily (as) meaningful (as RDF datasets using recognized vocabularies) •  The goal of this seminar is to inform how one can generate RDF datasets from the non-RDF datasets – a process called upli% – that are available on the open data portal using a W3C Recommenda-on called R2RML – which stands for RDB to RDF Mapping Language. •  The genera-on of RDF and the adop-on of appropriate vocabularies not only improve the availability, but also its use and even accessibility both at a technical level (querying), and at the level of understanding (seman-cs). 3 04/11/17 christophe.debruyne@adaptcentre.ie
  4. 4. Background – RDF •  Stands for Resource Descrip4on Framework •  RDF is not a language but a model (!!!) •  RDF is a W3C recommenda-on •  RDF is designed to be read by computers •  RDF is for describing resources on the Web in terms of triples (subject – predicate – object) [in graphs] •  RDF uses URIs to iden-fy and refer to resources on the Web •  RDF/XML is just one way of serializing RDF –  We will use the Terse RDF Triple Language – TURTLE, which does not provide support for named graphs –  A serializa-on format with support for named graphs is N-QUADS 04/11/17 christophe.debruyne@adaptcentre.ie 4
  5. 5. Background – RDF 04/11/17 christophe.debruyne@adaptcentre.ie 5 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix voc: <http://example.org/example#> . <http://christophedebruyne.be> voc:title "Christophe’s Page" . <http://christophedebruyne.be> voc:topic "Cats" . <http://christophedebruyne.be> voc:owned <http://example.org/#xof> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix voc: <http://example.org/example#> . <http://christophedebruyne.be> voc:title "Christophe’s Page" ; voc:topic "Cats" ; voc:owned <http://example.org/#xof> ; .
  6. 6. Background – RDFS (and OWL) •  RDF provides li>le support to describe a domain; we can declare proper-es, state that things are of a certain type, declare collec-ons and containers, but that’s about it… •  RDF Schema (RDF) is an extension of RDF – also a W3C Recommenda-on •  RDFS provides a framework for describing vocabularies •  RDFS describe resources with classes, proper-es and values •  Allows us to do some “lightweight” reasoning; infer implicit informa-on from explicit informa-on •  The Web Ontology Language (OWL) – not covered here – is an even more expressive ontology language suitable for more complex reasoning tasks 04/11/17 christophe.debruyne@adaptcentre.ie 6
  7. 7. Background – RDFS @base <http://foo.bar/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <#Cat> a rdfs:Class ; rdfs:label "Cat" . <#owns> a rdf:Property ; rdfs:label "owns" ; rdfs:domain foaf:Person ; rdfs:range <#Cat> . <#victor> a <#Cat> ; foaf:name "Victor" . <#louis> a <#Cat> ; foaf:name "Louis" . <#bettina> a <#Cat> ; foaf:name "Bettina" . <#chrdebru> a foaf:Person ; foaf:name "Christophe Debruyne" ; <#owns> <#victor> ; <#owns> <#louis> ; <#owns> <#bettina> . 7 Exercise: What could we infer from <#kevin> <#owns> <#gomez> ? 04/11/17 christophe.debruyne@adaptcentre.ie
  8. 8. Background – SPARQL •  Stands for SPARQL Protocol and RDF Query Language –  Recursive acronym •  SPARQL 1.1 is a W3C Recommenda-on since March 2013 (h>p://www.w3.org/TR/sparql11-overview/) •  SPARQL allows us to: –  Pull values from structured and semi-structured data –  Explore data by querying unknown rela-onships –  Perform complex joins of disparate databases in a single, simple query –  Transform RDF data from one vocabulary to another –  … 8 04/11/17 christophe.debruyne@adaptcentre.ie
  9. 9. Background – SPARQL Structure of a SPARQL SELECT Query. A SPARQL query comprises, in order: •  Prefix declara4ons for URI shorthands •  A result clause, iden-fying what informa-on to return from the query •  Dataset defini4on, sta-ng what RDF graph(s) are being queried •  The query paVern, specifying what to query for in the underlying dataset •  Query modifiers, slicing, ordering, and otherwise rearranging query results # prefix declarations PREFIX foaf: <http://.../0.1/> ... # result clause SELECT ... # dataset definition FROM ... # query pattern WHERE { ... } # query modifiers ORDER BY ... 9 Namespace shortened for brevity Adapted from “SPARQL by Example” by Feigenbaum and Prud’hommeaux 04/11/17 christophe.debruyne@adaptcentre.ie
  10. 10. Background – Simple SPARQL Queries •  Variables start with a ques-on mark and can match any node (resource or literal); •  Triple pa>erns are just like triples, except that any of the parts of a triple can be replaced with a variable; •  The SELECT result clause returns a table of variables and values that sa-sfy the query. •  “a” is syntac-c sugar for “rdf:type” Finding all cats and their names PREFIX ex: <http://foo.bar/#> PREFIX foaf: <http://.../0.1/> SELECT ?cat ?name WHERE { ?cat a ex:Cat. ?cat foaf:name ?name. } 10 04/11/17 christophe.debruyne@adaptcentre.ie
  11. 11. Background – Simple SPARQL Queries 11 PREFIX ex: <http://foo.bar/#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?cat ?name WHERE { ?cat a ex:Cat. ?cat foaf:name ?name. } ----------------------------------------- | cat | name | ========================================= | <http://foo.bar/#bettina> | "Bettina" | | <http://foo.bar/#louis> | "Louis" | | <http://foo.bar/#victor> | "Victor" | ----------------------------------------- 04/11/17 christophe.debruyne@adaptcentre.ie
  12. 12. Background – GeoSPARQL Give a me a list of coun-es whose borders touch County Dublin 12 SELECT ?county ?w2 WHERE { ?dublin a osi:County . ?dublin rdfs:label "DUBLIN" . ?county a osi:County . ?dublin geo:hasGeometry ?g1 . ?g1 geo:asWKT ?w1 . ?county geo:hasGeometry ?g2 . ?g2 geo:asWKT ?w2 FILTER (geof:sfTouches(?w1, ?w2)) . } RDF allows one to engage with data in a declara-ve manner, not bound to par-cular structure. 04/11/17 christophe.debruyne@adaptcentre.ie
  13. 13. Linked Data •  Linked Data started off as a ini-a-ve called the Linking Open Data (LOD) project. •  Linked Data is a global ini-a-ve to publish and interlink structured data on the Web using a clever combina-on of simple, standardized technologies. –  Uniform Resource Iden-fiers – to name things; –  Resource Descrip-on Framework – to represent things; –  HTTP infrastructure – to obtain those representa-ons. 04/11/17 13 christophe.debruyne@adaptcentre.ie
  14. 14. (Non-)Informa-on Resources •  Informa-on resources are documents – referred to by a URI – that describe non-informa-on resources – named with a URI – that represent things such as cars, people, etc. •  The NIR http://dbpedia.org/resource/James_Joyce is described by the following IRs: •  The web page http://dbpedia.org/page/James_Joyce •  The RDF doc http://dbpedia.org/data/James_Joyce •  Either is returned depending on what you need. How? 04/11/17 14 christophe.debruyne@adaptcentre.ie
  15. 15. Content Nego-a-on 04/11/17 15 Image from h>p://www.w3.org/TR/swbp-vocab-pub/ christophe.debruyne@adaptcentre.ie
  16. 16. Content nego-a-on in data.geohive.ie Proxy Server TPF Server TPF Web Client Linked Data Frontend Website (dumps) Ontologies SPARQL Endpoint Triplestore TPF Client 04/11/17 christophe.debruyne@adaptcentre.ie 16
  17. 17. UPLIFT WITH R2RML 04/11/17 christophe.debruyne@adaptcentre.ie 17
  18. 18. Different types of mappings… •  Correspondences vs. Mappings •  Correspondences capture how en--es are related •  Mappings describe how to relate en--es •  Correspondences are 'symmetrical', mappings have a direc-on •  Matching vs. Mapping •  Different types of mappings •  Between ontologies •  Between datasets or instances •  Upli%: From non-RDF to RDF •  Downli%: from RDF to non-RDF 04/11/17 christophe.debruyne@adaptcentre.ie 18
  19. 19. RDB2RDF 04/11/17 christophe.debruyne@adaptcentre.ie 19 foaf:based_near <person/1> <person/2> foaf:Person <city/1> <city/2> ChristopheDebruyne Kevin GhentDublin rdf:type rdf:type foaf:firstNamefoaf:lastName foaf:namefoaf:name foaf:based_near foaf:firstName Person ID First Last CityID 1 Christophe Debruyne 1 2 Kevin NULL 2 City ID Name 1 Dublin 2 Ghent
  20. 20. RDB2RDF: W3C Recommenda-ons •  It all started with Tim Berners-Lee proposing a direct mapping from rela-onal databases to RDF. •  Over the years, two W3C Recommenda-ons (standards) to map rela-onal data to RDF emerged. –  A Direct Mapping of rela-onal data to RDF –  R2RML: an RDB to RDF Mapping Language that is highly customizable to annotate rela-onal data with ontologies to generate RDF. 04/11/17 christophe.debruyne@adaptcentre.ie 20
  21. 21. RDB2RDF Tim Berners-Lee described a mapping between rela-onal databases and RDF that can be automated, see “Rela-onal Databases on the Seman-c Web” via h>p://www.w3.org/DesignIssues/RDB-RDF.html “The seman-c web data model is very directly connected with the model of rela-onal databases. A rela-onal database consists of tables, which consists of rows, or records. Each record consists of a set of fields. The record is nothing but the content of its fields, just as an RDF node is nothing but the connec-ons: the property values. The mapping is very direct: •  a record is an RDF node; •  the field (column) name is RDF propertyType; and •  the record field (table cell) is a value.” But can it be that simple? Can you give examples? 04/11/17 christophe.debruyne@adaptcentre.ie 21
  22. 22. Direct Mappings TBL proposed a direct mapping. Direct mappings immediately reflect the structure of the database à The target RDF vocabulary directly reflects the names of database schema elements, and neither structure nor target vocabulary can be changed. [R2RML] Process: •  Exis4ng table and column names are encoded into URIs. •  Data is (i) extracted, (ii) transformed into RDF and then (iii) loaded into a triplestore. This is thus an ETL process. This proposal – over -me – was refined into a W3C recommenda-on, published in fall 2012, called “A Direct Mapping of Rela-onal Data to RDF” h>p://www.w3.org/TR/rdb-direct-mapping/ 04/11/17 christophe.debruyne@adaptcentre.ie 22
  23. 23. Direct Mappings •  The database (both schema and data), primary keys and foreign keys are given to a direct mapping engine to produce an RDF graph. –  Fields are mapping to literals; –  Primary keys are used to construct URIs for resources; –  And foreign keys are used to construct proper-es and relate resources. •  Example… 04/11/17 christophe.debruyne@adaptcentre.ie 23
  24. 24. Direct Mapping Example (from W3C) @base <http://foo.example/DB/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <Person/ID=1> rdf:type <Person> . <Person/ID=1> <Person#ID> 1 . <Person/ID=1> <Person#First> "Christophe" . <Person/ID=1> <Person#Last> "Debruyne" . <Person/ID=1> <Person#CityID> 1. <Person/ID=1> <Person#ref-CityID> <City/ID=1> . <Person/ID=2> rdf:type <Person> . <Person/ID=2> <Person#ID> 2 . <Person/ID=2> <Person#First> "Kevin" . <Person/ID=2> <Person#CityID> 2. <Person/ID=2> <Person#ref-CityID> <City/ID=2> . <City/ID=1> rdf:type <City> . <City/ID=1> <City#ID> 1. <City/ID=1> <City#Name> "Dublin" . <City/ID=2> rdf:type <City> . <City/ID=2> <City#ID> 2. <City/ID=2> <City#Name> "Ghent" . 04/11/17 christophe.debruyne@adaptcentre.ie 24 Given a base URI h>p://foo.example/DB/ Discuss: why would a base URI be important? City ID Name 1 Dublin 2 Ghent Person ID First Last CityID 1 Christophe Debruyne 1 2 Kevin NULL 2
  25. 25. Direct mappings for tabular data •  Genera-ng RDF from Tabular Data on the Web •  A W3C Recommenda-on since 2015 - h>ps://www.w3.org/TR/csv2rdf/ •  A direct mapping approach for CSV and tabular data •  Note: The Recommenda-on provides a sec-on on comparing this approach to the direct mappings for rela-onal databases; “seman-cally equivalent” when dealing with one table, not guaranteed to be “seman-cally equivalent” when mul-ple interrelated tables are involved. 04/11/17 christophe.debruyne@adaptcentre.ie 25
  26. 26. Direct Mappings Discussion: Are direct mappings meaningful? Can you iden-fy poten-al problems? 04/11/17 christophe.debruyne@adaptcentre.ie 26
  27. 27. R2RML •  R2RML: RDB to RDF Mapping Language –  A W3C Recommenda-on since fall 2012 –  h>p://www.w3.org/TR/r2rml/ •  Crea-ng an R2RML file that annotates a rela-onal database with exis-ng vocabularies and/or ontologies (RDFS or OWL). •  That R2RML file goes through an R2RML Mapping Engine to produce RDF. •  R2RML specified –  An ontology to specify those mappings; –  How those mappings should be interpreted to produce RDF. –  R2RML files are thus stored as RDF. 04/11/17 christophe.debruyne@adaptcentre.ie 27
  28. 28. Example 04/11/17 christophe.debruyne@adaptcentre.ie 28 @prefix rr: <h>p://www.w3.org/ns/r2rml#> . @prefix foaf: <h>p://xmlns.com/foaf/0.1/> . @prefix dbpedia: <h>p://dbpedia.org/ontology/> . <#CityTriplesMap> a rr:TriplesMap ; rr:logicalTable [ rr:tableName "City" ] ; rr:subjectMap [ rr:template "h>p://foo.example/City/{ID}" ; rr:class dbpedia:Place ; ] ; rr:predicateObjectMap [ rr:predicate foaf:name ; rr:objectMap [ rr:column "Name" ] ; ] ; . What is being mapped? A logical table/view or an SQL query. How to generate and state something about the subject of those triples. How to generate predicates and objects. City ID Name 1 Dublin 2 Ghent
  29. 29. Example 04/11/17 christophe.debruyne@adaptcentre.ie 29 @prefix rr: <h>p://www.w3.org/ns/r2rml#> . @prefix foaf: <h>p://xmlns.com/foaf/0.1/> . @prefix dbpedia: <h>p://dbpedia.org/ontology/> . <#CityTriplesMap> a rr:TriplesMap ; rr:logicalTable [ rr:sqlQuery """SELECT ID, Name FROM City WHERE 1""" ; ] ; rr:subjectMap [ rr:template "h>p://foo.example/City/{ID}" ; rr:class dbpedia:Place ; ] ; rr:predicateObjectMap [ rr:predicate foaf:name ; rr:objectMap [ rr:column "Name" ] ; ] ; . City ID Name 1 Dublin 2 Ghent
  30. 30. Example 04/11/17 christophe.debruyne@adaptcentre.ie 30 <#PersonTriplesMap> a rr:TriplesMap; rr:logicalTable [ rr:tableName "Person" ] ; rr:subjectMap [ rr:template "h>p://foo.example/Person/{ID}" ; rr:class foaf:Person ; ]; rr:predicateObjectMap [ rr:predicate foaf:name ; rr:objectMap [ rr:column "First" ] ; ]; rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTriplesMap <#CityTriplesMap> ; rr:joinCondi4on [ rr:child "CityID" ; rr:parent "ID" ; ] ] ] ; . Rela-ng People to Addresses. City ID Name 1 Dublin 2 Ghent Person ID First Last CityID 1 Christophe Debruyne 1 2 Kevin NULL 2
  31. 31. In short… Direct vs. Declara4ve •  A Direct Mapping of Rela-onal Data to RDF - W3C •  R2RML: RDB to RDF Mapping Language - W3C Direct Mappings •  Reflect the structure of the data source, including vocabulary •  Mapping implemented in algorithm •  Easy to generate, but not (necessary) meaningful •  Use rules (SWRL or SPARQL CONSTRUCT) to use different vocabularies Declara4ve Mappings •  Relate your source dataset with a target RDF dataset with a vocabulary •  Requires more effort, but more meaningful RDF 04/11/17 christophe.debruyne@adaptcentre.ie 31
  32. 32. R2RML 04/11/17 christophe.debruyne@adaptcentre.ie 32 (From h>ps://www.w3.org/TR/r2rml/)
  33. 33. R2RML 04/11/17 christophe.debruyne@adaptcentre.ie 33 (From h>ps://www.w3.org/TR/r2rml/)
  34. 34. R2RML: Running Example Using our implementa-on of R2RML h>ps://opengogs.adaptcentre.ie/debruync/r2rml 1.  The implementa-on is available in the folder 2.  Go inside example-1 and execute $ java -jar ../r2rml.jar config.proper-es Note: CSV files are turned into rela-onal tables. Column names that contain no spaces are CAPITALIZED. 04/11/17 christophe.debruyne@adaptcentre.ie 34
  35. 35. R2RML: Running Example 04/11/17 christophe.debruyne@adaptcentre.ie 35
  36. 36. R2RML Example 1 config.proper4es CSVFiles = person.csv;city.csv mappingFile = ./mapping.ttl outputFile = ./output.ttl format = TURTLE Directory lis4ng of example-1 •  city.csv •  config.proper-es •  mapping.>l •  output.>l [will be generated] •  person.csv 04/11/17 christophe.debruyne@adaptcentre.ie 36
  37. 37. R2RML •  R2RML supports mapping values with constants, column values or column values applied to a template. •  If an Object Map does not refer to a column or has no language tag, then term types default to IRIs unless you explicitly specify it to be a Literal. 04/11/17 christophe.debruyne@adaptcentre.ie 37 … rr:predicateObjectMap[ rr:predicateMap [ rr:constant foaf:name ] ; rr:objectMap[ rr:template "{first} {last}" ; rr:termType rr:Literal ; ] ] …
  38. 38. R2RML •  Term maps with a TermType of rr:Literal can have a language tag. 04/11/17 christophe.debruyne@adaptcentre.ie 38 … rr:predicateObjectMap[ rr:predicateMap [rr:constant rdfs:label]; rr:objectMap[ rr:column "{title}"; rr:language "en"; ] ] … Why is the above TermType rr:Literal? It uses a column.
  39. 39. R2RML: Mul-ple Languages •  What if you have a table with mul-ple languages and these need to be separated by langue? •  Assuming you have a discriminator, you can create a logical table with an SQL query for each language. How? •  Create one logical table and use a language column to create a single mapping for all languages. Unfortunately, this is not part of the recommenda-on and support depends on the implementa-on… 04/11/17 christophe.debruyne@adaptcentre.ie 39 rr:objectMap [
 rr:column "TITLE";
 rrx:languageColumn "TITLE_LANG";
 ].
  40. 40. R2RML: Datatypes Only on TermMaps that are of type rr:Literal and without a language. 04/11/17 christophe.debruyne@adaptcentre.ie 40 rr:objectMap [ 
 rr:column "EMPNO" ; 
 rr:datatype xsd:positiveInteger ;
 ].
  41. 41. R2RML •  R2RML provides a highly customizable language for mapping rela-onal databases to triples. •  Unlike direct mapping that reflects the database’s structure, the author of the mapping decides on the structure and ontology. –  Mapping projec-ons, e.g., a Triples Map for π{gender}(Person) –  Mapping selec-ons, e.g., a Triples Map for σ{gender=‘F’}(Person) to create instances of ex:Woman –  … 04/11/17 christophe.debruyne@adaptcentre.ie 41 Person ID name gender 1 Christophe M 2 Kevin M 3 Aoife F
  42. 42. RDB2RDF •  How to annotate the exis-ng database? –  Translate data to RDF: generates a RDF dump for immediate consump-on, but will be harder to maintain (A) •  Loaded into triplestores such as Jena TDB (with Fuseki for a SPARQL endpoint), or Virtuoso. –  A mapping from the RDB to RDF, genera-ng SPARQL queries into (intermediate) SQL queries, but will have longer query -mes (B) 04/11/17 christophe.debruyne@adaptcentre.ie 42
  43. 43. Implementa-ons •  Stardog -- h>p://stardog.com/ –  implements both R2RML and their own mapping language. •  Virtuoso – h>p://virtuoso.openlinksw.com/ •  Oracle Spa-al and Graph –h>ps://www.oracle.com/database/spa-al/index.html –  Support for both transforming rela-onal databases into RDF using R2RML as well as crea-ng R2RML views using those mappings •  OnTop -- h>p://ontop.inf.unibz.it/ –  Access rela-onal databases as virtual graphs using mappings •  D2RQ -- h>p://d2rq.org/ –  Direct mapping and D2RQ mapping language. No complete support for R2RML yet. –  Comes with a SPARQL Endpoint and Linked Data Frontend based on Pubby. 04/11/17 christophe.debruyne@adaptcentre.ie 43
  44. 44. Other benefits R2RML is a vocabulary and mappings are stored as RDF. This means: •  One can add provenance informa-on to the mapping, which can be queried •  One can query over mappings, which can facilitate knowledge discovery, reuse, etc. –  Give me all mappings that use these predicates, these namespaces, etc. –  Give me all mappings created by a par-cular person –  ... •  Provides a basis for four star and even five star Linked Data if appropriate choices (e.g., in terms of URIs and how they resolve) are taken 04/11/17 christophe.debruyne@adaptcentre.ie 44
  45. 45. OTHER INITIATIVES •  R2RML-F – extending R2RML with func-ons •  RML – an R2RML superset for mul-ple datasources 04/11/17 christophe.debruyne@adaptcentre.ie 45
  46. 46. Extending R2RML – with Func-ons •  R2RML allows one to describe mappings and assumes the database to conform to the Core SQL 2008 specifica-on. •  But what if the underlying technology does not support certain data manipula-ons? –  Underlying technology not expressive enough. –  Procedural domain knowledge part of applica-on. •  More complex data processing “pipelines” have a nega-ve impact on transparency and traceability of the RDF dataset genera-on process. •  Inclusion of func-ons in ECMAScript (JavaScript) in mapping languages, provided tractability is not a problem. 04/11/17 christophe.debruyne@adaptcentre.ie 46
  47. 47. R2RML-F – R2RML with func-ons •  Namespace rrf: h>p://kdeg.scss.tcd.ie/ns/rrf# •  FuncBons have a func-on name and body. •  Func-ons are wri>en in ECMAScript. <#Multiply> rrf:functionName "multiply" ; rrf:functionBody """ function multiply(var1, var2) { return var1 * var2 ; } """ ; . 04/11/17 christophe.debruyne@adaptcentre.ie 47
  48. 48. R2RML-F – R2RML with func-ons A “func-on valued” term map calls a funcBon and the parameters are themselves term maps. <#TriplesMap1> rr:logicalTable [ rr:tableName "Employee"; ]; rr:subjectMap [ rr:template "http://org.com/employee/{ID}"; ] ; rr:predicateObjectMap [ rr:predicate ex:salary ; rr:objectMap [ rr:datatype xsd:double ; rrf:functionCall [ rrf:function <#Multiply> ; rrf:parameterBindings ( [ rr:constant "12"^^xsd:integer ] [ rr:column "monthly_salary" ] ) ; ] ; ] ; ] ; . Parameter bindings as an RDF Collec4on. Parameter bindings can be empty. Term Maps as parameters. 04/11/17 christophe.debruyne@adaptcentre.ie 48
  49. 49. R2RML-F •  Introduc-on of a FuncBon Valued Term Map that allow for user defined func-ons at the cost of tractability. •  h>ps://opengogs.adaptcentre.ie/debruync/r2rml •  C. Debruyne and D. O'Sullivan. R2RML-F: Towards Sharing and Execu-ng Domain Logic in R2RML Mappings. In Proceedings of the Workshop on Linked Data on the Web, LDOW 2016, co-located with the 25th Interna-onal World Wide Web Conference (WWW 2016), Montreal, Canada, April 12th, 2016, 2016. 04/11/17 christophe.debruyne@adaptcentre.ie 49
  50. 50. RML •  Developed by iMinds (now imec) at Ugent –  h>p://semweb.mmlab.be/rml/spec.html –  h>ps://github.com/mmlab/RMLProcessor •  An extension of R2RML to support XML, HTML, CSV, JSON… –  Basically a superset of R2RML. •  See also: A. Dimou, M. Vander Sande, P. Colpaert, R. Verborgh, E. Mannens, R. Van De Walle: “RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data”, in Proceedings of the 7th Workshop on Linked Data on the Web, WWW14, 2014 Seoul, Korea. 04/11/17 christophe.debruyne@adaptcentre.ie 50
  51. 51. TARQL •  SPARQL for Tables h>ps://tarql.github.io/ •  Use SPARQL CONSTRUCT queries to transform data in CSV/TSV into RDF •  Construct queries can be regarded as rules declaring how to transform data •  It adopts a W3C standard (SPARQL), through relies on some bespoke data transforma-on func-ons (e.g., string explosion) •  But not captured as RDF CONSTRUCT { ?URI a ex:Organization; ex:name ?NameWithLang; ex:CIK ?CIK; ex:LEI ?LEI; ex:ticker ?Stock_ticker; } FROM <file:companies.csv> WHERE { BIND(URI(CONCAT('org/', ?Stock_ticker)) AS ?URI) BIND(STRLANG(?Name, "en") AS ?NameWithLang) } 04/11/17 christophe.debruyne@adaptcentre.ie 51 Example from h>ps://tarql.github.io/
  52. 52. References •  Satya S. Sahoo et al. A Survey of Current Approaches for Mapping of Rela-onal Databases to RDF. W3C RDB2RDF XG Report, W3C, 2009. h>p://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf •  Berners-Lee, T. Rela-onal Databases on the Seman-c Web, 1998, via h>p://www.w3.org/DesignIssues/RDB-RDF.html •  R. Cyganiak, C. Bizer, J. Garbers, O. Maresch, and C. Becker. The D2RQ mapping language. h>p:// d2rq.org/d2rq-language, March 2012. 04/11/17 christophe.debruyne@adaptcentre.ie 52
  53. 53. TUTORIAL 1 – WEATHER STATION DATA 04/11/17 christophe.debruyne@adaptcentre.ie 53
  54. 54. Fingal County Council Weather Sta-ons •  Example from the h>p://data.gov.ie/ portal •  Create instances of geo:Feature and geo:Geometry from GeoSPARQL •  Use RDFS for labels •  Assume h>p://data.fingalcoco.ie/ as the base URI for the resources and h>p://www.fingalcoco.ie/ont# as the namespace for predicates specific for this dataset 04/11/17 christophe.debruyne@adaptcentre.ie 54 Records in fccweathersta-onsp20110829-2221.csv Name Weather Reading Agency LAT LONG M50 Blanchardstown h>p://… Na-onal Roads Authority 53.37046603 -6.380851447 M50 Dublin Airport h>p://… Na-onal Roads Authority 53.40964111 -6.227597428 Dublin Airport h>p://… Met Éireann 53.42150608 -6.29784754
  55. 55. Crea-ng a config.proper-es file CSVFiles = weatherstations.csv mappingFile = ./mapping.ttl outputFile = ./output.ttl format = TURTLE 04/11/17 christophe.debruyne@adaptcentre.ie 55 We will later see extend the example with named graphs and change the format accordingly
  56. 56. Tutorial 1 •  Add the following namespaces –  rr: h>p://www.w3.org/ns/r2rml# –  fcc: h>p://www.fingalcoco.ie/ont# –  geo: h>p://www.opengis.net/ont/geosparql# •  Create a Triples Map for weather sta-ons, and make the subjects and instance of geo:Feature and fcc:WeatherSta-on 04/11/17 christophe.debruyne@adaptcentre.ie 56
  57. 57. Tutorial 1 @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix fcc: <http://www.fingalcoco.ie/ont#> . @prefix geo: <http://www.opengis.net/ont/geosparql#> . <#WeatherStation> a rr:TriplesMap ; rr:logicalTable [ rr:tableName "WEATHERSTATIONS" ] ; rr:subjectMap [ rr:template "http://data.fingalcoco.ie/ws/{NAME}" ; rr:class geo:Feature ; rr:class fcc:WeatherStation ; ] ; . 04/11/17 christophe.debruyne@adaptcentre.ie 57 We generate URIs with the name. Discuss.
  58. 58. Tutorial 1 04/11/17 christophe.debruyne@adaptcentre.ie 58
  59. 59. Tutorial 1 •  Give the features both a default and an English label (use rdfs:label) •  Provide informa-on on where the weather readings can be found (use both fcc:withWeatherReading and rdfs:seeAlso) –  Make sure you add the rdfs namespace •  Relate to the agencies with fcc:withAgency 04/11/17 christophe.debruyne@adaptcentre.ie 59
  60. 60. Tutorial 1 rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:objectMap [ rr:column "NAME" ; rr:language "en" ; ] ; ]; rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:objectMap [ rr:column "NAME" ; ] ; ] rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:objectMap [ rr:column "NAME" ; rr:language "en" ; ] ; rr:objectMap [ rr:column "NAME" ; ] ; ] 04/11/17 christophe.debruyne@adaptcentre.ie 60 More terse…
  61. 61. Tutorial 1 rr:predicateObjectMap [ rr:predicate rdfs:seeAlso ; rr:predicate fcc:withWeatherReading ; rr:objectMap [ rr:column "WEATHER_READING" ; ] ; ] ; rr:predicateObjectMap [ rr:predicate rdfs:seeAlso ; rr:predicate fcc:withWeatherReading ; rr:objectMap [ rr:column "WEATHER_READING" ; rr:termType rr:IRI ; ] ; ] ; 04/11/17 christophe.debruyne@adaptcentre.ie 61 Or... Given that these are URIs
  62. 62. Tutorial 1 04/11/17 christophe.debruyne@adaptcentre.ie 62
  63. 63. Tutorial 1 •  Create instances of geo:Geometry for the loca-on of these weather sta-ons. We will use Well Known Text (WKT) to create points. –  h>ps://en.wikipedia.org/wiki/GeoSPARQL (for an example with polygons) –  h>ps://en.wikipedia.org/wiki/Well-known_text (for examples of points) –  Hint: first longitude, then la-tude. You can test whether points “make sense” with Wicket. Just plot the point without the data type: h>ps://arthur-e.github.io/Wicket/sandbox-gmaps3.html •  Some considera-ons: –  How could the URIs look like? Discuss. –  Could we use blank nodes? Discuss. 04/11/17 christophe.debruyne@adaptcentre.ie 63
  64. 64. Tutorial 1 04/11/17 christophe.debruyne@adaptcentre.ie 64
  65. 65. Tutorial 1 <#Geometries> a rr:TriplesMap ; rr:logicalTable [ rr:tableName "WEATHERSTATIONS" ] ; rr:subjectMap [ rr:template "http://data.fingalcoco.ie/geom/{LONG}/{LAT}" ; rr:class geo:Geometry ; ] ; rr:predicateObjectMap [ rr:predicate geo:asWKT ; rr:objectMap [ rr:template "POINT({LONG} {LAT})" ; rr:datatype geo:wktLiteral ; ] ; ] ; . 04/11/17 christophe.debruyne@adaptcentre.ie 65
  66. 66. Tutorial 1 04/11/17 christophe.debruyne@adaptcentre.ie 66
  67. 67. Tutorial 1 rr:subjectMap [ rr:template "http://data.fingalcoco.ie/geom/{LONG}/{LAT}" ; rr:termType rr:BlankNode ; rr:class geo:Geometry ; ] ; # OR... rr:subjectMap [ rr:template "{LONG}/{LAT}" ; rr:termType rr:BlankNode ; rr:class geo:Geometry ; ] ; 04/11/17 christophe.debruyne@adaptcentre.ie 67 •  Values (template, column or constant) are needed to differen-ate between different blank nodes. •  Template is not used to create a IRI, hence we can eliminate some redundant informa-on
  68. 68. 04/11/17 christophe.debruyne@adaptcentre.ie 68
  69. 69. A note on blank nodes… •  Blank nodes are resources without an iden-fier. They are generally to be avoided, unless you know what you are doing. •  Blank nodes are given a blank node iden-fier by triplestores and SPARQL engines. They are not guaranteed to be uniquely iden-fying! In fact, the specifica-ons even state that one 1) should not rely on consistent blank node iden-fiers, and 2) the same blank node iden-fier in different (named) graphs do not refer to the same thing. 04/11/17 christophe.debruyne@adaptcentre.ie 69
  70. 70. Blank nodes at the OSi data.geohive.ie is an ongoing collabora-on between ADAPT and the Ordnance Survey Ireland to publish OSi’s authorita-ve geospa-al informa-on as Linked Data. Star-ng from publicly available boundary data, suppor-ng two use cases: provision of different geometries for features, and provenance and evolu-on of features and their geometries 04/11/17 christophe.debruyne@adaptcentre.ie 70
  71. 71. Blank nodes at the OSi @prefix geo: <http://www.opengis.net/ont/geosparql#> . @prefix osi: <http://ontologies.geohive.ie/osi#> . <http://data.geohive.ie/resource/county/2AE19629144F13A3E055000000000001> a osi:County ; a geo:Feature ; rdfs:label "Baile Átha Cliath"@ga ; rdfs:label "DUBLIN"@en ; rdfs:label "DUBLIN" ; geo:hasGeometry [ a geo:Geometry ; geo:asWKT "MULTIPOLYGON (((-6.1 53.4, ..."^^geo:wktLiteral ; ] ; . 04/11/17 71 Encourage one to link to features rather than their geometries, which are “merely” an a>ribute. christophe.debruyne@adaptcentre.ie
  72. 72. Tutorial 1 •  Relate the geometries with their feature (hint parent triples map) –  Who’s the parent, who’s the child? Or what is the direc-on of this rela-on? 04/11/17 christophe.debruyne@adaptcentre.ie 72
  73. 73. Tutorial 1 rr:predicateObjectMap [ rr:predicate geo:hasGeometry ; rr:objectMap [ rr:parentTriplesMap <#Geometries>; rr:joinCondition [ rr:child "NAME" ; rr:parent "NAME" ; ] ; ] ; ] ; 04/11/17 christophe.debruyne@adaptcentre.ie 73
  74. 74. 04/11/17 christophe.debruyne@adaptcentre.ie 74
  75. 75. Tutorial 1 – (Named) Graphs •  Named graphs are a convenient way to organize informa-on. •  Unless otherwise specified, triples are stored in the default (or unnamed) graph, which is also referred to with rr:defaultGraph. •  Turtle does not support named graphs, but n-quads, for instance, does 04/11/17 christophe.debruyne@adaptcentre.ie 75
  76. 76. TRIPLESTORE Named graphs in data.geohive.ie Prime2 Database Ontologies R2RML Mapping R2RML Processor Graphs Use Case 1 default •  Types •  Labels •  Links •  100m resolu-on 50 meters •  50m resolu-on 20 meters •  20m resolu-on •  Default geometry links •  With LOD cloud Graphs Use Case 2 default •  Ac-vi-es [PROV-O] •  En--es [PROV-O] •  History of 100m resolu-on 50 meters •  History of 50m resolu-on 20 meters •  History of 20m resolu-on 04/11/17 christophe.debruyne@adaptcentre.ie 76
  77. 77. Tutorial 1 – (Named) Graphs •  Change config.proper-es to generate an RDF dataset –  format = NQUADS –  outputFile = ./output.nq •  Using h>p://www.w3.org/2003/01/geo/wgs84_pos# –  Give the features a geo2:lat and a geo2:long in a different graph, you may use h>p://data.fingalcoco.ie/graph/geo 04/11/17 christophe.debruyne@adaptcentre.ie 77
  78. 78. Tutorial 1 – (Named) Graphs # TO BE ADDED TO THE WEATHER STATION TRIPLES MAP rr:predicateObjectMap [ rr:graph <http://data.fingalcoco.ie/graph/geo> ; rr:predicate geo2:lat ; rr:objectMap [ rr:column "LAT" ; ] ; ] ; rr:predicateObjectMap [ rr:graph <http://data.fingalcoco.ie/graph/geo> ; rr:predicate geo2:long ; rr:objectMap [ rr:column "LONG" ; ] ; ] ; 04/11/17 christophe.debruyne@adaptcentre.ie 78
  79. 79. Tutorial 1 – (Named) Graphs 04/11/17 christophe.debruyne@adaptcentre.ie 79

×