SlideShare une entreprise Scribd logo
1  sur  23
Validation of Europeana data:
application profile, OWL ontology, or else?
Antoine Isaac
Application Profiles as an alternative to OWL Ontologies
Dublin Core Conference
5 September 2013
EDM basic pattern
 A data provider submits to Europeana a “bundle” of an object
and its digital representation(s)
For a general presentation on Europeana and EDM rationale see http://pro.europeana.eu/edm-documentation
Europeana Data Model: an example
Provided Cultural Heritage Object (CHO)
and descriptive metadata
Web Resources – digital representations
Aggregations – Bundling it all together
EDM Specs
http://pro.europeana.eu/edm-documentation
- EDM Definition:
- Mapping Guidelines and templates
- XML Schema
- OWL ontology
EDM Definitions
High level definition of classes and properties
edm:aggregatedCHO
- Definition: This property associates an ORE aggregation with the cultural
heritage object(s) (CHO for short) it is about.
- Subproperty of: ore:aggregates, dc:subject, P129_is_about
- Domain: ore:Aggregation
- Range: edm:ProvidedCHO
EDM Definitions
 Avoids adding semantics to re-used classes and properties
Except for mapping purposes, hierarchies of classes and properties for inference
dc:contributor rdfs:subPropertyOf edm:hasMet .
 Borderline case of axioms not in formal version of original specs
ore:proxyIn
- Obligation & Occurrence: A proxy may be in 1 to many aggregations,
and an aggregation may have 0 to many proxies in it
EDM Definitions
First hints at data constraints
edm:dataProvider
- Obligation & Occurrence: Mandatory for Europeana (Minimum: 1,
Maximum: 1)
edm:currentLocation
- Domain: The set of cultural heritage objects that Europeana collects
descriptions about, represented in the EDM by ProvidedCHOs and ORE
proxies for these CHOs.
edm:aggregatedCHO
- Obligation & Occurrence: In Europeana, an aggregation aggregates
exactly one CHO
Data validation: Europeana requirements
EDM is RDF-oriented: unbounded web of information, etc.
But Europeana needs to enforce constraints on the data it receives
 Data that meets basic Europeana function requirements
• An Aggregation should always have an edm:aggregatedCHO
• There must be exactly one edm:type -- the value must be TEXT, VIDEO,
SOUND, IMAGE or 3D
 Data quality criteria
• A ProvidedCHO should have at least a dc:title or a dc:description
We need specs for validation that are easily shareable, both for
humans and machines
EDM Mapping Guidelines
 Document written after the EDM Definitions
 Tries to formulate clearer instructions for Europeana providers
 Template-based, e.g. for provider’s Aggregation:
property value type cardinality
Machine-readable specs as OWL ontologies?
OWL is good for writing constraints, but not for validation!
Quite OK
 “Value types” via owl:ObjectProperty owl:DatatypeProperty in
OWL(DL)
 Data ranges (TEXT-VIDEO-SOUND-IMAGE-3D)
Less ok:
 Object domain and ranges
 (qualified) cardinality axioms
Including combinations: (either isShownBy OR isShownAt is mandatory)
We’ve created an OWL version of EDM
[…]
<owl:equivalentClass>
<owl:Restriction>
<owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality>
<owl:onProperty rdf:resource="&edm;aggregatedCHO"/>
</owl:Restriction>
</owl:equivalentClass>
[…]
OWL?
https://github.com/europeana/corelib/blob/master/c
orelib-solr-definitions/src/main/resources/eu/rdf/
But these are not really validation axioms
And it’s bad practice to add semantics to classes and properties that
already exist, such as ore:Aggregation
(let’s be honest: we were not ready for full RDF/OWL compatibility
anyway…)
EDM is implemented by an XML Schema (for RDF data!)
[…]
<sequence>
<element ref="edm:aggregatedCHO" maxOccurs="1" minOccurs="1"/>
<element ref="edm:dataProvider" maxOccurs="1" minOccurs="1"/>
<element ref="edm:isShownAt" maxOccurs="1" minOccurs="0"/>
<element ref="edm:isShownBy" maxOccurs="1" minOccurs="0"/>
<element ref="edm:object" maxOccurs="1" minOccurs="0"/>
<element ref="edm:provider" maxOccurs="1" minOccurs="1"/>
<element ref="dc:rights" maxOccurs="unbounded" minOccurs="0"/>
<element ref="edm:rights" maxOccurs="1" minOccurs="1"/>
</sequence>
[…]
XML Schema
And Schematron rules:
[…]
<sch:pattern name="Either Is shownby or is shownat should be present">
<sch:rule context="ore:Aggregation">
<sch:assert test="edm:isShownAt or edm:isShownBy">
[Error message]
</sch:assert>
</sch:rule>
</sch:pattern>
XML Schema
XML Schema: not ideal!
 Specific to a syntax
 Document-centric approach to validation
Back to square one: records!
 Forces us to enumerate the attributes with extra constraints, especially
order of elements
It’s really a super-closed world
 Schematron does slightly better, but then we have two constraint
languages co-existing in a same implementation
EDM as a “real” application profile?
 It is an application profile, already: mixing several vocabularies,
adding specific constraints
 Documentation includes definitions with constraints and
examples
 Interpretation of constraint in APs fit quite well
• AP constraints are expressed on the data
• Europeana needs dataset-level validation, mostly
EDM as a real application profile?
A fragment in DSP XML
<DescriptionTemplate ID=”aggregation" standalone=”yes">
<ResourceClass>ore:Aggregation</ResourceClass>
<StatementTemplate minOccurs="1" maxOccurs="1”>
<Property>edm:aggregatedCHO</Property>
</StatementTemplate>
<StatementTemplate minOccurs="1”>
<Property>edm:isShownBy</Property>
<Property>edm:isShownAt</Property>
</StatementTemplate>
</DescriptionTemplate>
http://dublincore.org/documents/dc-dsp/
Could be converted to other constraint
checking formalisms (1/2): SPIN
SPARQL Inferencing Notation http://spinrdf.org
ore:Aggregation
spin:constraint
[ a sp:Ask ;
sp:text """
# either isShownBy or isShownAt must be present
ASK WHERE {
{?this isShownBy ?image } UNION {?this isShownBy ?page }
}"""
] .
Issue: still looks like adding semantics to ore:Aggregation in general…
Could be converted to other constraint
checking formalisms (1/2): Stardog ICV
Integrity Constraint Validation http://stardog.com/docs/sdp/
Class: ore:Aggregation
SubClassOf: exactly 1 edm:aggregatedCHO
Class: ore:Aggregation
SubClassOf: min 1 edm:isShownBy or min 1 edm:isShownAt
Note: this is OWL2’s ‘Manchester Syntax’
Stardog accepts OWL, SWRL and SPARQL, uses SPARQL as back-end
Issue: still looks like adding semantics to ore:Aggregation in general…
Conclusions
Europeana requirements seem to be met by the AP approach, if this AP
approach is matched with SPARQL constraints
Much better than to try to partially catch constraints in OWL and
XML+Schematron as isolated machine-readable specs
Needs further testing
incl. trying to express all constraints with DSP and SPARQL queries
(with or without the help of a higher-level language)
An area that needs maturation
Maintainers (like us) may have validation specs in various forms
DSP is not flying, RDF validation still worthy of workshops!
https://www.w3.org/2012/12/rdf-val/
Thank you!
aisaac@few.vu.nl

Contenu connexe

Tendances

Europeana bergen may2010_dovwiner
Europeana bergen may2010_dovwinerEuropeana bergen may2010_dovwiner
Europeana bergen may2010_dovwiner
Dov Winer
 
EuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital HeritageEuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital Heritage
Max Kaiser
 
LIBER and its EU projects
LIBER and its EU projectsLIBER and its EU projects
LIBER and its EU projects
LIBER Europe
 
Linked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the Citizen
Stefan Gradmann
 
Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)
Y. Nicolas
 

Tendances (17)

W3C Library Linked Data Incubator Group - 2011
W3C Library Linked Data Incubator Group  - 2011W3C Library Linked Data Incubator Group  - 2011
W3C Library Linked Data Incubator Group - 2011
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
 
Eun lre brussels_winer20100616
Eun lre brussels_winer20100616Eun lre brussels_winer20100616
Eun lre brussels_winer20100616
 
Entity Management at Europeana - DCMI 2021
Entity Management at Europeana - DCMI 2021Entity Management at Europeana - DCMI 2021
Entity Management at Europeana - DCMI 2021
 
Multilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at EuropeanaMultilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at Europeana
 
The Europeana Data Model Principles, community and innovation
The Europeana Data Model  Principles, community and innovationThe Europeana Data Model  Principles, community and innovation
The Europeana Data Model Principles, community and innovation
 
Europeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) caseEuropeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) case
 
Exploiting vocabularies and Linked Data: in practice
Exploiting vocabularies and Linked Data: in practiceExploiting vocabularies and Linked Data: in practice
Exploiting vocabularies and Linked Data: in practice
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
The Europeana Data Model - TPDL2018
The Europeana Data Model - TPDL2018The Europeana Data Model - TPDL2018
The Europeana Data Model - TPDL2018
 
Europeana bergen may2010_dovwiner
Europeana bergen may2010_dovwinerEuropeana bergen may2010_dovwiner
Europeana bergen may2010_dovwiner
 
EuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital HeritageEuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital Heritage
 
LIBER and its EU projects
LIBER and its EU projectsLIBER and its EU projects
LIBER and its EU projects
 
Linked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the Citizen
 
Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)
 
Uniting Digitization & Heritage Metadata : Calames Plus & other tracks
Uniting  Digitization & Heritage Metadata : Calames Plus & other tracksUniting  Digitization & Heritage Metadata : Calames Plus & other tracks
Uniting Digitization & Heritage Metadata : Calames Plus & other tracks
 
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
The Hellenic Aggregator - Overview, procedures & the cooperation with EuropeanaThe Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
 

Similaire à Validation of Europeana data: application profile, OWL ontology, or else?

Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
Igor Moochnick
 
Building social and RESTful frameworks
Building social and RESTful frameworksBuilding social and RESTful frameworks
Building social and RESTful frameworks
brendonschwartz
 
Runtime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreRuntime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya Rathore
Esha Yadav
 
Eclipse Modeling Framework
Eclipse Modeling FrameworkEclipse Modeling Framework
Eclipse Modeling Framework
Ajay K
 

Similaire à Validation of Europeana data: application profile, OWL ontology, or else? (20)

Report on Work of Joint DCMI/IEEE LTSC Task Force
Report on Work of Joint DCMI/IEEE LTSC Task ForceReport on Work of Joint DCMI/IEEE LTSC Task Force
Report on Work of Joint DCMI/IEEE LTSC Task Force
 
Innovate2011 Keys to Building OSLC Integrations
Innovate2011 Keys to Building OSLC IntegrationsInnovate2011 Keys to Building OSLC Integrations
Innovate2011 Keys to Building OSLC Integrations
 
Entity Framework 4
Entity Framework 4Entity Framework 4
Entity Framework 4
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
 
EDM for Europeana Sounds
EDM for Europeana SoundsEDM for Europeana Sounds
EDM for Europeana Sounds
 
RESTful Services
RESTful ServicesRESTful Services
RESTful Services
 
Facilitating Busines Interoperability from the Semantic Web
Facilitating Busines Interoperability from the Semantic WebFacilitating Busines Interoperability from the Semantic Web
Facilitating Busines Interoperability from the Semantic Web
 
Modern Database Development Oow2008 Lucas Jellema
Modern Database Development Oow2008 Lucas JellemaModern Database Development Oow2008 Lucas Jellema
Modern Database Development Oow2008 Lucas Jellema
 
EDM for sounds
EDM for soundsEDM for sounds
EDM for sounds
 
oodb.ppt
oodb.pptoodb.ppt
oodb.ppt
 
Euclid Data Model 101 - Episode 01: Overview
Euclid Data Model 101 - Episode 01: OverviewEuclid Data Model 101 - Episode 01: Overview
Euclid Data Model 101 - Episode 01: Overview
 
The JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeThe JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scope
 
Building social and RESTful frameworks
Building social and RESTful frameworksBuilding social and RESTful frameworks
Building social and RESTful frameworks
 
Oodb
OodbOodb
Oodb
 
Metadata Cloud
Metadata CloudMetadata Cloud
Metadata Cloud
 
Runtime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreRuntime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya Rathore
 
Eclipse Modeling Framework
Eclipse Modeling FrameworkEclipse Modeling Framework
Eclipse Modeling Framework
 
DC-2008 Architecture Forum Open session
DC-2008 Architecture Forum Open sessionDC-2008 Architecture Forum Open session
DC-2008 Architecture Forum Open session
 
Description Set profiles
Description Set profilesDescription Set profiles
Description Set profiles
 
Entity framework 4.0
Entity framework 4.0Entity framework 4.0
Entity framework 4.0
 

Plus de Antoine Isaac

Plus de Antoine Isaac (20)

Addressing multilingual challenges at Europeana: An update - DCMI 2021
Addressing multilingual challenges at Europeana: An update - DCMI 2021Addressing multilingual challenges at Europeana: An update - DCMI 2021
Addressing multilingual challenges at Europeana: An update - DCMI 2021
 
Le Cadre de publication d'Europeana
Le Cadre de publication d'EuropeanaLe Cadre de publication d'Europeana
Le Cadre de publication d'Europeana
 
Metadata aggregation of IIIF Resources at Europeana: status and plans
Metadata aggregation of IIIF Resources at Europeana: status and plansMetadata aggregation of IIIF Resources at Europeana: status and plans
Metadata aggregation of IIIF Resources at Europeana: status and plans
 
IIIF and the Europeana mission
IIIF and the Europeana missionIIIF and the Europeana mission
IIIF and the Europeana mission
 
Semantic Interoperability at Europeana - MultilingualDSIs2018
Semantic Interoperability at Europeana - MultilingualDSIs2018Semantic Interoperability at Europeana - MultilingualDSIs2018
Semantic Interoperability at Europeana - MultilingualDSIs2018
 
Lightweight rights modeling and linked data publication for online cultural h...
Lightweight rights modeling and linked data publication for online cultural h...Lightweight rights modeling and linked data publication for online cultural h...
Lightweight rights modeling and linked data publication for online cultural h...
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018
 
Europeana et IIIF
Europeana et IIIFEuropeana et IIIF
Europeana et IIIF
 
Data scale and diversity issues at Europeana
Data scale and diversity issues at EuropeanaData scale and diversity issues at Europeana
Data scale and diversity issues at Europeana
 
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data VocabulariesIsaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
 
Europeana APIs
Europeana APIsEuropeana APIs
Europeana APIs
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpedia
 
Modelling and exchanging annotations
Modelling and exchanging annotationsModelling and exchanging annotations
Modelling and exchanging annotations
 
EuropeanaTech update - Europeana AGM 2015
EuropeanaTech update - Europeana AGM 2015EuropeanaTech update - Europeana AGM 2015
EuropeanaTech update - Europeana AGM 2015
 
Modelling annotations for Europeana and related projects - DARIAH-EU WS
Modelling annotations for Europeana and related projects - DARIAH-EU WSModelling annotations for Europeana and related projects - DARIAH-EU WS
Modelling annotations for Europeana and related projects - DARIAH-EU WS
 
Classification schemes, thesauri and other Knowledge Organization Systems - a...
Classification schemes, thesauri and other Knowledge Organization Systems - a...Classification schemes, thesauri and other Knowledge Organization Systems - a...
Classification schemes, thesauri and other Knowledge Organization Systems - a...
 
Multilingual challenges for accessing digitized culture online - Riga Summit 15
Multilingual challenges for accessing digitized culture online - Riga Summit 15Multilingual challenges for accessing digitized culture online - Riga Summit 15
Multilingual challenges for accessing digitized culture online - Riga Summit 15
 
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
 
AAC Education Session
AAC Education Session AAC Education Session
AAC Education Session
 
Europeana and the relevance of the DM2E results
Europeana and the relevance of the DM2E resultsEuropeana and the relevance of the DM2E results
Europeana and the relevance of the DM2E results
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Validation of Europeana data: application profile, OWL ontology, or else?

  • 1. Validation of Europeana data: application profile, OWL ontology, or else? Antoine Isaac Application Profiles as an alternative to OWL Ontologies Dublin Core Conference 5 September 2013
  • 2. EDM basic pattern  A data provider submits to Europeana a “bundle” of an object and its digital representation(s) For a general presentation on Europeana and EDM rationale see http://pro.europeana.eu/edm-documentation
  • 4. Provided Cultural Heritage Object (CHO) and descriptive metadata
  • 5. Web Resources – digital representations
  • 6. Aggregations – Bundling it all together
  • 7. EDM Specs http://pro.europeana.eu/edm-documentation - EDM Definition: - Mapping Guidelines and templates - XML Schema - OWL ontology
  • 8. EDM Definitions High level definition of classes and properties edm:aggregatedCHO - Definition: This property associates an ORE aggregation with the cultural heritage object(s) (CHO for short) it is about. - Subproperty of: ore:aggregates, dc:subject, P129_is_about - Domain: ore:Aggregation - Range: edm:ProvidedCHO
  • 9. EDM Definitions  Avoids adding semantics to re-used classes and properties Except for mapping purposes, hierarchies of classes and properties for inference dc:contributor rdfs:subPropertyOf edm:hasMet .  Borderline case of axioms not in formal version of original specs ore:proxyIn - Obligation & Occurrence: A proxy may be in 1 to many aggregations, and an aggregation may have 0 to many proxies in it
  • 10. EDM Definitions First hints at data constraints edm:dataProvider - Obligation & Occurrence: Mandatory for Europeana (Minimum: 1, Maximum: 1) edm:currentLocation - Domain: The set of cultural heritage objects that Europeana collects descriptions about, represented in the EDM by ProvidedCHOs and ORE proxies for these CHOs. edm:aggregatedCHO - Obligation & Occurrence: In Europeana, an aggregation aggregates exactly one CHO
  • 11. Data validation: Europeana requirements EDM is RDF-oriented: unbounded web of information, etc. But Europeana needs to enforce constraints on the data it receives  Data that meets basic Europeana function requirements • An Aggregation should always have an edm:aggregatedCHO • There must be exactly one edm:type -- the value must be TEXT, VIDEO, SOUND, IMAGE or 3D  Data quality criteria • A ProvidedCHO should have at least a dc:title or a dc:description We need specs for validation that are easily shareable, both for humans and machines
  • 12. EDM Mapping Guidelines  Document written after the EDM Definitions  Tries to formulate clearer instructions for Europeana providers  Template-based, e.g. for provider’s Aggregation: property value type cardinality
  • 13. Machine-readable specs as OWL ontologies? OWL is good for writing constraints, but not for validation! Quite OK  “Value types” via owl:ObjectProperty owl:DatatypeProperty in OWL(DL)  Data ranges (TEXT-VIDEO-SOUND-IMAGE-3D) Less ok:  Object domain and ranges  (qualified) cardinality axioms Including combinations: (either isShownBy OR isShownAt is mandatory)
  • 14. We’ve created an OWL version of EDM […] <owl:equivalentClass> <owl:Restriction> <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality> <owl:onProperty rdf:resource="&edm;aggregatedCHO"/> </owl:Restriction> </owl:equivalentClass> […] OWL? https://github.com/europeana/corelib/blob/master/c orelib-solr-definitions/src/main/resources/eu/rdf/ But these are not really validation axioms And it’s bad practice to add semantics to classes and properties that already exist, such as ore:Aggregation (let’s be honest: we were not ready for full RDF/OWL compatibility anyway…)
  • 15. EDM is implemented by an XML Schema (for RDF data!) […] <sequence> <element ref="edm:aggregatedCHO" maxOccurs="1" minOccurs="1"/> <element ref="edm:dataProvider" maxOccurs="1" minOccurs="1"/> <element ref="edm:isShownAt" maxOccurs="1" minOccurs="0"/> <element ref="edm:isShownBy" maxOccurs="1" minOccurs="0"/> <element ref="edm:object" maxOccurs="1" minOccurs="0"/> <element ref="edm:provider" maxOccurs="1" minOccurs="1"/> <element ref="dc:rights" maxOccurs="unbounded" minOccurs="0"/> <element ref="edm:rights" maxOccurs="1" minOccurs="1"/> </sequence> […] XML Schema
  • 16. And Schematron rules: […] <sch:pattern name="Either Is shownby or is shownat should be present"> <sch:rule context="ore:Aggregation"> <sch:assert test="edm:isShownAt or edm:isShownBy"> [Error message] </sch:assert> </sch:rule> </sch:pattern> XML Schema
  • 17. XML Schema: not ideal!  Specific to a syntax  Document-centric approach to validation Back to square one: records!  Forces us to enumerate the attributes with extra constraints, especially order of elements It’s really a super-closed world  Schematron does slightly better, but then we have two constraint languages co-existing in a same implementation
  • 18. EDM as a “real” application profile?  It is an application profile, already: mixing several vocabularies, adding specific constraints  Documentation includes definitions with constraints and examples  Interpretation of constraint in APs fit quite well • AP constraints are expressed on the data • Europeana needs dataset-level validation, mostly
  • 19. EDM as a real application profile? A fragment in DSP XML <DescriptionTemplate ID=”aggregation" standalone=”yes"> <ResourceClass>ore:Aggregation</ResourceClass> <StatementTemplate minOccurs="1" maxOccurs="1”> <Property>edm:aggregatedCHO</Property> </StatementTemplate> <StatementTemplate minOccurs="1”> <Property>edm:isShownBy</Property> <Property>edm:isShownAt</Property> </StatementTemplate> </DescriptionTemplate> http://dublincore.org/documents/dc-dsp/
  • 20. Could be converted to other constraint checking formalisms (1/2): SPIN SPARQL Inferencing Notation http://spinrdf.org ore:Aggregation spin:constraint [ a sp:Ask ; sp:text """ # either isShownBy or isShownAt must be present ASK WHERE { {?this isShownBy ?image } UNION {?this isShownBy ?page } }""" ] . Issue: still looks like adding semantics to ore:Aggregation in general…
  • 21. Could be converted to other constraint checking formalisms (1/2): Stardog ICV Integrity Constraint Validation http://stardog.com/docs/sdp/ Class: ore:Aggregation SubClassOf: exactly 1 edm:aggregatedCHO Class: ore:Aggregation SubClassOf: min 1 edm:isShownBy or min 1 edm:isShownAt Note: this is OWL2’s ‘Manchester Syntax’ Stardog accepts OWL, SWRL and SPARQL, uses SPARQL as back-end Issue: still looks like adding semantics to ore:Aggregation in general…
  • 22. Conclusions Europeana requirements seem to be met by the AP approach, if this AP approach is matched with SPARQL constraints Much better than to try to partially catch constraints in OWL and XML+Schematron as isolated machine-readable specs Needs further testing incl. trying to express all constraints with DSP and SPARQL queries (with or without the help of a higher-level language) An area that needs maturation Maintainers (like us) may have validation specs in various forms DSP is not flying, RDF validation still worthy of workshops! https://www.w3.org/2012/12/rdf-val/

Notes de l'éditeur

  1. Special session page: http://dcevents.dublincore.org/IntConf/index/pages/view/APaltOO
  2. View the object at: http://www.europeana.eu/portal/record/09102/_CM_0161930.html
  3. &quot;application profile” seen a bit as a binary relation between two vocabularies, ignoring what may happen beside the vocabulary that is being re-used (i.e. whether the AP uses other elements and whether these new elements are re-used from third-party contexts or newly coined).