SlideShare une entreprise Scribd logo
1  sur  13
Validata: A tool for testing
profile conformance
Alasdair J G Gray
Heriot-Watt University
www.macs.hw.ac.uk/~ajg33
A.J.G.Gray@hw.ac.uk
@gray_alasdair
Andrew Beveridge
Jacob Baungard Hansen
Johnny Val
Leif Gehrmann
Roisin Farmer
Sunil Khutan
Tomas Robertson
HCLS Dataset Descriptions
https://www.w3.org/TR/hcls-dataset/
Dumontier M, Gray AJG, Marshall MS, et al. (2016) The health care
and life sciences community profile for dataset descriptions.
PeerJ 4:e2331 https://doi.org/10.7717/peerj.2331
1 December 2016
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
2
Requirements
• Online tool
– Deployable on W3C
server
– GUI
– API
• Support multiple
constraints
– Properties
– Data values
– …
• Requirement levels
– Different levels of
user messages:
Error, Warning,
Information
• Configurable
– HCLS (Required)
– DCAT, Open
PHACTS, etc
(Optional)
1 December 2016
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
3
Example Constraint
1 December 2016 4
• Shape
• A Dataset
– MUST be declared to be of type dctype:Dataset
– MUST have a dcterms:title as a language typed
string
– MUST NOT have dcterms:created date
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Dates are associated
with versions in HCLS
Example Validation
1 December 2016 5
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
• Shape
• Data
Example Validation
• Shape
• Data
1 December 2016 6
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Example Validation
1 December 2016 7
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
• Shape
• Data
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
Shape
1 December 2016 8
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Shape Expressions (ShEx)
1 December 2016 9
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
ShEx: Validation
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
Validator can’t warn of
missing property
Example data
<Dataset> {
`MUST` rdf:type (dctypes:Dataset),
`MUST` dct:title rdf:langString,
`MAY` dct:alternative rdf:langString+,
`MUST` !dct:created .
}
Shape
1 December 2016 10
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Requirement Levels
Validator can warn of
missing property
Implementation
Validata
• Web app front end
• Javascript + HTML
• Relies on ShEx-validator
– Validates documents
– Returns report
https://github.com/HW-
SWeL/Validata
ShEx-validator
• Validation system
• Validation API
• Javascript
– nodejs engine
• Reuses
– n3: RDF Library
– ShExParser
https://github.com/HW-
SWeL/ShEx-validator
1 December 2016
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
11
http://hw-swel.github.io/Validata/
VALIDATA DEMO
Validata
https://github.com/HW-SWeL/Validata
• RDF constraint validation tool
– Configurable to any profile
• Shape Expression (ShEx) constraints
• Open source javascript implementation
www.macs.hw.ac.uk/~ajg33/
A.J.G.Gray@hw.ac.uk
@gray_alasdair

Contenu connexe

Tendances

Data curation at Dryad Digital Repository: A former curator's perspective
Data curation at Dryad Digital Repository: A former curator's perspectiveData curation at Dryad Digital Repository: A former curator's perspective
Data curation at Dryad Digital Repository: A former curator's perspectiveJane Frazier
 
Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...Dario Taraborelli
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories ImpactMerce Crosas
 
Verifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can editVerifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can editDario Taraborelli
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Merce Crosas
 
Using Funding Data
Using Funding DataUsing Funding Data
Using Funding DataCrossref
 
Yosemite part-4 webinar-final
Yosemite part-4 webinar-finalYosemite part-4 webinar-final
Yosemite part-4 webinar-finalDATAVERSITY
 
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-DataIntroduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-DataDATAVERSITY
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleMerce Crosas
 
DataTags, The Tags Toolset, and Dataverse Integration
DataTags, The Tags Toolset, and Dataverse IntegrationDataTags, The Tags Toolset, and Dataverse Integration
DataTags, The Tags Toolset, and Dataverse IntegrationMichael Bar-Sinai
 
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...DATAVERSITY
 
ORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice MeadowsORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice MeadowsCrossref
 
FundRef Webinar
FundRef WebinarFundRef Webinar
FundRef WebinarCrossref
 
BibBase Linked Data Triplification Challenge 2010 Presentation
BibBase Linked Data Triplification Challenge 2010 PresentationBibBase Linked Data Triplification Challenge 2010 Presentation
BibBase Linked Data Triplification Challenge 2010 PresentationReynold Xin
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of FoodBenjamin Good
 
Who is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny HendricksWho is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny HendricksCrossref
 
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can EditWikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can EditDario Taraborelli
 
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...Maulik Kamdar
 
Creating Incentives
Creating IncentivesCreating Incentives
Creating Incentivesdatacite
 
Open semantic chemical structures
Open semantic chemical structuresOpen semantic chemical structures
Open semantic chemical structuresStuart Chalk
 

Tendances (20)

Data curation at Dryad Digital Repository: A former curator's perspective
Data curation at Dryad Digital Repository: A former curator's perspectiveData curation at Dryad Digital Repository: A former curator's perspective
Data curation at Dryad Digital Repository: A former curator's perspective
 
Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
 
Verifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can editVerifiable, linked open knowledge that anyone can edit
Verifiable, linked open knowledge that anyone can edit
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Using Funding Data
Using Funding DataUsing Funding Data
Using Funding Data
 
Yosemite part-4 webinar-final
Yosemite part-4 webinar-finalYosemite part-4 webinar-final
Yosemite part-4 webinar-final
 
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-DataIntroduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
DataTags, The Tags Toolset, and Dataverse Integration
DataTags, The Tags Toolset, and Dataverse IntegrationDataTags, The Tags Toolset, and Dataverse Integration
DataTags, The Tags Toolset, and Dataverse Integration
 
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
 
ORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice MeadowsORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice Meadows
 
FundRef Webinar
FundRef WebinarFundRef Webinar
FundRef Webinar
 
BibBase Linked Data Triplification Challenge 2010 Presentation
BibBase Linked Data Triplification Challenge 2010 PresentationBibBase Linked Data Triplification Challenge 2010 Presentation
BibBase Linked Data Triplification Challenge 2010 Presentation
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Who is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny HendricksWho is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny Hendricks
 
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can EditWikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
 
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
 
Creating Incentives
Creating IncentivesCreating Incentives
Creating Incentives
 
Open semantic chemical structures
Open semantic chemical structuresOpen semantic chemical structures
Open semantic chemical structures
 

Similaire à Validata: A tool for testing profile conformance

Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologHowe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologEleanor Howe
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...GigaScience, BGI Hong Kong
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOCMerce Crosas
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...GigaScience, BGI Hong Kong
 
National Data Archive (NADA) 3.0
National Data Archive (NADA) 3.0National Data Archive (NADA) 3.0
National Data Archive (NADA) 3.0mehmood78
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
 
Google Scholar's citation graph: comprehensive, global... and inaccessible
Google Scholar's citation graph: comprehensive, global... and inaccessibleGoogle Scholar's citation graph: comprehensive, global... and inaccessible
Google Scholar's citation graph: comprehensive, global... and inaccessiblealbertomartin101
 
Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumMerce Crosas
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataAhmad C. Bukhari
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataSyed Ahmad Chan Bukhari, PhD
 
Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...
Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...
Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...Mary Bass
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCarole Goble
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
Data Standards & Best Practices for the Stratigraphic Record
Data Standards & Best Practices for the Stratigraphic RecordData Standards & Best Practices for the Stratigraphic Record
Data Standards & Best Practices for the Stratigraphic RecordKerstin Lehnert
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 

Similaire à Validata: A tool for testing profile conformance (20)

Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologHowe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
 
National Data Archive (NADA) 3.0
National Data Archive (NADA) 3.0National Data Archive (NADA) 3.0
National Data Archive (NADA) 3.0
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Google Scholar's citation graph: comprehensive, global... and inaccessible
Google Scholar's citation graph: comprehensive, global... and inaccessibleGoogle Scholar's citation graph: comprehensive, global... and inaccessible
Google Scholar's citation graph: comprehensive, global... and inaccessible
 
Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access Symposium
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
 
Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...
Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...
Building Protected Data Sharing Networks to Advance Cancer Risk Assessment an...
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Data Standards & Best Practices for the Stratigraphic Record
Data Standards & Best Practices for the Stratigraphic RecordData Standards & Best Practices for the Stratigraphic Record
Data Standards & Best Practices for the Stratigraphic Record
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 

Plus de Alasdair Gray

Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...Alasdair Gray
 
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...Alasdair Gray
 
Open PHACTS: The Data Today
Open PHACTS: The Data TodayOpen PHACTS: The Data Today
Open PHACTS: The Data TodayAlasdair Gray
 
Data Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case StudyData Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case StudyAlasdair Gray
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data ContextAlasdair Gray
 
Scientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry dataScientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry dataAlasdair Gray
 
Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...Alasdair Gray
 
Describing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community ProfileDescribing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community ProfileAlasdair Gray
 
Data Science meets Linked Data
Data Science meets Linked DataData Science meets Linked Data
Data Science meets Linked DataAlasdair Gray
 
Sensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-beingSensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-beingAlasdair Gray
 
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...Alasdair Gray
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSAlasdair Gray
 
Computing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery DatasetsComputing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery DatasetsAlasdair Gray
 
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...Alasdair Gray
 
Including Co-Referent URIs in a SPARQL Query
Including Co-Referent URIs in a SPARQL QueryIncluding Co-Referent URIs in a SPARQL Query
Including Co-Referent URIs in a SPARQL QueryAlasdair Gray
 
2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptions2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptionsAlasdair Gray
 

Plus de Alasdair Gray (19)

Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
 
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
 
Open PHACTS: The Data Today
Open PHACTS: The Data TodayOpen PHACTS: The Data Today
Open PHACTS: The Data Today
 
Project X
Project XProject X
Project X
 
Data Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case StudyData Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case Study
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data Context
 
Data Linkage
Data LinkageData Linkage
Data Linkage
 
Scientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry dataScientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry data
 
Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...
 
Describing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community ProfileDescribing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community Profile
 
SensorBench
SensorBenchSensorBench
SensorBench
 
Data Science meets Linked Data
Data Science meets Linked DataData Science meets Linked Data
Data Science meets Linked Data
 
Sensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-beingSensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-being
 
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
 
Computing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery DatasetsComputing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery Datasets
 
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
 
Including Co-Referent URIs in a SPARQL Query
Including Co-Referent URIs in a SPARQL QueryIncluding Co-Referent URIs in a SPARQL Query
Including Co-Referent URIs in a SPARQL Query
 
2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptions2013 01-14 ops-dataset_descriptions
2013 01-14 ops-dataset_descriptions
 

Dernier

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Dernier (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Validata: A tool for testing profile conformance

  • 1. Validata: A tool for testing profile conformance Alasdair J G Gray Heriot-Watt University www.macs.hw.ac.uk/~ajg33 A.J.G.Gray@hw.ac.uk @gray_alasdair Andrew Beveridge Jacob Baungard Hansen Johnny Val Leif Gehrmann Roisin Farmer Sunil Khutan Tomas Robertson
  • 2. HCLS Dataset Descriptions https://www.w3.org/TR/hcls-dataset/ Dumontier M, Gray AJG, Marshall MS, et al. (2016) The health care and life sciences community profile for dataset descriptions. PeerJ 4:e2331 https://doi.org/10.7717/peerj.2331 1 December 2016 @gray_alasdair www.macs.hw.ac.uk/~ajg33 2
  • 3. Requirements • Online tool – Deployable on W3C server – GUI – API • Support multiple constraints – Properties – Data values – … • Requirement levels – Different levels of user messages: Error, Warning, Information • Configurable – HCLS (Required) – DCAT, Open PHACTS, etc (Optional) 1 December 2016 @gray_alasdair www.macs.hw.ac.uk/~ajg33 3
  • 4. Example Constraint 1 December 2016 4 • Shape • A Dataset – MUST be declared to be of type dctype:Dataset – MUST have a dcterms:title as a language typed string – MUST NOT have dcterms:created date <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 Dates are associated with versions in HCLS
  • 5. Example Validation 1 December 2016 5 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 • Shape • Data
  • 6. Example Validation • Shape • Data 1 December 2016 6 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33
  • 7. Example Validation 1 December 2016 7 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 • Shape • Data
  • 8. <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } Shape 1 December 2016 8 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 Shape Expressions (ShEx)
  • 9. 1 December 2016 9 @gray_alasdair www.macs.hw.ac.uk/~ajg33 ShEx: Validation <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } Validator can’t warn of missing property Example data
  • 10. <Dataset> { `MUST` rdf:type (dctypes:Dataset), `MUST` dct:title rdf:langString, `MAY` dct:alternative rdf:langString+, `MUST` !dct:created . } Shape 1 December 2016 10 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 Requirement Levels Validator can warn of missing property
  • 11. Implementation Validata • Web app front end • Javascript + HTML • Relies on ShEx-validator – Validates documents – Returns report https://github.com/HW- SWeL/Validata ShEx-validator • Validation system • Validation API • Javascript – nodejs engine • Reuses – n3: RDF Library – ShExParser https://github.com/HW- SWeL/ShEx-validator 1 December 2016 @gray_alasdair www.macs.hw.ac.uk/~ajg33 11
  • 13. Validata https://github.com/HW-SWeL/Validata • RDF constraint validation tool – Configurable to any profile • Shape Expression (ShEx) constraints • Open source javascript implementation www.macs.hw.ac.uk/~ajg33/ A.J.G.Gray@hw.ac.uk @gray_alasdair

Notes de l'éditeur

  1. Motivation: how do we check descriptions conform? Summary level: time unchanging information, e.g. name, description, publisher Version level: version specific information, e.g. version number, creator, etc Distribution level: file specific information, e.g. file location and format, number of triples 18 vocabularies: DCTerms, DCAT, VoID, FOAF, … 61 prescribed properties: MUST, SHOULD, MAY, MUST NOT for each level
  2. Link into data publishing pipeline via API Not tied to HCLS, only a motivation No existing tool meets these needs
  3. Constraints form a graph pattern that data must comply with
  4. How do we validate that our example data conforms to a certain shape Express expected shape as ShEx Toy example, what about for real
  5. How do we validate that our example data conforms to a certain shape Express expected shape as ShEx Toy example, what about for real
  6. How do we validate that our example data conforms to a certain shape Express expected shape as ShEx Toy example, what about for real
  7. ShEx: Concise notation regex based W3C SHACL not stable when work done ShEx is an implementation of SHACL with extra features
  8. Step through validation process
  9. Extended ShEx to allow arbitrary hierarchies Toy example, what about for real
  10. ShEx-validator has other dependencies too Minimist: arguments parser Promise: call backs Pegjs: parser generator Mocha: test driven development