Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Bioschemas: Datasets and Data Catalogs
1. Alasdair J G Gray ELIXIR-UK
Heriot-Watt University
Carole Goble
University of Manchester
Rafael C Jimenez ELIXIR-Hub
Bioschemas
Datasets and
Data Catalogs
2. <div itemscope itemtype="http://schema.org/Recipe">
<h1 itemprop="name">Classic potato salad</h1>
<div itemprop="nutrition” itemscope
itemtype="http://schema.org/NutritionInformation">
Nutrition facts:
<span itemprop="calories">144 kcal</span>,
</div>
Ingredients:
- <span itemprop="recipeIngredient">800g small new potato</span>
- <span itemprop="recipeIngredient">3 shallot</span>
Markup for web pages
RDFa
JSON-LD
Microdata
With markup
4 Oct 2017 @bioschemas 2
4. Schema for Datasets and Catalogs
Schema definitions:
• Dataset: A body of structured
information describing some topic(s)
of interest
http://schema.org/Dataset
– 91 properties
• DataCatalog: A collection of datasets
http://schema.org/DataCatalog
– 91 properties
4 Oct 2017 @bioschemas 5
5. Schema for Datasets and Catalogs
Schema definitions:
• Dataset: A body of structured
information describing some topic(s)
of interest
http://schema.org/Dataset
– 91 properties
• DataCatalog: A collection of datasets
http://schema.org/DataCatalog
– 91 properties
Google Profile
• Dataset: 9 basic properties
• DataCatalog: 1 property
• DataDownload: 2 properties
• Many more
4 Oct 2017 @bioschemas 6
6. Bioschemas
• Schema.org for life sciences
–Introduce life sciences types
• Use case driven
–Finding data
–Presenting search results
–Metadata exchange
• Minimum properties – 6
• Link to domain ontologies
Specification on top of schema.org
Layer of constrains + documentation +
extensions Specification
Data model
Minimum information
Controlled vocabularies
Cardinality
Documentation
Examples
New (properties | types)
4 Oct 2017 @bioschemas 7
10. Bioschemas DataCatalog
4 Oct 2017 @bioschemas 13
http://bioschemas.org/specifications/
A collection of datasets, e.g. catalogs, repositories, registries, …
11. Bioschemas Dataset
4 Oct 2017 @bioschemas 14
http://bioschemas.org/specifications/
A body of structured information describing some topic(s) of interest
12. Bioschemas Dataset Deployment
Reactome dataset
• Status: in production
• Available from: view-source:
http://reactome.org/content/detail/R-HSA-74160
• Use case: discovery
• Documentation:
http://reactome.org/ContentService/#!/discover/eventDiscoveryUsingGET
4 Oct 2017 @bioschemas 15
15. bioschemas.org
Acknowledgements
Haydee Artaza
Terri Atwood
Phil Barker
Dominique Batista
Niall Beard
Raoul Bonnal
Cath Brooksbank
Tony Burdett
Guillermo Calderon
Mantilla
Ethy Cannon
Justin Clark-Casey
Martin Cook
Manuel Corpas
Michael R Crusoe
Pavel Dallakian
Luc Deltombe
Stephen Ficklin
Leyla Garcia
Carole Goble
Alejandra Gonzalez-
Beltran
Alasdair Gray
Jeffrey Grethe
Henning Hermjakob
Richard Holland
Carlos Horro
Jon Ison
Christa Janko
Andy Jenkinson
Rafael C Jimenez
Claire Johnson
Simon Jupp
Nick Juty
Lee Larcombe
Nicolas Le Novère
Mikael Linden
Audald Lloret
Federico López
Gómez
Ronald Margolis
Maria Martin
Michaela Th.
Mayrhofer
Peter McQuilton
Sarah Morgan
Chris Mungall
Aleksandra Nenadic
Helen Parkinson
Roberto Preste
Giuseppe Profiti
Philippe Rocca-Serra
Gabriella Rustici
Susanna A Sansone
Vicky Schneider
Serena Scollen
Chris Taylor
Milo Thurston
Dan Timmons
John Van Horn
Susheel Varma
Sameer Velankar
Premysl Velek
Andra Waagmeester
Liz Williams
Sarala Wimalaratne
Anil Wipat
Olga Ximena Giraldo
Anita de Waard
Peter van Heusden
+ others to be added
4 Oct 2017 @bioschemas 18
Notes de l'éditeur
Adoption meeting: 12 catalogs marked up
Define use case
Metadata crosswalk and mapping to schema.org
Metadata providers
Metadata registries
Standards defining metadata
Bioschemas specification
Define minimum properties based on “finding” use cases
Define cardinality and suggested controlled vocabularies
Test with existing entries
Adoption by data repositories and registries
Applications
Beacon
Data Catalog
Dataset
Event
Laboratory Protocol
Organization
Person
Phenotype
Protein
Protein Annotation
Protein Structure
Sample
Standard
Tool
Training Material
For example, lets look at the lading page of the uniprot data repository. There is the description of the repository, link to the latest release, citation information, licensing information etc. We can understand the text but it is not easy for a software agent to extract this information automatically. So in our data repository specification, we aim to encode this information as metadata which can be automatically extracted by a software agent.