SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Supported by the NIH grant 1U24 AI117966-01 to UCSD
PI , Co-Investigators at:
The
annotated with schema.org
Susanna-Assunta Sansone,
Alejandra Gonzalez-Beltran, Philippe Rocca-Serra
Oxford e-Research Centre, University of Oxford, UK
bioCADDIE - DATS Workshop, Bethesda, 7 May 2017
The model
What is ?
Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature,
a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in
the DataMed prototype
Where do I find the documentation?
Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature,
a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in
the DataMed prototype
Mar15
Jun15
Dec15
Jun16
Aug15
May16
Sep16
Mar17
bioCADDIE team’s work on DATS
Our community engagement: input, feedback and links
May17
Mar15
Jun15
Dec15
Jun16
Aug15
May16
Sep16
Mar17
bioCADDIE team’s work on DATS
Our community engagement: input, feedback and links
Phase 1 Phase 2 Phase 3
Design and development Continued evaluation & consolidation
May17
Evaluation & iterative refinement
Mar15
Jun15
Dec15
Jun16
Aug15
May16
Sep16
Mar17
bioCADDIE team’s work on DATS
Our community engagement: input, feedback and links
Phase 1 Phase 2 Phase 3
Design and development
• Use cases collection= George, ExcTeam
• Method development (SOP); competency
questions; metadata mapping and
definition= Susanna, Alejandra, Philippe
SOP and
metadata
strawman
<DATS>
name
May17
Metadata
specification V1.0
with JSON
schema
Use cases
workshop
WG3 formed;
telecons start;
dissemination via
WG7 formed;
telecons start
Evaluation & iterative refinement Continued evaluation & consolidation
Mar15
Jun15
Dec15
Jun16
Aug15
May16
Sep16
Mar17
bioCADDIE team’s work on DATS
Our community engagement: input, feedback and links
Phase 1 Phase 2 Phase 3
Design and development
• Use cases collection= George, ExcTeam
• Method development (SOP); competency
questions; metadata mapping and
definition= Susanna, Alejandra, Philippe
SOP and
metadata
strawman
<DATS>
name
DATS
v1.1
May17
DATS v2.0
(with access
metadata,
WG7)
Metadata
specification V1.0
with JSON
schema
Use cases
workshop
1st DATS
workshop
WG3 formed;
telecons start;
dissemination via
WG7 formed;
telecons start
Evaluation & iterative refinement
• DATS specification, serializations,
refinement= (Susanna), Alejandra,
Philippe and CoreDevTeam, also
based on community feedback
Continued evaluation & consolidation
Mar15
Jun15
Dec15
Jun16
Aug15
May16
Sep16
Mar17
bioCADDIE team’s work on DATS
Our community engagement: input, feedback and links
Phase 1 Phase 2 Phase 3
Design and development
• Use cases collection= George, ExcTeam
• Method development (SOP); competency
questions; metadata mapping and
definition= Susanna, Alejandra, Philippe
SOP and
metadata
strawman
<DATS>
name
DATS
v1.1
May17
DATS v2.0
(with access
metadata,
WG7)
DATS v2.1
(schema.org
JSON-LD)
DATS
v2.2
Metadata
specification V1.0
with JSON
schema
Use cases
workshop
1st DATS
workshop
WG3 formed;
telecons start;
dissemination via
2nd DATS
workshop
WG7 formed;
telecons start
WG12 formed;
telecons start
Evaluation & iterative refinement
• DATS specification, serializations,
refinement= (Susanna), Alejandra,
Philippe and CoreDevTeam, also
based on community feedback
Continued evaluation & consolidation
• Alignment with other community efforts;
documentation and curation guidelines =
Susanna, Alejandra, Philippe, Jared,
ExcTeam and CoreDevTeam
Mar15
Jun15
Dec15
Jun16
Aug15
May16
Sep16
Mar17
Use cases
workshop
bioCADDIE team’s work on DATS
Our community engagement: input, feedback and links
Phase 1 Phase 2 Phase 3
Design and development
• Use cases collection= George, ExcTeam
• Method development (SOP); competency
questions; metadata mapping and
definition= Susanna, Alejandra, Philippe
Evaluation & iterative refinement
• DATS specification, serializations,
refinement= (Susanna), Alejandra,
Philippe and CoreDevTeam, also
based on community feedback
Continued evaluation & consolidation
• Alignment with other community efforts;
documentation and curation guidelines =
Susanna, Alejandra, Philippe, Jared,
ExcTeam and CoreDevTeam
1st DATS
workshop
SOP and
metadata
strawman
WG3 formed;
telecons start;
dissemination via
<DATS>
name
DATS
v1.1
May17
2nd DATS
workshop
DATS v2.0
(with access
metadata,
WG7)
WG7 formed;
telecons start
DATS v2.1
(schema.org
JSON-LD)
DATS
v2.2
primarily metadata modelers
primarily implementers
Metadata
specification V1.0
with JSON
schema
WG12 formed;
telecons start
❖ Enabling discoverability: find and access datasets
❖ Focusing on surfacing key metadata descriptors, such as
✧ information and relations between authors, datasets, publication,
funding sources, nature of biological signal and perturbation etc.
✧ Not the perfect model to represent the experimental details
✧ the level of details and metadata needed to ensure interoperability
and reusability are left to the indexed databases
❖ Better than just having keywords
✧ we have aimed to have maximum coverage of use cases with
minimal number of data elements and relations
What is supposed to do and be?
Metadata elements identified by combining the two complementary approaches
USE CASES: top-down approach SCHEMAS: bottom-up approach
The development process in a nutshell
(v1.0, v1.1, v2.0, v2.1, v2.2)
Extracting requirements from use cases
❖ Selected competency questions
✧ representative set collected from: use cases workshop, white paper, submitted by
the community and from NIH and Phil Bourne’s ADDS office
✧ key metadata elements processed: abstracted, color-coded and terms binned
binned as Material, Process, Information, Properties; relation identified
top-down approach
bottom-up approach
Standing on the shoulders of giants
❖ schema.org
❖ DataCite
❖ RIF-CS
❖ W3C HCLS dataset descriptions (mapping of many models including DCAT, PROV, VOID, Dublin
Core)
❖ Project Open Metadata (used by HealthData.gov is being added in this new iteration)
❖ ……(full list in the DATS specification)
❖ ISA
❖ BioProject
❖ BioSample
❖ ……(full list in the DATS specification)
❖ MiNIML
❖ PRIDE-ml
❖ MAGE-tab
❖ GA4GH metadata schema
❖ SRA xml
❖ CDISC SDM / element of BRIDGE model
❖ ……(full list in the DATS specification)
Convergence
of elements
extracted from
competency
questions
and existing
(generic and
biomedical)
data models
(incl. DataCite,
DCAT, schema.org,
HCLS dataset, RIF-
CS, ISA-Tab, SRA-
xml etc.)
model for scalable indexing
Adoption
of elements extracted
from
and from
core entities
extended entities
❖ The descriptors for each metadata element (Entity), include
✧ Property (describing the Entity), Definition (of each Entity and Property),
Value(s) (allowed for each Property)
Key features of
❖ The descriptors for each metadata element (Entity), include
✧ Property (describing the Entity), Definition (of each Entity and Property),
Value(s) (allowed for each Property)
❖ We have defined a set of core and extended entities
✧ Core elements are generic and applicable to any type of datasets, like the
JATS can describe any type of publication.
✧ Extended elements includes an additional elements, some of which are
specific for life, environmental and biomedical science domains
✧ this set can be further extended as needed
Key features of
❖ The descriptors for each metadata element (Entity), include
✧ Property (describing the Entity), Definition (of each Entity and Property),
Value(s) (allowed for each Property)
❖ We have defined a set of core and extended entities
✧ Core elements are generic and applicable to any type of datasets, like the
JATS can describe any type of publication.
✧ Extended elements includes an additional elements, some of which are
specific for life, environmental and biomedical science domains
✧ this set can be further extended as needed
❖ Entities are not mandatory, in both core and extended set
✧ An entity is used only when applicable to the dataset to be described
✧ In that case, only few of its properties are defined as mandatory
Key features of
❖ Dataset, a core entity catering for any unit of information
✧ archived experimental datasets, which do not change after deposition to the
repository => examples available for dbGAP, GEO, ClinicalTrials.org
✧ datasets in reference knowledge bases, describing dynamic concepts, such
as “genes”, whose definition morphs over time => examples available for
UniProt
❖ Dataset entity is also linked to other digital research objects
✧ Software and Data Standard, which are also part of the NIH Commons, but
the focus on other discovery indexes and therefore are not described in
detail in this model
General design of the
core and extended elements
Of the 20 core elements none is
mandatory
Only few properties of the 20
core elements are mandatory
❖ What is the dataset about?
✧ Material
❖ How was the dataset produced ? Which information does it hold?
✧ Dataset / Data Type with its Information, Method, Platform,
Instrument
❖ Where can a dataset be found?
✧ Dataset, Distribution, Access objects (links to License)
❖ When was the datasets produced, released etc.?
✧ Dates to specify the nature of an event {create, modify, start, end...}
and its timestamp
❖ Who did the work, funded the research, hosts the resources etc.?
✧ Person, Organization and their roles, Grant
Core elements provide the basic info
Interlinking to other indexes
also follows the W3C Data on
the Web Best Practices
DATS follows these, which
also recommend
DatasetDistribution
https://www.w3.org/TR/dwbp
Serializations and use of schema.org
❖ DATS model in JSON schema, serialized as:
✧ JSON* format, and
✧ JSON-LD** with vocabulary from schema.org
✧ serializations in other formats can also be done, as / if needed
❖ Benefits for DataMed and databases index by DataMed
✧ Increased visibility (by both popular search engines), accessibility
(via common query interfaces) and possibly improved ranking
❖ Extending schema.org
✧ Submitted to their tracker missing DATS core elements
✧ Coordinating via the bioschemas.org initiative (ELIXIR is also part of)
the extension of schema.org for life science
* JavaScript Object Notation
** JavaScript Object Notation for Linked Data

Contenu connexe

Tendances

Re tooling for data management-support
Re tooling for data management-supportRe tooling for data management-support
Re tooling for data management-support
Sherry Lake
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
John Kunze
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collection
Sherry Lake
 
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
University of California Curation Center
 

Tendances (20)

Re tooling for data management-support
Re tooling for data management-supportRe tooling for data management-support
Re tooling for data management-support
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
 
RDAP13 Elizabeth Moss: The impact of data reuse
RDAP13 Elizabeth Moss: The impact of data reuseRDAP13 Elizabeth Moss: The impact of data reuse
RDAP13 Elizabeth Moss: The impact of data reuse
 
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FuturePoster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
 
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data AcceptanceRDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
 
Putnam Data Quality and the IR
Putnam Data Quality and the IRPutnam Data Quality and the IR
Putnam Data Quality and the IR
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collection
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
 
Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
 
Scholarly Information Practices In The Online Environment
Scholarly Information Practices In The Online EnvironmentScholarly Information Practices In The Online Environment
Scholarly Information Practices In The Online Environment
 
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goal
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Payton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook MetadataPayton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook Metadata
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018
 

Similaire à Introduction to DATS v2.2 - NIH May 2017

NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
Susanna-Assunta Sansone
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Susanna-Assunta Sansone
 

Similaire à Introduction to DATS v2.2 - NIH May 2017 (20)

NIH BD2K DataMed data index - DATS model
NIH BD2K DataMed data index - DATS modelNIH BD2K DataMed data index - DATS model
NIH BD2K DataMed data index - DATS model
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
 
NIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATSNIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATS
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series DataSharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
 
NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016NIH BD2K DataMed metadata model - Force11, 2016
NIH BD2K DataMed metadata model - Force11, 2016
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
 
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All HandsBioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
BioCADDIE: Descriptive Metadata for Datasets WG3 - ELIXIR All Hands
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
NC3Rs Publication Bias workshop - Sansone - Better Data = Better ScienceNC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
 
L07 metadata
L07 metadataL07 metadata
L07 metadata
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
 
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
 

Plus de Susanna-Assunta Sansone

Plus de Susanna-Assunta Sansone (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
FAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdfFAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdf
 
FAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdfFAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdf
 
FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
FAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-SingaporeFAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-Singapore
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipes
 
FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook
 
FAIRsharing for EOSC
FAIRsharing for EOSC FAIRsharing for EOSC
FAIRsharing for EOSC
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
 
FAIRsharing: what we do for policies
FAIRsharing: what we do for policiesFAIRsharing: what we do for policies
FAIRsharing: what we do for policies
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRness
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 response
 
FAIRsharing poster
FAIRsharing posterFAIRsharing poster
FAIRsharing poster
 
The FAIR Cookbook poster
The FAIR Cookbook posterThe FAIR Cookbook poster
The FAIR Cookbook poster
 

Dernier

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 

Dernier (20)

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 

Introduction to DATS v2.2 - NIH May 2017

  • 1. Supported by the NIH grant 1U24 AI117966-01 to UCSD PI , Co-Investigators at: The annotated with schema.org Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra Oxford e-Research Centre, University of Oxford, UK bioCADDIE - DATS Workshop, Bethesda, 7 May 2017
  • 2.
  • 4. What is ? Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature, a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in the DataMed prototype
  • 5. Where do I find the documentation? Like the JATS (Journal Article Tag Suite) is used by PubMed to index literature, a DATS (DatA Tag Suite) is needed for a scalable way to index data sources in the DataMed prototype
  • 6. Mar15 Jun15 Dec15 Jun16 Aug15 May16 Sep16 Mar17 bioCADDIE team’s work on DATS Our community engagement: input, feedback and links May17
  • 7. Mar15 Jun15 Dec15 Jun16 Aug15 May16 Sep16 Mar17 bioCADDIE team’s work on DATS Our community engagement: input, feedback and links Phase 1 Phase 2 Phase 3 Design and development Continued evaluation & consolidation May17 Evaluation & iterative refinement
  • 8. Mar15 Jun15 Dec15 Jun16 Aug15 May16 Sep16 Mar17 bioCADDIE team’s work on DATS Our community engagement: input, feedback and links Phase 1 Phase 2 Phase 3 Design and development • Use cases collection= George, ExcTeam • Method development (SOP); competency questions; metadata mapping and definition= Susanna, Alejandra, Philippe SOP and metadata strawman <DATS> name May17 Metadata specification V1.0 with JSON schema Use cases workshop WG3 formed; telecons start; dissemination via WG7 formed; telecons start Evaluation & iterative refinement Continued evaluation & consolidation
  • 9. Mar15 Jun15 Dec15 Jun16 Aug15 May16 Sep16 Mar17 bioCADDIE team’s work on DATS Our community engagement: input, feedback and links Phase 1 Phase 2 Phase 3 Design and development • Use cases collection= George, ExcTeam • Method development (SOP); competency questions; metadata mapping and definition= Susanna, Alejandra, Philippe SOP and metadata strawman <DATS> name DATS v1.1 May17 DATS v2.0 (with access metadata, WG7) Metadata specification V1.0 with JSON schema Use cases workshop 1st DATS workshop WG3 formed; telecons start; dissemination via WG7 formed; telecons start Evaluation & iterative refinement • DATS specification, serializations, refinement= (Susanna), Alejandra, Philippe and CoreDevTeam, also based on community feedback Continued evaluation & consolidation
  • 10. Mar15 Jun15 Dec15 Jun16 Aug15 May16 Sep16 Mar17 bioCADDIE team’s work on DATS Our community engagement: input, feedback and links Phase 1 Phase 2 Phase 3 Design and development • Use cases collection= George, ExcTeam • Method development (SOP); competency questions; metadata mapping and definition= Susanna, Alejandra, Philippe SOP and metadata strawman <DATS> name DATS v1.1 May17 DATS v2.0 (with access metadata, WG7) DATS v2.1 (schema.org JSON-LD) DATS v2.2 Metadata specification V1.0 with JSON schema Use cases workshop 1st DATS workshop WG3 formed; telecons start; dissemination via 2nd DATS workshop WG7 formed; telecons start WG12 formed; telecons start Evaluation & iterative refinement • DATS specification, serializations, refinement= (Susanna), Alejandra, Philippe and CoreDevTeam, also based on community feedback Continued evaluation & consolidation • Alignment with other community efforts; documentation and curation guidelines = Susanna, Alejandra, Philippe, Jared, ExcTeam and CoreDevTeam
  • 11. Mar15 Jun15 Dec15 Jun16 Aug15 May16 Sep16 Mar17 Use cases workshop bioCADDIE team’s work on DATS Our community engagement: input, feedback and links Phase 1 Phase 2 Phase 3 Design and development • Use cases collection= George, ExcTeam • Method development (SOP); competency questions; metadata mapping and definition= Susanna, Alejandra, Philippe Evaluation & iterative refinement • DATS specification, serializations, refinement= (Susanna), Alejandra, Philippe and CoreDevTeam, also based on community feedback Continued evaluation & consolidation • Alignment with other community efforts; documentation and curation guidelines = Susanna, Alejandra, Philippe, Jared, ExcTeam and CoreDevTeam 1st DATS workshop SOP and metadata strawman WG3 formed; telecons start; dissemination via <DATS> name DATS v1.1 May17 2nd DATS workshop DATS v2.0 (with access metadata, WG7) WG7 formed; telecons start DATS v2.1 (schema.org JSON-LD) DATS v2.2 primarily metadata modelers primarily implementers Metadata specification V1.0 with JSON schema WG12 formed; telecons start
  • 12. ❖ Enabling discoverability: find and access datasets ❖ Focusing on surfacing key metadata descriptors, such as ✧ information and relations between authors, datasets, publication, funding sources, nature of biological signal and perturbation etc. ✧ Not the perfect model to represent the experimental details ✧ the level of details and metadata needed to ensure interoperability and reusability are left to the indexed databases ❖ Better than just having keywords ✧ we have aimed to have maximum coverage of use cases with minimal number of data elements and relations What is supposed to do and be?
  • 13. Metadata elements identified by combining the two complementary approaches USE CASES: top-down approach SCHEMAS: bottom-up approach The development process in a nutshell (v1.0, v1.1, v2.0, v2.1, v2.2)
  • 14. Extracting requirements from use cases ❖ Selected competency questions ✧ representative set collected from: use cases workshop, white paper, submitted by the community and from NIH and Phil Bourne’s ADDS office ✧ key metadata elements processed: abstracted, color-coded and terms binned binned as Material, Process, Information, Properties; relation identified top-down approach
  • 15. bottom-up approach Standing on the shoulders of giants ❖ schema.org ❖ DataCite ❖ RIF-CS ❖ W3C HCLS dataset descriptions (mapping of many models including DCAT, PROV, VOID, Dublin Core) ❖ Project Open Metadata (used by HealthData.gov is being added in this new iteration) ❖ ……(full list in the DATS specification) ❖ ISA ❖ BioProject ❖ BioSample ❖ ……(full list in the DATS specification) ❖ MiNIML ❖ PRIDE-ml ❖ MAGE-tab ❖ GA4GH metadata schema ❖ SRA xml ❖ CDISC SDM / element of BRIDGE model ❖ ……(full list in the DATS specification)
  • 16. Convergence of elements extracted from competency questions and existing (generic and biomedical) data models (incl. DataCite, DCAT, schema.org, HCLS dataset, RIF- CS, ISA-Tab, SRA- xml etc.) model for scalable indexing Adoption of elements extracted from and from core entities extended entities
  • 17. ❖ The descriptors for each metadata element (Entity), include ✧ Property (describing the Entity), Definition (of each Entity and Property), Value(s) (allowed for each Property) Key features of
  • 18. ❖ The descriptors for each metadata element (Entity), include ✧ Property (describing the Entity), Definition (of each Entity and Property), Value(s) (allowed for each Property) ❖ We have defined a set of core and extended entities ✧ Core elements are generic and applicable to any type of datasets, like the JATS can describe any type of publication. ✧ Extended elements includes an additional elements, some of which are specific for life, environmental and biomedical science domains ✧ this set can be further extended as needed Key features of
  • 19. ❖ The descriptors for each metadata element (Entity), include ✧ Property (describing the Entity), Definition (of each Entity and Property), Value(s) (allowed for each Property) ❖ We have defined a set of core and extended entities ✧ Core elements are generic and applicable to any type of datasets, like the JATS can describe any type of publication. ✧ Extended elements includes an additional elements, some of which are specific for life, environmental and biomedical science domains ✧ this set can be further extended as needed ❖ Entities are not mandatory, in both core and extended set ✧ An entity is used only when applicable to the dataset to be described ✧ In that case, only few of its properties are defined as mandatory Key features of
  • 20. ❖ Dataset, a core entity catering for any unit of information ✧ archived experimental datasets, which do not change after deposition to the repository => examples available for dbGAP, GEO, ClinicalTrials.org ✧ datasets in reference knowledge bases, describing dynamic concepts, such as “genes”, whose definition morphs over time => examples available for UniProt ❖ Dataset entity is also linked to other digital research objects ✧ Software and Data Standard, which are also part of the NIH Commons, but the focus on other discovery indexes and therefore are not described in detail in this model General design of the
  • 21. core and extended elements
  • 22. Of the 20 core elements none is mandatory
  • 23. Only few properties of the 20 core elements are mandatory
  • 24. ❖ What is the dataset about? ✧ Material ❖ How was the dataset produced ? Which information does it hold? ✧ Dataset / Data Type with its Information, Method, Platform, Instrument ❖ Where can a dataset be found? ✧ Dataset, Distribution, Access objects (links to License) ❖ When was the datasets produced, released etc.? ✧ Dates to specify the nature of an event {create, modify, start, end...} and its timestamp ❖ Who did the work, funded the research, hosts the resources etc.? ✧ Person, Organization and their roles, Grant Core elements provide the basic info
  • 26. also follows the W3C Data on the Web Best Practices DATS follows these, which also recommend DatasetDistribution https://www.w3.org/TR/dwbp
  • 27. Serializations and use of schema.org ❖ DATS model in JSON schema, serialized as: ✧ JSON* format, and ✧ JSON-LD** with vocabulary from schema.org ✧ serializations in other formats can also be done, as / if needed ❖ Benefits for DataMed and databases index by DataMed ✧ Increased visibility (by both popular search engines), accessibility (via common query interfaces) and possibly improved ranking ❖ Extending schema.org ✧ Submitted to their tracker missing DATS core elements ✧ Coordinating via the bioschemas.org initiative (ELIXIR is also part of) the extension of schema.org for life science * JavaScript Object Notation ** JavaScript Object Notation for Linked Data