SlideShare une entreprise Scribd logo
1  sur  18
Daniel Jacob – INRA - 2018
How to ensure that open data
works for research
Make your data great again
Daniel Jacob
INRA UMR 1332 BFP – Metabolism Group
Bordeaux Metabolomics Facility
Oct 2018
https://fr.slideshare.net/danieljacob771282/make-your-data-great-now
following
Give an open access to your data
and make them ready to be mined
Open Data for Access and Mining
ODAM Framework
Daniel Jacob – INRA - 2018
Develop if needed, lightweight tools
- R scripts, lightweight GUI (R shiny)
Minimal effort, Maximal efficiency
…
Use existing tools
- Spreadsheets, R studio,
BioStatFlow, Galaxy,
Cytoscape, …
Data
Format
TSV
EDTMS
ODAM
F
A
INTEROPERABLE
R
Experiment
Data Tables
2 metadata files
+
Research question  Project  Experiment  Experimental set-up
 Data emancipation
regarding Tools
Data API  Tools
DataTools
https://fr.slideshare.net/danieljacob771282/make-your-data-great-now
following
Daniel Jacob – INRA - 2018
Develop if needed, lightweight tools
- R scripts, lightweight GUI (R shiny)
…
Use existing tools
- Spreadsheets, R studio,
BioStatFlow, Galaxy,
Cytoscape, …
Data
Format
TSV
Multi-species
Data Integration
Data integration
Towards Linked Data
Phenotype Information System
EDTMS
ODAM
F
A
INTEROPERABLE
R
« Plant Physiology and Metabolism»
https://www.quora.com/What-is-plant-physiology-and-metabolism
« Plant Growth»
Daniel Jacob – INRA - 2018
http://cgi.di.uoa.gr/~pms509/past_lectures/introduction-to-rdf.pdf
EDTMS
ODAM Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
s_subsets.tsv This metadata file allows to associate a key concept to each data subset file
Creation of the metadata files - Subsets
EDTMS
ODAM
Optional:
an annotation based on
ontology
CV Term
X
…
Optional:
an annotation based
on ontology
Plants
Harvests
Samples
Compounds
…
a_attributes.csv This metadata file allows each attribute (variable) to be annotated with some minimal but relevant metadata
CV Term
X
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Data / Metadata
Entities
Attributes
categories
subsets CV Term
s_subsets.tsv
a_attributes.tsv
CV Term ?
attributes CV Term
EDTMS
ODAM Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Data / Metadata
Entities
Attributes
attributes CV Term
subsets CV Term
s_subsets.tsv
a_attributes.tsv
CV Term
Entity + Attribute = Trait
Trait (characteristic / feature)
categories
EDTMS
ODAM Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
TO
Plant Trait
Ontology EO
Plant Env.
Ontology
PO
Plant
Structure &
Dev. Stage
Ontology
CHEBI
Ontology
GO
Ontology
…
TO
EO
PO
Entity + Attribute = Trait
Trait (characteristic / feature)
Plant Trait Ontology
as the core / kernel of all ontologies
http://agroportal.lirmm.fr/ontologies
EDTMS
ODAM Resource Description Framework (RDF)
« Plant Physiology and Metabolism»
« Plant Growth»
Daniel Jacob – INRA - 2018
factor
quantitative
qualitative
identifier
categories
Plants
Compounds
Enzymes
Harvests
Samples
plants.tsv
PlanteID
harvests.tsv
Lot samples.tsv
SampleID
compounds.tsv
enzymes.tsv
SampleID
SampleID
Entities
TO
Plant Trait
Ontology
EO
Plant Env.
Ontology
PO
Plant Structure &
Dev. Stage
Ontology
GO
Ontology
CHEBI
Ontology
…
Attributes CV Term
CV Term
CV Term
http://agroportal.lirmm.fr/ontologies
CV Term
EDTMS
ODAM
a TBox is a "terminological component“
a conceptualization associated with a set of facts
TBox
Reference ontologies
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Data / Metadata
Category CV Term
Entities
Attributes
Typical queries:
Search for a particular Trait
Entity + Attribute = Trait
CV Term
Attribute Subset
CV Term
Category Species
EDTMS
ODAM
an ABox is an "assertion component“
a fact associated with a conceptual model or ontologies within a knowledge base.
ABox
Application ontologies
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
factor
quantitative
qualitative
identifier
rdfs:range
categories
For each
Dataset
RDF
Schema
rdfs:label
<description>
rdfs:label
<description>
#description
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
rdf:Bag
xsd:stringxsd:string
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecies
#description
#hasCategory
xsd:string
TO
EO
PO
CHE
BI
GO
…
Taxo
n
rdf:resource
rdf:resource
…
xsd:string
rdf:resource
CV Term
Abox - Application ontologies
Tbox - Reference ontologies
EDTMS
ODAM
https://schema.org/Dataset
measurementTechniquevariableMeasured
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Category CV Term
Entities
Attributes
Data / Metadata
Traits
Values
Phenotype (observed)
=
Traits + Values
Towards a Phenotype Information System
Automatic populating of the knowledge base
from the metadata files
defined within ODAM data subsets
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecie
s
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Fruit + weight = Fruit weightTrait
Constraint
and
Species = Tomato
Typical queries:
Search for a particular Trait
with or without Constraints
hasSynonym Tomato
Towards a Phenotype Information System
Attributes
Entities
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Fruit + weight = Fruit weightTrait
Constraint
and
Species = Tomato
Typical queries:
Search for a particular Trait
with or without Constraints
Phenotype (observed)
=
(Entity + Attribute) + Values
Towards a Phenotype Information SystemEDTMS
ODAM
Daniel Jacob – INRA - 2018
Category CV Term
Entities
Attributes
Data mapping
Values
Data capture
EDTMS
Entity + Attribute = Trait
Trait (characteristic / feature)
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecies
Data linking
Develop if needed, lightweight tools
- R scripts (Galaxy), lightweight GUI (R shiny)
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Category CV Term
Entities
Attributes
Data mapping
Values
Data capture
EDTMS
Phenotype
(observed)
=
Traits + Values
Data Exploration
Entity + Attribute = Trait
Trait (characteristic / feature)
Towards a Phenotype
Information System
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecies
Data linking
Data = Phenotypic data +
Molecular data +
Environment data
Phenotypic metadata =
Descriptors of Traits
(Entity-Attribute) +
Environment Factors
Data accumulation

Knowledge Base
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Bayes' theorem, the general formula:
y : data  : parameters
[ y,  ] = [ y |  ].[ ] = [ | y].[y]
Where [.] means a density or a probability
Posterior density
or simply the so-
called “posterior”
Prior density of  or simply the
so-called “prior”
Likelihood (function of  )
Marginal density
(data, model)
Model-Based Bayesian Inference:
Data mining
Phenotype
Information
System
Ex : model for
phenotypic variance and
biomass prediction (Y)
based on environmental
parameters ( )
Machine
Learning
« Plant Growth»
Daniel Jacob – INRA - 2018
Make your data great again
 Metadata : not just on the "top"
linked to datasets but more
deeply linked to the variables.
The data management system becomes completely
independent of data usage.
One dataset  Several applications
&
One application  Several datasets
Making open data work for research
Data accumulation

Knowledge Base
 Keep data “alive” into the data process loop
 to similar way as for DNA/Protein
sequences where sequences can be
integrated into annotation pipelines.
Machine Learning
Model-Based Bayesian Inference:

Contenu connexe

Tendances

PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...Araport
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIIVivek Krishnakumar
 
2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jkJohannes Keizer
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector BuilderMark Wilkinson
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Mark Wilkinson
 
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
Publishing and Consuming FAIR DataA Case in the Agri-Food DomainPublishing and Consuming FAIR DataA Case in the Agri-Food Domain
Publishing and Consuming FAIR Data A Case in the Agri-Food DomainRothamsted Research, UK
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMCarole Goble
 
ICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanAraport
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
Kampmeier ecn 2012
Kampmeier ecn 2012Kampmeier ecn 2012
Kampmeier ecn 2012ECNOfficer
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview LeafletAraport
 
Vaughn aip walkthru_pag2015
Vaughn aip walkthru_pag2015Vaughn aip walkthru_pag2015
Vaughn aip walkthru_pag2015Araport
 

Tendances (20)

PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
 
2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector Builder
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
 
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
Publishing and Consuming FAIR DataA Case in the Agri-Food DomainPublishing and Consuming FAIR DataA Case in the Agri-Food Domain
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
 
SWAT4LS 2014 SLIDE by Yamamoto
SWAT4LS 2014 SLIDE by YamamotoSWAT4LS 2014 SLIDE by Yamamoto
SWAT4LS 2014 SLIDE by Yamamoto
 
ICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes Chan
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
Kampmeier ecn 2012
Kampmeier ecn 2012Kampmeier ecn 2012
Kampmeier ecn 2012
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet
 
Vaughn aip walkthru_pag2015
Vaughn aip walkthru_pag2015Vaughn aip walkthru_pag2015
Vaughn aip walkthru_pag2015
 

Similaire à Make your data great again - Ver 2

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Takeshi Morita
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
 
Odam: Open Data, Access and Mining
Odam: Open Data, Access and MiningOdam: Open Data, Access and Mining
Odam: Open Data, Access and MiningDaniel JACOB
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadatasuyu22
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...ICZN
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordMark Wilkinson
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webFabien Gandon
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
 
Make your data great now
Make your data great nowMake your data great now
Make your data great nowDaniel JACOB
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Mark Wilkinson
 

Similaire à Make your data great again - Ver 2 (20)

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshow
 
Odam: Open Data, Access and Mining
Odam: Open Data, Access and MiningOdam: Open Data, Access and Mining
Odam: Open Data, Access and Mining
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Preservation Metadata
Preservation MetadataPreservation Metadata
Preservation Metadata
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the web
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
 
Exploring Linked Data
Exploring Linked DataExploring Linked Data
Exploring Linked Data
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 

Dernier

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 

Dernier (20)

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 

Make your data great again - Ver 2

  • 1. Daniel Jacob – INRA - 2018 How to ensure that open data works for research Make your data great again Daniel Jacob INRA UMR 1332 BFP – Metabolism Group Bordeaux Metabolomics Facility Oct 2018 https://fr.slideshare.net/danieljacob771282/make-your-data-great-now following Give an open access to your data and make them ready to be mined Open Data for Access and Mining ODAM Framework
  • 2. Daniel Jacob – INRA - 2018 Develop if needed, lightweight tools - R scripts, lightweight GUI (R shiny) Minimal effort, Maximal efficiency … Use existing tools - Spreadsheets, R studio, BioStatFlow, Galaxy, Cytoscape, … Data Format TSV EDTMS ODAM F A INTEROPERABLE R Experiment Data Tables 2 metadata files + Research question  Project  Experiment  Experimental set-up  Data emancipation regarding Tools Data API  Tools DataTools https://fr.slideshare.net/danieljacob771282/make-your-data-great-now following
  • 3. Daniel Jacob – INRA - 2018 Develop if needed, lightweight tools - R scripts, lightweight GUI (R shiny) … Use existing tools - Spreadsheets, R studio, BioStatFlow, Galaxy, Cytoscape, … Data Format TSV Multi-species Data Integration Data integration Towards Linked Data Phenotype Information System EDTMS ODAM F A INTEROPERABLE R « Plant Physiology and Metabolism» https://www.quora.com/What-is-plant-physiology-and-metabolism « Plant Growth»
  • 4. Daniel Jacob – INRA - 2018 http://cgi.di.uoa.gr/~pms509/past_lectures/introduction-to-rdf.pdf EDTMS ODAM Resource Description Framework (RDF)
  • 5. Daniel Jacob – INRA - 2018 s_subsets.tsv This metadata file allows to associate a key concept to each data subset file Creation of the metadata files - Subsets EDTMS ODAM Optional: an annotation based on ontology CV Term X … Optional: an annotation based on ontology Plants Harvests Samples Compounds … a_attributes.csv This metadata file allows each attribute (variable) to be annotated with some minimal but relevant metadata CV Term X Resource Description Framework (RDF)
  • 6. Daniel Jacob – INRA - 2018 Data / Metadata Entities Attributes categories subsets CV Term s_subsets.tsv a_attributes.tsv CV Term ? attributes CV Term EDTMS ODAM Resource Description Framework (RDF)
  • 7. Daniel Jacob – INRA - 2018 Data / Metadata Entities Attributes attributes CV Term subsets CV Term s_subsets.tsv a_attributes.tsv CV Term Entity + Attribute = Trait Trait (characteristic / feature) categories EDTMS ODAM Resource Description Framework (RDF)
  • 8. Daniel Jacob – INRA - 2018 TO Plant Trait Ontology EO Plant Env. Ontology PO Plant Structure & Dev. Stage Ontology CHEBI Ontology GO Ontology … TO EO PO Entity + Attribute = Trait Trait (characteristic / feature) Plant Trait Ontology as the core / kernel of all ontologies http://agroportal.lirmm.fr/ontologies EDTMS ODAM Resource Description Framework (RDF) « Plant Physiology and Metabolism» « Plant Growth»
  • 9. Daniel Jacob – INRA - 2018 factor quantitative qualitative identifier categories Plants Compounds Enzymes Harvests Samples plants.tsv PlanteID harvests.tsv Lot samples.tsv SampleID compounds.tsv enzymes.tsv SampleID SampleID Entities TO Plant Trait Ontology EO Plant Env. Ontology PO Plant Structure & Dev. Stage Ontology GO Ontology CHEBI Ontology … Attributes CV Term CV Term CV Term http://agroportal.lirmm.fr/ontologies CV Term EDTMS ODAM a TBox is a "terminological component“ a conceptualization associated with a set of facts TBox Reference ontologies Resource Description Framework (RDF)
  • 10. Daniel Jacob – INRA - 2018 Data / Metadata Category CV Term Entities Attributes Typical queries: Search for a particular Trait Entity + Attribute = Trait CV Term Attribute Subset CV Term Category Species EDTMS ODAM an ABox is an "assertion component“ a fact associated with a conceptual model or ontologies within a knowledge base. ABox Application ontologies Resource Description Framework (RDF)
  • 11. Daniel Jacob – INRA - 2018 factor quantitative qualitative identifier rdfs:range categories For each Dataset RDF Schema rdfs:label <description> rdfs:label <description> #description Attributes Subsets attribute node subset node rdf:type rdf:type rdf:Bag xsd:stringxsd:string Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecies #description #hasCategory xsd:string TO EO PO CHE BI GO … Taxo n rdf:resource rdf:resource … xsd:string rdf:resource CV Term Abox - Application ontologies Tbox - Reference ontologies EDTMS ODAM https://schema.org/Dataset measurementTechniquevariableMeasured Resource Description Framework (RDF)
  • 12. Daniel Jacob – INRA - 2018 Category CV Term Entities Attributes Data / Metadata Traits Values Phenotype (observed) = Traits + Values Towards a Phenotype Information System Automatic populating of the knowledge base from the metadata files defined within ODAM data subsets Attributes Subsets attribute node subset node rdf:type rdf:type Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecie s EDTMS ODAM
  • 13. Daniel Jacob – INRA - 2018 Fruit + weight = Fruit weightTrait Constraint and Species = Tomato Typical queries: Search for a particular Trait with or without Constraints hasSynonym Tomato Towards a Phenotype Information System Attributes Entities EDTMS ODAM
  • 14. Daniel Jacob – INRA - 2018 Fruit + weight = Fruit weightTrait Constraint and Species = Tomato Typical queries: Search for a particular Trait with or without Constraints Phenotype (observed) = (Entity + Attribute) + Values Towards a Phenotype Information SystemEDTMS ODAM
  • 15. Daniel Jacob – INRA - 2018 Category CV Term Entities Attributes Data mapping Values Data capture EDTMS Entity + Attribute = Trait Trait (characteristic / feature) Attributes Subsets attribute node subset node rdf:type rdf:type Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecies Data linking Develop if needed, lightweight tools - R scripts (Galaxy), lightweight GUI (R shiny) EDTMS ODAM
  • 16. Daniel Jacob – INRA - 2018 Category CV Term Entities Attributes Data mapping Values Data capture EDTMS Phenotype (observed) = Traits + Values Data Exploration Entity + Attribute = Trait Trait (characteristic / feature) Towards a Phenotype Information System Attributes Subsets attribute node subset node rdf:type rdf:type Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecies Data linking Data = Phenotypic data + Molecular data + Environment data Phenotypic metadata = Descriptors of Traits (Entity-Attribute) + Environment Factors Data accumulation  Knowledge Base EDTMS ODAM
  • 17. Daniel Jacob – INRA - 2018 Bayes' theorem, the general formula: y : data  : parameters [ y,  ] = [ y |  ].[ ] = [ | y].[y] Where [.] means a density or a probability Posterior density or simply the so- called “posterior” Prior density of  or simply the so-called “prior” Likelihood (function of  ) Marginal density (data, model) Model-Based Bayesian Inference: Data mining Phenotype Information System Ex : model for phenotypic variance and biomass prediction (Y) based on environmental parameters ( ) Machine Learning « Plant Growth»
  • 18. Daniel Jacob – INRA - 2018 Make your data great again  Metadata : not just on the "top" linked to datasets but more deeply linked to the variables. The data management system becomes completely independent of data usage. One dataset  Several applications & One application  Several datasets Making open data work for research Data accumulation  Knowledge Base  Keep data “alive” into the data process loop  to similar way as for DNA/Protein sequences where sequences can be integrated into annotation pipelines. Machine Learning Model-Based Bayesian Inference:

Notes de l'éditeur

  1. Trait vs Phenotype Entity + Attribute = Trait (observable) Entity + (Attribute + Value) = Phenotype (observed)
  2. an ABox is an "assertion component"—a fact associated with a terminological vocabulary within a knowledge base
  3. TBox statements describe a system in terms of controlled vocabularies, for example, a set of classes and properties. ABox are TBox-compliant statements about that vocabulary.
  4. Questions types: Quel est l’ensemble des “Traits” (quantitative/qualitative) pour un échantillon (identifiant) donné ? Quel est l’ensemble des “Traits” (quantitative/qualitative) pour un ou plusieurs CV donnés { type de subsets: ex: CV subset in (metabolite,enzyme)(CHEBI) ; type d’attribut: ex CV attribute ==tissue == “fruit pericarp” (PO) }, avec ou sans contrainte suppl. Ex : type de factor