SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
From peer-reviewed to peer-reproduced:
a role for research objects in scholarly
publishing in the life
sciences
Alejandra González-Beltrán
Oxford e-Research Centre, University of Oxford
-ontology.org
Bioinformatics Open Source Conference (BOSC), Dublin, Ireland
July 10-11 2015
"AGBell Notebook" by Alexander Graham Bell. (d. 1922) -
page 40-41 of Alexander Graham Bell Family Papers in the Library of Congress' Manuscript Division.
Licensed under Public Domain via Wikimedia Commons
- http://commons.wikimedia.org/wiki/File:AGBell_Notebook.jpg#/media/File:AGBell_Notebook.jpg
http://petcaretips.net/bonding-rabbit-to-pets.html
Many things have been said about
the challenges of
science reproducibility
and how it can go wrong…
Difficulties when the description
of the experimental steps
is only available in
lab notebooks and scientific articles;
lack of data,
lack of software tools
required for analysis
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
experimental
description
(design & steps)
conclusions
computational
workflows
aggregation & workflow preservation
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
• open peer-review
• availability of
• data
• analysis scripts
• documentation
Evaluation of SOAPdenovo2 tool for the de novo assembly of genomes from small DNA segments
reads by next generation sequencing, implementing improvements over SOAPdenovo1 assembler.
pre-publication history
https://github.com/aquaskyline/SOAPdenovo2
http://sourceforge.net/projects/soapdenovo2/
Experimental Description
Experimental Description
EXCELERATE interoperability component
http://www.ncbi.nlm.nih.gov/books/NBK279831/
http://elixir-uk.org/interoperability-infrastructure
genome
assembly
algorithm
genome
size
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
3x3 factorial design
9 study groups
The experimental plan - computational case
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
S. aureus
R. sphaeroides
B. impatiens
Chinese Han genome
(orYH genome)
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
Response Variables
(with units)
genome coverage (%)
computation run time (h)
peak memory consumption (Gb)
contig N50 (kb or bp)
scaffold N50 (kb or bp)
number of errors
The experimental steps
Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers
if available (ORCIDs, DOIs); we suggest a dedicated article section
Experimental workflows - identification of processes, their inputs and outputs
Experimental design: identify experimental goal, independent and response variables
The experimental steps
Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers
if available (ORCIDs, DOIs); dedicated article section
Experimental workflows - identification of processes, their inputs and outputs
Experimental design: identify experimental goal, independent and response variables
Reproducing SOAPdenovo2 results
with Galaxy workflows
S. aureus pipeline
Reproducing SOAPdenovo2 results
with Galaxy workflows
S. aureus pipeline
2241 400
30
119.0 11 106 24 68
0
Reproducing SOAPdenovo2 results
with Galaxy workflows
Publishing findings as nanopublications
assertion
provenance
publication info
nanopublication A NP represents structured data along with its
provenance in a single publishable and citable entity
Publishing findings as nanopublications
assertion
provenance
publication info
nanopublication A NP represents structured data along with its
provenance in a single publishable and citable entity
Abstract & Conclusions
assertion provenance
Generation of nanopublications for all the results of the
response variables
NanoMaton
templates for nanopublications
Prevent priming; report all findings corresponding to the identified
response variables
Remain neutral and report all findings of similar
importance with the same weight
“genome coverage increased
over the human data when
comparing SOAPdenovo2
against SOAPdenovo1”
Link conclusions
to
experimental
description
http://www.researchobject.org/
Aggregation and workflow preservation as
ResearchObject: enables the aggregation of the digital
resources contributing to findings of computational
research, including results, data and software, as citable
compound digital objects
http://isa-tools.github.io/soapdenovo2
Aggregation and workflow preservation as
http://www.researchobject.org/
From narrative to self-described structured data
Model & workflow assisted experimental description and review process
Depth and breadth of semantic resources, clear meaning of experimental
elements
Ruibang Luo, University of Hong Kong
Tin-Lap Lee, Chinese University of Hong Kong
Tak-wah Lam, University of Hong Kong
SOAPdenovo2
Scott Edmunds, GigaScience
Peter Li, GigaScience
Marco Roos, Leiden University
Mark Thompson, Leiden University
Rajaram Kaliyaperumal, Leiden University
Eelke van der Horst, Leiden University
Jun Zhao, Lancaster University
María Susana Avila García,
Oxford University
Philippe Rocca-Serra, Oxford University
Susanna-Assunta Sansone, Oxford University
Alejandra Gonzalez-Beltran, Oxford University
Team
Questions?
You can email us...
isatools@googlegroups.com
View our blog
http://isatools.wordpress.com
Follow us onTwitter
@isatools
View our websites
View our Git repo & contribute
http://github.com/ISA-tools
Thanks for your attention!

Contenu connexe

Tendances

Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Cassava genome hub
Cassava genome hubCassava genome hub
Cassava genome hubCIAT
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkBOSC 2010
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportAraport
 
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...Jean Fan
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitData Con LA
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview LeafletAraport
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartAraport
 
Plant ontology web services on Araport
Plant ontology web services on AraportPlant ontology web services on Araport
Plant ontology web services on AraportAraport
 
Architecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgArchitecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgpetermurrayrust
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenisBOSC 2010
 
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...Araport
 

Tendances (20)

CV_10/17
CV_10/17CV_10/17
CV_10/17
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 
ROHub
ROHubROHub
ROHub
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Cassava genome hub
Cassava genome hubCassava genome hub
Cassava genome hub
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 
Plant ontology web services on Araport
Plant ontology web services on AraportPlant ontology web services on Araport
Plant ontology web services on Araport
 
Architecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgArchitecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.org
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
 

Similaire à From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences

Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Alejandra Gonzalez-Beltran
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMonica Munoz-Torres
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterMonica Munoz-Torres
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Carole Goble
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsChris Mungall
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Monica Munoz-Torres
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
 
Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsPeter van Heusden
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKIlgın Kavaklıoğulları
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 
Modularity and evolvability
Modularity and evolvabilityModularity and evolvability
Modularity and evolvabilitypedrobeltrao
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009bosc
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionPaul Groth
 

Similaire à From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences (20)

Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of Exeter
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformatics
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
Reproducibility 1
Reproducibility 1Reproducibility 1
Reproducibility 1
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformatics
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
 
OpenSciNY Open Notebook Science
OpenSciNY Open Notebook ScienceOpenSciNY Open Notebook Science
OpenSciNY Open Notebook Science
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Modularity and evolvability
Modularity and evolvabilityModularity and evolvability
Modularity and evolvability
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 

Plus de Alejandra Gonzalez-Beltran

The Software Sustainability Institute Fellowship
The Software Sustainability Institute FellowshipThe Software Sustainability Institute Fellowship
The Software Sustainability Institute FellowshipAlejandra Gonzalez-Beltran
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedAlejandra Gonzalez-Beltran
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseAlejandra Gonzalez-Beltran
 
ISA commons - overview and latest developments
ISA commons - overview and latest developmentsISA commons - overview and latest developments
ISA commons - overview and latest developmentsAlejandra Gonzalez-Beltran
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOAlejandra Gonzalez-Beltran
 
Brazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare sessionBrazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare sessionAlejandra Gonzalez-Beltran
 

Plus de Alejandra Gonzalez-Beltran (18)

The Software Sustainability Institute Fellowship
The Software Sustainability Institute FellowshipThe Software Sustainability Institute Fellowship
The Software Sustainability Institute Fellowship
 
CMSO Minimal reporting requirements
CMSO Minimal reporting requirementsCMSO Minimal reporting requirements
CMSO Minimal reporting requirements
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMed
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, Visualise
 
ISA commons - overview and latest developments
ISA commons - overview and latest developmentsISA commons - overview and latest developments
ISA commons - overview and latest developments
 
Metadata for Interoperable Bioscience
Metadata for Interoperable BioscienceMetadata for Interoperable Bioscience
Metadata for Interoperable Bioscience
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
 
Brazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare sessionBrazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare session
 
COPO kick-off meeting
COPO kick-off meetingCOPO kick-off meeting
COPO kick-off meeting
 
UKON 2014
UKON 2014UKON 2014
UKON 2014
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
Beyond the PDF 2, 2013
Beyond the PDF 2, 2013Beyond the PDF 2, 2013
Beyond the PDF 2, 2013
 
BCU 2013
BCU 2013BCU 2013
BCU 2013
 
CSHALS 2013
CSHALS 2013CSHALS 2013
CSHALS 2013
 
SELENfest 2012
SELENfest 2012SELENfest 2012
SELENfest 2012
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012
 

Dernier

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxabhishekdhamu51
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 

Dernier (20)

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 

From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences

  • 1. From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences Alejandra González-Beltrán Oxford e-Research Centre, University of Oxford -ontology.org Bioinformatics Open Source Conference (BOSC), Dublin, Ireland July 10-11 2015
  • 2.
  • 3. "AGBell Notebook" by Alexander Graham Bell. (d. 1922) - page 40-41 of Alexander Graham Bell Family Papers in the Library of Congress' Manuscript Division. Licensed under Public Domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:AGBell_Notebook.jpg#/media/File:AGBell_Notebook.jpg http://petcaretips.net/bonding-rabbit-to-pets.html Many things have been said about the challenges of science reproducibility and how it can go wrong… Difficulties when the description of the experimental steps is only available in lab notebooks and scientific articles; lack of data, lack of software tools required for analysis
  • 4. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How? experimental description (design & steps) conclusions computational workflows aggregation & workflow preservation
  • 5. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How?
  • 6. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How?
  • 7. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How?
  • 8. • open peer-review • availability of • data • analysis scripts • documentation Evaluation of SOAPdenovo2 tool for the de novo assembly of genomes from small DNA segments reads by next generation sequencing, implementing improvements over SOAPdenovo1 assembler. pre-publication history https://github.com/aquaskyline/SOAPdenovo2 http://sourceforge.net/projects/soapdenovo2/
  • 10. Experimental Description EXCELERATE interoperability component http://www.ncbi.nlm.nih.gov/books/NBK279831/ http://elixir-uk.org/interoperability-infrastructure
  • 11. genome assembly algorithm genome size Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case
  • 12. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case
  • 13. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome bacterial genome insect genome human genome bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) 3x3 factorial design 9 study groups The experimental plan - computational case
  • 14. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome bacterial genome insect genome human genome bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case S. aureus R. sphaeroides B. impatiens Chinese Han genome (orYH genome)
  • 15. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome bacterial genome insect genome human genome bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case Response Variables (with units) genome coverage (%) computation run time (h) peak memory consumption (Gb) contig N50 (kb or bp) scaffold N50 (kb or bp) number of errors
  • 16. The experimental steps Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers if available (ORCIDs, DOIs); we suggest a dedicated article section Experimental workflows - identification of processes, their inputs and outputs Experimental design: identify experimental goal, independent and response variables
  • 17. The experimental steps Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers if available (ORCIDs, DOIs); dedicated article section Experimental workflows - identification of processes, their inputs and outputs Experimental design: identify experimental goal, independent and response variables
  • 18. Reproducing SOAPdenovo2 results with Galaxy workflows S. aureus pipeline
  • 19. Reproducing SOAPdenovo2 results with Galaxy workflows S. aureus pipeline
  • 20. 2241 400 30 119.0 11 106 24 68 0 Reproducing SOAPdenovo2 results with Galaxy workflows
  • 21. Publishing findings as nanopublications assertion provenance publication info nanopublication A NP represents structured data along with its provenance in a single publishable and citable entity
  • 22. Publishing findings as nanopublications assertion provenance publication info nanopublication A NP represents structured data along with its provenance in a single publishable and citable entity Abstract & Conclusions assertion provenance Generation of nanopublications for all the results of the response variables NanoMaton templates for nanopublications Prevent priming; report all findings corresponding to the identified response variables Remain neutral and report all findings of similar importance with the same weight
  • 23. “genome coverage increased over the human data when comparing SOAPdenovo2 against SOAPdenovo1” Link conclusions to experimental description
  • 24. http://www.researchobject.org/ Aggregation and workflow preservation as ResearchObject: enables the aggregation of the digital resources contributing to findings of computational research, including results, data and software, as citable compound digital objects
  • 25. http://isa-tools.github.io/soapdenovo2 Aggregation and workflow preservation as http://www.researchobject.org/
  • 26. From narrative to self-described structured data Model & workflow assisted experimental description and review process Depth and breadth of semantic resources, clear meaning of experimental elements
  • 27. Ruibang Luo, University of Hong Kong Tin-Lap Lee, Chinese University of Hong Kong Tak-wah Lam, University of Hong Kong SOAPdenovo2 Scott Edmunds, GigaScience Peter Li, GigaScience Marco Roos, Leiden University Mark Thompson, Leiden University Rajaram Kaliyaperumal, Leiden University Eelke van der Horst, Leiden University Jun Zhao, Lancaster University María Susana Avila García, Oxford University Philippe Rocca-Serra, Oxford University Susanna-Assunta Sansone, Oxford University Alejandra Gonzalez-Beltran, Oxford University Team
  • 28. Questions? You can email us... isatools@googlegroups.com View our blog http://isatools.wordpress.com Follow us onTwitter @isatools View our websites View our Git repo & contribute http://github.com/ISA-tools Thanks for your attention!