SlideShare une entreprise Scribd logo
1  sur  14
FAIR
Computational
Workflows
Professor Carole Goble
The University of Manchester UK
EU Research Infrastructures ELIXIR, IBISBA, EOSC-Life
BioExcel Centre of Excellence
Software Sustainability Institute UK
FAIRDOM Consortium
carole.goble@manchester.ac.uk
FAIReScience, IEEE eScience, 20th September 2021
Computational Workflows for Data intensive Bioscience
prepare, analyze, and share increasing volumes of complex data
CryoEM Image Analysis
Metagenomic Pipelines
Protein Ligand
Simulation
[Adam Hospital]
[Rob Finn]
[Carlos Oscar Sorzano Sanchez]
Nature 573, 149-150 (2019)
https://doi.org/10.1038/d41586-019-02619-z
Multi-step processes to
coordinate and execute multiple
codes and handle data and
processing dependencies
Typically Data flows
Benefit from FAIR data with
machine processable metadata
A precise description
A special kind of software
Workflow Management Systems FAIR bits
Abstraction: Separation of the workflow specification from its execution & tools
FAIR stratification, FAIR all the way down
FAIR Software
FAIR Data
FAIR Data FAIR Services
Image credit: BioExcel Centre of Excellence
Composition & Portability
different
components,
codes,
languages,
third parties
Workflow Management Systems FAIR bits
Composition: modularisation, FAIR parts & dependencies, propagation of FAIR properties
FAIR all the way down, versions, parts recycled, repurposed, remixed, citable credit
Workflow System Landscape
Inter-twingled, mix and matching
Scripting
environments
Interactive Electronic
Research Notebooks
Repositories Registries
Workflow
Management
Systems & execution
platforms
https://s.apache.org/existing-workflow-systems
298 Systems
General and Specialised
General Repositories
Identifiable
Community
FAIR Principles for Workflows
Hybrid Processual Digital Objects
Method “Data” Objects
Workflows as
FAIR Software
FAIR+R and FAIR++
Quality, maturity, maintainability
The principles revised
Workflows as
FAIR Digital Objects
Data-like method objects
Associated objects
The principles adapted
Workflows as
FAIR Data Instruments
FAIRification of the dataflow
The data principles supported
C. Goble, S. Cohen-Boulakia, S.
Soiland-Reyes, D. Garijo, Y. Gil, M.R.
Crusoe, K. Peters & D. Schober. FAIR
computational workflows. Data
Intelligence 2(2020), 108–121.
doi: 10.1162/dint_a_000
Workflow Objects
Software Objects
Data FAIRification
Efforts: Workflow Findability and Accessibility
Registries: lifecycle support for living workflows and associated objects
Identifiers: DOIs, ORCID, ROR etc
Licensing, credit, attribution
Support versions, reuse & remix
Workflow libraries
Access workflows at source, Github support
Auto / manual harvested metadata
Registry – execution integration
Execution monitoring services
Onboard WfMS platforms
Metadata standards framework
Metadata by stealth
https://workflowhub.eu
Publishing Services
Journals
scripts
Repos
Containers Deploys
Tools
https://dockstore.org/
Registries
Efforts: Workflow Metadata Frameworks
Metadata for machines & people, for WfMS, Registries & Services
Common metadata
about the workflow,
tools & parameters
Canonical workflow
description of the
steps of the workflow
Type the input and outputs
of the steps
Run Provenance /
Histories / Tests
RO-Crate format for
packaging a workflow, its
metadata and companion
objects (links to containers,
data etc) for exchange,
archiving, reporting, citing.
FAIR Digital Object
Open
Communities
Efforts: Workflow Interoperability
1. Workflow spec & WfMS
interoperability: describe workflows
independently of WfMS. Platform
independent pipeline exchange and
comparison.
2. Workflow Composability: Software interoperates
through APIs and metadata standards (FAIR4RS*).
Workflow-ready tools.
Recycle tested & validated canonical workflow blocks.
https://openwdl.org/
https://www.commonwl.org
Design for FAIR Data
& FAIR Workflow Reuse
Review
Curation
Certification
Governance
Licence combinations
Access permissions
Local -> Global identifiers
Best
Practice
* FAIR4RS First Draft of FAIR4RS principles
Efforts: Workflow Reusability and Usability
FAIR+R, FAIR++, FAIR4RS
Reusable – “can be understood, modified, built upon or incorporated into other software
workflows” Composability + Associated Objects + Metadata
Usable – “can be executed”
Containers & Packaging Testing & monitoring Execution standards APIs
Tool Registry Service API
checker workflows
test data
A2. metadata are accessible, even when the workflow is no longer available
Enough metadata that a workflow is read-reproducible as a method description if it no longer runs
Effort: Workflows as functions for FAIR Data
Data FAIRification of Workflows, assisted by WfMS & reporting
Challenge of diverse API & AAI landscape, formats and packaging
Review
Curation
Certification
Governance
Best Practice
Golden Examples
Canonical
workflows
Manage
AAI, format,
packaging
choices
Design for
FAIR Data
and Reuse
FAIR Computational Workflows
Hybrid Processual Digital Objects
Data + Software FAIR Principles
Data FAIRification methods
WfMS support
FAIR takes a village
Community of projects, WfMS, platforms &
environments, stakeholders.
Long tail pattern. Collective action by a few
WfMS and services nails 80:20.
FAIR by stealth.
Borgman, C. L., & Bourne, P. E. (2021). Why it takes a village to manage and share data. Harvard Data Science Review (under Review), arXiv:2109.01694v1.
EOSC-Life https://www.eosc-life.eu/
RO-Crate https://www.researchobject.org/ro-crate/
WorkflowHub https://workflowhub.eu/
Galaxy Europe https://galaxyproject.eu/
Bioschemas https://bioschemas.org/
Common Workflow Language https://www.commonwl.org/
Dockstore https://dockstore.org/
WorkflowsRI https://workflowsri.org/
Acknowledgements
Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo,
Yolanda Gil, Michael Crusoe, Kristian Peters, Daniel
Schober
FAIR Computational Workflows

Contenu connexe

Tendances

FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
Carole Goble
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
Carole Goble
 

Tendances (20)

Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher?
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction)
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
Building Federated FAIR Data Spaces, Yann Le Franc, EOSC-Pillar
Building Federated FAIR Data Spaces, Yann Le Franc, EOSC-PillarBuilding Federated FAIR Data Spaces, Yann Le Franc, EOSC-Pillar
Building Federated FAIR Data Spaces, Yann Le Franc, EOSC-Pillar
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
 

Similaire à FAIR Computational Workflows

Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
Carole Goble
 
Pricing and business model Fusepool
Pricing and business model FusepoolPricing and business model Fusepool
Pricing and business model Fusepool
Fusepool SME project
 

Similaire à FAIR Computational Workflows (20)

FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
 
070416 Egu Vienna Husar
070416 Egu Vienna Husar070416 Egu Vienna Husar
070416 Egu Vienna Husar
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications database
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Bigdata
BigdataBigdata
Bigdata
 
061206 Ua Huntsville Seminar
061206 Ua Huntsville Seminar061206 Ua Huntsville Seminar
061206 Ua Huntsville Seminar
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Bay Area Azure Meetup - Ignite update session
Bay Area Azure Meetup - Ignite update sessionBay Area Azure Meetup - Ignite update session
Bay Area Azure Meetup - Ignite update session
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Fedora
FedoraFedora
Fedora
 
Introduction to Persistent Identifiers| www.eudat.eu |
Introduction to Persistent Identifiers| www.eudat.eu | Introduction to Persistent Identifiers| www.eudat.eu |
Introduction to Persistent Identifiers| www.eudat.eu |
 
AIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
AIOps: Anomalous Span Detection in Distributed Traces Using Deep LearningAIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
AIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
 
BigData
BigDataBigData
BigData
 
Build and Modernize Intelligent Apps​
Build and Modernize Intelligent Apps​Build and Modernize Intelligent Apps​
Build and Modernize Intelligent Apps​
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azure
 
OGCE SciDAC2010 Tutorial
OGCE SciDAC2010 TutorialOGCE SciDAC2010 Tutorial
OGCE SciDAC2010 Tutorial
 
Pricing and business model Fusepool
Pricing and business model FusepoolPricing and business model Fusepool
Pricing and business model Fusepool
 

Plus de Carole Goble

Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Carole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 

Plus de Carole Goble (13)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 

Dernier

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 

Dernier (20)

Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 

FAIR Computational Workflows

  • 1. FAIR Computational Workflows Professor Carole Goble The University of Manchester UK EU Research Infrastructures ELIXIR, IBISBA, EOSC-Life BioExcel Centre of Excellence Software Sustainability Institute UK FAIRDOM Consortium carole.goble@manchester.ac.uk FAIReScience, IEEE eScience, 20th September 2021
  • 2. Computational Workflows for Data intensive Bioscience prepare, analyze, and share increasing volumes of complex data CryoEM Image Analysis Metagenomic Pipelines Protein Ligand Simulation [Adam Hospital] [Rob Finn] [Carlos Oscar Sorzano Sanchez] Nature 573, 149-150 (2019) https://doi.org/10.1038/d41586-019-02619-z Multi-step processes to coordinate and execute multiple codes and handle data and processing dependencies Typically Data flows Benefit from FAIR data with machine processable metadata A precise description A special kind of software
  • 3. Workflow Management Systems FAIR bits Abstraction: Separation of the workflow specification from its execution & tools FAIR stratification, FAIR all the way down FAIR Software FAIR Data FAIR Data FAIR Services
  • 4. Image credit: BioExcel Centre of Excellence Composition & Portability different components, codes, languages, third parties Workflow Management Systems FAIR bits Composition: modularisation, FAIR parts & dependencies, propagation of FAIR properties FAIR all the way down, versions, parts recycled, repurposed, remixed, citable credit
  • 5. Workflow System Landscape Inter-twingled, mix and matching Scripting environments Interactive Electronic Research Notebooks Repositories Registries Workflow Management Systems & execution platforms https://s.apache.org/existing-workflow-systems 298 Systems General and Specialised General Repositories Identifiable Community
  • 6. FAIR Principles for Workflows Hybrid Processual Digital Objects Method “Data” Objects Workflows as FAIR Software FAIR+R and FAIR++ Quality, maturity, maintainability The principles revised Workflows as FAIR Digital Objects Data-like method objects Associated objects The principles adapted Workflows as FAIR Data Instruments FAIRification of the dataflow The data principles supported C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes, D. Garijo, Y. Gil, M.R. Crusoe, K. Peters & D. Schober. FAIR computational workflows. Data Intelligence 2(2020), 108–121. doi: 10.1162/dint_a_000 Workflow Objects Software Objects Data FAIRification
  • 7. Efforts: Workflow Findability and Accessibility Registries: lifecycle support for living workflows and associated objects Identifiers: DOIs, ORCID, ROR etc Licensing, credit, attribution Support versions, reuse & remix Workflow libraries Access workflows at source, Github support Auto / manual harvested metadata Registry – execution integration Execution monitoring services Onboard WfMS platforms Metadata standards framework Metadata by stealth https://workflowhub.eu Publishing Services Journals scripts Repos Containers Deploys Tools https://dockstore.org/ Registries
  • 8. Efforts: Workflow Metadata Frameworks Metadata for machines & people, for WfMS, Registries & Services Common metadata about the workflow, tools & parameters Canonical workflow description of the steps of the workflow Type the input and outputs of the steps Run Provenance / Histories / Tests RO-Crate format for packaging a workflow, its metadata and companion objects (links to containers, data etc) for exchange, archiving, reporting, citing. FAIR Digital Object Open Communities
  • 9. Efforts: Workflow Interoperability 1. Workflow spec & WfMS interoperability: describe workflows independently of WfMS. Platform independent pipeline exchange and comparison. 2. Workflow Composability: Software interoperates through APIs and metadata standards (FAIR4RS*). Workflow-ready tools. Recycle tested & validated canonical workflow blocks. https://openwdl.org/ https://www.commonwl.org Design for FAIR Data & FAIR Workflow Reuse Review Curation Certification Governance Licence combinations Access permissions Local -> Global identifiers Best Practice * FAIR4RS First Draft of FAIR4RS principles
  • 10. Efforts: Workflow Reusability and Usability FAIR+R, FAIR++, FAIR4RS Reusable – “can be understood, modified, built upon or incorporated into other software workflows” Composability + Associated Objects + Metadata Usable – “can be executed” Containers & Packaging Testing & monitoring Execution standards APIs Tool Registry Service API checker workflows test data A2. metadata are accessible, even when the workflow is no longer available Enough metadata that a workflow is read-reproducible as a method description if it no longer runs
  • 11. Effort: Workflows as functions for FAIR Data Data FAIRification of Workflows, assisted by WfMS & reporting Challenge of diverse API & AAI landscape, formats and packaging Review Curation Certification Governance Best Practice Golden Examples Canonical workflows Manage AAI, format, packaging choices Design for FAIR Data and Reuse
  • 12. FAIR Computational Workflows Hybrid Processual Digital Objects Data + Software FAIR Principles Data FAIRification methods WfMS support FAIR takes a village Community of projects, WfMS, platforms & environments, stakeholders. Long tail pattern. Collective action by a few WfMS and services nails 80:20. FAIR by stealth. Borgman, C. L., & Bourne, P. E. (2021). Why it takes a village to manage and share data. Harvard Data Science Review (under Review), arXiv:2109.01694v1.
  • 13. EOSC-Life https://www.eosc-life.eu/ RO-Crate https://www.researchobject.org/ro-crate/ WorkflowHub https://workflowhub.eu/ Galaxy Europe https://galaxyproject.eu/ Bioschemas https://bioschemas.org/ Common Workflow Language https://www.commonwl.org/ Dockstore https://dockstore.org/ WorkflowsRI https://workflowsri.org/ Acknowledgements Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael Crusoe, Kristian Peters, Daniel Schober