The document discusses workflows and their importance in preserving scientific methodology. It provides an overview of the Wf4Ever project which aims to develop tools and infrastructure for preserving, sharing, and reusing scientific workflows. Key points include:
1) Workflows capture experiments in an executable way and allow results to be reproduced, repeated, and reused by others.
2) Wf4Ever is working to develop semantic repositories, advanced search/retrieval, and communities for sharing workflows across disciplines like astronomy.
3) Preserving complete "research objects" containing related data, code, and publications better supports reproducibility and new knowledge discovery.
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Grant agreement no.: 27092 Workflow Preservation
1. Grant agreement no.: 27092
Workflows for Methodology
and Science Preservation
Juan de Dios Santander Vela
On behalf of the AMIGA group and the Wf4Ever collaboration
Instituto de Astrofísica de Andalucía-CSIC, AMIGA Group
2. AMIGA
█
AMIGA: Analysis of the Interstellar Medium of isolated
GAlaxies
‣ Multi-wavelength, multi-object study on isolated galaxies with
strict isolation criteria
‣ Careful curation of data
‣ Very careful processing of new parameters from
• Group’s own observation programs and data reduction
• Literature table scanning
• Virtual Observatory table harvesting and parsing
‣ Emphasis on marrying astronomy and computer science, and
buy-in of the VO
v ers!
elie
ce b
e-S cien
2
3. What is Wf4Ever?
EU funded FP7 STREP Project 1. Intelligent Software
December 2010 – December 2013 Components (ISOCO, Spain)
2. University of Manchester
(UNIMAN, UK)
3. Universidad Politécnica de
Madrid (UPM, Spain)
2 7
5 4 4. Poznan Supercomputing and
Networking Centre
(PSNC, Poland)
13
5. University of Oxford
6 (OXF, UK)
6. Instituto de Astrofísica de
Andalucía (IAA, Spain)
7. Leiden University Medical
Centre (LUMC, NL)
3
4. What is Wf4Ever?
Technological infrastructure for the preservation and efficient retrieval
and reuse of scientific workflows in a range of disciplines
Partners Goals
• One SME Archival, classification, and indexing
• Six public organisations of scientific workflows and their
associated materials in scalable
Core Competencies (Tech) semantic repositories, providing
• Digital Libraries advanced access and recommendation
• Workflow Management capabilities
• Semantic Web
• Integrity & Authenticity
• Provenance
Creation of scientific communities to
• Information Quality
collaboratively share, reuse, and evolve
Case Studies workflows and their parts, stimulating
the development of new scientific
• Astronomy (IAA) knowledge
• Genome-wide Analysis and Biobanking
4
5. What are workflows?
Combination of data and processes into a
configurable and structured set of steps that
implement semi-automated, problem solving,
computational solutions
█
Types of workflows in Astronomy
‣ Personal script-based recipes
‣ Internal group developments✱
‣ Multi-archive VO experiments
‣ The classical processing pipeline✱
‣ Driving pipelines from VO services
(TBD)
✱ Scientifically exploitable results vs. scientific insight
Easily accessible and reproducible
5
7. What tools are available?
Combination of data and processes into a
configurable and structured set of steps that
implement semi-automated, problem solving,
computational solutions
6
8. The importance of workflow preservation
Astronomy research is entirely digital:
time to go “beyond the PDF”
█
Preserved experiments
‣ Methodology “in action”
‣ All data are exposed
‣ Reproducible
‣ Repeatable
‣ Re-usable
‣ Re-purposeable
‣ Participatory
‣ Collaborative
‣ Formative
7
9. The importance of workflow preservation
Astronomy research is entirely digital:
time to go “beyond the PDF”
█
Preserved experiments
‣ Methodology “in action”
‣ All data are exposed
‣ Reproducible
Trust assessment
‣ Repeatable
‣ Re-usable
‣ Re-purposeable
‣ Participatory
‣ Collaborative
‣ Formative
7
10. The importance of workflow preservation
Astronomy research is entirely digital:
time to go “beyond the PDF”
█
Preserved experiments
‣ Methodology “in action”
‣ All data are exposed
‣ Reproducible
‣ Repeatable
‣ Re-usable
‣ Re-purposeable
Social aspect
‣ Participatory of science
‣ Collaborative
‣ Formative
7
11. The importance of workflow preservation
Astronomy research is entirely digital:
time to go “beyond the PDF”
█
Preserved experiments
‣ Methodology “in action”
New kind of publication?
‣ All data are exposed
‣ Reproducible
‣ Repeatable
‣ Re-usable
‣ Re-purposeable
‣ Participatory
‣ Collaborative
‣ Formative
7
12. The importance of workflow preservation
Astronomy research is entirely digital:
time to go “beyond the PDF”
bl e!
█
Preserved experiments
ve ra
‣ Methodology “in action”
is co
‣ All data are exposed D
‣ Reproducible
‣ Repeatable
‣ Re-usable
‣ Re-purposeable
‣ Participatory
‣ Collaborative
‣ Formative
7
13. Workflow preservation considerations
Workflow, not data preservation
█
Workflows are interpreted █
Provenance is a complex
through their execution issue in a cloud of
‣ Complex models are services
required to describe them █
Resources are often
█
Severely vulnerable to beyond control of
obsolescence scientists
‣ Applications █
Alleviate decay of
‣ Libraries external resources via
‣ Operating environment alternates
█
Ensure trustworthiness
and authenticity
8
14. Workflow preservation considerations
Workflow, not data preservation
█
Versioning of the whole █
Permissions, licenses,
workflow, or its platform, costs, etc.
components █
Semantic discovery (WFs,
█
Access control policies processes, web services)
on data and processes █
QA: usage, logs, uptime…
Workflows and Processes should benefit
of the same privileges acquired by Data
9
15. First Approach to Workflow Preservation
Preserve, Retrieve, Reconstruct, Replay
█
Retrieve
‣ Functionality of the WF and/or its modules
‣ What are the inputs and outputs
‣ Metadata: Authority, Complexity, Keywords…
█
Reconstruct
‣ Understand dependencies and components
‣ Technical specificities
█
Replay
‣ Check the success of the preservation method
█
Referenced and acknowledged
10
16. First Approach to Workflow Preservation
Preserve, Retrieve, Reconstruct, Replay
█
Retrieve
‣ Functionality of the WF and/or its modules
‣ What are the inputs and outputs Characterisation
‣ Metadata: Authority, Complexity, Keywords…
█
Reconstruct Tools
‣ Understand dependencies and components Semantics
‣ Technical specificities & Modelling
█
Replay
‣ Check the success of the preservation method
█
Referenced and acknowledged Long term IDs
10
17. More than a WF: The Research Object (RO)
█
All components related to the research lifecycle of an
experiment should be available.
█
Preserved and easily retrievable
‣ Proposals
‣ Data
All linked by
‣ Processes
persistent IDs
‣ Workflows
‣ Publications
11
18. Wf4Ever Update
█
User Requirements
‣ Functional requirements for Wf4Ever “working” platform
‣ Focused on improving collaboration and reuse
‣ Interoperability in exchanging scientific methodology
‣ Expose experiment in a structured way to be understood by
others
We need to build what we want to preserve!
█
RO Modeling
‣ Model for interlinked components in a Research Object
‣ Strategies for assessing integrity and authenticity
‣ Attempts in metrics for Information Quality
12
19. Wf4Ever Update
‣ Architecture
• Search & Retrieval Service
• Recommender Service
• I & A Evaluation Service
• Notification Service
‣ User-Tools Prototypes
• RO Command Line Tool
• RO Annotator
• RO Box
13
20. New Workflows in myExperiment
About | Mailing List | Log in | Register | Give us Feedback | Invite
Publications
Home Users Groups Workflows Files Packs Services Topics
virtual observatory All Search
Home » New/Upload
Workflow GO
Search results for "virtual observatory"
Search filter terms Log in / Register
Sort by: Rank
Username or Email:
Showing 5 results. Use the filters on the left and the search box below to refine the results.
Filter by category virtual observatory Search
Password:
Workflow 3
Group 1
User 1
Taverna 2 AMIGA ConeSearch (v3) View
Remember me:
Created: 11/07/11 @ 22:08:06 | Last updated: 11/07/11 @ 23:34:13 Download (v3)
OR
Filter by type Original
License: BSD License Use OpenID:
Taverna 2 3 Uploader
This workflow provides a VOTable response from
the AMIGA ConeSearch service and extract values (eg: name.myopenid.com)
Filter by tag from VOTable columns.
virtual observa… 4 Log in
astronomy 3 Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 |
votable 3 Pique Citations: 0 Need an account?
astrogrid-taver… 1 Click here to register
Viewed: 4 times | Downloaded: 1 time
astrophysics 1 Tags (3): Forgot Password?
workflows 1 astronomy | virtual observatory | votable
Popular Tags
Filter by user 25 tags
Pique 3 Taverna 2 AMIGA ConeSearch from a file of targets/positions View [All Tags]
(v1) Download (v1)
Filter by licence benchmarks | bio2rdf |
Original Created: 12/07/11 @ 17:34:33 | Last updated: 12/07/11 @ 17:36:37 bioinformatics | BLAST |
by-nd 3 Uploader
License: BSD License cheminformatics | data integration 14
22. Wf4Ever Update
Structure Metadata for
in Dropbox selected item
Unstructured, rich-text
metadata editor
16
23. Wf4Ever Update
Notification Service for Authors
█
What should be notified?
‣ Fails
‣ Downloads
‣ Annotations
‣ Linked/Similarity
‣ Modifications on Working RO
‣ Acknowledgements
█
Notification Management Tool
‣ Avoid spam
17
24. Conclusions
█
Workflows are a powerful, semantically rich way of
describing astronomical knowledge discovery methods
‣ Provide both glue and structure to the method
‣ Also allow for metadata encapsulation
█
Preserving workflows allows for method reuse,
experiment replay, dissemination, attribution, trust
building
█
Wf4Ever is providing a framework for allowing
astronomers to start using workflows without leaving
their tools
‣ But with the idea of nudging them toward more structured
workflow descriptions
18