SlideShare a Scribd company logo
1 of 24
Grant agreement no.: 27092




     Workflows for Methodology
      and Science Preservation
               Juan de Dios Santander Vela
On behalf of the AMIGA group and the Wf4Ever collaboration

 Instituto de Astrofísica de Andalucía-CSIC, AMIGA Group
AMIGA


█
    AMIGA: Analysis of the Interstellar Medium of isolated
    GAlaxies
    ‣ Multi-wavelength, multi-object study on isolated galaxies with
      strict isolation criteria
    ‣ Careful curation of data
    ‣ Very careful processing of new parameters from
     • Group’s own observation programs and data reduction
     • Literature table scanning
     • Virtual Observatory table harvesting and parsing

    ‣ Emphasis on marrying astronomy and computer science, and
      buy-in of the VO
                                                              v ers!
                                                         elie
                                                     ce b
                                             e-S cien
                                                                         2
What is Wf4Ever?

EU funded FP7 STREP Project     1. Intelligent Software
December 2010 – December 2013      Components (ISOCO, Spain)
                                2. University of Manchester
                                   (UNIMAN, UK)
                                3. Universidad Politécnica de
                                   Madrid (UPM, Spain)
       2       7
           5       4            4. Poznan Supercomputing and
                                   Networking Centre
                                   (PSNC, Poland)
      13
                                5. University of Oxford
       6                           (OXF, UK)
                                6. Instituto de Astrofísica de
                                   Andalucía (IAA, Spain)
                                7. Leiden University Medical
                                   Centre (LUMC, NL)

                                                                 3
What is Wf4Ever?
         Technological infrastructure for the preservation and efficient retrieval
              and reuse of scientific workflows in a range of disciplines

Partners                                      Goals
• One SME                                     Archival, classification, and indexing
• Six public organisations                    of scientific workflows and their
                                              associated materials in scalable
Core Competencies (Tech)                      semantic repositories, providing
•   Digital Libraries                         advanced access and recommendation
•   Workflow Management                        capabilities
•   Semantic Web
•   Integrity & Authenticity
•   Provenance
                                              Creation of scientific communities to
•   Information Quality
                                              collaboratively share, reuse, and evolve
Case Studies                                  workflows and their parts, stimulating
                                              the development of new scientific
• Astronomy (IAA)                             knowledge
• Genome-wide Analysis and Biobanking

                                                                                         4
What are workflows?

     Combination of data and processes into a
    configurable and structured set of steps that
    implement semi-automated, problem solving,
              computational solutions


█
    Types of workflows in Astronomy
    ‣   Personal script-based recipes
    ‣   Internal group developments✱
    ‣   Multi-archive VO experiments
    ‣   The classical processing pipeline✱
    ‣   Driving pipelines from VO services
        (TBD)
        ✱   Scientifically exploitable results vs. scientific insight

    Easily accessible and reproducible
                                                                                           5
What tools are available?




                            6
What tools are available?




 Combination of data and processes into a
configurable and structured set of steps that
implement semi-automated, problem solving,
          computational solutions




                                                            6
The importance of workflow preservation


                Astronomy research is entirely digital:
                    time to go “beyond the PDF”

█
    Preserved experiments
    ‣   Methodology “in action”
    ‣   All data are exposed
    ‣   Reproducible
    ‣   Repeatable
    ‣   Re-usable
    ‣   Re-purposeable
    ‣   Participatory
    ‣   Collaborative
    ‣   Formative

                                                                           7
The importance of workflow preservation


                Astronomy research is entirely digital:
                    time to go “beyond the PDF”

█
    Preserved experiments
    ‣   Methodology “in action”
    ‣   All data are exposed
    ‣   Reproducible
                                                  Trust assessment
    ‣   Repeatable
    ‣   Re-usable
    ‣   Re-purposeable
    ‣   Participatory
    ‣   Collaborative
    ‣   Formative

                                                                           7
The importance of workflow preservation


                Astronomy research is entirely digital:
                    time to go “beyond the PDF”

█
    Preserved experiments
    ‣   Methodology “in action”
    ‣   All data are exposed
    ‣   Reproducible
    ‣   Repeatable
    ‣   Re-usable
    ‣   Re-purposeable
                                                      Social aspect
    ‣   Participatory                                   of science
    ‣   Collaborative
    ‣   Formative

                                                                           7
The importance of workflow preservation


                Astronomy research is entirely digital:
                    time to go “beyond the PDF”

█
    Preserved experiments
    ‣   Methodology “in action”
                                            New kind of publication?
    ‣   All data are exposed
    ‣   Reproducible
    ‣   Repeatable
    ‣   Re-usable
    ‣   Re-purposeable
    ‣   Participatory
    ‣   Collaborative
    ‣   Formative

                                                                           7
The importance of workflow preservation


                Astronomy research is entirely digital:
                    time to go “beyond the PDF”
                                                       bl e!
█
    Preserved experiments
                                               ve ra
    ‣   Methodology “in action”
                                       is co
    ‣   All data are exposed       D
    ‣   Reproducible
    ‣   Repeatable
    ‣   Re-usable
    ‣   Re-purposeable
    ‣   Participatory
    ‣   Collaborative
    ‣   Formative

                                                                           7
Workflow preservation considerations

                Workflow, not data preservation
█
    Workflows are interpreted           █
                                           Provenance is a complex
    through their execution                issue in a cloud of
    ‣ Complex models are                   services
      required to describe them        █
                                           Resources are often
█
    Severely vulnerable to                 beyond control of
    obsolescence                           scientists
    ‣ Applications                     █
                                           Alleviate decay of
    ‣ Libraries                            external resources via
    ‣ Operating environment                alternates
                                       █
                                           Ensure trustworthiness
                                           and authenticity

                                                                        8
Workflow preservation considerations

              Workflow, not data preservation

█
    Versioning of the whole        █
                                       Permissions, licenses,
    workflow, or its                    platform, costs, etc.
    components                     █
                                       Semantic discovery (WFs,
█
    Access control policies            processes, web services)
    on data and processes          █
                                       QA: usage, logs, uptime…



          Workflows and Processes should benefit
          of the same privileges acquired by Data


                                                                    9
First Approach to Workflow Preservation

Preserve, Retrieve, Reconstruct, Replay
█
    Retrieve
    ‣ Functionality of the WF and/or its modules
    ‣ What are the inputs and outputs
    ‣ Metadata: Authority, Complexity, Keywords…
█
    Reconstruct
    ‣ Understand dependencies and components
    ‣ Technical specificities
█
    Replay
    ‣ Check the success of the preservation method
█
    Referenced and acknowledged
                                                                      10
First Approach to Workflow Preservation

Preserve, Retrieve, Reconstruct, Replay
█
    Retrieve
    ‣ Functionality of the WF and/or its modules
    ‣ What are the inputs and outputs            Characterisation
    ‣ Metadata: Authority, Complexity, Keywords…
█
    Reconstruct                                                     Tools

    ‣ Understand dependencies and components           Semantics
    ‣ Technical specificities                          & Modelling

█
    Replay
    ‣ Check the success of the preservation method
█
    Referenced and acknowledged                          Long term IDs

                                                                            10
More than a WF: The Research Object (RO)




█
    All components related to the research lifecycle of an
    experiment should be available.

█
    Preserved and easily retrievable
    ‣   Proposals
    ‣   Data
                          All linked by
    ‣   Processes
                         persistent IDs
    ‣   Workflows
    ‣   Publications




                                                                   11
Wf4Ever Update

█
    User Requirements
    ‣   Functional requirements for Wf4Ever “working” platform
    ‣   Focused on improving collaboration and reuse
    ‣   Interoperability in exchanging scientific methodology
    ‣   Expose experiment in a structured way to be understood by
        others

          We need to build what we want to preserve!
█
    RO Modeling
    ‣ Model for interlinked components in a Research Object
    ‣ Strategies for assessing integrity and authenticity
    ‣ Attempts in metrics for Information Quality

                                                                        12
Wf4Ever Update



‣ Architecture
 •   Search & Retrieval Service
 •   Recommender Service
 •   I & A Evaluation Service
 •   Notification Service




‣ User-Tools Prototypes
 • RO Command Line Tool
 • RO Annotator
 • RO Box




                             13
New Workflows in myExperiment

                                                          About | Mailing List |                                Log in |     Register |    Give us Feedback |          Invite
                                                          Publications



                          Home         Users         Groups           Workflows           Files         Packs          Services        Topics

                                                      virtual observatory               All                Search


Home »                                                                                                                                               New/Upload

                                                                                                                                                Workflow              GO
                                   Search results for "virtual observatory"

Search filter terms                                                                                                                               Log in / Register
                                                                                                      Sort by:      Rank

                                                                                                                                                  Username or Email:
                          Showing 5 results. Use the filters on the left and the search box below to refine the results.
Filter by category        virtual observatory                                                                          Search
                                                                                                                                                      Password:
   Workflow           3
   Group              1
   User               1
                           Taverna 2        AMIGA ConeSearch (v3)                                                          View
                                                                                                                                                   Remember me:
                                       Created: 11/07/11 @ 22:08:06 | Last updated: 11/07/11 @ 23:34:13                    Download (v3)
                                                                                                                                                           OR
Filter by type            Original
                                       License: BSD License                                                                                          Use OpenID:
   Taverna 2          3   Uploader
                                                               This workflow provides a VOTable response from
                                                               the AMIGA ConeSearch service and extract values                                   (eg: name.myopenid.com)
Filter by tag                                                  from VOTable columns.
   virtual observa…   4                                                                                                                                  Log in
   astronomy          3                Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 |
   votable            3       Pique    Citations: 0                                                                                               Need an account?
   astrogrid-taver…   1                                                                                                                          Click here to register
                                       Viewed: 4 times | Downloaded: 1 time
   astrophysics       1                Tags (3):                                                                                                  Forgot Password?
   workflows          1                astronomy | virtual observatory | votable
                                                                                                                                                    Popular Tags
Filter by user                                                                                                                                          25 tags
   Pique              3    Taverna 2        AMIGA ConeSearch from a file of targets/positions                              View                        [All Tags]
                                       (v1)                                                                                Download (v1)
Filter by licence                                                                                                                               benchmarks | bio2rdf |
                          Original     Created: 12/07/11 @ 17:34:33 | Last updated: 12/07/11 @ 17:36:37                                         bioinformatics | BLAST |
   by-nd              3   Uploader
                                       License: BSD License                                                                                cheminformatics | data integration   14
Administrator:            AstroGrid and the VO                                                             View

                                               Unique name: astrogrid.org Created: Tuesday 05 February 2008 @ 19:44:08
                                               (GMT)

                                                                                            New Workflows in myExperiment
                                                This group will enable astronomers and astrophysicists who use the
                                                AstroGrid-Taverna workflow system to share their workflows. For more
                                                information see the AstroGrid website http://www.astrogrid.org. In addition
                                  Nicholas      emerging International Virtual Observatory Alliance (IVOA - see
                                 Walton         http://www.ivoa.net) efforts in the 'workflow' arena will be referenced.

                                                0 shared items | 0 announcements

                                                Members (2):




                                                    Nicholas     Dugan
                                                     Walton



                                               Tags:
                                               astrogrid-taverna | astrophysics | virtual observatory | workflows




                                Member            Pique                                                                            View

                                                                                                                                   Message
                                             Joined: Tuesday 08 March 2011 @ 00:23:14 (GMT)

                                              No description

                                             Last active: Wednesday 02 November 2011 @ 12:06:31 (GMT)
                                             Website: http://www.iaa.es/~jer | Email (public): jer [at] iaa.es
                                  Pique



                                                                                                        Sort by:    Rank


                                                                                                        Results per page:     10




                                      Copyright © 2007 - 2011 The University of Manchester and University of Southampton

Front Page                                     About Us                                             Taverna Workflow Workbench                 EPSRC

Home                                           News and Events                                      myGrid                                      JISC
                                                                                                                                              Microsoft
Invite people to myExperiment                  Mailing List                                         BioCatalogue
Help pages                                     Contact Us                                           Trident                                  Powered by:
                                               Developers                                           Google Coop Search
                                               Publications                                                                                                14
Wf4Ever Update




  Structure   Metadata for
in Dropbox    selected item




                              Unstructured, rich-text
                                 metadata editor

                                                         16
Wf4Ever Update


Notification Service for Authors

█
    What should be notified?
    ‣   Fails
    ‣   Downloads
    ‣   Annotations
    ‣   Linked/Similarity
    ‣   Modifications on Working RO
    ‣   Acknowledgements

█
    Notification Management Tool
    ‣ Avoid spam


                                                      17
Conclusions

█
    Workflows are a powerful, semantically rich way of
    describing astronomical knowledge discovery methods
    ‣ Provide both glue and structure to the method
    ‣ Also allow for metadata encapsulation

█
    Preserving workflows allows for method reuse,
    experiment replay, dissemination, attribution, trust
    building
█
    Wf4Ever is providing a framework for allowing
    astronomers to start using workflows without leaving
    their tools
    ‣ But with the idea of nudging them toward more structured
      workflow descriptions

                                                                       18

More Related Content

Similar to Grant agreement no.: 27092 Workflow Preservation

2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
 
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...Neil Chue Hong
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 
Collaborative Digital Experiments
Collaborative Digital ExperimentsCollaborative Digital Experiments
Collaborative Digital ExperimentsJose Enrique Ruiz
 
Paper talk: Idcc 11
Paper talk: Idcc 11  Paper talk: Idcc 11
Paper talk: Idcc 11 Paolo Missier
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Ola Spjuth
 
Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudOla Spjuth
 
Open Science and Executable Papers
Open Science and Executable PapersOpen Science and Executable Papers
Open Science and Executable PapersJose Enrique Ruiz
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
Curating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsCurating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsJose Enrique Ruiz
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods dgarijo
 
Presentation to EASE, Tallinn, June 2012
Presentation to EASE, Tallinn, June 2012Presentation to EASE, Tallinn, June 2012
Presentation to EASE, Tallinn, June 2012Sarah Callaghan
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3guru122
 
Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesHerbert Van de Sompel
 
Group Support Systems - GSS
Group Support Systems - GSSGroup Support Systems - GSS
Group Support Systems - GSSJoão Gratuliano
 

Similar to Grant agreement no.: 27092 Workflow Preservation (20)

Workflow Preservation
Workflow PreservationWorkflow Preservation
Workflow Preservation
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects
 
Digital Science
Digital ScienceDigital Science
Digital Science
 
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
Doing Science Properly in the Digital Age: Software Skills for Free-Range Res...
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
Collaborative Digital Experiments
Collaborative Digital ExperimentsCollaborative Digital Experiments
Collaborative Digital Experiments
 
Paper talk: Idcc 11
Paper talk: Idcc 11  Paper talk: Idcc 11
Paper talk: Idcc 11
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
 
Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and Cloud
 
Open Science and Executable Papers
Open Science and Executable PapersOpen Science and Executable Papers
Open Science and Executable Papers
 
Reproducibility 1
Reproducibility 1Reproducibility 1
Reproducibility 1
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
Curating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsCurating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital Experiments
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
 
Presentation to EASE, Tallinn, June 2012
Presentation to EASE, Tallinn, June 2012Presentation to EASE, Tallinn, June 2012
Presentation to EASE, Tallinn, June 2012
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
 
Group Support Systems - GSS
Group Support Systems - GSSGroup Support Systems - GSS
Group Support Systems - GSS
 

More from Joint ALMA Observatory

Hablemos de ALMA — Wideband Sensitivity Upgrade
Hablemos de ALMA — Wideband Sensitivity UpgradeHablemos de ALMA — Wideband Sensitivity Upgrade
Hablemos de ALMA — Wideband Sensitivity UpgradeJoint ALMA Observatory
 
From SKA to SKAO: Early progress in SKA project construction.
From SKA to SKAO: Early progress in SKA project construction.From SKA to SKAO: Early progress in SKA project construction.
From SKA to SKAO: Early progress in SKA project construction.Joint ALMA Observatory
 
The Square Kilometre Array Science Cases (CosmoAndes 2018)
The Square Kilometre Array Science Cases (CosmoAndes 2018)The Square Kilometre Array Science Cases (CosmoAndes 2018)
The Square Kilometre Array Science Cases (CosmoAndes 2018)Joint ALMA Observatory
 
Software Development Practices in ESFRIS—SKA Software Development
Software Development Practices in ESFRIS—SKA Software DevelopmentSoftware Development Practices in ESFRIS—SKA Software Development
Software Development Practices in ESFRIS—SKA Software DevelopmentJoint ALMA Observatory
 
Agile Systems Engineering & Agile at SKA Scale
Agile Systems Engineering & Agile at SKA ScaleAgile Systems Engineering & Agile at SKA Scale
Agile Systems Engineering & Agile at SKA ScaleJoint ALMA Observatory
 
Citizen Science in the era of the Square Kilometre Array
Citizen Science in the era of the Square Kilometre ArrayCitizen Science in the era of the Square Kilometre Array
Citizen Science in the era of the Square Kilometre ArrayJoint ALMA Observatory
 
The Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering UpdateThe Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering UpdateJoint ALMA Observatory
 
SKA Systems Engineering: from PDR to Construction
SKA Systems Engineering: from PDR to ConstructionSKA Systems Engineering: from PDR to Construction
SKA Systems Engineering: from PDR to ConstructionJoint ALMA Observatory
 
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...Joint ALMA Observatory
 
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...Joint ALMA Observatory
 
e-Science for the Science Kilometre Array
e-Science for the Science Kilometre Arraye-Science for the Science Kilometre Array
e-Science for the Science Kilometre ArrayJoint ALMA Observatory
 
VO Course 10: Big data challenges in astronomy
VO Course 10: Big data challenges in astronomyVO Course 10: Big data challenges in astronomy
VO Course 10: Big data challenges in astronomyJoint ALMA Observatory
 
Curso VO 07: Sistemas gestores de bases de datos
Curso VO 07: Sistemas gestores de bases de datosCurso VO 07: Sistemas gestores de bases de datos
Curso VO 07: Sistemas gestores de bases de datosJoint ALMA Observatory
 
VO Course 05: VOTable, VO Protocols, and UCDs
VO Course 05: VOTable, VO Protocols, and UCDsVO Course 05: VOTable, VO Protocols, and UCDs
VO Course 05: VOTable, VO Protocols, and UCDsJoint ALMA Observatory
 
VO Course 03: IVOA, the International Virtual Observatory Alliance
VO Course 03: IVOA, the International Virtual Observatory AllianceVO Course 03: IVOA, the International Virtual Observatory Alliance
VO Course 03: IVOA, the International Virtual Observatory AllianceJoint ALMA Observatory
 

More from Joint ALMA Observatory (18)

Hablemos de ALMA — Wideband Sensitivity Upgrade
Hablemos de ALMA — Wideband Sensitivity UpgradeHablemos de ALMA — Wideband Sensitivity Upgrade
Hablemos de ALMA — Wideband Sensitivity Upgrade
 
From SKA to SKAO: Early progress in SKA project construction.
From SKA to SKAO: Early progress in SKA project construction.From SKA to SKAO: Early progress in SKA project construction.
From SKA to SKAO: Early progress in SKA project construction.
 
The Square Kilometre Array Science Cases (CosmoAndes 2018)
The Square Kilometre Array Science Cases (CosmoAndes 2018)The Square Kilometre Array Science Cases (CosmoAndes 2018)
The Square Kilometre Array Science Cases (CosmoAndes 2018)
 
Software Development Practices in ESFRIS—SKA Software Development
Software Development Practices in ESFRIS—SKA Software DevelopmentSoftware Development Practices in ESFRIS—SKA Software Development
Software Development Practices in ESFRIS—SKA Software Development
 
Agile Systems Engineering & Agile at SKA Scale
Agile Systems Engineering & Agile at SKA ScaleAgile Systems Engineering & Agile at SKA Scale
Agile Systems Engineering & Agile at SKA Scale
 
Citizen Science in the era of the Square Kilometre Array
Citizen Science in the era of the Square Kilometre ArrayCitizen Science in the era of the Square Kilometre Array
Citizen Science in the era of the Square Kilometre Array
 
The Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering UpdateThe Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering Update
 
SKA Systems Engineering: from PDR to Construction
SKA Systems Engineering: from PDR to ConstructionSKA Systems Engineering: from PDR to Construction
SKA Systems Engineering: from PDR to Construction
 
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
 
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
 
e-Science for the Science Kilometre Array
e-Science for the Science Kilometre Arraye-Science for the Science Kilometre Array
e-Science for the Science Kilometre Array
 
VO Course 11: Spatial indexing
VO Course 11: Spatial indexingVO Course 11: Spatial indexing
VO Course 11: Spatial indexing
 
VO Course 10: Big data challenges in astronomy
VO Course 10: Big data challenges in astronomyVO Course 10: Big data challenges in astronomy
VO Course 10: Big data challenges in astronomy
 
Curso VO 07: Sistemas gestores de bases de datos
Curso VO 07: Sistemas gestores de bases de datosCurso VO 07: Sistemas gestores de bases de datos
Curso VO 07: Sistemas gestores de bases de datos
 
VO Course 06: VO Data-models
VO Course 06: VO Data-modelsVO Course 06: VO Data-models
VO Course 06: VO Data-models
 
VO Course 05: VOTable, VO Protocols, and UCDs
VO Course 05: VOTable, VO Protocols, and UCDsVO Course 05: VOTable, VO Protocols, and UCDs
VO Course 05: VOTable, VO Protocols, and UCDs
 
VO Course 04: VO architecture
VO Course 04: VO architectureVO Course 04: VO architecture
VO Course 04: VO architecture
 
VO Course 03: IVOA, the International Virtual Observatory Alliance
VO Course 03: IVOA, the International Virtual Observatory AllianceVO Course 03: IVOA, the International Virtual Observatory Alliance
VO Course 03: IVOA, the International Virtual Observatory Alliance
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 

Grant agreement no.: 27092 Workflow Preservation

  • 1. Grant agreement no.: 27092 Workflows for Methodology and Science Preservation Juan de Dios Santander Vela On behalf of the AMIGA group and the Wf4Ever collaboration Instituto de Astrofísica de Andalucía-CSIC, AMIGA Group
  • 2. AMIGA █ AMIGA: Analysis of the Interstellar Medium of isolated GAlaxies ‣ Multi-wavelength, multi-object study on isolated galaxies with strict isolation criteria ‣ Careful curation of data ‣ Very careful processing of new parameters from • Group’s own observation programs and data reduction • Literature table scanning • Virtual Observatory table harvesting and parsing ‣ Emphasis on marrying astronomy and computer science, and buy-in of the VO v ers! elie ce b e-S cien 2
  • 3. What is Wf4Ever? EU funded FP7 STREP Project 1. Intelligent Software December 2010 – December 2013 Components (ISOCO, Spain) 2. University of Manchester (UNIMAN, UK) 3. Universidad Politécnica de Madrid (UPM, Spain) 2 7 5 4 4. Poznan Supercomputing and Networking Centre (PSNC, Poland) 13 5. University of Oxford 6 (OXF, UK) 6. Instituto de Astrofísica de Andalucía (IAA, Spain) 7. Leiden University Medical Centre (LUMC, NL) 3
  • 4. What is Wf4Ever? Technological infrastructure for the preservation and efficient retrieval and reuse of scientific workflows in a range of disciplines Partners Goals • One SME Archival, classification, and indexing • Six public organisations of scientific workflows and their associated materials in scalable Core Competencies (Tech) semantic repositories, providing • Digital Libraries advanced access and recommendation • Workflow Management capabilities • Semantic Web • Integrity & Authenticity • Provenance Creation of scientific communities to • Information Quality collaboratively share, reuse, and evolve Case Studies workflows and their parts, stimulating the development of new scientific • Astronomy (IAA) knowledge • Genome-wide Analysis and Biobanking 4
  • 5. What are workflows? Combination of data and processes into a configurable and structured set of steps that implement semi-automated, problem solving, computational solutions █ Types of workflows in Astronomy ‣ Personal script-based recipes ‣ Internal group developments✱ ‣ Multi-archive VO experiments ‣ The classical processing pipeline✱ ‣ Driving pipelines from VO services (TBD) ✱ Scientifically exploitable results vs. scientific insight Easily accessible and reproducible 5
  • 6. What tools are available? 6
  • 7. What tools are available? Combination of data and processes into a configurable and structured set of steps that implement semi-automated, problem solving, computational solutions 6
  • 8. The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF” █ Preserved experiments ‣ Methodology “in action” ‣ All data are exposed ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 7
  • 9. The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF” █ Preserved experiments ‣ Methodology “in action” ‣ All data are exposed ‣ Reproducible Trust assessment ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 7
  • 10. The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF” █ Preserved experiments ‣ Methodology “in action” ‣ All data are exposed ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable Social aspect ‣ Participatory of science ‣ Collaborative ‣ Formative 7
  • 11. The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF” █ Preserved experiments ‣ Methodology “in action” New kind of publication? ‣ All data are exposed ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 7
  • 12. The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF” bl e! █ Preserved experiments ve ra ‣ Methodology “in action” is co ‣ All data are exposed D ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 7
  • 13. Workflow preservation considerations Workflow, not data preservation █ Workflows are interpreted █ Provenance is a complex through their execution issue in a cloud of ‣ Complex models are services required to describe them █ Resources are often █ Severely vulnerable to beyond control of obsolescence scientists ‣ Applications █ Alleviate decay of ‣ Libraries external resources via ‣ Operating environment alternates █ Ensure trustworthiness and authenticity 8
  • 14. Workflow preservation considerations Workflow, not data preservation █ Versioning of the whole █ Permissions, licenses, workflow, or its platform, costs, etc. components █ Semantic discovery (WFs, █ Access control policies processes, web services) on data and processes █ QA: usage, logs, uptime… Workflows and Processes should benefit of the same privileges acquired by Data 9
  • 15. First Approach to Workflow Preservation Preserve, Retrieve, Reconstruct, Replay █ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs ‣ Metadata: Authority, Complexity, Keywords… █ Reconstruct ‣ Understand dependencies and components ‣ Technical specificities █ Replay ‣ Check the success of the preservation method █ Referenced and acknowledged 10
  • 16. First Approach to Workflow Preservation Preserve, Retrieve, Reconstruct, Replay █ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs Characterisation ‣ Metadata: Authority, Complexity, Keywords… █ Reconstruct Tools ‣ Understand dependencies and components Semantics ‣ Technical specificities & Modelling █ Replay ‣ Check the success of the preservation method █ Referenced and acknowledged Long term IDs 10
  • 17. More than a WF: The Research Object (RO) █ All components related to the research lifecycle of an experiment should be available. █ Preserved and easily retrievable ‣ Proposals ‣ Data All linked by ‣ Processes persistent IDs ‣ Workflows ‣ Publications 11
  • 18. Wf4Ever Update █ User Requirements ‣ Functional requirements for Wf4Ever “working” platform ‣ Focused on improving collaboration and reuse ‣ Interoperability in exchanging scientific methodology ‣ Expose experiment in a structured way to be understood by others We need to build what we want to preserve! █ RO Modeling ‣ Model for interlinked components in a Research Object ‣ Strategies for assessing integrity and authenticity ‣ Attempts in metrics for Information Quality 12
  • 19. Wf4Ever Update ‣ Architecture • Search & Retrieval Service • Recommender Service • I & A Evaluation Service • Notification Service ‣ User-Tools Prototypes • RO Command Line Tool • RO Annotator • RO Box 13
  • 20. New Workflows in myExperiment About | Mailing List | Log in | Register | Give us Feedback | Invite Publications Home Users Groups Workflows Files Packs Services Topics virtual observatory All Search Home » New/Upload Workflow GO Search results for "virtual observatory" Search filter terms Log in / Register Sort by: Rank Username or Email: Showing 5 results. Use the filters on the left and the search box below to refine the results. Filter by category virtual observatory Search Password: Workflow 3 Group 1 User 1 Taverna 2 AMIGA ConeSearch (v3) View Remember me: Created: 11/07/11 @ 22:08:06 | Last updated: 11/07/11 @ 23:34:13 Download (v3) OR Filter by type Original License: BSD License Use OpenID: Taverna 2 3 Uploader This workflow provides a VOTable response from the AMIGA ConeSearch service and extract values (eg: name.myopenid.com) Filter by tag from VOTable columns. virtual observa… 4 Log in astronomy 3 Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | votable 3 Pique Citations: 0 Need an account? astrogrid-taver… 1 Click here to register Viewed: 4 times | Downloaded: 1 time astrophysics 1 Tags (3): Forgot Password? workflows 1 astronomy | virtual observatory | votable Popular Tags Filter by user 25 tags Pique 3 Taverna 2 AMIGA ConeSearch from a file of targets/positions View [All Tags] (v1) Download (v1) Filter by licence benchmarks | bio2rdf | Original Created: 12/07/11 @ 17:34:33 | Last updated: 12/07/11 @ 17:36:37 bioinformatics | BLAST | by-nd 3 Uploader License: BSD License cheminformatics | data integration 14
  • 21. Administrator: AstroGrid and the VO View Unique name: astrogrid.org Created: Tuesday 05 February 2008 @ 19:44:08 (GMT) New Workflows in myExperiment This group will enable astronomers and astrophysicists who use the AstroGrid-Taverna workflow system to share their workflows. For more information see the AstroGrid website http://www.astrogrid.org. In addition Nicholas emerging International Virtual Observatory Alliance (IVOA - see Walton http://www.ivoa.net) efforts in the 'workflow' arena will be referenced. 0 shared items | 0 announcements Members (2): Nicholas Dugan Walton Tags: astrogrid-taverna | astrophysics | virtual observatory | workflows Member Pique View Message Joined: Tuesday 08 March 2011 @ 00:23:14 (GMT) No description Last active: Wednesday 02 November 2011 @ 12:06:31 (GMT) Website: http://www.iaa.es/~jer | Email (public): jer [at] iaa.es Pique Sort by: Rank Results per page: 10 Copyright © 2007 - 2011 The University of Manchester and University of Southampton Front Page About Us Taverna Workflow Workbench EPSRC Home News and Events myGrid JISC Microsoft Invite people to myExperiment Mailing List BioCatalogue Help pages Contact Us Trident Powered by: Developers Google Coop Search Publications 14
  • 22. Wf4Ever Update Structure Metadata for in Dropbox selected item Unstructured, rich-text metadata editor 16
  • 23. Wf4Ever Update Notification Service for Authors █ What should be notified? ‣ Fails ‣ Downloads ‣ Annotations ‣ Linked/Similarity ‣ Modifications on Working RO ‣ Acknowledgements █ Notification Management Tool ‣ Avoid spam 17
  • 24. Conclusions █ Workflows are a powerful, semantically rich way of describing astronomical knowledge discovery methods ‣ Provide both glue and structure to the method ‣ Also allow for metadata encapsulation █ Preserving workflows allows for method reuse, experiment replay, dissemination, attribution, trust building █ Wf4Ever is providing a framework for allowing astronomers to start using workflows without leaving their tools ‣ But with the idea of nudging them toward more structured workflow descriptions 18