SlideShare a Scribd company logo
1 of 15
Download to read offline
Research Objects
     in Wf4Ever
        Jose Enrique Ruiz
           jer@iaa.es
      On behalf of the Wf4Ever Team

            October 25th 2012
2012 IVOA Fall Interop Meeting - Sao Paolo



                                             1
Wf4Ever
                                                          E-SCIENCE
Wf4Ever                                                2011 - 2013
Advanced Workflow Preservation Technologies for Enhanced Science

                         1.    Intelligent Software Components (ISOCO, Spain)
                         2.    University of Manchester (UNIMAN, UK)
     2
              7          3.    Universidad Politécnica de Madrid (UPM, Spain)
          5       4      4.    Poznan Supercomputing and Networking Centre (Poland)
                         5.    University of Oxford and OeRC (OXF, UK)

     1                   6.  Instituto Astrofísica Andalucía (IAA-CSIC, Spain)
      3
      6                  7.  Leiden University Medical Centre (LUMC, NL)




                                                                                      2
Astronomy Research Lifecycle

Astronomy research lifecycle is entirely digital

»    Observation proposals
»    Data reduction pipelines
»    Analysis of science ready data
»    Catalogs of objects and data
»    Publish process
      ›  Final data results
      ›  Experiment in DL
         ADS/arXiv

     Reproducible research is still not       A normalized preservation of
        possible in a digital world             methodology is needed

 Efficient use of rich data infrastructure                            Tools
          (VO) may be improved
                                                                              3
Efficiency and Reuse


Optimize return on investments made on big facilities
»  Avoid duplication of efforts and reinvention
»  How to discover and not duplicate ?
»  How to re-use and not duplicate ?
»  How to make use of best practices ?
»  How to use the rich infrastructure of data ?
»  Intellectual contributions are encoded in softw

More data in archives does not imply more knowledge
»  Time has come to go beyond the PDF
»  Expose complete scientific record, not the story
»  Allow easy discovery of methods and tools




                                                                           4
Reproducibility: documenting and sharing




I don’t know how
                    Tools




                                       5
Research Objects in Wf4Ever




                                  Multi Workflow Centric




Technical Objects    Social Objects
   Distributed                                        6
Research Objects in Wf4Ever

RO Content
  ›    Process (workflows), data, external resources and bibliography
  ›    Execution environment set-up and local software dependencies
  ›    Experimental protocol followed
  ›    Roles, types and relationships among all digital components
  ›    Provenance of intermediate and final results
  ›    Decomposable attribution and authoring
  ›    Fine-grained access control and permissions
  ›    Example datasets for demonstration, reproducibility, monitoring, etc

RO Template
  ›  Placeholders to ease the aggregation process
  ›  Completeness checking/quality assessment
                                                                              7
Research Objects in Wf4Ever


Semantic Annotations
»    Author of an annotation
»    Author and co-authors of a workflow; reference link to a re-used workflow and its author
»    Who has performed the execution of a workflow leading to the results provided in the RO
»    Computing execution environment of the RO and local software dependencies
»    Special access requirements to web services
»    Datasets provider: person, webpage, survey, data release, etc.
»    How much time does it take to run a workflow using the full data and the provided subsample
»    The number of elements of the sample dataset where one workflow and/or RO iterates
»    Previous and subsequent workflows to be executed, as in the experimental protocol
»    Research institution, country, and scientific domain of the RO
»    The actual size of the RO and/or a folder
»    The version of a workflow



                                                                                                   8
Research Object Wf4Ever Semantic Model




                           DataLink
                                      9
Research Object Golden Exemplar

Luminosity Profiles RO

                                                  1010 Files, 200 MB
                                             External Sources ~ 8 GB




5 Main Workflows, 14 Nested Workflows, 25 Scripts, 11 Configuration files
10 Software dependencies, 1 Web Service

Dataset: 90 galaxies observed in 3 bands

                                                                            10
Incentives
Reproducibility
When organization is better than automation




                                                      11
Incentives
   !
Credit and attribution
 !
Papers with data links are cited more than those without




 Effect of E-printing on Citation Rates in Astronomy and Physics
 2006. Edwin A. Henneken et al.
                                                                           12
Research Object Digital Library Architecture



User
Clients




Extension
Services




Foundation
Services



                                                       13
Research Object Digital Library Architecture




                                          14
Research Objects in Astronomy

ADSLabs Research Objects

ADO Linked Components
»    Authors
»    Publications
»    Journals
»    Objects SIMBAD
»    Tabular data behind the plots CDS
»    ASCL reference of used software
»    Observing time Proposals
»    Used facilities, surveys or missions




                                                                   15

More Related Content

What's hot

A Recommender Story: Improving Backend Data Quality While Reducing Costs
A Recommender Story: Improving Backend Data Quality While Reducing CostsA Recommender Story: Improving Backend Data Quality While Reducing Costs
A Recommender Story: Improving Backend Data Quality While Reducing Costs
Databricks
 

What's hot (20)

Velocity cubes of galaxies
Velocity cubes of galaxiesVelocity cubes of galaxies
Velocity cubes of galaxies
 
Big Data Modeling Challenges and Machine Learning with No Code
Big Data Modeling Challenges and Machine Learning with No CodeBig Data Modeling Challenges and Machine Learning with No Code
Big Data Modeling Challenges and Machine Learning with No Code
 
What to Expect of the LSST Archive: The LSST Science Platform
What to Expect of the LSST Archive: The LSST Science PlatformWhat to Expect of the LSST Archive: The LSST Science Platform
What to Expect of the LSST Archive: The LSST Science Platform
 
Introduction NL-HUG (April)
Introduction NL-HUG (April)Introduction NL-HUG (April)
Introduction NL-HUG (April)
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundane
 
A Biological Internet?: Eywa
A Biological Internet?: EywaA Biological Internet?: Eywa
A Biological Internet?: Eywa
 
Reproducible Research and the Cloud
Reproducible Research and the CloudReproducible Research and the Cloud
Reproducible Research and the Cloud
 
ieee cloud 2015 keynote talk
ieee cloud 2015 keynote talkieee cloud 2015 keynote talk
ieee cloud 2015 keynote talk
 
Accelerating your research with Microsoft Azure
Accelerating your research with Microsoft AzureAccelerating your research with Microsoft Azure
Accelerating your research with Microsoft Azure
 
Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)Accelerating your Research with Microsoft Azure (June 2015)
Accelerating your Research with Microsoft Azure (June 2015)
 
A Recommender Story: Improving Backend Data Quality While Reducing Costs
A Recommender Story: Improving Backend Data Quality While Reducing CostsA Recommender Story: Improving Backend Data Quality While Reducing Costs
A Recommender Story: Improving Backend Data Quality While Reducing Costs
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis Gannon
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in Cytoscape
 
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
Keynote IEEE International Workshop on Cloud Analytics. Dennis  GannonKeynote IEEE International Workshop on Cloud Analytics. Dennis  Gannon
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
 
Cloud com foster december 2010
Cloud com foster december 2010Cloud com foster december 2010
Cloud com foster december 2010
 
A New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceA New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScience
 
Research Automation for Data-Driven Discovery
Research Automationfor Data-Driven DiscoveryResearch Automationfor Data-Driven Discovery
Research Automation for Data-Driven Discovery
 

Similar to Research Objects in Wf4Ever

Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
Carole Goble
 
Curation and Characterization of Web Services
Curation and Characterization of Web ServicesCuration and Characterization of Web Services
Curation and Characterization of Web Services
Jose Enrique Ruiz
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 

Similar to Research Objects in Wf4Ever (20)

Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the web
 
2013-01-17 Research Object
2013-01-17 Research Object2013-01-17 Research Object
2013-01-17 Research Object
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
From Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsFrom Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and Collaborations
 
Curation and Characterization of Web Services
Curation and Characterization of Web ServicesCuration and Characterization of Web Services
Curation and Characterization of Web Services
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
User engagement in research data curation
User engagement in research data curationUser engagement in research data curation
User engagement in research data curation
 

More from Jose Enrique Ruiz

Wf4Ever: Workflow Preservation
Wf4Ever: Workflow PreservationWf4Ever: Workflow Preservation
Wf4Ever: Workflow Preservation
Jose Enrique Ruiz
 
Use of CharDM in an archive of velocity cubes
Use of CharDM in an archive of velocity cubesUse of CharDM in an archive of velocity cubes
Use of CharDM in an archive of velocity cubes
Jose Enrique Ruiz
 
VO web-services-based astronomy workflows
VO web-services-based astronomy workflowsVO web-services-based astronomy workflows
VO web-services-based astronomy workflows
Jose Enrique Ruiz
 
Web services based workflows to deal with 3D data
Web services based workflows to deal with 3D dataWeb services based workflows to deal with 3D data
Web services based workflows to deal with 3D data
Jose Enrique Ruiz
 
Curating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsCurating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital Experiments
Jose Enrique Ruiz
 
Collaborative Digital Experiments
Collaborative Digital ExperimentsCollaborative Digital Experiments
Collaborative Digital Experiments
Jose Enrique Ruiz
 
El Observatorio Virtual - eCA
El Observatorio Virtual - eCAEl Observatorio Virtual - eCA
El Observatorio Virtual - eCA
Jose Enrique Ruiz
 
Multidimensional Data in the VO
Multidimensional Data in the VOMultidimensional Data in the VO
Multidimensional Data in the VO
Jose Enrique Ruiz
 
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
B0DEGA 3D VO Archive - IVOA 2010 Fall InteropB0DEGA 3D VO Archive - IVOA 2010 Fall Interop
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
Jose Enrique Ruiz
 
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science iWf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
Jose Enrique Ruiz
 

More from Jose Enrique Ruiz (14)

Jupyter notebooks on steroids
Jupyter notebooks on steroidsJupyter notebooks on steroids
Jupyter notebooks on steroids
 
Digital Science
Digital ScienceDigital Science
Digital Science
 
Wf4Ever: Workflow Preservation
Wf4Ever: Workflow PreservationWf4Ever: Workflow Preservation
Wf4Ever: Workflow Preservation
 
Use of CharDM in an archive of velocity cubes
Use of CharDM in an archive of velocity cubesUse of CharDM in an archive of velocity cubes
Use of CharDM in an archive of velocity cubes
 
Workflow Preservation
Workflow PreservationWorkflow Preservation
Workflow Preservation
 
VO web-services-based astronomy workflows
VO web-services-based astronomy workflowsVO web-services-based astronomy workflows
VO web-services-based astronomy workflows
 
Web services based workflows to deal with 3D data
Web services based workflows to deal with 3D dataWeb services based workflows to deal with 3D data
Web services based workflows to deal with 3D data
 
Curating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital ExperimentsCurating and Preserving Collaborative Digital Experiments
Curating and Preserving Collaborative Digital Experiments
 
Collaborative Digital Experiments
Collaborative Digital ExperimentsCollaborative Digital Experiments
Collaborative Digital Experiments
 
SVO Activities - SEA 2008
SVO Activities - SEA 2008SVO Activities - SEA 2008
SVO Activities - SEA 2008
 
El Observatorio Virtual - eCA
El Observatorio Virtual - eCAEl Observatorio Virtual - eCA
El Observatorio Virtual - eCA
 
Multidimensional Data in the VO
Multidimensional Data in the VOMultidimensional Data in the VO
Multidimensional Data in the VO
 
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
B0DEGA 3D VO Archive - IVOA 2010 Fall InteropB0DEGA 3D VO Archive - IVOA 2010 Fall Interop
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
 
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science iWf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Research Objects in Wf4Ever

  • 1. Research Objects in Wf4Ever Jose Enrique Ruiz jer@iaa.es On behalf of the Wf4Ever Team October 25th 2012 2012 IVOA Fall Interop Meeting - Sao Paolo 1
  • 2. Wf4Ever E-SCIENCE Wf4Ever 2011 - 2013 Advanced Workflow Preservation Technologies for Enhanced Science 1.  Intelligent Software Components (ISOCO, Spain) 2.  University of Manchester (UNIMAN, UK) 2 7 3.  Universidad Politécnica de Madrid (UPM, Spain) 5 4 4.  Poznan Supercomputing and Networking Centre (Poland) 5.  University of Oxford and OeRC (OXF, UK) 1 6.  Instituto Astrofísica Andalucía (IAA-CSIC, Spain) 3 6 7.  Leiden University Medical Centre (LUMC, NL) 2
  • 3. Astronomy Research Lifecycle Astronomy research lifecycle is entirely digital »  Observation proposals »  Data reduction pipelines »  Analysis of science ready data »  Catalogs of objects and data »  Publish process ›  Final data results ›  Experiment in DL ADS/arXiv Reproducible research is still not A normalized preservation of possible in a digital world methodology is needed Efficient use of rich data infrastructure Tools (VO) may be improved 3
  • 4. Efficiency and Reuse Optimize return on investments made on big facilities »  Avoid duplication of efforts and reinvention »  How to discover and not duplicate ? »  How to re-use and not duplicate ? »  How to make use of best practices ? »  How to use the rich infrastructure of data ? »  Intellectual contributions are encoded in softw More data in archives does not imply more knowledge »  Time has come to go beyond the PDF »  Expose complete scientific record, not the story »  Allow easy discovery of methods and tools 4
  • 5. Reproducibility: documenting and sharing I don’t know how Tools 5
  • 6. Research Objects in Wf4Ever Multi Workflow Centric Technical Objects Social Objects Distributed 6
  • 7. Research Objects in Wf4Ever RO Content ›  Process (workflows), data, external resources and bibliography ›  Execution environment set-up and local software dependencies ›  Experimental protocol followed ›  Roles, types and relationships among all digital components ›  Provenance of intermediate and final results ›  Decomposable attribution and authoring ›  Fine-grained access control and permissions ›  Example datasets for demonstration, reproducibility, monitoring, etc RO Template ›  Placeholders to ease the aggregation process ›  Completeness checking/quality assessment 7
  • 8. Research Objects in Wf4Ever Semantic Annotations »  Author of an annotation »  Author and co-authors of a workflow; reference link to a re-used workflow and its author »  Who has performed the execution of a workflow leading to the results provided in the RO »  Computing execution environment of the RO and local software dependencies »  Special access requirements to web services »  Datasets provider: person, webpage, survey, data release, etc. »  How much time does it take to run a workflow using the full data and the provided subsample »  The number of elements of the sample dataset where one workflow and/or RO iterates »  Previous and subsequent workflows to be executed, as in the experimental protocol »  Research institution, country, and scientific domain of the RO »  The actual size of the RO and/or a folder »  The version of a workflow 8
  • 9. Research Object Wf4Ever Semantic Model DataLink 9
  • 10. Research Object Golden Exemplar Luminosity Profiles RO 1010 Files, 200 MB External Sources ~ 8 GB 5 Main Workflows, 14 Nested Workflows, 25 Scripts, 11 Configuration files 10 Software dependencies, 1 Web Service Dataset: 90 galaxies observed in 3 bands 10
  • 12. Incentives ! Credit and attribution ! Papers with data links are cited more than those without Effect of E-printing on Citation Rates in Astronomy and Physics 2006. Edwin A. Henneken et al. 12
  • 13. Research Object Digital Library Architecture User Clients Extension Services Foundation Services 13
  • 14. Research Object Digital Library Architecture 14
  • 15. Research Objects in Astronomy ADSLabs Research Objects ADO Linked Components »  Authors »  Publications »  Journals »  Objects SIMBAD »  Tabular data behind the plots CDS »  ASCL reference of used software »  Observing time Proposals »  Used facilities, surveys or missions 15