SlideShare une entreprise Scribd logo
1  sur  12
The Symbiotic Nature of Provenance and
                     Workflow




    Eric Stephan, Todd Halter

    Pacific Northwest National Laboratory
1
The Systems Science Challenge
!   Studying complex systems typically has the
    following characteristics:
     !    Interdisciplinary problem involving various
          stakeholders
     !    Leverage multiple tools, algorithms, data products, and
          sensors
     !    Reliant on highly iterative and repetitive techniques
     !    Steps are difficult to document and are often time
          committed to memory or notes.
!   Solution is to provide:
     !    ‘plumbing’ to more easily configure and automate
          integration, calculation, analysis, and visualization
     !    Provide a historical explanation of what occurred

2
Active Computer Science Research Areas
    !   Workflows – plumbing
    !   Provenance – explanation
    • Without a historical explanation
    workflows provide capability,
    but neglect a documentation
    trail of what transpired.

    • Without plumbing provenance
    is difficult to introduce
    generically or to support legacy
    applications

3
Example Workflow Products

    !   Creating executable workflows
    from schematic drawings
    I.	
  Al&nas,	
  O.	
  Barney,	
  Z.Cheng,	
  T.	
  Critchlow,	
  B.	
  Ludaescher,	
  S.	
  Parker,	
  A.	
  Shoshani,	
  M.	
  Vouk,	
  
    “Accelera&ng	
  the	
  Scien&fic	
  Explora&on	
  Process	
  with	
  Scien&fic	
  Workflows”,	
  In	
  Journal	
  of	
  
    Physics:	
  Conference	
  Series	
  SciDAC	
  2006	
  proceedings.	
  	
  June	
  2006.




4
Example Workflow Products

    !   Constructing component based                                                                           MeDICi: Middleware for Data-
                                                                                                               Intensive Computing
    analytical pipelines on enterprise
    service bus technology
    Gorton	
  I,	
  AS	
  Wynne,	
  JP	
  Almquist,	
  and	
  J	
  ChaQerton.	
  2008.	
  ”The	
  MeDICi	
  
    Integra&on	
  Framework:	
  A	
  PlaVorm	
  for	
  High	
  Performance	
  Data	
  Streaming	
  
    Applica&ons.”	
  In	
  WICSA	
  2008.	
  7th	
  IEEE/IFIP	
  Working	
  Conference	
  on	
  So[ware	
  
    Architecture,	
  Feb.	
  18-­‐22,	
  2008,	
  Vancouver,	
  Canada	
  ,	
  pp.	
  95-­‐104.	
  IEEE	
  
    Computer	
  Society,	
  Los	
  Alamitos,	
  CA.	
  doi:10.1109/WICSA.2008.21	
  




5
Example of Provenance

    !   Digital Library, Lineage

    !   Extensible Open Model- Open Provenance Model
       Moreau	
  L,	
  B	
  Clifford,	
  J	
  Freire,	
  J	
  Futrelle,	
  Y	
  Gil,	
  P	
  Groth,	
  N	
  Kwasnikowska,	
  S	
  Miles,	
  P	
  Missier,	
  J	
  Myers,	
  BA	
  Plale,	
  YL	
  
       Simmhan,	
  EG	
  Stephan,	
  and	
  J	
  Van	
  den	
  Bussche.	
  	
  2010.	
  	
  "The	
  Open	
  Provenance	
  Model	
  Core	
  Specifica&on	
  
       (v1.1)	
  ."	
  	
  Future	
  Genera@ons	
  Computer	
  Systems.	
  	
  doi:10.1016/j.future.2010.07.005	
  



     !   Semantic web-based Models- Proof Markup Language
         W3C	
  Incubator	
  Group,	
  hQp://www.w3.org/2005/Incubator/prov/wiki/
         W3C_Provenance_Incubator_Group_Wiki	
  




6
Examples of Creating Connectivity…

    !   Workflows
       !   Event listeners
       !   Self describing workflow components, flow

    !   Provenance
       !   Formally described
       !   Support for reasoning, transitive closure etc.
       !   Semantically relevant to provenance consumers.




7
Existing Deficiencies
    !   Workflows
       !   Listeners only reporting syntactic events
             !   Deluge of atomic transactions

       !   Inability to convey logical constructs
             !   E.g. initialization stage

       !   Lack of support to collect logs from legacy applications
    !   Provenance
       !   Collecting naïve provenance – big graph dilemma
       !   Hardcoded – risk being out of sync with workflow
       !   Collection without end user requirements


8
Interoperability Aides
    !   Applying provenance execution models to workflow
        listeners
       !       E.g. Describe Anything DaAPI
      Wynne	
  AS,	
  I	
  Gorton,	
  JM	
  Chase,	
  and	
  EG	
  Stephan.	
  	
  2009.	
  	
  MeDICi:	
  An	
  Open	
  PlaEorm	
  for	
  Sensor	
  Integra@on	
  .	
  	
  PNNL-­‐18716,	
  Pacific	
  
      Northwest	
  Na&onal	
  Laboratory,	
  Richland,	
  WA.	
  

    !   Incorporating provenance in workflow framework
       !       Semantic Abstract Workflow (SAW)
      Leonardo	
  Salayandia	
  and	
  Paulo	
  Pinheiro	
  da	
  Silva.	
  On	
  the	
  Use	
  of	
  Seman&c	
  Abstract	
  Workflows	
  Rooted	
  on	
  Provenance	
  
      Concepts	
  .PROVENANCE	
  AND	
  ANNOTATION	
  OF	
  DATA	
  AND	
  PROCESSES.	
  Lecture	
  Notes	
  in	
  Computer	
  Science,	
  2010,	
  
      Volume	
  6378/2010,	
  216-­‐220,	
  DOI:	
  10.1007/978-­‐3-­‐642-­‐17819-­‐1_24	
  




9
Interoperability Aides
     !   Advanced storage –
               !       Grids, Semantic Wikis




     !   New Provenance Model Abstractions
     Stephan	
  EG,	
  TD	
  Halter,	
  and	
  BD	
  Ermold.	
  	
  2010.	
  	
  "Leveraging	
  The	
  Open	
  Provenance	
  Model	
  as	
  a	
  Mul&-­‐Tier	
  Model	
  for	
  
     Global	
  Climate	
  Research	
  ."	
  	
  In	
  The	
  3rd	
  Interna@onal	
  Provenance	
  and	
  Annota@on	
  Workshop	
  (IPAW'2010).

     Gibson	
  TD,	
  KL	
  Schuchardt,	
  and	
  EG	
  Stephan.	
  	
  2009.	
  	
  "Applica&on	
  of	
  Named	
  Graphs	
  Towards	
  Custom	
  Provenance	
  
     Views."	
  	
  In	
  1st	
  Workshop	
  on	
  the	
  Theory	
  and	
  Prac&ce	
  of	
  Provenance	
  (TaPP	
  '09),	
  p.	
  Paper	
  No.	
  5.	
  	
  USENIX,	
  Berkeley,	
  CA.	
  	
  	
  


10
Conclusions

     !   Good news - Workflow and provenance interoperability is
         evolving.

     !   Challenge #1: Recognizing existence of symbiotic
         relationship between Workflow and Provenance.

     !   Challenge #2: Finding new ways to harness this
         relationship to advance systems science research.

11
Questions?

     !   Contact: eric.stephan@pnl.gov




12

Contenu connexe

Tendances

Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchPeter van Heusden
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Carole Goble
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Yves Sucaet
 
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Deborah McGuinness
 
Finding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesFinding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesmhaendel
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402vrij
 
Interventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studiesInterventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studiesAntti Salovaara
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
 
Scratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceScratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceVince Smith
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationRinke Hoekstra
 

Tendances (20)

Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible research
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
Summary of 3DPAS
Summary of 3DPASSummary of 3DPAS
Summary of 3DPAS
 
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
 
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
 
Finding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesFinding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologies
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402
 
Interventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studiesInterventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studies
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
Katie Hochberg Resume
Katie Hochberg ResumeKatie Hochberg Resume
Katie Hochberg Resume
 
Resume 2016 detailed
Resume 2016 detailedResume 2016 detailed
Resume 2016 detailed
 
Scratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceScratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity science
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 

En vedette

A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...Eric Stephan
 
Climate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future ProvenanceClimate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future ProvenanceEric Stephan
 
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Eric Stephan
 
Open source Software: pros and cons
Open source Software: pros and consOpen source Software: pros and cons
Open source Software: pros and consygpriya
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 

En vedette (7)

A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
 
Climate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future ProvenanceClimate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future Provenance
 
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
 
Open source Software: pros and cons
Open source Software: pros and consOpen source Software: pros and cons
Open source Software: pros and cons
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 

Similaire à The Symbiotic Nature of Provenance and Workflow

Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataOscar Corcho
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible researchYannick Wurm
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Semantic Web in Physical Science
Semantic Web in Physical ScienceSemantic Web in Physical Science
Semantic Web in Physical Sciencepetermurrayrust
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynoteCarole Goble
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009Ian Foster
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Jisc
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Rudy Potenzone
 
Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Richard Zijdeman
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...Advanced-Concepts-Team
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleAndy Petrella
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 

Similaire à The Symbiotic Nature of Provenance and Workflow (20)

Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream Data
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Semantic Web in Physical Science
Semantic Web in Physical ScienceSemantic Web in Physical Science
Semantic Web in Physical Science
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
Reproducible Research and the Cloud
Reproducible Research and the CloudReproducible Research and the Cloud
Reproducible Research and the Cloud
 
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 

Dernier

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 

Dernier (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

The Symbiotic Nature of Provenance and Workflow

  • 1. The Symbiotic Nature of Provenance and Workflow Eric Stephan, Todd Halter Pacific Northwest National Laboratory 1
  • 2. The Systems Science Challenge !   Studying complex systems typically has the following characteristics: !  Interdisciplinary problem involving various stakeholders !  Leverage multiple tools, algorithms, data products, and sensors !  Reliant on highly iterative and repetitive techniques !  Steps are difficult to document and are often time committed to memory or notes. !   Solution is to provide: !  ‘plumbing’ to more easily configure and automate integration, calculation, analysis, and visualization !  Provide a historical explanation of what occurred 2
  • 3. Active Computer Science Research Areas !   Workflows – plumbing !   Provenance – explanation • Without a historical explanation workflows provide capability, but neglect a documentation trail of what transpired. • Without plumbing provenance is difficult to introduce generically or to support legacy applications 3
  • 4. Example Workflow Products !   Creating executable workflows from schematic drawings I.  Al&nas,  O.  Barney,  Z.Cheng,  T.  Critchlow,  B.  Ludaescher,  S.  Parker,  A.  Shoshani,  M.  Vouk,   “Accelera&ng  the  Scien&fic  Explora&on  Process  with  Scien&fic  Workflows”,  In  Journal  of   Physics:  Conference  Series  SciDAC  2006  proceedings.    June  2006. 4
  • 5. Example Workflow Products !   Constructing component based MeDICi: Middleware for Data- Intensive Computing analytical pipelines on enterprise service bus technology Gorton  I,  AS  Wynne,  JP  Almquist,  and  J  ChaQerton.  2008.  ”The  MeDICi   Integra&on  Framework:  A  PlaVorm  for  High  Performance  Data  Streaming   Applica&ons.”  In  WICSA  2008.  7th  IEEE/IFIP  Working  Conference  on  So[ware   Architecture,  Feb.  18-­‐22,  2008,  Vancouver,  Canada  ,  pp.  95-­‐104.  IEEE   Computer  Society,  Los  Alamitos,  CA.  doi:10.1109/WICSA.2008.21   5
  • 6. Example of Provenance !   Digital Library, Lineage !   Extensible Open Model- Open Provenance Model Moreau  L,  B  Clifford,  J  Freire,  J  Futrelle,  Y  Gil,  P  Groth,  N  Kwasnikowska,  S  Miles,  P  Missier,  J  Myers,  BA  Plale,  YL   Simmhan,  EG  Stephan,  and  J  Van  den  Bussche.    2010.    "The  Open  Provenance  Model  Core  Specifica&on   (v1.1)  ."    Future  Genera@ons  Computer  Systems.    doi:10.1016/j.future.2010.07.005   !   Semantic web-based Models- Proof Markup Language W3C  Incubator  Group,  hQp://www.w3.org/2005/Incubator/prov/wiki/ W3C_Provenance_Incubator_Group_Wiki   6
  • 7. Examples of Creating Connectivity… !   Workflows !  Event listeners !   Self describing workflow components, flow !   Provenance !  Formally described !   Support for reasoning, transitive closure etc. !   Semantically relevant to provenance consumers. 7
  • 8. Existing Deficiencies !   Workflows !  Listeners only reporting syntactic events !   Deluge of atomic transactions !   Inability to convey logical constructs !   E.g. initialization stage !   Lack of support to collect logs from legacy applications !   Provenance !  Collecting naïve provenance – big graph dilemma !   Hardcoded – risk being out of sync with workflow !   Collection without end user requirements 8
  • 9. Interoperability Aides !   Applying provenance execution models to workflow listeners !  E.g. Describe Anything DaAPI Wynne  AS,  I  Gorton,  JM  Chase,  and  EG  Stephan.    2009.    MeDICi:  An  Open  PlaEorm  for  Sensor  Integra@on  .    PNNL-­‐18716,  Pacific   Northwest  Na&onal  Laboratory,  Richland,  WA.   !   Incorporating provenance in workflow framework !  Semantic Abstract Workflow (SAW) Leonardo  Salayandia  and  Paulo  Pinheiro  da  Silva.  On  the  Use  of  Seman&c  Abstract  Workflows  Rooted  on  Provenance   Concepts  .PROVENANCE  AND  ANNOTATION  OF  DATA  AND  PROCESSES.  Lecture  Notes  in  Computer  Science,  2010,   Volume  6378/2010,  216-­‐220,  DOI:  10.1007/978-­‐3-­‐642-­‐17819-­‐1_24   9
  • 10. Interoperability Aides !   Advanced storage – !  Grids, Semantic Wikis !   New Provenance Model Abstractions Stephan  EG,  TD  Halter,  and  BD  Ermold.    2010.    "Leveraging  The  Open  Provenance  Model  as  a  Mul&-­‐Tier  Model  for   Global  Climate  Research  ."    In  The  3rd  Interna@onal  Provenance  and  Annota@on  Workshop  (IPAW'2010). Gibson  TD,  KL  Schuchardt,  and  EG  Stephan.    2009.    "Applica&on  of  Named  Graphs  Towards  Custom  Provenance   Views."    In  1st  Workshop  on  the  Theory  and  Prac&ce  of  Provenance  (TaPP  '09),  p.  Paper  No.  5.    USENIX,  Berkeley,  CA.       10
  • 11. Conclusions !   Good news - Workflow and provenance interoperability is evolving. !   Challenge #1: Recognizing existence of symbiotic relationship between Workflow and Provenance. !   Challenge #2: Finding new ways to harness this relationship to advance systems science research. 11
  • 12. Questions? !   Contact: eric.stephan@pnl.gov 12