SlideShare une entreprise Scribd logo
1  sur  6
Use Case for D-PROV:
Querying Provenance Traces Produced by
Workflows Enacted by Different Systems
   Khalid Belhajjame,
   Fernando Seabra Chirigati,
   Victor Cuevas
Context and Objective
• D-PROV is a model that capture both workflow definitions, their provenance as well as the
  provenance of the results obtained by their execution. It expressive enough to capture the
  definition of workflows and provenance traces that are specified in multiple workflow
  systems, in particular Kepler, Taverna and VisTrails

• D-PROV provides users with an integrated access to workflow definitions and associated
  provenance traces

• It uses (extends) the W3C PROV model to capture the provenance traces produced by the
  execution of such workflows

• The objective of this use case is to show that D-PROV users are able to query (and
  combine) provenance traces that are produced by (equivalent) workflows that are specified
  and enacted using different systems, namely Taverna and VisTrails

• Note that while in the use case we focus on two equivalent workflows, generally
  speaking, D-PROV is expected to allow users to query and combine provenance traces of
  workflows that are not necessarily equivalent.
Approach
• The approach adopted in the use case is a four-step process
  that is illustrated in the figure below

              Enact the workflows within their native
              Enact the workflows                        done
           within their native system
                                system

            Export the provenance traces in the native   done
                 format of the workflow systems

                Map the workflows and associated
                                                         ongoing
                 provenance traces to D-PROV

           Query the provenance traces produced by the
                  workflow system using D-PROV
Workflows
We used two (equivalent) workflows specified within Taverna and
VisTrails. Both workflows implement a simple in-silico experiment for
pathway analysis. Given gene IDs, the workflows fetch the
corresponding pathways. To do so, they make use of two KEGG web
services




 Taverna Workflow          Vistrails Workflow
Provenance Traces
• The two workflows were enacted within their respective system
  using different (yet overlapping) set of gene Ids as inputs

• The provenance traces were then captured and exported in different
  formats
  • From the Taverna workflow, we used PROVO and JANUS formats
  • From the VisTrails workflow, we used their own provenance format
    (based on XML) and OPM

• The workflows and their provenance are accessible through
  myExperiment [1]

• Workflows and their provenance traces are now being mapped to D-
  PROV

[1] http://www.myexperiment.org/packs/317.html
Queries
Once the mapping is done, we would like to issue some queries, as the
ones specified below, against D-PROV:

• Q1: Give the pathways that were produced by the pathway analysis
  workflow (as is defined within D-PROV), specifying the gene IDs that
  were used as inputs to that workflow
  • The result of this query should be the union of pathways returned by
    Taverna and VisTrails workflows, together with the gene IDS used as
    input to both workflows.

• Q2: Give the pathways that were produced by the Taverna
  workflow, and that are associated with gene IDs that were not used
  as input to the VisTrails workflow
  • This is a diff query

Contenu connexe

En vedette

Introduction to ProvBench @ Provenance Week 2014
Introduction to ProvBench @ Provenance Week 2014Introduction to ProvBench @ Provenance Week 2014
Introduction to ProvBench @ Provenance Week 2014Khalid Belhajjame
 
Research Object Model in Sepublica
Research Object Model in SepublicaResearch Object Model in Sepublica
Research Object Model in SepublicaKhalid Belhajjame
 
Case studyworkshoponprovenance
Case studyworkshoponprovenanceCase studyworkshoponprovenance
Case studyworkshoponprovenanceKhalid Belhajjame
 
A Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its ExtensionsA Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its ExtensionsKhalid Belhajjame
 
Detecting Duplicate Records in Scientific Workflow Results
Detecting Duplicate Records in Scientific Workflow ResultsDetecting Duplicate Records in Scientific Workflow Results
Detecting Duplicate Records in Scientific Workflow ResultsKhalid Belhajjame
 
Предиктивная аналитика и Big Data: методы, инструменты, решения
Предиктивная аналитика и Big Data: методы, инструменты, решенияПредиктивная аналитика и Big Data: методы, инструменты, решения
Предиктивная аналитика и Big Data: методы, инструменты, решенияDell_Russia
 

En vedette (11)

Ikc 2015
Ikc 2015Ikc 2015
Ikc 2015
 
Introduction to ProvBench @ Provenance Week 2014
Introduction to ProvBench @ Provenance Week 2014Introduction to ProvBench @ Provenance Week 2014
Introduction to ProvBench @ Provenance Week 2014
 
Research Object Model in Sepublica
Research Object Model in SepublicaResearch Object Model in Sepublica
Research Object Model in Sepublica
 
Why Workflows Break
Why Workflows BreakWhy Workflows Break
Why Workflows Break
 
Case studyworkshoponprovenance
Case studyworkshoponprovenanceCase studyworkshoponprovenance
Case studyworkshoponprovenance
 
Edbt2014 talk
Edbt2014 talkEdbt2014 talk
Edbt2014 talk
 
A Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its ExtensionsA Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its Extensions
 
Anr cair meeting feb 2016
Anr cair meeting feb 2016Anr cair meeting feb 2016
Anr cair meeting feb 2016
 
Detecting Duplicate Records in Scientific Workflow Results
Detecting Duplicate Records in Scientific Workflow ResultsDetecting Duplicate Records in Scientific Workflow Results
Detecting Duplicate Records in Scientific Workflow Results
 
Reproducibility 1
Reproducibility 1Reproducibility 1
Reproducibility 1
 
Предиктивная аналитика и Big Data: методы, инструменты, решения
Предиктивная аналитика и Big Data: методы, инструменты, решенияПредиктивная аналитика и Big Data: методы, инструменты, решения
Предиктивная аналитика и Big Data: методы, инструменты, решения
 

Similaire à D-prov use-case

Wolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national serviceWolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national serviceJan Aerts
 
Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09smarru
 
OPC Unified Architecture
OPC Unified ArchitectureOPC Unified Architecture
OPC Unified ArchitectureVishwa Mohan
 
BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...
BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...
BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...Big Data Value Association
 
Adam shiwa summerschool 2012
Adam shiwa summerschool 2012Adam shiwa summerschool 2012
Adam shiwa summerschool 2012aszbel
 
Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsLightbend
 
Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013
Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013
Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013anpawlik
 
Simseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made EasySimseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made EasySilvio Cesare
 
2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflows2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflowsmyGrid team
 
2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperiment2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperimentmyGrid team
 
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...Docker, Inc.
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"Pinar Alper
 
How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?ThoughtWorks Studios
 
Implementing Continuous Delivery with Enterprise Middleware
Implementing Continuous Delivery with Enterprise MiddlewareImplementing Continuous Delivery with Enterprise Middleware
Implementing Continuous Delivery with Enterprise MiddlewareXebiaLabs
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsCarole Goble
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Diverajdeep
 
AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...
AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...
AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...Amazon Web Services
 
Declarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data modelsDeclarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data modelsMonal Daxini
 

Similaire à D-prov use-case (20)

Wolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national serviceWolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national service
 
Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09
 
NFV Testing
NFV TestingNFV Testing
NFV Testing
 
OPC Unified Architecture
OPC Unified ArchitectureOPC Unified Architecture
OPC Unified Architecture
 
BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...
BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...
BDVe Webinar Series - Toreador Intro - Designing Big Data pipelines (Paolo Ce...
 
Adam shiwa summerschool 2012
Adam shiwa summerschool 2012Adam shiwa summerschool 2012
Adam shiwa summerschool 2012
 
Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML Models
 
Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013
Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013
Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013
 
Simseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made EasySimseer.com - Malware Similarity and Clustering Made Easy
Simseer.com - Malware Similarity and Clustering Made Easy
 
2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflows2014 Taverna tutorial introduction to Taverna workflows
2014 Taverna tutorial introduction to Taverna workflows
 
2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperiment2014 Taverna tutorial myExperiment
2014 Taverna tutorial myExperiment
 
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
Getting Deep on Orchestration: APIs, Actors, and Abstractions in a Distribute...
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
 
How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?
 
Implementing Continuous Delivery with Enterprise Middleware
Implementing Continuous Delivery with Enterprise MiddlewareImplementing Continuous Delivery with Enterprise Middleware
Implementing Continuous Delivery with Enterprise Middleware
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Dive
 
C4Bio paper talk
C4Bio paper talkC4Bio paper talk
C4Bio paper talk
 
AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...
AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...
AWS re:Invent 2016: Infrastructure Continuous Delivery Using AWS CloudFormati...
 
Declarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data modelsDeclarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data models
 

Plus de Khalid Belhajjame

Lineage-Preserving Anonymization of the Provenance of Collection-Based Workflows
Lineage-Preserving Anonymization of the Provenance of Collection-Based WorkflowsLineage-Preserving Anonymization of the Provenance of Collection-Based Workflows
Lineage-Preserving Anonymization of the Provenance of Collection-Based WorkflowsKhalid Belhajjame
 
Privacy-Preserving Data Analysis Workflows for eScience
Privacy-Preserving Data Analysis Workflows for eSciencePrivacy-Preserving Data Analysis Workflows for eScience
Privacy-Preserving Data Analysis Workflows for eScienceKhalid Belhajjame
 
Converting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsConverting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsKhalid Belhajjame
 
Linking the prospective and retrospective provenance of scripts
Linking the prospective and retrospective provenance of scriptsLinking the prospective and retrospective provenance of scripts
Linking the prospective and retrospective provenance of scriptsKhalid Belhajjame
 
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...Khalid Belhajjame
 
Intégration incrémentale de données (Valenciennes juin 2010)
Intégration incrémentale de données (Valenciennes juin 2010)Intégration incrémentale de données (Valenciennes juin 2010)
Intégration incrémentale de données (Valenciennes juin 2010)Khalid Belhajjame
 

Plus de Khalid Belhajjame (12)

Provenance witha purpose
Provenance witha purposeProvenance witha purpose
Provenance witha purpose
 
Lineage-Preserving Anonymization of the Provenance of Collection-Based Workflows
Lineage-Preserving Anonymization of the Provenance of Collection-Based WorkflowsLineage-Preserving Anonymization of the Provenance of Collection-Based Workflows
Lineage-Preserving Anonymization of the Provenance of Collection-Based Workflows
 
Privacy-Preserving Data Analysis Workflows for eScience
Privacy-Preserving Data Analysis Workflows for eSciencePrivacy-Preserving Data Analysis Workflows for eScience
Privacy-Preserving Data Analysis Workflows for eScience
 
Irpb workshop
Irpb workshopIrpb workshop
Irpb workshop
 
Aussois bda-mdd-2018
Aussois bda-mdd-2018Aussois bda-mdd-2018
Aussois bda-mdd-2018
 
Converting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsConverting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objects
 
Linking the prospective and retrospective provenance of scripts
Linking the prospective and retrospective provenance of scriptsLinking the prospective and retrospective provenance of scripts
Linking the prospective and retrospective provenance of scripts
 
Tapp 2014 (belhajjame)
Tapp 2014 (belhajjame)Tapp 2014 (belhajjame)
Tapp 2014 (belhajjame)
 
Credible workshop
Credible workshopCredible workshop
Credible workshop
 
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
 
Intégration incrémentale de données (Valenciennes juin 2010)
Intégration incrémentale de données (Valenciennes juin 2010)Intégration incrémentale de données (Valenciennes juin 2010)
Intégration incrémentale de données (Valenciennes juin 2010)
 
Edbt 2010, Belhajjame
Edbt 2010, BelhajjameEdbt 2010, Belhajjame
Edbt 2010, Belhajjame
 

Dernier

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

D-prov use-case

  • 1. Use Case for D-PROV: Querying Provenance Traces Produced by Workflows Enacted by Different Systems Khalid Belhajjame, Fernando Seabra Chirigati, Victor Cuevas
  • 2. Context and Objective • D-PROV is a model that capture both workflow definitions, their provenance as well as the provenance of the results obtained by their execution. It expressive enough to capture the definition of workflows and provenance traces that are specified in multiple workflow systems, in particular Kepler, Taverna and VisTrails • D-PROV provides users with an integrated access to workflow definitions and associated provenance traces • It uses (extends) the W3C PROV model to capture the provenance traces produced by the execution of such workflows • The objective of this use case is to show that D-PROV users are able to query (and combine) provenance traces that are produced by (equivalent) workflows that are specified and enacted using different systems, namely Taverna and VisTrails • Note that while in the use case we focus on two equivalent workflows, generally speaking, D-PROV is expected to allow users to query and combine provenance traces of workflows that are not necessarily equivalent.
  • 3. Approach • The approach adopted in the use case is a four-step process that is illustrated in the figure below Enact the workflows within their native Enact the workflows done within their native system system Export the provenance traces in the native done format of the workflow systems Map the workflows and associated ongoing provenance traces to D-PROV Query the provenance traces produced by the workflow system using D-PROV
  • 4. Workflows We used two (equivalent) workflows specified within Taverna and VisTrails. Both workflows implement a simple in-silico experiment for pathway analysis. Given gene IDs, the workflows fetch the corresponding pathways. To do so, they make use of two KEGG web services Taverna Workflow Vistrails Workflow
  • 5. Provenance Traces • The two workflows were enacted within their respective system using different (yet overlapping) set of gene Ids as inputs • The provenance traces were then captured and exported in different formats • From the Taverna workflow, we used PROVO and JANUS formats • From the VisTrails workflow, we used their own provenance format (based on XML) and OPM • The workflows and their provenance are accessible through myExperiment [1] • Workflows and their provenance traces are now being mapped to D- PROV [1] http://www.myexperiment.org/packs/317.html
  • 6. Queries Once the mapping is done, we would like to issue some queries, as the ones specified below, against D-PROV: • Q1: Give the pathways that were produced by the pathway analysis workflow (as is defined within D-PROV), specifying the gene IDs that were used as inputs to that workflow • The result of this query should be the union of pathways returned by Taverna and VisTrails workflows, together with the gene IDS used as input to both workflows. • Q2: Give the pathways that were produced by the Taverna workflow, and that are associated with gene IDs that were not used as input to the VisTrails workflow • This is a diff query