More Related Content Similar to SHARP: Harmonizing Galaxy and Taverna workflow provenance (15) More from Syed Muhammad Ali Hasnain (10) SHARP: Harmonizing Galaxy and Taverna workflow provenance1. SHARP: Harmonizing Galaxy and Taverna workflow
provenance
SeWeBMeDA’17 - Demonstration
Alban Gaignard1
, Khalid Belhajjame2
, Hala Skaf-Molli3
May 28, 2017
1
Nantes Academic Hospital, France
2
LAMSADE Paris-Dauphine University, France
3
LS2N - Nantes University, France
2. Multiple workflow engines
Taverna workflow
@research-lab
Galaxy workflow
@sequencing-facility
Variant effect
prediction
VCF file
Exon filtering
output
Merge
Alignment
sample
1.a.R1
sample
1.a.R2
Alignment
sample
1.b.R1
sample
1.b.R2
Alignment
sample
2.R1
sample
2.R2
Sort Sort
Variant calling
GRCh37
go to owl:sameAs
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 1
7. Demonstration scenario
– Provenance capture
— Provenance interlinking
˜ Provenance harmonization
™ Provenance summarization (influence graphs,
nanopublications)
• https://github.com/albangaignard/galaxy-PROV
• https://github.com/albangaignard/sharp-prov-toolbox
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 6
8. – Provenance capture
Taverna
Built-in when saving workflow execution results.
Galaxy
GALAXY-PROV tool + web interface:
• API key
• list Galaxy data processing histories
• generate PROV (turtle)
• visualize PROV (D3.js)
https://github.com/albangaignard/galaxy-PROV
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 7
12. — Provenance interlinking
1. SHA-512 fingerprint of files
2. annotating PROV entities with SHA-512 digest
3. producing owl:sameAs → SPARQL CONSTRUCT-WHERE query
Command line tool
java -jar SharpProvToolbox/target/SHARP-1.0-SNAPSHOT-launcher.jar
-ri sample-data/control_mm9_chr15_Plekhh2-PigF_forward.fastq
sample-data/control_mm9_chr15_Plekhh2-PigF_reverse.fastq
sample-data/drugged_mm9_chr15_Plekhh2-PigF_forward.fastq
sample-data/drugged_mm9_chr15_Plekhh2-PigF_reverse.fastq
sample-data/unknown.fastq
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 10
14. ˜ Provenance harmonization
1. OWL entailments, Jena API
ReasonerRegistry.getOWLMiniReasoner()
2. PROV inferences (TGD), Jena rule engine
new GenericRuleReasoner(all prov rules)
3. Blank nodes removing (EGD)
Command line tool
java -jar SharpProvToolbox/target/SHARP-1.0-SNAPSHOT-launcher.jar
-i sample-data/taverna.prov.ttl
sample-data/galaxy.prov.ttl
sample-data/sameas.ttl
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 12
16. ™ Provenance summarization: influence graph
CONSTRUCT {
?x ?p ?y .
?x rdfs:label ?lx .
?y rdfs:label ?ly .
} WHERE {
?x ?p ?y .
FILTER (?p IN (prov:wasInfluencedBy)) .
?x rdfs:label ?lx .
?y rdfs:label ?ly .
}
+ HTML/D3.js code generation
Command line tool
java -jar SharpProvToolbox/target/SHARP-1.0-SNAPSHOT-launcher.jar
-i sample-data/taverna.prov.ttl
sample-data/galaxy.prov.ttl
sample-data/sameas.ttl
-s
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 14
18. ™ Provenance summarization: nanopublication
CONSTRUCT {
GRAPH :assertion {
?ref_genome a sio:Genome .
?sample a sio:Sample ;
sio:is-variant-of ?ref_genome ;
sio:has-phenotype ?out .
[...]
}
} WHERE {
[...] ?out ( prov:wasInfluencedBy )+ ?sample . [...]
}
Command line tool
java -jar SharpProvToolbox/target/SHARP-1.0-SNAPSHOT-launcher.jar
-i sample-data/taverna.prov.ttl
sample-data/galaxy.prov.ttl
sample-data/sameas.ttl
-sq sample-data/nanopub.query
A. Gaignard, K. Belhajjame, H. Skaff Molli – SeWeBMeDA’17 16