See presentation at http://www.slideshare.net/soilandreyes/2011-0716-scufl2-because-a-workflow-is-more-than-its-definition-bosc-2011
From BOSC 2011 - http://www.open-bio.org/wiki/BOSC_2011_Schedule
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Scufl2 - A Self-Documenting Workflow Format
1. http://www.taverna.org.uk/
Project site:
Scufl2 – because a workflow
http://www.taverna.org.uk/
Source code:
is more than its definition
https://github.com/mygrid/scufl2
Stian Soiland-Reyes, Alan R Williams, Stuart Owen, David Withers and Carole Goble
http://taverna.googlecode.com/
School of Computer Science, University of Manchester, UK
License: GNU Lesser General Public {stian.soiland-reyes, alan.r.williams, stuart.owen, david.withers, carole.a.goble}@manchester.ac.uk
License (LGPL) 2.1
“Structured” zip-file • Taverna’s future workflow and data format
Self-documenting media
(can be unpacked to be • Goal: Simplify third-party reading, writing,
exposed on the web)
type (OpenOffice ODF,
ePub OCF, Adobe UCF) – annotating and extending
•
for tools like file(1) and
mime magic Scufl2: specification, schema, ontology,
Java API and conversion tool
the-workflow-bundle.scufl2
mimetype Manifest listing all
resources in
application/vnd.taverna.scufl2.workflow-bundle
META-INF/manifest.xml bundle with media
META-INF/container.xml types
<manifest:manifest
xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0
">
<container version="1.0“ xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
<manifest:file-entry manifest:media-
<rootfiles>
type="application/vnd.taverna.scufl2.workflow-bundle"
<rootfile full-path="workflowBundle.rdf" media-type="application/rdf+xml" /> manifest:full-path="/"/>
<!-- <rootfile full-path="workflowBundle.ttl" <!– Standard resources -->
media-type="text/turtle" /> Alternative repr. --> <manifest:file-entry manifest:media-type="application/rdf+xml"
</rootfiles> manifest:full-path="workflowBundle.rdf"/>
</container> <manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="workflow/HelloWorld.rdf"/>
<manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="profile/tavernaWorkbench.rdf"/>
<manifest:file-entry manifest:media-type="application/rdf+xml"
workflowBundle.rdf manifest:full-path="profile/tavernaServer.rdf"/>
<!– Any additional resources -->
<manifest:file-entry manifest:media-type="application/rdf+xml"
<rdf:RDF xmlns="http://ns.taverna.org.uk/2010/scufl2#" xmlns:rdf=".." manifest:full-path="annotation/user_annotations.rdf"/>
xmlns:rdfs=".." xmlns:xsi=".."
xsi:schemaLocation="http://ns.taverna.org.uk/2010/scufl2#
<manifest:file-entry manifest:media-type="application/rdf+xml"
manifest:full-path="annotation/myExperiment-wf-765.rdf"/>
Definition of
<manifest:file-entry manifest:media-type="image/png" manifest:full-
http://ns.taverna.org.uk/2010/scufl2/scufl2.xsd .."
xsi:type="WorkflowBundleDocument" xml:base="./"> path="Thumbnails/thumbnail.png"/>
<manifest:file-entry manifest:media-type="image/svg+xml" manifest:full-
workflow structure.
<WorkflowBundle rdf:about="">
<name>HelloWorld</name>
<sameBaseAs rdf:resource=
path="diagram/workflow/HelloWorld.svg"/>
</manifest:manifest> Nested workflows
"http://ns.taverna.org.uk/2010/workflowBundle/28f7c..a0ef731/" />
<mainWorkflow rdf:resource="workflow/HelloWorld/" />
<workflow>
are separate
<Workflow rdf:about="workflow/HelloWorld/">
<rdfs:seeAlso rdf:resource="workflow/HelloWorld.rdf" /> resources.
</Workflow> <!-- plus each nested <Workflow> -->
</workflow>
<mainProfile rdf:resource="profile/tavernaWorkbench/" /> workflow/HelloWorld.rdf
<profile>
<Profile rdf:about="profile/tavernaWorkbench/">
<rdfs:seeAlso rdf:resource="profile/tavernaWorkbench.rdf" /> <rdf:RDF xmlns="http://ns.taverna.org.uk/2010/scufl2#"
</Profile> <!– plus optional alternative <Profile>--> xmlns:rdf=".." xmlns:owl=".." xmlns:rdfs=".." xmlns:xsi=".."
xsi:schemaLocation="http://ns.taverna.org.uk/2010/scufl2#
</profile> http://ns.taverna.org.uk/2010/scufl2/scufl2.xsd ..
<rdfs:seeAlso rdf:resource="annotation/user_annotations.rdf" />
Profile gives implementation <rdfs:seeAlso rdf:resource="annotation/myExperiment-wf-765.rdf" />
xsi:type="WorkflowDocument" xml:base="HelloWorld/">
<Workflow rdf:about="">
</WorkflowBundle> <name>HelloWorld</name>
bindings, alternative profiles </rdf:RDF> <workflowIdentifier
rdf:resource="http://ns.taverna.org.uk/2010/workflow/006...c84e2ca/" />
can customize execution of <inputWorkflowPort>
<InputWorkflowPort rdf:about="in/yourName">
<name>yourName</name>
workflow steps for different
environments (e.g. desktop,
profile/tavernaWorkbench.rdf <portDepth>0</portDepth>
</InputWorkflowPort>
</inputWorkflowPort>
Every workflow part
<outputWorkflowPort><!-- .. --></outputWorkflowPort>
server, cloud) <rdf:RDF xmlns="http://ns.taverna.org.uk/2010/scufl2#"
xmlns:beanshell="http://ns.taverna.org.uk/2010/activity/beanshell#" xmlns:dc=".."
<processor>
<Processor rdf:about="processor/Hello/">
xmlns:owl=".." xmlns:rdf=".." xmlns:rdfs=".." xmlns:xsi=".." <name>Hello</name>
xsi:schemaLocation="http://ns.taverna.org.uk/2010/scufl2#
http://ns.taverna.org.uk/2010/scufl2/scufl2.xsd .."
xsi:type="ProfileDocument" xml:base="tavernaWorkbench/">
<inputProcessorPort><!-- .. --></inputProcessorPort>
<outputProcessorPort>
<OutputProcessorPort rdf:about="processor/Hello/out/greeting">
has a URI, allowing
<Profile rdf:about=""> <name>greeting</name>
<name>tavernaWorkbench</name>
<processorBinding rdf:resource="processorbinding/Hello/" />
<portDepth>0</portDepth>
<granularPortDepth>0</granularPortDepth> deep annotations
Hybrid of RDF/XML and XML <activateConfiguration rdf:resource="configuration/Hello/" />
</Profile>
</OutputProcessorPort>
</outputProcessorPort>
<Activity rdf:about="activity/HelloScript/"> <dispatchStack>
schema - If the xsi:type is <rdf:type rdf:resource="http://ns.taverna.org.uk/2010/activity/beanshell" />
<name>HelloScript</name>
<DispatchStack rdf:about="processor/wait4me/dispatchStack/">
<rdf:type Relative references can also
given, documents can be <inputActivityPort>
<InputActivityPort rdf:about="activity/HelloScript/in/personName">
rdf:resource="http://ns.taverna.org.uk/2010/scufl2/taverna#defaultDispatchStack" />
</DispatchStack> be made absolute using the
<name>personName</name> </dispatchStack>
parsed and generated as <portDepth>0</portDepth> <iterationStrategyStack>
sameBaseAs prefix, RDF clients
</InputActivityPort> <IterationStrategyStack rdf:about="processor/Hello/iterationstrategy/">
regular XML, using xpath, etc. </inputActivityPort>
<outputActivityPort><!-- .. --></outputActivityPort>
<iterationStrategies rdf:parseType="Collection">
<CrossProduct> <!-- .. --></CrossProduct> resolving such URIs at
</Activity> </iterationStrategies>
<ProcessorBinding rdf:about="processorbinding/Hello/"> </IterationStrategyStack> http://ns.taverna.org.uk/2010/
<name>Hello</name> </iterationStrategyStack>
The schema ensures the <bindActivity rdf:resource="activity/HelloScript/" />
<bindProcessor rdf:resource="../../workflow/HelloWorld/processor/Hello/" />
</Processor>
</processor> <!-- plus other <processor>s -->
workflowBundle/can be
document can still be parsed <activityPosition>10</activityPosition>
<inputPortBinding>
<datalink>
<DataLink
redirected to a generated RDF
<InputPortBinding rdf:about="processorbinding/Hello/in/name"> rdf:about="datalink?from=processor/Hello/out/greeting&to=out/results&mergePosit
as RDF/XML as well. Pure RDF <bindInputActivityPort rdf:resource="activity/HelloScript/in/personName" /> ion=0"> resource stating what’s
<bindInputProcessorPort <receiveFrom rdf:resource="processor/Hello/out/greeting" />
writers can ommit the xsi:type rdf:resource="../../workflow/HelloWorld/processor/Hello/in/name" />
</InputPortBinding>
<sendTo rdf:resource="out/results" />
<mergePosition>0</mergePosition>
obvious from the URI.
</inputPortBinding> </DataLink>
<outputPortBinding><!-- --> </outputPortBinding> </datalink> <!-- plus more <datalink>s -->
</ProcessorBinding>
<Configuration rdf:about="configuration/Hello/">
<control>
<Blocking
“Cool URI”-style relative
<rdf:type
rdf:resource="http://ns.taverna.org.uk/2010/activity/beanshell#Config" />
rdf:about="control?block=processor/Hello/&untilFinished=processor/wait4me/">
<block rdf:resource="processor/Hello/" /> references allow parsers to
<name>Hello</name> <untilFinished rdf:resource="processor/wait4me/" />
<configure rdf:resource="activity/HelloScript/" />
<beanshell:script>hello = "Hello, " + personName;
<block
</Blocking>
‘cheat’ and pick out processor
JOptionPane.showMessageDialog(null, hello);</beanshell:script>
</Configuration>
</control>
</Workflow> “Hello” and port “greeting” -
</rdf:RDF> </rdf:RDF>
but only if starts with the
keyword paths out/ in/ or
processor/
Resources Annotations Binaries Provenance Inputs Outputs
The bundle (as a ZIP file) can be extended to include Wf4Ever: A Research
arbitrary resources. These could be referenced Object (RO) captures Embedding provenance, input and output data of a
internally by workflow activities, it could be PDFs, enough data and workflow run can provide example use and
spreadsheets, etc. provenance about results expected outputs. Changing the root-file to point
to make data and methods to the provenance makes the bundle a workflow
reproducible, verifiable, run bundle, where the workflow definition is
shareable, reusable and secondary.
repeatable. The Scufl2
workflow run bundle forms Similarly a data bundle represents primarily the
such an RO for Taverna data, but by including the workflow and the
workflow results – with the provenance also embed information on how it was
http://www.wf4ever-project.org/ generated.
goal of becoming an http://www.mygrid.org.uk/
executable paper.
Funded by: EPSRC EP/G026238/1; European Commission’s 7th FWP FP7-ICT-2007-6 270192; FP7-ICT-2007-6 270137