SlideShare une entreprise Scribd logo
1  sur  44
Scientific Workflows, Science
Gateways, and Apache Airavata:
Opening Science, Opening
Cyberinfrastructure
Marlon Pierce, Suresh Marru, Saminda
Wijeratne, and the IU Science Gateway
Group
{marpierc, smarru}@iu.edu
Science Gateway Group
• Amila Jayasekara
• Chathuri Wimalasena
• Jun Wang
• Lahiru Gunathilake
• Marlon Pierce (Group
Lead)
• Raminder Singh
• Saminda Wijeratne
• Suresh Marru
• Viknes Balasubramanee
• Yu (Marie) Ma
We are primarily funded by NSF (XSEDE, SDCI, others) and NASA
(QuakeSim, E-DECIDER) awards with two base funded positions.
What Is Cyberinfrastructure?
“Cyberinfrastructure consists of computing systems,
data storage systems, advanced instruments and
data repositories, visualization environments, and
people, all linked together by software and high
performance networks to improve research
productivity and enable breakthroughs not otherwise
possible.”
–Craig Stewart, Indiana University
Compute
Resources
Resource
Middleware Cloud Interfaces Grid Middleware
SSH & Resource
Managers
Computational
Clouds
Computational Grids
Gateway
Services
User
Interfacing
Services
Web/Gadget
Container
Web Enabled
Desktop Applications
User
Management
Auditing &
Reporting
Fault
Tolerance
Application
Abstractions
Workflow
System
Information
Services
Monitoring
Registry
Provenance &
Metadata
Management
Local Resources
Web/Gadget
Interfaces
Gateway Abstraction
Interfaces
Color Coding
Dependent resource provider components
Complimentary gateway components
Airavata components
Cyberinfrastructure Layers
Security
7
XSEDE Offers Variety
• Leading-edge distributed memory systems
• Very large shared memory systems
• High throughput systems
• Visualization engines
• Accelerators like GPUs
8
Many scientific problems
have components that call
for use of more than one
architecture.
Extended Collaborative Support Services
• Support staff who understand the discipline as well as
the systems (perhaps more than one support person
working with a project).
– 37 FTEs, spread over ~80 people at almost a dozen sites.
• Brings the best available knowledge and skills to bear
on computational science problems within the XSEDE
user community.
• Wide range of areas,
– Performance analysis
– Petascale optimization techniques to
– Building Science Gateways.
• Novel and Innovative Projects –
– bring in diverse and non-traditional users.
9
ECSS Examples
• Algorithmic or solver change; incorporation or implementation
of new or advanced numerical methods and/or math libraries
• Optimization (single processor performance, parallel scaling,
parallel I/O performance, memory usage, or benchmarking
improvement)
• Development of parallel code from serial code/algorithm
• Data visualization or analysis
• Incorporation of new data management or storage
• New grid computing work
• Implementation of new workflows for automation of scientific
processes
• Incorporation of new visualization methods
• Innovative scheduling implementation
• Integration of XSEDE resources into a portal or Science
Gateway
Knowledge and Expertise
Computational
Resources
Scientific
Instruments
Algorithms and
Models
Archived Data
and Metadata
Advanced Science Tools
Science Gateways:
Enabling & Democratizing Scientific Research
Science Gateway Communities
Community Gateway Capabilities
Universities and
Academic Departments
Provide simplified access to computing resources for
students, faculty, and staff. Gateways should support
MOOCs
Shared Instrument
Facilities
Simplify access to instruments, support research derived
from common data products: UltraScan, LIGO, DES, LSST
Multisite Collaborations
and Virtual
Organizations
Funded or unfunded, including self-organized communities,
who need to collaborate and use a common pool of
resources: LEAD, ENZO
Businesses and Services Provide Software as a Service for a particular application
suite
Small Research Groups “Long tail” of science, need to preserve the work that is
done.
UltraScan: A Science Gateway for
Biophysical Analysis
• Support analysis of experiments on molecules in solution
environments
– Analytical ultracentrifugation (AUC)
– Laser Light Scattering (LS)
– Small angle X-ray/neutron scattering (SAXS/SANS)
• Provide highest possible resolution in the analysis
– Requires HPC
• Offer a flexible model for multiple optimization methods
• Integrate a variety of HPC environments into a uniform
user experience
• Must be easy to learn and use –
– users should not have to be HPC experts
– Provide check-pointing
– easy to understand error messages
• Fast turnaround to support serial workflows
On-Demand
Grid Computing
Example: Adapting Weather Prediction to Observational
Sources Using Dynamic Adaptivity
Streaming
Observations
Storms Forming
Forecast Model
Data Mining
Abstractions
16
Load Leveler
PBS/Torque Sun Grid Engine LSF
Globus
Globus
Unicore
EC2 APIGateway Frameworks
SSH
Apache Airavata Overview
http://airavata.apache.org
Realizing the Universe for the Dark Energy Survey (DES) Using XSEDE Support
(Pis: A. Evrard (UM) and A. Kravtsov (UC)
• The Dark Energy Survey (DES) is an
upcoming international experiment
that aims to constrain the properties of
dark energy and dark matter in the
universe using a deep, 5000-square
degree survey of cosmic structure
traced by galaxies.
• To support this science, the DES
Simulation Working Group is
generating expectations for galaxy
yields in various cosmologies.
• Analysis of these simulated catalogs
offers a quality assurance capability for
cosmological and astrophysical analysis
of upcoming DES telescope data.
• These large, multi-staged
computations are a natural fit for
workflow control atop XSEDE
resources.
Fig. 2: A synthetic 2x3 arcmin DES sky image showing galaxies, stars,
and observational artifacts. Courtesy Huan Lin, FNAL.
Fig. 1 The density of dark matter in a thin radial slice as seen by a
synthetic observer located in the 8 billion light-year computational
volume. Image courtesy Matthew Becker, University of Chicago.
DES
Application
Component Description
CAMB Code for Anisotropies in the Microwave Background is a serial
FORTRAN code that computes the power spectrum of dark matter,
which is necessary for generating the simulation initial conditions.
Output is a small ASCII file describing the power spectrum.
2LPTic Second-order Lagrangian Perturbation Theory initial conditions
code is an MPI based C code that computes the initial conditions for
the simulation from parameters and an input power spectrum
generated by CAMB. Output is a set of binary files that vary in size
from ~80-250 GB depending on the simulation resolution.
LGadget LGadget is an MPI based C code that evolves a gravitational N-body
system. The outputs of this step are system state snapshot files, as
well as lightcone files, and some properties of the matter
distribution, including the power spectrum at various timesteps. The
total output from LGadget depends on resolution and the number of
system snapshots stored, and approaches ~10 TB for large DES
simulation boxes.
DES as a Workflow
There are plenty of issues:
• Long running code: Based on simulation box size
L-gadget can run for 3 to 5 days using more than
1024 cores.
• Local HPC provider policies: XSEDE resource
provider’s job scheduling policy does not allow
jobs to run for more than 24 hours in normal
queue
• Do-While Construct: Restart service support is
needed in workflow. Do-while construct was
developed to address the need.
• Data size and File transfer challenges: L-gadget
produces 10~TB for large DES simulation boxes in
system scratch so data need to moved to
persistent storage ASAP
• File system issues: More than 10,000 lightcone
files are doing continues file I/O. This can cause
problems with the HPC resource’s file system
(usually Lustre-based in XSEDE).
Processing steps to build a synthetic
galaxy catalog.
Workflow
Interpreter
Application
Factory
Message
Box
Registry
Apache
Airavata
API
L
o
r
e
m
i
p
s
u
m
i
n
s
o
l
e
n
s
p
1
m
5
d
u
ox
EndUsersGatewayDeveloper
Scientific
Applicati
on
Core
Developer
Computational
Resources
Apache Airavata
Apache Airavata Components
Component Description
XBaya Workflow graphical composition tool.
Registry Service Insert and access application, host machine,
workflow, and provenance data.
Workflow Interpreter
Service
Execute the workflow on one or more resources.
Application Factory
Service (GFAC)
Manages the execution and management of an
application in a workflow
Messaging System WS-Notification and WS-Eventing compliant
publish/subscribe messaging system for workflow
events
Airavata API Single wrapping client to provide higher level
programming interfaces.
Domain Description
Astronomy Image processing pipeline for One Degree Imager
instrument on XSEDE
Astrophysics Supporting workflow of Dark Energy Survey
simulations working group on XSEDE
Bioinformatics Supported workflow executions on Amazon EC2 for
BioVLAB project
Biophysics Manage large scale data analysis of analytical
ultracentrifugation experiments on XSEDE and
campus resources
Computational
Chemistry
Manage workflows to support computational
chemistry parameter studies for ParamChem.org on
XSEDE
Nuclear Physics Workflows for nuclear structure calculations using
Leadership Class Configuration Interaction (LCCI)
computations on DOE resources
Apache Airavata in Action
Apache and Open Governance
Cyberinfrastructure: How open is
open source?
• What’s missing?
– Open source licensing
– Open Standards
– Open Code (GitHub,
SourceForge, Google
Code e.t.c)
We also need Open Governance
Open Source Software and Governance
• Open source projects need diversity,
governance.
– Reproducibility
– Sustainability
• Incentives for projects to diversify
their developer base.
• Govern
– Software releases
– Contributions
– Credit sharing.
– Members are added
– Project direction decisions.
– IP, legal issues
• Our approach: Apache Software
Foundation
Collaborate
Compete
The Apache Software Foundation
• Apache software powers
65% of web sites worldwide
• 501(c)3 non-profit
foundation
• Reasons for creating ASF
– Create legal entity
– Protect contributors from
liability
– Protect Apache assets
• Membership: individual
• Apache Incubator
• Governance and Staffing
– Board of Directors
– Project Management
Committees
– ASF Members
– Committers
– Contributors
• Funding
– All-volunteer
staffing/development
resources
– Donations
– Corporate investment
Apache Contributions Aren’t Just
Software
• Apache committers and PMC members aren’t just
code writers.
• Successful communities also include
– Important users
– Project evangelists
– Content providers: documentation, tutorials
– Testers, requirements providers, and constructive
complainers
• Using Jira and mailing lists
– Anything else that needs doing.
More Information
• Contact Us:
– marpierc@iu.edu, smarru@iu.edu
• Websites:
– Science Gateway Group:
http://rt.uits.iu.edu/visualization/gateways/index.
php
– Apache Airavata: http://airavata.apache.org
– Science Gateway Institute:
http://sciencegateways.org
What Is Apache Airavata?
• Science Gateway software
system to
• Compose, manage, execute,
and monitor distributed,
computational workflows
• Wrap legacy command line
scientific applications with
Web services.
• Run jobs on computational
resources ranging from local
resources to computational
grids and clouds
• Airavata software is largely
derived from NSF-funded
academic research.
Apache Airavata
Science Gateway Framework
Workflow
Interpreter
Application
Factory
Message
Box
Regist
ry
Apache
Airavata
API
L
o
r
e
m
i
p
s
u
m
i
n
s
o
l
e
n
s
p
1
m
5
d
u
ox
EndUsersGatewayDeveloper
Scientific
Applicati
on
Core
Developer
Computational
Resources
Apache Airavata
Apache Airavata Components
Component Description
XBaya Workflow graphical composition tool.
Registry Service Insert and access application, host machine,
workflow, and provenance data.
Workflow Interpreter
Service
Execute the workflow on one or more resources.
Application Factory
Service (GFAC)
Manages the execution and management of an
application in a workflow
Messaging System WS-Notification and WS-Eventing compliant
publish/subscribe messaging system for workflow
events
Airavata API Single wrapping client to provide higher level
programming interfaces.
Key Airavata Features
• Graphical user interface to construct, execute, control,
manage and reuse scientific workflows.
• Desktop tools and browser-based web interface
components to manage applications, workflows and
generated data.
• Sophisticated server-side tools to register, schedule and
manage scientific applications on high performance
computational resources.
• Ability to Interface and interoperate with various external
(third party) data, workflow and provenance
management tools.
A Classic Scientific Workflow
• Workflows are composite applications built out of independent
parts.
– Parts are executables wrapped as network accessible services
• The classic example is that codes A, B, and C need to be executed in
a specific sequence.
– A, B, C: parallel codes compiled and executable on a cluster,
supercomputer, etc. by schedulers.
• A, B, and C do not need to be co-located
• A, B, and C may be sequential or parallel
• A, B and C may have date or control dependencies
– Data may need to be staged in and out
• Some variations on ABC:
– Conditional execution branches
– Dynamic execution resource binding
– Iterations (Do-while, For-Each) over all or parts of the sequence
– Triggers, events, data streams
Challenges in Scientific Workflows
• Accommodating wide range of execution
patterns
– Iterations: for-each, do-while, dot and Cartesian
products
– Interactivity, adaptivity, non-determinism
• Accommodating error and uncertainties
NextGen Workflow Systems:
Need for Interactivity Across Layers
• Scientific workflow systems and compiled
workflow languages have focused on modeling,
scheduling, data movement, dynamic service
creation and monitoring of workflows.
• Building on these foundations Airavata extends to
a interactive and flexible workflow systems.
• Airavata Workflow Features include:
– interactive ways of interfering and steering the
workflow execution
– interpreted workflow execution model
– high level instruction set
– flexibility to execute individual workflow activity and
wait for further analysis.
Interactivity Contd.
• Derivations during workflow Execution that does
not affect the structure of the workflow
– dynamic change workflow inputs, workflow rerun.
interpreted workflow execution model.
– dynamic change in point of execution, workflow smart
rerun.
– Fault handling and exception models.
• Derivation that change the workflow DAG during
runtime
– Reconfiguration of activity..
– dynamic addition of activities to the workflow.
– Dynamic remove or replace of activity to the workflow
Interactivity
• Mathematical uncertainty:
– PDE’s from domain problems do not have analytical solution and thereby look at numerical
methods to find solutions
– These solvers may not converge depending on method, PDE system, initial conditions and
expected output tolerances
– statistical techniques lead to nondeterministic results.
– closer observation at computational output ensure acceptability of results.
• Domain uncertainty:
– Scenarios of running against range of parameter values in an attempt to find the most
appropriate input set.
– Initial execution providing estimate of the accuracy of the inputs and facilitating further
refinement.
– Outputs are diverse and nondeterministic
• Resource uncertainty:
– Failures in distributed systems are norm than an exception
– transient failures can be retried if computation is side-effect free/Idempotent.
– persistent failures require migration
• Real-time Model refinement
– Real-time event processing systems not having data available prior to initialization of model.
– models evolve over time and can take advantage of more and more events as they become
available
Mathema cal Domain Resource
Model
Refinement
Uncertain es
Orchestra on level Interac ons
Workflow
Steering
Parametric
Sweeps
Provenance
Job Level Interac ons
Job launch,
gliding
Checkpoint/
Restart
Asynchronous
refinements
Applica on
Steering
Illustrating Interactivity
Domain Description
Astronomy Image processing pipeline for One Degree Imager
instrument on XSEDE
Astrophysics Supporting workflow of Dark Energy Survey
simulations working group on XSEDE
Bioinformatics Supported workflow executions on Amazon EC2 for
BioVLAB project
Biophysics Manage large scale data analysis of analytical
ultracentrifugation experiments on XSEDE and
campus resources
Computational
Chemistry
Manage workflows to support computational
chemistry parameter studies for ParamChem.org on
XSEDE
Nuclear Physics Workflows for nuclear structure calculations using
Leadership Class Configuration Interaction (LCCI)
computations on DOE resources
Apache Airavata in Action
Workflow
Interpreter
Application
Factory
Message
Box
Regist
ry
Apache
Airavata
API
L
o
r
e
m
i
p
s
u
m
i
n
s
o
l
e
n
s
p
1
m
5
d
u
ox
EndUsersGatewayDeveloper
Scientific
Applicati
on
Core
Developer
Computational
Resources
Apache Airavata
Apache Airavata Components
Component Description
XBaya Workflow graphical composition tool.
Registry Service Insert and access application, host machine,
workflow, and provenance data.
Workflow Interpreter
Service
Execute the workflow on one or more resources.
Application Factory
Service (GFAC)
Manages the execution and management of an
application in a workflow
Messaging System WS-Notification and WS-Eventing compliant
publish/subscribe messaging system for workflow
events
Airavata API Single wrapping client to provide higher level
programming interfaces.
Scientific

Contenu connexe

Tendances

Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
DataWorks Summit
 

Tendances (20)

Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy Science
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDB
Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDBMonitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDB
Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDB
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
F233842
F233842F233842
F233842
 
My Dissertation 2016
My Dissertation 2016My Dissertation 2016
My Dissertation 2016
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming Data
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
TG11 ORPS Poster
TG11 ORPS PosterTG11 ORPS Poster
TG11 ORPS Poster
 
Don't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDon't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing Spark
 

En vedette (6)

OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
 
IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavata
 
OGCE SciDAC2010 Tutorial
OGCE SciDAC2010 TutorialOGCE SciDAC2010 Tutorial
OGCE SciDAC2010 Tutorial
 
Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation
 
ACES QuakeSim 2011
ACES QuakeSim 2011ACES QuakeSim 2011
ACES QuakeSim 2011
 
El precio»
El precio»El precio»
El precio»
 

Similaire à Scientific

Ogce Workflow Suite
Ogce Workflow SuiteOgce Workflow Suite
Ogce Workflow Suite
smarru
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Databricks
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
Farley Lai
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
Anubhav Jain
 

Similaire à Scientific (20)

Ogce Workflow Suite
Ogce Workflow SuiteOgce Workflow Suite
Ogce Workflow Suite
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the Cloud
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
grid mining
grid mininggrid mining
grid mining
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
grid computing
grid computinggrid computing
grid computing
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
CSense: A Stream-Processing Toolkit for Robust and High-Rate Mobile Sensing A...
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systems
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
 
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureJetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
 

Plus de marpierc (12)

SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overview
 
OGCE MSI Presentation
OGCE MSI PresentationOGCE MSI Presentation
OGCE MSI Presentation
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
Building Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialBuilding Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocial
 
OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Support
 
OGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research TechnologiesOGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research Technologies
 
Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10
 
OGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track PresentationOGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track Presentation
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
 
GTLAB Overview
GTLAB OverviewGTLAB Overview
GTLAB Overview
 

Dernier

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Dernier (20)

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 

Scientific

  • 1. Scientific Workflows, Science Gateways, and Apache Airavata: Opening Science, Opening Cyberinfrastructure Marlon Pierce, Suresh Marru, Saminda Wijeratne, and the IU Science Gateway Group {marpierc, smarru}@iu.edu
  • 2. Science Gateway Group • Amila Jayasekara • Chathuri Wimalasena • Jun Wang • Lahiru Gunathilake • Marlon Pierce (Group Lead) • Raminder Singh • Saminda Wijeratne • Suresh Marru • Viknes Balasubramanee • Yu (Marie) Ma We are primarily funded by NSF (XSEDE, SDCI, others) and NASA (QuakeSim, E-DECIDER) awards with two base funded positions.
  • 3.
  • 4. What Is Cyberinfrastructure? “Cyberinfrastructure consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible.” –Craig Stewart, Indiana University
  • 5. Compute Resources Resource Middleware Cloud Interfaces Grid Middleware SSH & Resource Managers Computational Clouds Computational Grids Gateway Services User Interfacing Services Web/Gadget Container Web Enabled Desktop Applications User Management Auditing & Reporting Fault Tolerance Application Abstractions Workflow System Information Services Monitoring Registry Provenance & Metadata Management Local Resources Web/Gadget Interfaces Gateway Abstraction Interfaces Color Coding Dependent resource provider components Complimentary gateway components Airavata components Cyberinfrastructure Layers Security
  • 6.
  • 7. 7
  • 8. XSEDE Offers Variety • Leading-edge distributed memory systems • Very large shared memory systems • High throughput systems • Visualization engines • Accelerators like GPUs 8 Many scientific problems have components that call for use of more than one architecture.
  • 9. Extended Collaborative Support Services • Support staff who understand the discipline as well as the systems (perhaps more than one support person working with a project). – 37 FTEs, spread over ~80 people at almost a dozen sites. • Brings the best available knowledge and skills to bear on computational science problems within the XSEDE user community. • Wide range of areas, – Performance analysis – Petascale optimization techniques to – Building Science Gateways. • Novel and Innovative Projects – – bring in diverse and non-traditional users. 9
  • 10. ECSS Examples • Algorithmic or solver change; incorporation or implementation of new or advanced numerical methods and/or math libraries • Optimization (single processor performance, parallel scaling, parallel I/O performance, memory usage, or benchmarking improvement) • Development of parallel code from serial code/algorithm • Data visualization or analysis • Incorporation of new data management or storage • New grid computing work • Implementation of new workflows for automation of scientific processes • Incorporation of new visualization methods • Innovative scheduling implementation • Integration of XSEDE resources into a portal or Science Gateway
  • 11. Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced Science Tools Science Gateways: Enabling & Democratizing Scientific Research
  • 12. Science Gateway Communities Community Gateway Capabilities Universities and Academic Departments Provide simplified access to computing resources for students, faculty, and staff. Gateways should support MOOCs Shared Instrument Facilities Simplify access to instruments, support research derived from common data products: UltraScan, LIGO, DES, LSST Multisite Collaborations and Virtual Organizations Funded or unfunded, including self-organized communities, who need to collaborate and use a common pool of resources: LEAD, ENZO Businesses and Services Provide Software as a Service for a particular application suite Small Research Groups “Long tail” of science, need to preserve the work that is done.
  • 13. UltraScan: A Science Gateway for Biophysical Analysis • Support analysis of experiments on molecules in solution environments – Analytical ultracentrifugation (AUC) – Laser Light Scattering (LS) – Small angle X-ray/neutron scattering (SAXS/SANS) • Provide highest possible resolution in the analysis – Requires HPC • Offer a flexible model for multiple optimization methods • Integrate a variety of HPC environments into a uniform user experience • Must be easy to learn and use – – users should not have to be HPC experts – Provide check-pointing – easy to understand error messages • Fast turnaround to support serial workflows
  • 14.
  • 15. On-Demand Grid Computing Example: Adapting Weather Prediction to Observational Sources Using Dynamic Adaptivity Streaming Observations Storms Forming Forecast Model Data Mining
  • 16. Abstractions 16 Load Leveler PBS/Torque Sun Grid Engine LSF Globus Globus Unicore EC2 APIGateway Frameworks SSH
  • 18. Realizing the Universe for the Dark Energy Survey (DES) Using XSEDE Support (Pis: A. Evrard (UM) and A. Kravtsov (UC) • The Dark Energy Survey (DES) is an upcoming international experiment that aims to constrain the properties of dark energy and dark matter in the universe using a deep, 5000-square degree survey of cosmic structure traced by galaxies. • To support this science, the DES Simulation Working Group is generating expectations for galaxy yields in various cosmologies. • Analysis of these simulated catalogs offers a quality assurance capability for cosmological and astrophysical analysis of upcoming DES telescope data. • These large, multi-staged computations are a natural fit for workflow control atop XSEDE resources. Fig. 2: A synthetic 2x3 arcmin DES sky image showing galaxies, stars, and observational artifacts. Courtesy Huan Lin, FNAL. Fig. 1 The density of dark matter in a thin radial slice as seen by a synthetic observer located in the 8 billion light-year computational volume. Image courtesy Matthew Becker, University of Chicago.
  • 19. DES Application Component Description CAMB Code for Anisotropies in the Microwave Background is a serial FORTRAN code that computes the power spectrum of dark matter, which is necessary for generating the simulation initial conditions. Output is a small ASCII file describing the power spectrum. 2LPTic Second-order Lagrangian Perturbation Theory initial conditions code is an MPI based C code that computes the initial conditions for the simulation from parameters and an input power spectrum generated by CAMB. Output is a set of binary files that vary in size from ~80-250 GB depending on the simulation resolution. LGadget LGadget is an MPI based C code that evolves a gravitational N-body system. The outputs of this step are system state snapshot files, as well as lightcone files, and some properties of the matter distribution, including the power spectrum at various timesteps. The total output from LGadget depends on resolution and the number of system snapshots stored, and approaches ~10 TB for large DES simulation boxes.
  • 20. DES as a Workflow There are plenty of issues: • Long running code: Based on simulation box size L-gadget can run for 3 to 5 days using more than 1024 cores. • Local HPC provider policies: XSEDE resource provider’s job scheduling policy does not allow jobs to run for more than 24 hours in normal queue • Do-While Construct: Restart service support is needed in workflow. Do-while construct was developed to address the need. • Data size and File transfer challenges: L-gadget produces 10~TB for large DES simulation boxes in system scratch so data need to moved to persistent storage ASAP • File system issues: More than 10,000 lightcone files are doing continues file I/O. This can cause problems with the HPC resource’s file system (usually Lustre-based in XSEDE). Processing steps to build a synthetic galaxy catalog.
  • 22. Apache Airavata Components Component Description XBaya Workflow graphical composition tool. Registry Service Insert and access application, host machine, workflow, and provenance data. Workflow Interpreter Service Execute the workflow on one or more resources. Application Factory Service (GFAC) Manages the execution and management of an application in a workflow Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events Airavata API Single wrapping client to provide higher level programming interfaces.
  • 23. Domain Description Astronomy Image processing pipeline for One Degree Imager instrument on XSEDE Astrophysics Supporting workflow of Dark Energy Survey simulations working group on XSEDE Bioinformatics Supported workflow executions on Amazon EC2 for BioVLAB project Biophysics Manage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources Computational Chemistry Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE Nuclear Physics Workflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources Apache Airavata in Action
  • 24. Apache and Open Governance
  • 25. Cyberinfrastructure: How open is open source? • What’s missing? – Open source licensing – Open Standards – Open Code (GitHub, SourceForge, Google Code e.t.c) We also need Open Governance
  • 26. Open Source Software and Governance • Open source projects need diversity, governance. – Reproducibility – Sustainability • Incentives for projects to diversify their developer base. • Govern – Software releases – Contributions – Credit sharing. – Members are added – Project direction decisions. – IP, legal issues • Our approach: Apache Software Foundation Collaborate Compete
  • 27. The Apache Software Foundation • Apache software powers 65% of web sites worldwide • 501(c)3 non-profit foundation • Reasons for creating ASF – Create legal entity – Protect contributors from liability – Protect Apache assets • Membership: individual • Apache Incubator • Governance and Staffing – Board of Directors – Project Management Committees – ASF Members – Committers – Contributors • Funding – All-volunteer staffing/development resources – Donations – Corporate investment
  • 28. Apache Contributions Aren’t Just Software • Apache committers and PMC members aren’t just code writers. • Successful communities also include – Important users – Project evangelists – Content providers: documentation, tutorials – Testers, requirements providers, and constructive complainers • Using Jira and mailing lists – Anything else that needs doing.
  • 29. More Information • Contact Us: – marpierc@iu.edu, smarru@iu.edu • Websites: – Science Gateway Group: http://rt.uits.iu.edu/visualization/gateways/index. php – Apache Airavata: http://airavata.apache.org – Science Gateway Institute: http://sciencegateways.org
  • 30. What Is Apache Airavata? • Science Gateway software system to • Compose, manage, execute, and monitor distributed, computational workflows • Wrap legacy command line scientific applications with Web services. • Run jobs on computational resources ranging from local resources to computational grids and clouds • Airavata software is largely derived from NSF-funded academic research.
  • 33. Apache Airavata Components Component Description XBaya Workflow graphical composition tool. Registry Service Insert and access application, host machine, workflow, and provenance data. Workflow Interpreter Service Execute the workflow on one or more resources. Application Factory Service (GFAC) Manages the execution and management of an application in a workflow Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events Airavata API Single wrapping client to provide higher level programming interfaces.
  • 34. Key Airavata Features • Graphical user interface to construct, execute, control, manage and reuse scientific workflows. • Desktop tools and browser-based web interface components to manage applications, workflows and generated data. • Sophisticated server-side tools to register, schedule and manage scientific applications on high performance computational resources. • Ability to Interface and interoperate with various external (third party) data, workflow and provenance management tools.
  • 35. A Classic Scientific Workflow • Workflows are composite applications built out of independent parts. – Parts are executables wrapped as network accessible services • The classic example is that codes A, B, and C need to be executed in a specific sequence. – A, B, C: parallel codes compiled and executable on a cluster, supercomputer, etc. by schedulers. • A, B, and C do not need to be co-located • A, B, and C may be sequential or parallel • A, B and C may have date or control dependencies – Data may need to be staged in and out • Some variations on ABC: – Conditional execution branches – Dynamic execution resource binding – Iterations (Do-while, For-Each) over all or parts of the sequence – Triggers, events, data streams
  • 36. Challenges in Scientific Workflows • Accommodating wide range of execution patterns – Iterations: for-each, do-while, dot and Cartesian products – Interactivity, adaptivity, non-determinism • Accommodating error and uncertainties
  • 37. NextGen Workflow Systems: Need for Interactivity Across Layers • Scientific workflow systems and compiled workflow languages have focused on modeling, scheduling, data movement, dynamic service creation and monitoring of workflows. • Building on these foundations Airavata extends to a interactive and flexible workflow systems. • Airavata Workflow Features include: – interactive ways of interfering and steering the workflow execution – interpreted workflow execution model – high level instruction set – flexibility to execute individual workflow activity and wait for further analysis.
  • 38. Interactivity Contd. • Derivations during workflow Execution that does not affect the structure of the workflow – dynamic change workflow inputs, workflow rerun. interpreted workflow execution model. – dynamic change in point of execution, workflow smart rerun. – Fault handling and exception models. • Derivation that change the workflow DAG during runtime – Reconfiguration of activity.. – dynamic addition of activities to the workflow. – Dynamic remove or replace of activity to the workflow
  • 39. Interactivity • Mathematical uncertainty: – PDE’s from domain problems do not have analytical solution and thereby look at numerical methods to find solutions – These solvers may not converge depending on method, PDE system, initial conditions and expected output tolerances – statistical techniques lead to nondeterministic results. – closer observation at computational output ensure acceptability of results. • Domain uncertainty: – Scenarios of running against range of parameter values in an attempt to find the most appropriate input set. – Initial execution providing estimate of the accuracy of the inputs and facilitating further refinement. – Outputs are diverse and nondeterministic • Resource uncertainty: – Failures in distributed systems are norm than an exception – transient failures can be retried if computation is side-effect free/Idempotent. – persistent failures require migration • Real-time Model refinement – Real-time event processing systems not having data available prior to initialization of model. – models evolve over time and can take advantage of more and more events as they become available
  • 40. Mathema cal Domain Resource Model Refinement Uncertain es Orchestra on level Interac ons Workflow Steering Parametric Sweeps Provenance Job Level Interac ons Job launch, gliding Checkpoint/ Restart Asynchronous refinements Applica on Steering Illustrating Interactivity
  • 41. Domain Description Astronomy Image processing pipeline for One Degree Imager instrument on XSEDE Astrophysics Supporting workflow of Dark Energy Survey simulations working group on XSEDE Bioinformatics Supported workflow executions on Amazon EC2 for BioVLAB project Biophysics Manage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources Computational Chemistry Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE Nuclear Physics Workflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources Apache Airavata in Action
  • 43. Apache Airavata Components Component Description XBaya Workflow graphical composition tool. Registry Service Insert and access application, host machine, workflow, and provenance data. Workflow Interpreter Service Execute the workflow on one or more resources. Application Factory Service (GFAC) Manages the execution and management of an application in a workflow Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events Airavata API Single wrapping client to provide higher level programming interfaces.