3. Nephele: What for ???
• General trends: At the beginning of processing
resource allocation is done (static), no scope to
extend/remove resources (dynamic) ….
• Paper (Claims): “…first data processing
framework to include the possibility of
dynamically allocating/ deallocating different
compute resources ….”
4. Known Issues in Cloud
• Cloud resources are dynamic and
heterogeneous
• Provisioning of resources on demand
• Cloud challenge: opaqueness
6. Jobs @ Nephele
q I. Steps to create a job(DAG):
1. Write own code for task.
2. Assign tasks to a vertex.
3. Define communication path for
Job.
q II. Add annotation to job
description.
q III. Transform Job Graph
àExecution Graph
7. An Execution Graph
Efficient Execution Graph creation depends on user input / job annotation
description
8. Pegasus: A framework for mapping
complex scientific workflows onto
distributed systems
By: Ewa Deelman, Gurmeet Singh, Mei-Hui Su, James Blythe, Yolanda Gil, Carl
Kesselman, Gaurang Mehta, Karan Vahia, G. Bruce Berriman, John Good, Anastasia
Laity, Joseph C. Jacob and Daniel S. Katz
University of Southern California Information Sciences Institute, CA, USA
Infrared Processing and Analysis Center, Jet Propulsion Laboratory, CalTech USA
Published in: Journal Scientific Programming archive Volume 13 Issue 3, July 2005
IOS Press Amsterdam, The Netherlands
9. Before Starting….
ü Workflow can capture the behavior of
application.(abstract & Concrete)
ü Workflows are abstract in the application-
level (describes application components and
their dependencies)
ü Simplifies application development process(+)
ü Concrete workflow describe resources that
would be used in execution of specific tasks.
10. Pegasus: What for..??
ü Describes: process of mapping from abstract to
executable workflow can be automated.
ü Assumption: 1. Application is already represented
in an abstract workflow form
2. Data does not specify particular
resources to be used.
ü Scheduling horizon encompasses tasks that can be
sent to execution system.
ü Mapping horizon indicates how far into the
workflow to map the tasks.
11. Horizons & Costs
Mapping depends on
specific resources to execute
specific tasks as well as data
locality.