SlideShare une entreprise Scribd logo
Bridging Concepts and Practice in
eScience via Simulation-driven
Engineering
Rafael Ferreira da Silva1, Henri Casanova2, Ryan Tanaka2, Frédéric Suter3
https://wrench-project.org
1 USC Information Sciences Institute, Marina del Rey, CA, USA
2 Information and Computer Sciences, University of Hawaii, Honolulu, HI, USA
3 IN2P3 Computing Center, CNRS, Villeurbanne, France
#1
Disconnect between theoretical and practical works
Theoreticians produce
results that are never used
by practitioners
2
Practitioners use approaches that
may be vastly suboptimal because
they are not informed by any theory
One of the reasons for this disconnect is that theoretical work must be
done using formally defined models of computation
Ideally, these models are complete enough to be relevant to practice, but
simple enough that obtaining theoretical results (e.g., optimality results,
complexity bounds) is tractable
Real-world experiments are limited
One is limited to particular platform configurations (and sub-configurations)
How can “what if?” scenarios be explored?
How can generality be claimed?
One is limited by specifics of the software infrastructure that impose
constraints on CI application executions
Modifying complex software stacks (often written by others) just to test out ideas is not
feasible
In the end, the scope of real-world experiments is limited, which impedes
progress / discovery
3
Simulation
When one works in an experimental field in which experiments are
problematic, one resorts to simulation
Physicists have understood this decades ago :)
In some fields of Computer Science simulation is a standard research and
development methodology
e.g., Networking, Computer Architecture
Several simulators and simulation frameworks have been developed for
parallel and distributed computing
Some of them developed explicitly for workflows
4
Simulation-driven
engineering life cycle
Experimental
simulation
Research
idea
Evaluation of
simulation
results
Research
product
Implementation
onto CI platform
Design of
research
solution
unsatisfactory results
Accurate CI
simulator
Design of CI
simulator
5
The ability to define
parameterizable services is key for
developing accurate CI simulators,
from which research products
evaluated via experimental
simulation could be seamlessly
integrated into actual CI platforms
The SimGrid framework
SimGrid is a research project
Development of simulation models of hardware/software stacks
Models are accurate (validated/invalidated) and scalable (low computational complexity, low
memory footprint)
SimGrid is open source usable software
Provides different APIs for a range of simulation needs, e.g.:
S4U: General simulation of Concurrent Sequential Processes
SMPI: Fine-grained simulation of MPI applications
SimGrid is versatile scientific instrument
Used for (combinations of) Grid, HPC, Peer-to-Peer, Cloud, Fog simulation projects
First developed in 2000, latest release: v3.23.2 (July 2019)
6
https://simgrid.org
SimGrid’s philosophy
SimGrid’s philosophy: provide low-level abstractions
Advantage: you can do anything with it
Drawback: implementing a simulation of a complex system is a lot of work
Critical analysis:
In [Kecskemeti et al.’14] pinpoints exactly the above trade-off:
"SimGrid is more scalable and validated than competing frameworks, but just too much
work when wanting to simulate a WMS that interacts with CI components"
7
https://simgrid.org
The WRENCH simulation framework
Objective #1: Make it easy to develop simulators of complex CI
application executions
Done by providing high-level, reusable simulation abstractions
Objective #2: Produce accurate and scalable simulations
Done by building on SimGrid
Let’s look at an example system one can simulate with WRENCH…
8
wrench-project.org
System to simulate
9
WRENCH core services
10
Simulation core
All necessary simulation models and base
abstractions (computing, communicating,
storing), provided by SimGrid
Simulated core CI services
Abstractions for simulated CI
components to execute
computational workloads
Compute Services
Provide mechanisms for
executing application tasks,
which entail I/O and
computation
cloud
bare-metal virtualized cluster
batch-scheduled cluster
Storage Services
Store application files, which
can then be accessed in
reading/writing by the
compute services when
executing tasks that
read/write files
File Registry Services
Databases of key-value pairs
of storage services and files
replicas
Network Proximity Services
monitor the network and
maintain a database of host-
to-host network distances
Workflow Management
System
Provides the mechanisms for
executing workflow
applications, including
decision-making for
optimizing various objectives
WRENCH’s impact on
CI research
Accuracy: the ability to capture the
behavior of a real-world system with
as little bias as possible
Scalability: the ability to simulate
large systems with as few CPU cycles
and bytes of RAM as possible
11
Empirical cumulative distribution function of task
completion times for sample real-world (“pegasus” and
“workqueue”) and simulated (“wrench”) executions.
Simulation Accuracy and Scalability
●
●
●
●
●
●
●
●
●
●
●
●
140
160
180
200
1 2 3 4 5 6 7 8 9 10 11 12
# cores
PowerConsumption(W)
● estimation real wrench
●
●
●
●
●
●
●
●
●
●
●
●
0.1
0.2
0.3
1 2 3 4 5 6 7 8 9 10 11 12
# cores
EnergyConsumption(KWh)
● estimation real wrench
WRENCH’s impact on
CI research
Investigated the impact of resource
utilization and I/O operations on
the energy usage, as well as the
impact of executing multiple tasks
concurrently on multi-socket,
multi-core compute nodes
12
Comparison of power (left) and energy (right) consumption
measurements for a real-world application (“real”) using a
well-known model from the literature (“estimation”) and
our WRENCH model (“wrench”)
Energy-aware Computing
WRENCH
Pedagogic Modules
Simulation-driven self-contained
pedagogic modules supported by
WRENCH-based simulators
Activities entail running, through a
Web application, a simulator with
different input parameters
13
https://wrench-project.org/wrench-pedagogic-modules
Thank You
Questions?
14
This work is funded by NSF contracts #1642369 and #1642335; by CNRS
under grant #PICS07239; and partly funded by NSF contracts #1923539
and #1923621.
https://wrench-project.org

Contenu connexe

Tendances

Embacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDEmbacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CD
Nebulaworks
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
Ahmed Zerouali
 
Utility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDUtility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDD
XP Conference India
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
Ash Winter
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
Len Bass
 
Static Analysis Primer
Static Analysis PrimerStatic Analysis Primer
Static Analysis Primer
Coverity
 
Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
Len Bass
 
Continuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsContinuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database Objects
Prabhu Ramasamy
 
Adopting Agile
Adopting AgileAdopting Agile
Adopting Agile
Coverity
 
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorLFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
Eric Smalling
 
Kelis king - engineering approach to develop software.
Kelis king -  engineering approach to develop software.Kelis king -  engineering approach to develop software.
Kelis king - engineering approach to develop software.
KelisKing
 
Source Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsSource Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code Scripts
Akond Rahman
 
Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
Samsung Open Source Group
 

Tendances (13)

Embacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDEmbacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CD
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
 
Utility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDUtility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDD
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
 
Static Analysis Primer
Static Analysis PrimerStatic Analysis Primer
Static Analysis Primer
 
Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
 
Continuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsContinuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database Objects
 
Adopting Agile
Adopting AgileAdopting Agile
Adopting Agile
 
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorLFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
 
Kelis king - engineering approach to develop software.
Kelis king -  engineering approach to develop software.Kelis king -  engineering approach to develop software.
Kelis king - engineering approach to develop software.
 
Source Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsSource Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code Scripts
 
Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
 

Similaire à Bridging Concepts and Practice in eScience via Simulation-driven Engineering

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Rafael Ferreira da Silva
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
Marielisa Peralta
 
Microkontroler
MicrokontrolerMicrokontroler
Microkontroler
Feira Project
 
01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf
OCRE | Open Clouds for Research Environments
 
Parallex - The Supercomputer
Parallex - The SupercomputerParallex - The Supercomputer
Parallex - The Supercomputer
Ankit Singh
 
AF-2599-P.docx
AF-2599-P.docxAF-2599-P.docx
AF-2599-P.docx
Sami Siddiqui
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Gaurav Raina
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
Gaurav Raina
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2
Gaurav Raina
 
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET Journal
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
mictc
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
Abhiram Kanigolla
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
achakracu
 
Presentation for min project
Presentation for min projectPresentation for min project
Presentation for min project
araya kiros
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
Benoit Combemale
 
Using Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridUsing Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGrid
gwickler
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTING
Amy Roman
 
Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...
WMG centre High Value Manufacturing Catapult
 
Computer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching TrendsComputer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching Trends
yogesh1617
 

Similaire à Bridging Concepts and Practice in eScience via Simulation-driven Engineering (20)

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
 
Microkontroler
MicrokontrolerMicrokontroler
Microkontroler
 
01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf
 
Parallex - The Supercomputer
Parallex - The SupercomputerParallex - The Supercomputer
Parallex - The Supercomputer
 
AF-2599-P.docx
AF-2599-P.docxAF-2599-P.docx
AF-2599-P.docx
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2
 
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
 
Presentation for min project
Presentation for min projectPresentation for min project
Presentation for min project
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
 
Using Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridUsing Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGrid
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTING
 
Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...
 
Computer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching TrendsComputer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching Trends
 

Plus de Rafael Ferreira da Silva

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
Rafael Ferreira da Silva
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Rafael Ferreira da Silva
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
Rafael Ferreira da Silva
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Rafael Ferreira da Silva
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
Rafael Ferreira da Silva
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
Rafael Ferreira da Silva
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Rafael Ferreira da Silva
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Rafael Ferreira da Silva
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
Rafael Ferreira da Silva
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
Rafael Ferreira da Silva
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Rafael Ferreira da Silva
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Rafael Ferreira da Silva
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
Rafael Ferreira da Silva
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Rafael Ferreira da Silva
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
Rafael Ferreira da Silva
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Rafael Ferreira da Silva
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
Rafael Ferreira da Silva
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
Rafael Ferreira da Silva
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
Rafael Ferreira da Silva
 

Plus de Rafael Ferreira da Silva (20)

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
 

Dernier

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 

Dernier (20)

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 

Bridging Concepts and Practice in eScience via Simulation-driven Engineering

  • 1. Bridging Concepts and Practice in eScience via Simulation-driven Engineering Rafael Ferreira da Silva1, Henri Casanova2, Ryan Tanaka2, Frédéric Suter3 https://wrench-project.org 1 USC Information Sciences Institute, Marina del Rey, CA, USA 2 Information and Computer Sciences, University of Hawaii, Honolulu, HI, USA 3 IN2P3 Computing Center, CNRS, Villeurbanne, France #1
  • 2. Disconnect between theoretical and practical works Theoreticians produce results that are never used by practitioners 2 Practitioners use approaches that may be vastly suboptimal because they are not informed by any theory One of the reasons for this disconnect is that theoretical work must be done using formally defined models of computation Ideally, these models are complete enough to be relevant to practice, but simple enough that obtaining theoretical results (e.g., optimality results, complexity bounds) is tractable
  • 3. Real-world experiments are limited One is limited to particular platform configurations (and sub-configurations) How can “what if?” scenarios be explored? How can generality be claimed? One is limited by specifics of the software infrastructure that impose constraints on CI application executions Modifying complex software stacks (often written by others) just to test out ideas is not feasible In the end, the scope of real-world experiments is limited, which impedes progress / discovery 3
  • 4. Simulation When one works in an experimental field in which experiments are problematic, one resorts to simulation Physicists have understood this decades ago :) In some fields of Computer Science simulation is a standard research and development methodology e.g., Networking, Computer Architecture Several simulators and simulation frameworks have been developed for parallel and distributed computing Some of them developed explicitly for workflows 4
  • 5. Simulation-driven engineering life cycle Experimental simulation Research idea Evaluation of simulation results Research product Implementation onto CI platform Design of research solution unsatisfactory results Accurate CI simulator Design of CI simulator 5 The ability to define parameterizable services is key for developing accurate CI simulators, from which research products evaluated via experimental simulation could be seamlessly integrated into actual CI platforms
  • 6. The SimGrid framework SimGrid is a research project Development of simulation models of hardware/software stacks Models are accurate (validated/invalidated) and scalable (low computational complexity, low memory footprint) SimGrid is open source usable software Provides different APIs for a range of simulation needs, e.g.: S4U: General simulation of Concurrent Sequential Processes SMPI: Fine-grained simulation of MPI applications SimGrid is versatile scientific instrument Used for (combinations of) Grid, HPC, Peer-to-Peer, Cloud, Fog simulation projects First developed in 2000, latest release: v3.23.2 (July 2019) 6 https://simgrid.org
  • 7. SimGrid’s philosophy SimGrid’s philosophy: provide low-level abstractions Advantage: you can do anything with it Drawback: implementing a simulation of a complex system is a lot of work Critical analysis: In [Kecskemeti et al.’14] pinpoints exactly the above trade-off: "SimGrid is more scalable and validated than competing frameworks, but just too much work when wanting to simulate a WMS that interacts with CI components" 7 https://simgrid.org
  • 8. The WRENCH simulation framework Objective #1: Make it easy to develop simulators of complex CI application executions Done by providing high-level, reusable simulation abstractions Objective #2: Produce accurate and scalable simulations Done by building on SimGrid Let’s look at an example system one can simulate with WRENCH… 8 wrench-project.org
  • 10. WRENCH core services 10 Simulation core All necessary simulation models and base abstractions (computing, communicating, storing), provided by SimGrid Simulated core CI services Abstractions for simulated CI components to execute computational workloads Compute Services Provide mechanisms for executing application tasks, which entail I/O and computation cloud bare-metal virtualized cluster batch-scheduled cluster Storage Services Store application files, which can then be accessed in reading/writing by the compute services when executing tasks that read/write files File Registry Services Databases of key-value pairs of storage services and files replicas Network Proximity Services monitor the network and maintain a database of host- to-host network distances Workflow Management System Provides the mechanisms for executing workflow applications, including decision-making for optimizing various objectives
  • 11. WRENCH’s impact on CI research Accuracy: the ability to capture the behavior of a real-world system with as little bias as possible Scalability: the ability to simulate large systems with as few CPU cycles and bytes of RAM as possible 11 Empirical cumulative distribution function of task completion times for sample real-world (“pegasus” and “workqueue”) and simulated (“wrench”) executions. Simulation Accuracy and Scalability
  • 12. ● ● ● ● ● ● ● ● ● ● ● ● 140 160 180 200 1 2 3 4 5 6 7 8 9 10 11 12 # cores PowerConsumption(W) ● estimation real wrench ● ● ● ● ● ● ● ● ● ● ● ● 0.1 0.2 0.3 1 2 3 4 5 6 7 8 9 10 11 12 # cores EnergyConsumption(KWh) ● estimation real wrench WRENCH’s impact on CI research Investigated the impact of resource utilization and I/O operations on the energy usage, as well as the impact of executing multiple tasks concurrently on multi-socket, multi-core compute nodes 12 Comparison of power (left) and energy (right) consumption measurements for a real-world application (“real”) using a well-known model from the literature (“estimation”) and our WRENCH model (“wrench”) Energy-aware Computing
  • 13. WRENCH Pedagogic Modules Simulation-driven self-contained pedagogic modules supported by WRENCH-based simulators Activities entail running, through a Web application, a simulator with different input parameters 13 https://wrench-project.org/wrench-pedagogic-modules
  • 14. Thank You Questions? 14 This work is funded by NSF contracts #1642369 and #1642335; by CNRS under grant #PICS07239; and partly funded by NSF contracts #1923539 and #1923621. https://wrench-project.org