SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Bridging Concepts and Practice in
eScience via Simulation-driven
Engineering
Rafael Ferreira da Silva1, Henri Casanova2, Ryan Tanaka2, Frédéric Suter3
https://wrench-project.org
1 USC Information Sciences Institute, Marina del Rey, CA, USA
2 Information and Computer Sciences, University of Hawaii, Honolulu, HI, USA
3 IN2P3 Computing Center, CNRS, Villeurbanne, France
#1
Disconnect between theoretical and practical works
Theoreticians produce
results that are never used
by practitioners
2
Practitioners use approaches that
may be vastly suboptimal because
they are not informed by any theory
One of the reasons for this disconnect is that theoretical work must be
done using formally defined models of computation
Ideally, these models are complete enough to be relevant to practice, but
simple enough that obtaining theoretical results (e.g., optimality results,
complexity bounds) is tractable
Real-world experiments are limited
One is limited to particular platform configurations (and sub-configurations)
How can “what if?” scenarios be explored?
How can generality be claimed?
One is limited by specifics of the software infrastructure that impose
constraints on CI application executions
Modifying complex software stacks (often written by others) just to test out ideas is not
feasible
In the end, the scope of real-world experiments is limited, which impedes
progress / discovery
3
Simulation
When one works in an experimental field in which experiments are
problematic, one resorts to simulation
Physicists have understood this decades ago :)
In some fields of Computer Science simulation is a standard research and
development methodology
e.g., Networking, Computer Architecture
Several simulators and simulation frameworks have been developed for
parallel and distributed computing
Some of them developed explicitly for workflows
4
Simulation-driven
engineering life cycle
Experimental
simulation
Research
idea
Evaluation of
simulation
results
Research
product
Implementation
onto CI platform
Design of
research
solution
unsatisfactory results
Accurate CI
simulator
Design of CI
simulator
5
The ability to define
parameterizable services is key for
developing accurate CI simulators,
from which research products
evaluated via experimental
simulation could be seamlessly
integrated into actual CI platforms
The SimGrid framework
SimGrid is a research project
Development of simulation models of hardware/software stacks
Models are accurate (validated/invalidated) and scalable (low computational complexity, low
memory footprint)
SimGrid is open source usable software
Provides different APIs for a range of simulation needs, e.g.:
S4U: General simulation of Concurrent Sequential Processes
SMPI: Fine-grained simulation of MPI applications
SimGrid is versatile scientific instrument
Used for (combinations of) Grid, HPC, Peer-to-Peer, Cloud, Fog simulation projects
First developed in 2000, latest release: v3.23.2 (July 2019)
6
https://simgrid.org
SimGrid’s philosophy
SimGrid’s philosophy: provide low-level abstractions
Advantage: you can do anything with it
Drawback: implementing a simulation of a complex system is a lot of work
Critical analysis:
In [Kecskemeti et al.’14] pinpoints exactly the above trade-off:
"SimGrid is more scalable and validated than competing frameworks, but just too much
work when wanting to simulate a WMS that interacts with CI components"
7
https://simgrid.org
The WRENCH simulation framework
Objective #1: Make it easy to develop simulators of complex CI
application executions
Done by providing high-level, reusable simulation abstractions
Objective #2: Produce accurate and scalable simulations
Done by building on SimGrid
Let’s look at an example system one can simulate with WRENCH…
8
wrench-project.org
System to simulate
9
WRENCH core services
10
Simulation core
All necessary simulation models and base
abstractions (computing, communicating,
storing), provided by SimGrid
Simulated core CI services
Abstractions for simulated CI
components to execute
computational workloads
Compute Services
Provide mechanisms for
executing application tasks,
which entail I/O and
computation
cloud
bare-metal virtualized cluster
batch-scheduled cluster
Storage Services
Store application files, which
can then be accessed in
reading/writing by the
compute services when
executing tasks that
read/write files
File Registry Services
Databases of key-value pairs
of storage services and files
replicas
Network Proximity Services
monitor the network and
maintain a database of host-
to-host network distances
Workflow Management
System
Provides the mechanisms for
executing workflow
applications, including
decision-making for
optimizing various objectives
WRENCH’s impact on
CI research
Accuracy: the ability to capture the
behavior of a real-world system with
as little bias as possible
Scalability: the ability to simulate
large systems with as few CPU cycles
and bytes of RAM as possible
11
Empirical cumulative distribution function of task
completion times for sample real-world (“pegasus” and
“workqueue”) and simulated (“wrench”) executions.
Simulation Accuracy and Scalability
●
●
●
●
●
●
●
●
●
●
●
●
140
160
180
200
1 2 3 4 5 6 7 8 9 10 11 12
# cores
PowerConsumption(W)
● estimation real wrench
●
●
●
●
●
●
●
●
●
●
●
●
0.1
0.2
0.3
1 2 3 4 5 6 7 8 9 10 11 12
# cores
EnergyConsumption(KWh)
● estimation real wrench
WRENCH’s impact on
CI research
Investigated the impact of resource
utilization and I/O operations on
the energy usage, as well as the
impact of executing multiple tasks
concurrently on multi-socket,
multi-core compute nodes
12
Comparison of power (left) and energy (right) consumption
measurements for a real-world application (“real”) using a
well-known model from the literature (“estimation”) and
our WRENCH model (“wrench”)
Energy-aware Computing
WRENCH
Pedagogic Modules
Simulation-driven self-contained
pedagogic modules supported by
WRENCH-based simulators
Activities entail running, through a
Web application, a simulator with
different input parameters
13
https://wrench-project.org/wrench-pedagogic-modules
Thank You
Questions?
14
This work is funded by NSF contracts #1642369 and #1642335; by CNRS
under grant #PICS07239; and partly funded by NSF contracts #1923539
and #1923621.
https://wrench-project.org

Contenu connexe

Tendances

Embacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDEmbacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDNebulaworks
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesAhmed Zerouali
 
Utility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDUtility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDXP Conference India
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the ApplicationAsh Winter
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020Len Bass
 
Static Analysis Primer
Static Analysis PrimerStatic Analysis Primer
Static Analysis PrimerCoverity
 
Devops syllabus
Devops syllabusDevops syllabus
Devops syllabusLen Bass
 
Continuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsContinuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsPrabhu Ramasamy
 
Adopting Agile
Adopting AgileAdopting Agile
Adopting AgileCoverity
 
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorLFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorEric Smalling
 
Kelis king - engineering approach to develop software.
Kelis king -  engineering approach to develop software.Kelis king -  engineering approach to develop software.
Kelis king - engineering approach to develop software.KelisKing
 
Source Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsSource Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsAkond Rahman
 
Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoveritySamsung Open Source Group
 

Tendances (13)

Embacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CDEmbacing service-level-objectives of your microservices in your Cl/CD
Embacing service-level-objectives of your microservices in your Cl/CD
 
A multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker imagesA multi-dimensional analysis of technical lag in Debian-based Docker images
A multi-dimensional analysis of technical lag in Debian-based Docker images
 
Utility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDDUtility of Test Coverage Metrics in TDD
Utility of Test Coverage Metrics in TDD
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
 
Static Analysis Primer
Static Analysis PrimerStatic Analysis Primer
Static Analysis Primer
 
Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
 
Continuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database ObjectsContinuous Integration - Oracle Database Objects
Continuous Integration - Oracle Database Objects
 
Adopting Agile
Adopting AgileAdopting Agile
Adopting Agile
 
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your doorLFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
LFX Nov 16, 2021 - Find vulnerabilities before security knocks on your door
 
Kelis king - engineering approach to develop software.
Kelis king -  engineering approach to develop software.Kelis king -  engineering approach to develop software.
Kelis king - engineering approach to develop software.
 
Source Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code ScriptsSource Code Properties of Defective Infrastructure as Code Scripts
Source Code Properties of Defective Infrastructure as Code Scripts
 
Static Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with CoverityStatic Analysis of Your OSS Project with Coverity
Static Analysis of Your OSS Project with Coverity
 

Similaire à Bridging Concepts and Practice in eScience via Simulation-driven Engineering

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Rafael Ferreira da Silva
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Rafael Ferreira da Silva
 
Parallex - The Supercomputer
Parallex - The SupercomputerParallex - The Supercomputer
Parallex - The SupercomputerAnkit Singh
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiGaurav Raina
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiGaurav Raina
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Gaurav Raina
 
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET Journal
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland mictc
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptxachakracu
 
Presentation for min project
Presentation for min projectPresentation for min project
Presentation for min projectaraya kiros
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...Benoit Combemale
 
Using Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridUsing Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridgwickler
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGAmy Roman
 
Computer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching TrendsComputer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching Trendsyogesh1617
 

Similaire à Bridging Concepts and Practice in eScience via Simulation-driven Engineering (20)

Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
 
Microkontroler
MicrokontrolerMicrokontroler
Microkontroler
 
01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf
 
Parallex - The Supercomputer
Parallex - The SupercomputerParallex - The Supercomputer
Parallex - The Supercomputer
 
AF-2599-P.docx
AF-2599-P.docxAF-2599-P.docx
AF-2599-P.docx
 
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon PhiDeep Convolutional Neural Network acceleration on the Intel Xeon Phi
Deep Convolutional Neural Network acceleration on the Intel Xeon Phi
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
 
Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2Thesis Report - Gaurav Raina MSc ES - v2
Thesis Report - Gaurav Raina MSc ES - v2
 
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratoryIRJET- Methodologies to the Strategy of Computer Networking Research laboratory
IRJET- Methodologies to the Strategy of Computer Networking Research laboratory
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
 
Presentation for min project
Presentation for min projectPresentation for min project
Presentation for min project
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
 
Using Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGridUsing Simulation for Decision Support: Lessons Learned from FireGrid
Using Simulation for Decision Support: Lessons Learned from FireGrid
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTING
 
Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...Deep learning in manufacturing predicting and preventing manufacturing defect...
Deep learning in manufacturing predicting and preventing manufacturing defect...
 
Computer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching TrendsComputer Organisation and Architecture Teaching Trends
Computer Organisation and Architecture Teaching Trends
 

Plus de Rafael Ferreira da Silva

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Rafael Ferreira da Silva
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsRafael Ferreira da Silva
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsRafael Ferreira da Silva
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Rafael Ferreira da Silva
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsRafael Ferreira da Silva
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCRafael Ferreira da Silva
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Rafael Ferreira da Silva
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Rafael Ferreira da Silva
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsRafael Ferreira da Silva
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsRafael Ferreira da Silva
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Rafael Ferreira da Silva
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresRafael Ferreira da Silva
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...Rafael Ferreira da Silva
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsRafael Ferreira da Silva
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Rafael Ferreira da Silva
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...Rafael Ferreira da Silva
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Rafael Ferreira da Silva
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceRafael Ferreira da Silva
 

Plus de Rafael Ferreira da Silva (20)

Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...Towards an Infrastructure for Enabling Systematic Development and Research of...
Towards an Infrastructure for Enabling Systematic Development and Research of...
 
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific WorkflowsAccurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
Accurately Simulating Energy Consumption of I/O-intensive Scientific Workflows
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific WorkflowsOn the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows
 
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...
 
Automating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific WorkflowsAutomating Environmental Computing Applications with Scientific Workflows
Automating Environmental Computing Applications with Scientific Workflows
 
Analysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTCAnalysis of User Submission Behavior on HPC and HTC
Analysis of User Submission Behavior on HPC and HTC
 
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
Automating Real-time Seismic Analysis Through Streaming and High Throughput W...
 
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
Performance Analysis of an I/O-Intensive Workflow executing on Google Cloud a...
 
Pegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computationsPegasus - automate, recover, and debug scientific computations
Pegasus - automate, recover, and debug scientific computations
 
Task Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and WorkflowsTask Resource Consumption Prediction for Scientific Applications and Workflows
Task Resource Consumption Prediction for Scientific Applications and Workflows
 
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...
 
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud InfrastructuresExperiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
Experiments with Complex Scientific Applications on Hybrid Cloud Infrastructures
 
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
A Unified Approach for Modeling and Optimization of Energy, Makespan and Reli...
 
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific WorkflowsLeveraging Semantics to Improve Reproducibility in Scientific Workflows
Leveraging Semantics to Improve Reproducibility in Scientific Workflows
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
 
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Work...
 
On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...On-line, non-clairvoyant optimization of workflow activity granularity task o...
On-line, non-clairvoyant optimization of workflow activity granularity task o...
 
Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...Workflow fairness control on online and non-clairvoyant distributed computing...
Workflow fairness control on online and non-clairvoyant distributed computing...
 
VIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution serviceVIP: design and implementation of the portal and execution service
VIP: design and implementation of the portal and execution service
 

Dernier

WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsUXDXConf
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 

Dernier (20)

WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 

Bridging Concepts and Practice in eScience via Simulation-driven Engineering

  • 1. Bridging Concepts and Practice in eScience via Simulation-driven Engineering Rafael Ferreira da Silva1, Henri Casanova2, Ryan Tanaka2, Frédéric Suter3 https://wrench-project.org 1 USC Information Sciences Institute, Marina del Rey, CA, USA 2 Information and Computer Sciences, University of Hawaii, Honolulu, HI, USA 3 IN2P3 Computing Center, CNRS, Villeurbanne, France #1
  • 2. Disconnect between theoretical and practical works Theoreticians produce results that are never used by practitioners 2 Practitioners use approaches that may be vastly suboptimal because they are not informed by any theory One of the reasons for this disconnect is that theoretical work must be done using formally defined models of computation Ideally, these models are complete enough to be relevant to practice, but simple enough that obtaining theoretical results (e.g., optimality results, complexity bounds) is tractable
  • 3. Real-world experiments are limited One is limited to particular platform configurations (and sub-configurations) How can “what if?” scenarios be explored? How can generality be claimed? One is limited by specifics of the software infrastructure that impose constraints on CI application executions Modifying complex software stacks (often written by others) just to test out ideas is not feasible In the end, the scope of real-world experiments is limited, which impedes progress / discovery 3
  • 4. Simulation When one works in an experimental field in which experiments are problematic, one resorts to simulation Physicists have understood this decades ago :) In some fields of Computer Science simulation is a standard research and development methodology e.g., Networking, Computer Architecture Several simulators and simulation frameworks have been developed for parallel and distributed computing Some of them developed explicitly for workflows 4
  • 5. Simulation-driven engineering life cycle Experimental simulation Research idea Evaluation of simulation results Research product Implementation onto CI platform Design of research solution unsatisfactory results Accurate CI simulator Design of CI simulator 5 The ability to define parameterizable services is key for developing accurate CI simulators, from which research products evaluated via experimental simulation could be seamlessly integrated into actual CI platforms
  • 6. The SimGrid framework SimGrid is a research project Development of simulation models of hardware/software stacks Models are accurate (validated/invalidated) and scalable (low computational complexity, low memory footprint) SimGrid is open source usable software Provides different APIs for a range of simulation needs, e.g.: S4U: General simulation of Concurrent Sequential Processes SMPI: Fine-grained simulation of MPI applications SimGrid is versatile scientific instrument Used for (combinations of) Grid, HPC, Peer-to-Peer, Cloud, Fog simulation projects First developed in 2000, latest release: v3.23.2 (July 2019) 6 https://simgrid.org
  • 7. SimGrid’s philosophy SimGrid’s philosophy: provide low-level abstractions Advantage: you can do anything with it Drawback: implementing a simulation of a complex system is a lot of work Critical analysis: In [Kecskemeti et al.’14] pinpoints exactly the above trade-off: "SimGrid is more scalable and validated than competing frameworks, but just too much work when wanting to simulate a WMS that interacts with CI components" 7 https://simgrid.org
  • 8. The WRENCH simulation framework Objective #1: Make it easy to develop simulators of complex CI application executions Done by providing high-level, reusable simulation abstractions Objective #2: Produce accurate and scalable simulations Done by building on SimGrid Let’s look at an example system one can simulate with WRENCH… 8 wrench-project.org
  • 10. WRENCH core services 10 Simulation core All necessary simulation models and base abstractions (computing, communicating, storing), provided by SimGrid Simulated core CI services Abstractions for simulated CI components to execute computational workloads Compute Services Provide mechanisms for executing application tasks, which entail I/O and computation cloud bare-metal virtualized cluster batch-scheduled cluster Storage Services Store application files, which can then be accessed in reading/writing by the compute services when executing tasks that read/write files File Registry Services Databases of key-value pairs of storage services and files replicas Network Proximity Services monitor the network and maintain a database of host- to-host network distances Workflow Management System Provides the mechanisms for executing workflow applications, including decision-making for optimizing various objectives
  • 11. WRENCH’s impact on CI research Accuracy: the ability to capture the behavior of a real-world system with as little bias as possible Scalability: the ability to simulate large systems with as few CPU cycles and bytes of RAM as possible 11 Empirical cumulative distribution function of task completion times for sample real-world (“pegasus” and “workqueue”) and simulated (“wrench”) executions. Simulation Accuracy and Scalability
  • 12. ● ● ● ● ● ● ● ● ● ● ● ● 140 160 180 200 1 2 3 4 5 6 7 8 9 10 11 12 # cores PowerConsumption(W) ● estimation real wrench ● ● ● ● ● ● ● ● ● ● ● ● 0.1 0.2 0.3 1 2 3 4 5 6 7 8 9 10 11 12 # cores EnergyConsumption(KWh) ● estimation real wrench WRENCH’s impact on CI research Investigated the impact of resource utilization and I/O operations on the energy usage, as well as the impact of executing multiple tasks concurrently on multi-socket, multi-core compute nodes 12 Comparison of power (left) and energy (right) consumption measurements for a real-world application (“real”) using a well-known model from the literature (“estimation”) and our WRENCH model (“wrench”) Energy-aware Computing
  • 13. WRENCH Pedagogic Modules Simulation-driven self-contained pedagogic modules supported by WRENCH-based simulators Activities entail running, through a Web application, a simulator with different input parameters 13 https://wrench-project.org/wrench-pedagogic-modules
  • 14. Thank You Questions? 14 This work is funded by NSF contracts #1642369 and #1642335; by CNRS under grant #PICS07239; and partly funded by NSF contracts #1923539 and #1923621. https://wrench-project.org