Using Simulation for Decision Support: Lessons Learned from FireGrid
1. Using Simulation for Decision Support:
Lessons Learned from FireGrid
Gerhard Wickler1
George Beckett2, Liangxiu Han3, Sung Han Koo4,
Stephen Potter1, Gavin Pringle2, Austin Tate1
1:AIAI, 2:EPCC, 3:NeSC, 4:SEE,
University of Edinburgh, United Kingdom
www.ed.ac.uk
g.wickler@ed.ac.uk
Intelligent Systems
@ ISCRAM 2009
1
2. FireGrid
1000s of sensors Emergency
responders
Grid
Command-and-
Control
Super-
I-X Technologies real-time
simulation
Computational (HPC)
models
Intelligent Systems
@ ISCRAM 2009
2
5. FireGrid Final Experiment:
User Interface
3D schematized overview
of relevant locations
for each location:
– double traffic light
(current/future hazard
level) per location
– time-line window on
demand
» time slider
» hazard points
» beliefs with
justifications
» link for more
information
Intelligent Systems
@ ISCRAM 2009
5
6. Lessons Learned: Overview
model
sensor data
interpretation
simulation
acquisition
software
HPC / Grid
questions: can we re-apply the FireGrid approach for in a different
scenario, e.g. FloodGrid, QuakeGrid, PandemicGrid, etc.
lessons learned structured according to data flow:
– data acquisition from sensors
– high-performance computing (HPC)
– the Grid
– models and simulation
– intelligent decision support
Intelligent Systems
@ ISCRAM 2009
6
7. Data Acquisition from Sensors:
Overview
aim: collect raw data from available sensors
experiment: ca. 140 sensors of different types
(mostly thermocouples) used
caveats for lessons learned:
– sensors used were simple: single quantity at
specific location; no image data used/analysed
– sensors were pre-installed: exact number and
location known; may not be possible in other
scenarios (e.g. oil spill)
Intelligent Systems
@ ISCRAM 2009
7
8. Data Acquisition from Sensors:
Lessons Learned (1)
Is all the data required by the models actually available?
– problem: models may demand inputs that cannot be
measured realistically, e.g. location of furniture, heat release
rates over time
– problem: number and location of sensors, e.g. centre of room
not practical
Can the sensor data be channelled to and processed by the
simulator?
– problem: data logger is set up to write to file, e.g. when aim
is post-experimental data analysis
– problem: data is in proprietary format, e.g. to protect
commercial interest
Intelligent Systems
@ ISCRAM 2009
8
9. Data Acquisition from Sensors:
Lessons Learned (2)
At what frequency can sensor values be expected?
– not a problem in FireGrid
– problem: sensor readings not synchronized
Is there an ontology that describes the required sensor
types?
– problem: design database to hold sensor readings
Is there a reliable way of grading the sensor output?
– problem: failing or dislocated sensors give incorrect readings
resulting in poor predictions
» sensor grading: decide which sensor readings are to be
believed
» developed a constraint-based algorithm that results in a
consistent picture (minimize violated constraints)
Intelligent Systems
@ ISCRAM 2009
9
10. High Performance Computing:
Lessons Learned (1)
How fast does the simulation run on a “normal” computer?
– problem: linear speed-up might not be sufficient; expected
speed-up due to multiple processors; linear speed-up is best
case
– problem: current CFD model for fires do not scale well
What is the execution bottleneck for the simulation?
– problem: computational bottleneck may be input/output
operations; using multiple CPUs will not provide
solution
– problem: inter-process communication may slow
down computation
Intelligent Systems
@ ISCRAM 2009
10
11. High Performance Computing:
Lessons Learned (2)
Is the model implementation suitable for running on a (parallel)
HPC resource?
– problem: domain experts often produce serial code; need to
parallelize the simulation software
– approach: ensemble computing (used in FireGrid)
Can the existing implementation be compiled on the HPC
resource?
– problem: simulator (in Fortran) using non-standard features;
need to port to HPC platform using different compiler and
libraries
How quickly do simulators need to start running?
– problem: batch system causes delay on HPC
Intelligent Systems
@ ISCRAM 2009
11
12. The Grid:
Background
aim: use Grid to provide on-demand access to HPC
resources
Grid: “… a form of distributed computing whereby a
quot;super and virtual computerquot; is composed of a cluster of
networked, loosely coupled computers, acting in concert to
perform very large tasks. […] What distinguishes grid
computing from conventional cluster computing systems is
that grids tend to be more loosely
coupled, heterogeneous, and geographically
dispersed.”
issues:
– not aiming to fully exploit Grid capabilities
– pre-installation of simulation software
on heterogeneous systems very difficult
Intelligent Systems
@ ISCRAM 2009
12
13. The Grid:
Lessons Learned
How many (heterogeneous) computing resources should be
available through the Grid?
– advice: start with small number (one + one spare);
minimizes porting effort
Is there a Grid expert available?
– problem: software for accessing the Grid seems still
experimental
Can the simulator be adapted to the resource it
is running on?
– problem: Grid provides unified interface, but
setting parameters may be necessary to get
optimal performance out of an HPC resource
Intelligent Systems
@ ISCRAM 2009
13
14. Models and Simulation:
Lessons Learned
Have the models ever been used to generate predictions?
– problem: models developed in research context;
usable for predictions? validation?
Can the simulation be “calibrated on the fly”?
– problem: model may not be able to assimilate live
sensor data
– FireGrid approach: parameter-sweep
Can the model be used to address “what-if” questions?
– problem: model does not take into account
hypothetical actions of emergency responders
Can the model assess the accuracy of its own results?
– problem: responders need confidence in model
Intelligent Systems
@ ISCRAM 2009
14
15. Intelligent Decision Support:
Lessons Learned
Are the model outputs in terms the emergency responders can
understand?
– problem: model output is large amounts of numbers; need to
be contextualized and interpreted;
– approaches: AI system vs. expert at emergency
Is there a set of standard operating procedures available?
– SOPs: give ways in which task can be accomplished;
preconditions represent kind of information decision
makers need to know
Can uncertainty about the model results be conveyed
to the user in a useful way?
– problem: what do percentages mean?
Intelligent Systems
@ ISCRAM 2009
15
16. Conclusions
aim of this paper: provide lessons learned for people
trying to build a system that:
– uses (large amounts of) sensor data to
– steer a super-real-time simulation that
– generates predictions which are the basis for
– decision support for emergency responders.
but: for a different type of scenario/model, e.g.
– an oil spill simulator
– a flood simulator (for a river)
creating such a system requires experts from a variety of
technical domains, and pitfalls that are obvious to an
expert in one field may be far from it to an expert in a
different field, even if they are all experts in computing!
Intelligent Systems
@ ISCRAM 2009
16