2. Framework
¥ Computational modelling of atmospheric dispersion
of hazardous pollutants
¥ How can BigDataEurope Integrator tools contribute
to performing more efficiently computational tasks
related to atmospheric dispersion of hazardous
pollutants?
11-oct.-16www.big-data-europe.eu
3. Purposes and means
¥ Air pollution abatement / early warning / countermeasures
o Anthropogenic emissions: routine, accidental (nuclear, chemical),
malevolent (terrorist) – unannounced releases
o Natural emissions (e.g., volcanic eruptions)
¥ Measurements (from earth or space)
¥ Mathematical modelling
¥ Combination of the above → “forward” or “inverse” modelling
through “data assimilation”
11-oct.-16www.big-data-europe.eu
4. Input data for dispersion modelling
¥ Meteorology
¥ “Source term”: knowledge of the emitted pollutant(s)
source(s): Location, quantity and conditions of release,
timing
¥ Terrain characteristics, geometry of buildings etc.
¥ Depending on available input and measurement data:
“forward” or “inverse” modelling
11-oct.-16www.big-data-europe.eu
5. Cases of “inverse” computations
¥ The pollutant emission sources are NOT known:
location and / or quantity of emitted substances
o Technological accidents (e.g., chemical, nuclear), natural
disasters (e.g., volcanos): known location, unknown
emission
o Un-announced technological accidents (e.g. Chernobyl),
malevolent intentional releases (terrorism), nuclear tests
¥ Inverse “source-term” estimation techniques
11-oct.-16www.big-data-europe.eu
6. Inverse source-term estimation
¥ Available information:
o Measurements indicating the presence of air pollutant
o Meteorological data for now and recent past
¥ Mathematical techniques blending the above with
results of dispersion models to infer position and
strength of emitting source
o Special attention: multiple solutions
11-oct.-16www.big-data-europe.eu
7. Introducing the 2nd BDE SC5 Pilot
¥ The previously mentioned mathematical techniques require
large computing times
¥ Purpose: fast estimation of source location in emergencies
¥ Proposed solution: pre-calculate a large number of scenarios,
store them, and at the time of an emergency select the “most
appropriate”
¥ BDE will provide the tools to perform this functionality
efficiently
11-oct.-16www.big-data-europe.eu
8. Structure of the 2nd BDE SC5 Pilot
¥ Geographic area: Europe
¥ Cases of interest: accidents at Nuclear Power Plants
¥ Weather calculations:
o Re-analysis data for 20 years
o Clustering → “typical” weather circulation patterns
o Downscaling through WRF for the “typical” weather
circulation patterns
11-oct.-16www.big-data-europe.eu
9. Structure of the 2nd BDE SC5 Pilot
¥ Dispersion calculations:
o Calculation of dispersion patterns from NPPs for the
above downscaled typical weather circulation patterns
o Dispersion results: gridded and (optionally) at
monitoring stations
11-oct.-16www.big-data-europe.eu
10. Structure of the 2nd BDE SC5 Pilot
¥ In the event of radiation signals at some stations:
o Matching of current and recent weather to closest
typical circulation pattern
o From the stored dispersion results pertaining to the
matched weather circulation patterns select the one that
closest matches the monitoring data
o The matched dispersion pattern will reveal the most
probable emission source
11-oct.-16www.big-data-europe.eu
11. So far …
¥ Preliminary clustering studies on limited amount of
re-analysis data (while waiting for full download)
o On the basis of different variables on different
pressure levels
¥ Dispersion calculations for a selected NPP for the
revealed weather classes
11-oct.-16www.big-data-europe.eu
12. So far …
¥ Selected a random date, taken as “true” accident day
¥ Matching of the “true” day’s weather data with the closest
weather class from the clustering procedure
¥ Dispersion calculations with the weather data of the “true” day
¥ Comparison of dispersion results based on “true” and matched
weather data
11-oct.-16www.big-data-europe.eu
14. Data
¥ ECMWF Reanalysis data
¥ NCAR-UCAR Archive
o Better compatibility with WPS/WRF
¥ 20-30 years
o Approx. 6 TB in total
¥ Grib2 format – again for better compatibility with WRF
o NetCDF via WPS
¥ Many variables at multiple geopotential heights
www.big-data-europe.eu
16. Clustering
¥ Traditional methods
o Agglomerative hierarchical
o K-means
¥ Soon to implement
o NN-based feature extraction (e.g. autoencoders,
convolution nets)
o (Possibly) followed by k-means
www.big-data-europe.eu
17. Evaluation
¥ Incremental
o Clustering outcome
o Closeness of constituent weather within clusters / distance between
clusters
o Dispersion characteristics
o Different cluster descriptors for
v Creating cluster-based dispersions
v Matching “real data” to clusters
¥ Complete
o Compare cluster-based dispersion against
o “Real data” dispersion
v For a number of hypothetical scenarios
www.big-data-europe.eu
18. Preliminary results
¥ Clustering over 2-year period (1986, 1987)
o K=6 clusters
¥ Multiple geopotentials
¥ Other variables – notably wind speed – at
different heights
¥ “Visual comparison” against “real data” dispersions
¥ Incrementally combining more vars
www.big-data-europe.eu
19. Cluster quality / GHT 500hPa
www.big-data-europe.eu
• 1986, 1987
• Resolution=
• Items (6-hr snapshots) =
• K-means, for K-6
• Geopotential height=500hPa
• Dispersions well differentiated for a
specific hypothetical origin
• Real data:
21. Immediate Future Work
¥ Feature extraction
o Taking into account multiple variables
o At more heights
¥ Automatic evaluation
o For a number of pre-selected scenarios
¥ Dockerisation and inclusion into the BDE architecture
www.big-data-europe.eu