2. Climate 1st
Pilot case: Supporting data-intensive climate
research
Pilot under development by NCSR “Demokritos”
Impact on / basis for synergies with:
o SCs that need feedback on the potential effects of climate change for considering adaptation,
mitigation and prevention measures
SC1 –Health
SC2 -Food and Agriculture
SC3 -Secure, clean and efficient Energy
SC4 -Smart, green and integrated Transport
SC6 -Europe in a changing world - inclusive, innovative and reflective societies
SC7 -Secure societies - protecting freedom and security of Europe and its citizens
3. Background & Problem Statement
The extended scientific effort over the last few decades in understanding and
studying the earth’s climate and climate change has resulted in a huge volume
production of model (e.g. Global Circulation Models, Regional Models) and
observational (e.g. satellite, aircraft, station) data
The provision of such model data satisfies an important objective, that of assessing
the potential impacts of climate change on well being for adaptation, prevention and
mitigation measures and supporting other policy making decisions
The climate research and impact assessment communities need to interface with
useful data resources to satisfy the requirement of extracting data efficiently and
timely. One of the methodologies applied is that of dynamical downscaling to obtain
values of physical atmospheric variables in smaller spatial and temporal scales.
The present BDE pilot aims to facilitate the process of dynamical downscaling from
global climate data to regional / local scales with the support of tools aggregated on
the Big Data Europe platform
4. Dynamical downscaling
http://hosted.verticalresponse.com/1022925/c3c95b44ec/TEST/TEST/
Output of Global Circulation Models
(GCMs) run at coarse spatial
resolution drives Regional or Local
Models run at higher resolution
Aim: To obtain local weather and
climate variables needed for impact
assessment and planning by
decision makers
Assumption: local climate depends
on large-scale atmospheric
characteristics and local-scale
features
5. Purpose of the Pilot
Provide an intuitive interface between researchers and specific climate data portals and
providers.
Search and download climate model and, optionally, observational data, according to
user requirements, such as geographic coverage and / or experiments (scenarios).
Setup and orchestrate the execution of the dynamic downscaling process on institutional
computational resources, while gathering and managing data products.
Establish a workflow for useful metadata mappings and data lineage.
6. Scenario
The user accesses the climate data portal and defines the geographic area and
scenarios.
The user then selects and downloads data on the BDE platform.
The user defines the case-specific parameters for the dynamical downscaling process
o several of the above options will be predefined for the present pilot
The platform initiates and controls the dynamical downscaling process by invoking the
modules that will perform the preprocessing and main computation steps.
o on institutional resources provided and used by the partner
o data mapping and necessary transformations will be addressed at the pre-processing stage, to
ensure compatibility of the data with the modelling software
The intermediate and final data products are stored on the BDE platform, while data
lineage is also tracked.
The process described can be repeated either from any intermediate step or from the
beginning, using data from different climatic models
7. Current practice
no generally agreed workflow to perform the dynamical downscaling process
data acquisition, handling and management is often performed in an ad-hoc manner.
o observational and analysis data from external services are transferred to local filesystems
Researchers then preprocess the data using third-party or custom-made scripts.
The preprocessed data (i.e. the resulting set of files) are then fed into the regional model
of choice along with a set of suitable parameters.
The regional model is usually run on local computational infrastructure
The resulting files are used for visualisation and further analysis either on local or, data-
size and task-permitting, on the researchers’ machines.
Difficulties can be encountered in managing and archiving the data produced at the
various stages.
Difficult to track progress and intermediate data products as well as to encourage reuse
and collaboration across disciplines
8. Primary content / data involved
This Pilot will make extensive use of publically available climate data
Earth System Grid Federation (ESGF) http://esgf.llnl.gov
o international collaboration for the management, dissemination and analysis of model output and
observational data
o complete set of CMIP5 data (global climate model simulations)
o most CORDEX data (regional climate model simulations)
o more than 15 portals and more than 45 data nodes worldwide
o IS-ENES operates the European part of ESGF (https://verc.enes.org/ISENES2)
European Centre for Medium range Weather Forecasting (ECMWF)
o intergovernmental organisation supported by 21 European Member States and 13 cooperating
States
o ECMWF’s public data http://apps.ecmwf.int/datasets/data/interim-full-daily
o Certain subsets of ECMWF data are also published via ESGF
9. Data structures
This pilot use case will make use of NetCDF data sets
o binary, header-based format
o variable data are stored in multidimensional arrays
Access to relevant datasets is described via THREDDS records on each node’s
catalogue http://www.unidata.ucar.edu/software/thredds/current/tds/catalog/
The data are described using the Climate Forecasting (CF) conventions
http://cfconventions.org
The implementation of this pilot will make use of WRF - the Weather Research and
Forecasting model http://wrf-model.org - to perform the dynamical downscaling
o the internal metadata will be expressed in a WRF-compatible format. Appropriate mappings will
be put in place so that the conversion is transparent to the user and that it takes place
automatically upon ingestion.
10. Current technological infrastructure at the Pilot partner
The pilot partner - NCSR Demokritos - currently operates WRF on a daily basis to
produce weather forecasts on 3 nested computational grids covering Europe, Greece
and Attica peninsula
Initial and boundary conditions are provided by GFS data (from US). The GFS data are
downloaded automatically and pre-processed to produce the necessary input file for
running WRF.
11. Functional roles involved in this Use Case
Research site/centre (stakeholder): use of the functionality of this pilot either as an
external service/product or by deploying an internal instance of the BDE platform
o Benefit: increase the potential for its internal processes with minimal disruption and expansion
requirements to its infrastructure and internal policies.
Researcher (primary user): effectively search for data, potentially across different
providers, perform processing tasks on the BDE platform or on their departmental
resources
o gain data provenance information
o increase the efficiency of experimental runs,
o increase the value of their experiments (as they will be archivable and retrievable for potential
future reference, publication and scientific replication)
o increase the potential for primary users to perform multiple downscaling computations and
inter-compare their results
12. IT infrastructure
Potential configurations for a research institute to use the Pilot:
o As an external set of services: through agreement with the instance provider provides the pilot
services are accessible and integratable to current research procedures.
o As an internal infrastructure component: the research institute provides the necessary hardware
to be used for hosting the BDE instance.
In showcasing the pilot we aim to follow the first approach, taking advantage of the
common administration within NCSR-D.
13. Impacts
This pilot aims to aid climate research, and therefore encourage consequent impactful
synergies, by providing
o a user-friendly, interactive, way to query and fetch data from resources such as ESGF and
ECMWF
o a framework to support the incremental, data-oriented carrying out of climate-related
experiments, while storing intermediate data products and associated data lineage.
In particular this pilot aims at:
o improving the productivity of researchers
Easier management, ingestion and transformation of external data
overseeing the execution of model runs over the data, making use of existing infrastructure and
procedures
Improve efficiency and reusability in downscaling computational experiments,
o Create opportunities for pilots across communities within the BDE platform
Climate change impact assessment studies on sectors such as energy, food and agriculture are
potential future pilot-use cases across societal challenges in the BDE platform.