The document discusses the e-MAST data-model interface and the Protocol for the Analysis of the Land Surface (PALS) data portal. PALS aims to bridge the observation and modeling communities by hosting standardized model experiments that include multiple observational data streams for model evaluation and development. It addresses issues like data formatting and access that have hindered interaction between the communities. PALS currently has around 140 users conducting site-based experiments and is working to expand to larger-scale distributed experiments to further evaluation of land surface, hydrological and ecosystem models.
TERN Ecosystem Surveillance Plots Kakadu National Park
Gab Abramowitz_The e-MAST data-model interface
1. The e-MAST data-model interface
Gab Abramowitz
Climate Change Research Centre, UNSW
ARC Centre of Excellence for Climate System Science
2. Outline
• Why care about models?
• Why model evaluation is complicated
• How models and observations interact
• How model-based experiments quantify uncertainty
• Why multiple data streams are particularly important
• Protocol for the Analysis of the Land Surface (PALS) – the
eMAST data portal – and what it’s trying to achieve
3. Why care about models?
• Land surface models; hydrological models; ecosystem models.
• They drive: climate projections, weather forecasts, water resources
assessments, impacts assessments for natural systems
• Many modelling areas have a long history of development before high
quality observations became available – legacy code is still a real issue.
• Observations play a key role in reducing uncertainty in model
predictions, yet communication channels between modelling and
observational groups are often poor
4. Model evaluation is complicated
• Diagnostic model evaluation is contingent
parameters
upon the quality of the observations used
in all the four green categories.
input MODEL Outputs
• Model outputs may cover a wide range of s
systems – e.g. carbon, water, energy states
• Uncertainty in these is rarely low enough
to tightly constrain simulations
• Leads to “equifinality” – several different
combinations of inputs / initial states /
parameters give equally good results.
• This makes identification of the “best”
model structure very difficult
NASA LIS
5. How models and observations interact
• Simple comparison of observations with predictions (obs ≈ model output)
• Parameter estimation (obs ≈ model output)
• Direct restriction of parameter ranges (obs ≈ model parameters)
• Comparison with empirical approaches (obs ≈ model input / output / params)
• Data assimilation generating reanalysis products (obs ≈ model output / state)
All can give diagnostic information about model structure
6. How models and observations don’t interact
• Long pathways to data access / uncertainty in availability
• Physical inconsistencies in data (i.e. quality control)
• Data formatting and a lack of standardisation (file formats, standards within
formats, time and space sampling)
• e.g. CMIP5 database is anticipated to be ~50PB
• Example – Fluxnet and the land surface modelling community
7. Gauging uncertainty in model predictions
• Uncertainty in predictions is typically
estimated by sampling uncertainty in parameters
parameters, inputs, and/or initial states
input MODEL Outputs
• Multiple streams of data mean that
s
uncertainty ranges can be better
states
constrained => more reliable predictions
• Examples: meteorology, C fluxes, water
and heat fluxes, biomass, carbon pool
sizes, physical soil properties, vegetation
characteristics, soil moisture and
temperature, streamflow, N, P etc
NASA LIS
8. Multiple data streams – an example
Model parameter estimation based on each
data type separately, and in combination
(error bars are uncertainty from propagated
parameter uncertainty – 1σ):
See Vanessa’s talk tomorrow at 11:10am
Prior estimate
Eddy fluxes
Streamflow
Litterfall
Haverd et al, BGD, 2012 Eddy fluxes + Litterfall
Streamflow + Litterfall
Streamflow + Eddy fluxes
Eddy fluxes + Litterfall + Streamflow
0 1 2 3 4
-1
NPP (GtC y )
9. How the eMAST data portal addresses this issue
• The Protocol for the Analysis of the Land Surface (PALS) is a web application
for evaluating land surface/ecosystem/hydrology models
10. How the eMAST data portal addresses this issue
• The Protocol for the Analysis of the Land Surface (PALS) is a web application
for evaluating land surface/ecosystem/hydrology models
• PALS hosts Experiments:
All data sets required to drive/force a model
for an experiment are downloadable in a
standardised netcdf format, easily read by
most LS modelling groups internationally
11. How the eMAST data portal addresses this issue
• The Protocol for the Analysis of the Land Surface (PALS) is a web application
for evaluating land surface/ecosystem/hydrology models
• PALS hosts Experiments:
All data sets required to drive/force a model
for an experiment are downloadable in a
standardised netcdf format, easily read by
most LS modelling groups internationally
Users upload their model simulations for an
experiment in a way that maximises
reproducibility – ancillary files (log and
parameter files, run scripts etc)
12. How the eMAST data portal addresses this issue
• The Protocol for the Analysis of the Land Surface (PALS) is a web application
for evaluating land surface/ecosystem/hydrology models
• PALS hosts Experiments:
All data sets required to drive/force a model
for an experiment are downloadable in a
standardised netcdf format, easily read by
most LS modelling groups internationally
Users upload their model simulations for an
experiment in a way that maximises
reproducibility – ancillary files (log and
parameter files, run scripts etc)
PALS automatically runs analysis of the
model output, comparing with a range of
observational data sources (where
available), other models and standard
benchmarks
13. What PALS aims to achieve
• Bridge the observation and modelling communities
• Avoid data formatting issues and quality control issues
• Expose modelling communities to the nature of uncertainties within observational data
• Give data collection communities access to model simulations that utilise their data
• Provide international exposure for a broad collection of ecosystem field data
• Introduce standard reference benchmarks for modelling communities
• Provide a fast, free, and comprehensive model evaluation facility for model
developers
• Provide detailed diagnostic information that can identify model weaknesses
across the international modelling community
14. PALS – progress so far
• Structure for site-based experiments (primarily based on flux tower sites) is
in place and being used by around 140 researchers internationally.
• Met Office (UK) coordinating an experiment between 6+ land surface
modelling groups using what’s already in place (presentation from Martin
Best at AMS, Austin, Jan 2013):
15. PALS – progress so far
• Structure for site-based experiments (primarily based on flux tower sites) is
in place and being used by around 140 researchers.
• Involvement with GEWEX Global Land Atmosphere System Study (GLASS)
panel; OzEWEX; CABLE LSM management group; Met Office training…
• Generic Experiments structure being built now:
• Single-site; multiple-site; catchment-based; regional; global
• Field ecosystem, hydrological and satellite-based data streams
• First inclusion likely continental-scale C budget – Vanessa’s talk 11:10am tomorrow
• Early engagement from hydrological research community
• PALS adopted as offline benchmarking environment for CABLE (Australian
community land surface model)
16. The last slide
• Models that produce climate, weather, hydrological, ecological projections
are ripe to benefit from integration of ecological data streams
• “Benefit” can mean diagnostic model evaluation AND reduction in
projection uncertainty - concurrent multiple data streams especially useful
• To date the modelling and observation communities have not
communicated as well as they might – we’d like to try to help
• The PALS web application tries to address this by serving standardised
model experiments that utilise multiple data streams
• PALS already has around 140 users, active international experiments and is
used as a model development tool by a number of modelling groups
• PALS is expanding to include distributed experiments (up to global scale)
for a range of model types
Gab Abramowitz: gabriel@unsw.edu.au
Editor's Notes
Structured talk mostly to speak to non-modelling folk
Models incorporate our understanding of how natural systems workThe types of models that can benefit from TERN-type data
Take, for example, a continental scale simulation of Australian C cycle over a decade.Equifinality = underconstrainedmodelling system