TERN Ecosystem Surveillance Plots Kakadu National Park
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N modelling.'
1. New value from old trials:
Developing long-term agro-ecological
trial datasets for C and N modelling
Australian Centre for Ecological Analysis and Synthesis (ACEAS)
C&N Dynamics Working Group
Beverley Henry (QUT), John Carter, Ram Dalal, Steven Reeves (Qld Department of
Science, Information Technology, Innovation and the Arts); Craig Thornton (Qld
Department of Natural Resources & Mines), Robyn Cowley (NT Department of
Resources); Leigh Hunt, Andrew Moore (CSIRO); Bill Parton (Colorado State
University); Bill Slattery (Department of Climate Change and Energy Efficiency);
Peter Grace, Richard Conant (Institute for Future Environments, QUT);Alison Specht
(ACEAS); Murray Unkovich (University of Adelaide).
Semaphore project team (Australian National Data Service - ANDS)
Vaughan Hobbs , Marco Fahmi, Beverley Henry, Alvin Sebastian, Siobhann
McCafferty (Institute for Future Environments, QUT), Mingfang Wu (ANDS), Richard
Conant CSU)
CCRSPI Conference 27th – 29th November 2012 Melbourne
2. Purpose of C&N modelling
• To evaluate effects of environmental change
• To evaluate changes in management
CCRSPI Conference 27th – 29th November 2012 Melbourne
3. Purpose of C&N modelling
• To evaluate effects of environmental change
• To evaluate changes in management
Current applications include:
• Understanding impacts of climate variability and change
• Testing climate adaptation strategies
• Evaluating practices and activities for soil carbon
sequestration
• Understanding Nutrient Use Efficiency and N2O emissions
• Managing productivity and sustainability goals in agro-
ecosystems
CCRSPI Conference 27th – 29th November 2012 Melbourne
4. Project Objectives
ACEAS C&N Dynamics Working Group
To develop a database with high-quality climate, soil,
management, NPP and nutrient datasets (with metadata)
suitable for validation of C&N models
ANDS Semaphore
To develop software to [semi-]automatically extract and
transform data to calibrate and validate C&N models.
CCRSPI Conference 27th – 29th November 2012 Melbourne
5. Project Objectives
ACEAS C&N Dynamics Working Group
To develop a database with high-quality climate, soil,
management, NPP and nutrient datasets (with metadata)
suitable for validation of C&N models
ANDS Semaphore
To develop software to [semi-]automatically extract and
transform data to calibrate and validate C&N models.
Outcome: Improved predictions of carbon and
nutrient dynamics in agro-ecosystems
CCRSPI Conference 27th – 29th November 2012 Melbourne
6. Models & sites
Sites Targeted
– Brigalow catchment study
– Hermitage cropping trial
– Kidman Springs fire trial
– Wambiana grazing trial
– Hamilton long-term P trial
Data & Information
– Site information and climate files
– Experimental data (treatments, measurements)
– Management schedules
CCRSPI Conference 27th – 29th November 2012 Melbourne
7. Brigalow Catchment Study
Brigalow (Acacia harpophylla) bioregion (40 Mha) of central QLD
Study commenced 1965
3 land uses, brigalow forest, cropping, and grazed pasture
Monitored for water balance, resource condition, productivity
Qld Government – Craig Thornton, John Carter
CCRSPI Conference 27th – 29th November 2012 Melbourne
8. Hermitage Cropping Trial
Hermitage Long-Term Tillage Trial near Warwick, QLD
Study commenced 1968
Tillage, stubble management, N fertiliser treatments
Monitored for soil C, N, yield, N2O, (& disease tolerance, LCA) etc
Qld Government – Ram Dalal, Steven Reeves; CSU – Bill
Parton
CCRSPI Conference 27th – 29th November 2012 Melbourne
9. Data collation issues
Examples of data issues:
data in different sources (e.g. paper, electronic) with different custodians
data stored in different layouts and file formats
different units used
different sampling strategies – may effect accuracy, comparability
may be only processed data (e.g. means) – limits interpretations & uses
frequency of data collection changes over time (depending on resources)
metadata may be missing or incomplete
type of physical or chemical analysis may change over time (e.g. for
carbon - Walkley & Black vs Leco)
CCRSPI Conference 27th – 29th November 2012 Melbourne
10. Data collation issues (2)
Access to specific site history knowledge is critical for both
data collation and developing model inputs, e.g.
pre-trial site history,
data outlier reasoning,
data gap filling and actual location of data sources
to ensure any model parameterisation was realistic
ACEAS investment provided an opportunity to bring together
data custodians, modellers & end-users
Data inputs for different models require further conversion –
hence the ANDS project
CCRSPI Conference 27th – 29th November 2012 Melbourne
11. Preliminary findings - Hermitage
Example: Simulated SOC (0-20 cm) zero till; (A)stubble retained, (B)stubble burnt; 3 N fertiliser rates
A B
CCRSPI Conference 27th – 29th November 2012 Melbourne
12. Preliminary findings - Hermitage
Example: Simulated SOC (0-20 cm) zero till; (A)stubble retained, (B)stubble burnt; 3 N fertiliser rates
A B
Preliminary results using DayCent:
• Trends in SOC across treatments OK
• Magnitude of SOC changes less
accurate
• Further parameterisation required for
outputs such as N2O fluxes
CCRSPI Conference 27th – 29th November 2012 Melbourne
13. Preliminary findings - General
Preliminary comparison of model outputs for FullCAM,
DayCENT, Century and a Microsoft Excel version of RothC
Initial estimates suggest that if the input data are the
same/very similar, the 4 models will all deliver soil C stock
results within 10 t C/ha of the final result.
CCRSPI Conference 27th – 29th November 2012 Melbourne
14. Preliminary findings - General
Preliminary comparison of model outputs for FullCAM,
DayCENT, Century and a Microsoft Excel version of RothC
Initial estimates suggest that if the input data are the
same/very similar, the 4 models will all deliver soil C stock
results within 10 t C/ha of the final result.
Semaphore
Improving the quality of modelling using scientific
workflows
CCRSPI Conference 27th – 29th November 2012 Melbourne
15. Opportunities/challenges
• The rise of “data- and CPU-intensive”
analysis
– Multiplication of data sources and tools to analyse
the data
• Need for a scientific “cyber infrastructure”
– Increased expectation to share data and tools in
a structured way
• A new practice – “eScience”
– Ability for scientists to make sense of the
software written by others and modify it to fit their
needs
16. Pitfalls/solutions
• Problems
– Low visibility due to unavailability of data and code
publicly
– Poor quality of software due to bugs/heavy
customisation
– Lack of provenance information and documentation of
procedures
• The project
– Ability to prepare data and run simulations remotely
– Rapid sharing of well described data and tools
– Allows users to examine the data analysis and re-
purpose the tools for other analyses
17. Data processing and analysis
• A manual process
prone to error and
inconsistency
• Capture (and expose)
implicit knowledge
and local conditions
• Integrate tools for
data cleaning and
preparation
18. How the software works
Scientific workflow
software to capture
steps of data process
and processing
• Visual interface that
allows user
manipulation of data
• Integration with data
management software
20. Advantages
• Ability to run simulations with zero software
installation
• Public availability of data and software for peer-
review and public use
• Expose data processing and analysis for easy
bug fixes and re-use
• Flexible design to extend to other standard
modelling tools
21. For more information
• Project blog: semaphoreblog.wordpress.com
• Software prototype available at Github
• Final product will available by June 2013
• Software development team
– Alvin Sebastian, Marco Fahmi, Siobhann
McCafferty, Jodie Vaughan, Vaughan Hobbs
• Project funders
– Australian National Data Service (ANDS)
– Australian Centre for Ecological Analysis and
Synthesis (ACEAS)
22. Thank You
Acknowledgements
Funding:
Australian Centre for Ecological Analysis and Synthesis (ACEAS)
Australian National Data Service (ANDS)
Expert contributions:
QUT
Queensland Government (DSITIA, DNR&M)
CSIRO
Colorado State University
DCCEE
University of Adelaide
NT Department of Resources
CCRSPI Conference 27th – 29th November 2012 Melbourne
23. This project is supported by the Australian National Data Service (ANDS)
ANDS is supported by the Australian Government through the National
Collaborative Research Infrastructure Strategy Program and the Education
Investment Fund (EIF) Super Science Initiative
Notes de l'éditeur
Initially for 3 to 5 long-term research trials, and having a framework and capacity for expansion to include additional datasets and trials.
Initially for 3 to 5 long-term research trials, and having a framework and capacity for expansion to include additional datasets and trials.
Data and metadate collation is very time-consuming and requires multiple sourcesThroughout the data collation and modelling process, the importance of individual site history knowledge was evident. Many queries which arose throughout the process could only be answered by someone with a long-term association with the trial. These include pre-trial site history, data outlier reasoning, data gap filling and actual location of data sources. Once the data was collated site historical knowledge was still important to ensure any model parameterisation was reasonable. The importance of long term trial data in refining current models, to enable them to predict carbon and nutrient dynamics in Australian agroecosystems, cannot be understated. Many models currently in use have been developed for Northern Hemisphere systems, or have only been tested on a limited number of sites in Australia. To ensure that the models are accurately able to predict long-term temporal variability in agroecosystem dynamics under Australian conditions, data collated from longer time scales in Australia are vital. This will enable the effects of different climate scenarios and management regimes on carbon and nitrogen dynamics to be confidently modelled. This will further enable primary industries to investigate a range of issues, such as carbon sequestration or nitrogen leaching, in a timely an economical manner.With the number of long term trials constantly dwindling due to financial constraints, data from existing sites needs to be actively collated with appropriate metadata and stored for future use in modelling. Without this emphasis, the scientific community will lose vital data, and our ability to utilise the knowledge associated with it will also be lost.
Preliminary results show the Daycent model is able to model the overall trends in soil carbon across treatments (i.e. increased soil carbon with increased applied nitrogen, stubble burning reduced soil carbon compared to stubble retention and conventional tillage reduced soil carbon compared to zero till). However the magnitude of temporal soil carbon changes is not accurate. Further parameterisation is required to refine the carbon outputs, before other outputs, such as nitrous oxide fluxes, are investigated.
Preliminary results show the Daycent model is able to model the overall trends in soil carbon across treatments (i.e. increased soil carbon with increased applied nitrogen, stubble burning reduced soil carbon compared to stubble retention and conventional tillage reduced soil carbon compared to zero till). However the magnitude of temporal soil carbon changes is not accurate. Further parameterisation is required to refine the carbon outputs, before other outputs, such as nitrous oxide fluxes, are investigated.
Throughout the data collation and modelling process, the importance of individual site history knowledge was evident. Many queries which arose throughout the process could only be answered by someone with a long-term association with the trial. These include pre-trial site history, data outlier reasoning, data gap filling and actual location of data sources. Once the data was collated site historical knowledge was still important to ensure any model parameterisation was reasonable.
Throughout the data collation and modelling process, the importance of individual site history knowledge was evident. Many queries which arose throughout the process could only be answered by someone with a long-term association with the trial. These include pre-trial site history, data outlier reasoning, data gap filling and actual location of data sources. Once the data was collated site historical knowledge was still important to ensure any model parameterisation was reasonable.
Issues sharing and re-using data and code from others due to lack of documentation, need for re-purposing the software and generally crappiness of the code
Various ways issues can be addressed using technologies that capture scientific workflows, automatic or semi-automatic data transformation and running software in a cloud environment
Issues that scientists encounter when developing custom code for data preparation and calibration of models due to the high specificity of each
Annex benefits of sound data management practices and ability to value the contribution of data officers, software developers and other technicians but publicising the existing of code and data, listing them in repositories and making the citable
The scientific benefits (better due diligence, increase efficiency, enabling synthesis) that could be gleaned from software that addresses immediate needs but that is designed with sufficient flexibility so it can be examined, used and modified by others for different data sets or different modelling tools.