Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Introduction to Digital Soil Mapping
(DSM)

R. A. (Bob) MacMillan
LandMapper Environmental Solutions Inc.

Presented to Golder Associates: Feb 6, 2013

Outline
• Unifying DSM Framework: Universal Model of Variation
– Z(s) = Z*(s) + ε(s) + ε
• Past: Early History of Development of DSM (pre 2003)
– Theory, Concepts, Models, Software, Inputs, Developments
– Examples of early methods and outputs
• Key Recent Developments in DSM post 2003
– Theory, Concepts, Models, Software, Inputs, Developments
– Examples of recent methods and outputs
• Future Trends: How do I See DSM Developing?
– Theory, Concepts, Inputs, Models, Software, Developments
– From Static Maps to Dynamic Real-Time Models

Introduction

Universal Model of Soil Variation
A Unifying Framework for DSM

Source: Burrough, 1986 eq. 8.14

Universal Model of Soil Variation
• A Unifying Framework for Digital Soil Mapping
Z(s) = Z*(s) + ε(s) + ε
Predicted soil type or Deterministic part of Stochastic part of the Pure Noise part of
soil property value the predictive model predictive model the predictive model

Predicted spatial part of the variation part of the variation part of the variation
pattern of some soil that is predictable by that shows spatial that can’t be predicted
property or class means of some structure, can be at the current scale
including uncertainty statistical or heuristic modelled with a with the available
of the estimate soil-landscape model variogram data and models

Deterministic Part of Prediction Model:
Z*(s)
KLM Series FMN Series

• Conceptual Models
EOR Series DYD Series COR Series

15

– Conceptual or mental soil-40
landscape models 60

– Produce area-class maps
• Statistical Models I n d iv i d u a l s a l in it y h a z a r d r a t i n g s
fo r e a c h la y e r

1 0 0 x 1 0 0 m g r id

– Scorpan – relate soils/soil
L a y e r w e ig h tin g s

Landscape
c u r v a tu r e
2 x

properties to covariates
V e g e ta tio n

1 x

R a in fa ll

2 x

– Explain spatial distribution
G e o lo g y

1 x

S o ils

of soils in terms of known
3 x

soil forming factors as
L a n d s u r fa c e

represented by covariates T o ta l s a lin ity
h a z a r d r a tin g
S a lin ity h a z a r d
m ap

Stochastic Part of Prediction Model:
ε(s)
• Geostatistical Estimation
– Predict soil properties
• Point or block kriging
– Predict soil classes
• Indicator kriging
– Predict error of estimate
• Correct Deterministic Part
– Error in deterministic part
is computed (residuals)
– If structure exists in error
then krige error & subtract

Pure Noise Part of Prediction Model:
ε(s)
• Some Variation not Predictable
– Have to be honest about this
• Should quantify and report it
• Deterministic Prediction
– Mental and Statistical Models
• Not perfect – often lack suitable
covariates to predict target variable
Sill
• Lack covariates at finer resolution Range

• Geostatistical Prediction Semi
Variance

– Insufficient point input data Nugget

• Can’t predict at less than the d1 d2 d3 d4
Lag (distance)
smallest spacing of input point data

Past

Early History of DSM Development
(pre 2003)
On Digital Soil Mapping
McBratney et al., 2003

Early History of Development of DSM

Deterministic Stochastic
Soil Classes Soil Classes

Soil Soil
Properties Properties

Past Theory: Deterministic Component
Z*(s) Classed Conceptual Models
– Jenny (1941)
• CLORPT (Note no N=space)
– Simonson (1959)
• Process Model of additions,
removals, translocations,
transformations
– Ruhe (1975)
• Erosional -Depositional
surfaces, open/closed basins
– Dalrymple et al., (1968)
• Nine unit hill slope model
– Milne (1936a, 1936b)
• Catena concept, toposequences

Past Concepts: Deterministic Component
Z*(s) Classed Conceptual Models
Soil = f (C, O, R, P, T, …)

Climate

Organisms
Topography

Parent
Material
Soil
Time

Source: Lin, 2005 Frontiers in Soil Science
http://www7.nationalacademies.org/soilfrontiers/

http://solim.geography.wisc.edu/index.htm

Past Models: Deterministic Component
Z*(s) Classed Statistical Predictions
• Fuzzy Inference In d iv id u a l s a lin it y h a z a r d r a t in g s
fo r e a c h la y e r

– Zhu, 1997, Zhu et al., 1996 1 0 0 x 1 0 0 m g r id

L a y e r w e ig h tin g s

– MacMillan et al., 2000, 2005 Landscape
c u r v a tu r e
2 x

• Neural Networks 1 x
V e g e ta tio n

– Zhu, 2000
R a in fa ll

2 x

G e o lo g y

• Expert Knowledge (Bayesian) 1 x

S o ils

– Skidmore et al., 1991 3 x

– Cook et al., 1996, Corner et al., 1997 L a n d s u r fa c e

• Regression Trees
– Moran and Bui, 2002, Bui and
S a lin ity h a z a r d
T o ta l s a lin ity m ap
h a z a r d r a tin g

Moran, 2003
Source: Jones et al., 2000

Past Software: Deterministic Component
• Regression Trees • Fuzzy Logic
– CUBIST – SoLIM
• Rulequest Research , 2000 • Zhu et al., 1996, 1997
– CART – LandMapR, FuzME
• Breiman et al., 1984
– C4.5 & See5 • Bayesian Logic
• Quinlin, 1992 – Prospector
– JMP (SAS) • Duda et al., 1978
• http://www.jmp.com/ – Expector
– R • Skidmore et al., 1991
• http://www.r-project.org/
– Netica
• Norsys.com/netica

Past Inputs: Deterministic Component
• C = Climate • R = Relief (topography)
– Temp, Ppt, ET, Solar Rad – Primary Attributes
• Mean, min, max, variance • Slope, aspect, curvatures
• Annual, monthly, indices • Slope Position, roughness
• O = Organisms – Secondary Attributes
• CTI, WI, SPI, STC
– Manual Maps
• Land Use • P = Parent Material
• Vegetation – Published geology maps
– Remotely Sensed Imagery – Gamma radiometrics
• Classified RS imagery
– Thermal IR, RS Ratios
• NDVI, EVI, other ratios
• A = Age

• Common Topo Inputs Profile Curvature Plan Curvature

– Profile Curvature
– Plan (Contour) Curvature
– Slope Gradient (& Aspect)
Slope Gradient Wetness Index
– CTI or Wetness Index
• Sometimes, not always
• Less Common Topo Inputs
– Surface Roughness Pit 2 Peak Relief Divide 2 Channel
– Relief within a window
– Relief relative to drainage
• Pit, peak, Ridge, channel,
Source: MacMillan, 2005

Past Inputs: Non-DEM Airborne
Radiometrics
• Radiometrics 4 Subsurface • Infer Parent Material

Source: Mayr, 2005

Past Inputs: Non-DEM Satellite Imagery
Grassland Land Cover Types Alpine Land Cover Types

Z*(s)

Examples of Predictions of Soil Class
Maps

Approaches to Producing Predictive Area-
Class Maps

Knowledge-Based Classification In SoLIM

Source: Zhu, SoLIM Handbook

Knowledge-Based Classification Using
Boolean Decision Tree in USA
Component Soils
Gilpin
Pineville
Laidig
Guyandotte
Dekalb
Craigsville
Meckesville
Cateache
Shouns

Source: Thompson et al., 2010 WCSS

Knowledge-Based Classification In LandMapR

Source: Steen and Coupé, 1997


Knowledge-Based Classification In LandMapR

Source: Global Forest Watch Canada, 2012

Knowledge-Based Classification In Utah,
Knowledge-Based PURC Approach

Note: Not simple slope
elements but complex patterns

Source: Cole and Boettinger, 2004

Source: Zhou et al., 2004,
JZUS

Supervised Classification Using Regression Trees
Note similarity of supervised rules
and classes to typical soil-landform
conceptual classes
Note numeric estimate of
likelihood of occurrence of classes

Supervised Classification Using Bayesian
Analysis of Evidence/Classification Trees

Source: Zhou et al., 2004,
JZUS

Predicting Area-Class Soil Maps Using
Discriminant Analysis

Source: Scull et al., 2005, Ecological Modelling

Regression Trees

Extrapolation

Uncertainty of prediction

Bui and Moran (2003)
Geoderma 111:21-44
Source: Bui and Moran., 2003

Supervised Classification Using Fuzzy Logic
• Shi et al., 2004 Fuzzy likelihood of being a broad ridge
– Used multiple cases of reference
sites
– Each site was used to establish
fuzzy similarity of unclassified
locations to reference sites
– Used Fuzzy-minimum function to
compute fuzzy similarity
– Harden class using largest (Fuzzy-
maximum) value
– Considered distance to each
reference site in computing
Fuzzy-similarity

Source: Shi et al., 2004

Credit: J. Balkovič & G. Čemanová

Concept of Fuzzy K-means Clustering

Source: Sobocká et al., 2003

Example of Application of Fuzzy K-means
Unsupervised Classification

From: Burrough et al.,
2001, Landscsape
Ecology

Note similarity of unsupervised
classes to conceptual classes

Example of Application of Disaggregation of
a Soil Map by Clustering into Components

Source: Faine, 2001

Developments: Deterministic Component
Z*(s) Classed Predictive Maps in Past
• Characteristics of Models • Characteristics of Models
– Models largely ignored ε – Many use expert knowledge
• Seldom estimate error • Data mining is the exception
• Rarely correct for error • Training data seldom used
– Mainly use DEM inputs – Specialty software prevails
• Initially 3x3 windows • Software for DEM analysis
• Slope, aspect, curvatures – SoLIM, TAPESG, TOPAZ,
TOPOG, TAS, SAGA,
• Maybe wetness index
ESRI, ISRISI, LandMapR
• Later improvements were • Software for extracting rules
measures of slope position
– Expector, Netica, CART,
– Rarely use ancillary data See 5, Cubist, Prospector
• Exceptions like Bui, Skull • Software for applying rules
– Operate at single scale – ESRI, SoLIM, SIE, SAGA

Z*(s) for Continuous Soil Properties

Approaches Aimed at Predicting
Continuous Soil Properties

Past Concepts: Deterministic Component
Z*(s) Continuous Soil Properties
• Same Theory-Concepts • Key Papers
as for Classed Maps – Moore et al., 1993
Soil = f (C, O, R, P, T, …) • Linear regression
– Except theory applied to – McSweeney et al., 1994
individual soil properties
– McKenzie & Austin, 1993
– Initially referred to as
environmental correlation – Gessler at al, 1995
– Soil properties related to • GLMs in S-Plus
• Landscape attributes – McKenzie & Ryan, 1999
• Climate variables • Regression Trees
• Geology, lithology, soil pm

Z*(s) Continuous Soil Properties
– McKenzie & Ryan, 1998, Odeh et
al., 1994
• Fuzzy Logic-Neural Networks
– Zhu, 1997
• Bayesian Expert Knowledge
– Skidmore et al., 1996
– Cook et al., 1996, Corner et al., 1997
• GLMs – General Linear Models
– McKenzie & Austin, 1993
– Gessler et al., 1995
Source: McKenzie and Ryan, 1998

• Similar to Classed Maps But:
– Many innovations originated
with continuous modelers
• Increased use of non-DEM
attributes
– climate, radiometrics, imagery
• Improved DEM derivatives
– Wetness Index & CTI
– Upslope means for slope, etc.
– Inverted DEMs to compute
» Down slope dispersal
» Down slope means
» New slope position data


Examples of Predictions of Soil
Property Maps

Z*(s) Continuous Maps
• Aandahl, 1948 (Note Date!)
– Regression model
• Predicted
– Average Nitrogen (3-24 inch)
– Total Nitrogen by depth
– Total Organic Carbon by
depth interval
– Depth of profile to loess
• Predictor (covariate)
– Slope position as expressed by
length of slope from shoulder
– Lost in the depths of time

Source: Aandahl, 1948

• Moore et al., 1993
– Seminal paper
– Focus on topography
• Small sites
• Other covariates were
assumed constant
– Got people thinking
• About quantifying
environmental
correlation, especially
soil-topography
relationships

Source: Moore et al, 1993


• McKenzie & Ryan, 1998
– Regression Tree: Soil Depth


• Gessler et al., 1995
– GLMs
– Largely based
• Topo
– CTI
• Others held
– Steady

Source: Gessler, 2005

Credit: Minasny & McBratney

2.17
160.1
Regression tree
Text: C Text: S,LS,L,CL,LiC

1.18 2.84
54.61 27.45
BD<1.43 BD>1.43 Clay<46.5 Clay>46.5

0.64 2.21 2.97 2.04
15.65 13.00 14.59 5.50

BD<1.42 BD>1.42

3.37 2.81
Source: Minasny and McBratney 1.83 8.90

Developments: Deterministic Component
Z*(s) Predictive Maps up to 2003
• Main Developments • Main Developments
– Better DEM derivatives – Integration of single models
• More and better measures of into multi-purpose software
landform position or • ArcGIS, ArcSIE, ArcView
context • SAGA, Whitebox, IDRISI
• Some recognition of scale
and resolution effects – Improved processing ability
– Different window sizes • Bigger files, faster processing
– Different grid resolutions – Emergence of 2 main scales
– More non-DEM inputs • Hillslope elements (series)
• Increased use of imagery – Quite similar across models
• New surrogates for PM • Landscape patterns (domains)
– Similar to associations

Past Theory: Stochastic Component
ε(s)
– Waldo Tobler (1970)
• First law of geography
– Everything is related to
everything else, but near things
are more related than distant
things
– Matheron (1971)
• Theory of regionalized variables
– Webster and Cuanalo (1975)
• clay, silt, pH, CaCO3, colour
value, and stoniness on transect
– Burgess and Webster (1980 ab)
• Soil Property maps by kriging
• Universal kriging (drift) of EC

Source: Oliver, 1989

Past Models: Stochastic Component
ε(s)
– Universal Model of Variation
• Matheron (1971)

• Burgess and Webster (1980 ab)
• Webster and Burrough (1980)
• Burrough (1986)
• Webster and McBratney (1987)
• Oliver (1989)

Past Models: Stochastic Component
ε(s) Optimal Interpolation by Kriging
Irregular spatial distribution Compute semi-variance
(of observed point values) at different lag distances

6
5
6

7 6
7
8 5
6
y
7
Estimate values and error
x at fixed grid locations
Collect point sample observations
Fit Semi-variogram to lag data 6.1 5.7 5.3 5.8

7.0 6.5 6.0 5.2

7.6 7.0 6.0 5.7

7.2 7.0 6.2 5.5

Past Software: Stochastic Component
ε(s)
• Earlier Stand Alone • Later More Integrated
– Pc-Geostat (PC-Raster) – GSTAT
• Early version of GSTAT • Pebesma and Wesseling, 1998
– VESPER • Incorporated into ISRISI
• Variogram estimation and • Now incorporated into R and
spatial prediction with error S-Plus packages
• Minasny et al., 2005 – Pebesma, 2004
• http://sydney.edu.au/agricultu • http://www.gstat.org/index.ht
re/pal/software/vesper ml
– GEOEASE (DOS, 1991) – ArcGIS
• http://www.epa.gov/ada/csm • Geostatistical Analyst
os/models/geoeas.html – SGeMS (Stanford Univ)
• http://sgems.sourceforge.net/

Past Inputs: Stochastic Component ε(s)
• Essentially Just x,y,z Values at Point Locations
1. Start with set of soil 2. Locate the regularly
property values spaced grid nodes where
irregularly distributed in predicted soil property
x,y Cartesian space values are to be
calculated

3. Locate the n soil 4. Compute a new value
property data points for each location as the
within a search window weighted average of n
around the current grid neighbor elevations with
cell for which a value is weights established by
to be calculated the semi-variogram

Past Models: Stochastic Component ε(s)
for Continuous Soil Properties

Examples of Predictions of Soil
Property Maps by Kriging

Continuous Soil Property Maps by
Kriging
• Very Early Alberta Example SEMI-VARIOGRAM FOR A-HORIZON %SAND

SEMI-VARIANCE
160
– Lacombe Research Station 140
120

• Sampled soils on a 50 m grid 100
80
60
– Sand, Silt, Clay, 40
20
– pH, OC, EC, others 0

11

13

15

17

19
1

3

5

7

9
– 3 depths (0-15, 15-50, 50-100) LAG (1 LAG = 30 M)
• Used custom written software
– Compute variograms
– Interpolate using the variograms
• Only visualised as contour maps
– Only got 3D drapes in 1988
– Used PC-Raster to drape
– Saw strong soil-landscape pattern
LACOMBE SITE: A HORIZON %SAND (1985)
Source: MacMillan, 1985 unpublished

Kriging

Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml

Kriging
• Yasribi et al., 2009
– Simple ordinary kriging
of soil properties (OK)
• No co-kriging
• No regression prediction
– Relies on presence of
• Sufficient point samples
• Spatial structure over
distances longer then the
smallest sampling
interval

Source: Yasribi et al., 2009

Kriging
• Shi, 2009
– Comparison of pH by
four different methods
• a) HASM
• b) Kriging
• c) IWD
• d) Splines

Source: Yasribi et al., 2009

Developments: Stochastic Component
ε(s) Predictive Maps up to 2003
• Main Developments • Main Developments
– Theory – Software
• Becomes better understood • From stand alone and single
and accepted purpose to integrated software
– Concepts • Improvements in
• Regression-kriging evolves – Visualization
to include a separate part for – Capacity to process large
regression prediction data sets
– Automated variogram fitting
– Models – Ease of use
• Understanding and use of
universal model grows
– Inputs
• Developments in sampling
• Directional, local variograms
designs and sampling theory

Present and Recent Past

Key Developments in DSM Since 2003
(2003-2012)
On Digital Soil Mapping
McBratney et al., 2003

Developments in DSM Since 2003
Increasing Convergence and Interplay

Deterministic Stochastic
Soil Classes Soil Classes

Soil Soil
Properties Properties
Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Theory: Key Developments Since 2003
• Deterministic Part • Stochastic Part
– Pretty much unchanged – Same underlying theory
• Still based on attempting to • Still based on theory of
elucidate quantitative regionalized variables
relationships between soils – But
& environmental covariates
• Increasing realization that
– But the structural part of
• Scorpan elaboration variation (non-stationary
highlights importance of mean or drift) can be better
the spatial component (n) modelled by a deterministic
and of spatially correlated function than by purely
error ε(s) spatial calculations

Concepts: Key Developments Since 2003
• Deterministic Part • Factors as predictors
– Scorpan Model – Factors explicitly seen as
• Explicitly recognizes soil data quantitative predictors in
(s) as a potential input to prediction function
predict other soil data
– Soil inputs can include soil
maps, point observations,
even expert knowledge
• Explicitly recognizes space
(n) or location as a factor in
predicting soil data
– Space as in x,y location
– Space as in context, kriging
Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Concepts: Key Developments Since 2003
• Stochastic Part
– Emergence of Regression
Kriging (RK)
• Key difference to ordinary
kriging is that it is no longer
assumed that the mean of a
variable is constant
• Local variation or drift can
be modelled by some
deterministic function
– Local regression lowers
error, improves predictions
– Local regression function
can even be a soil map
Source: Heuvelink, personal communication

Models: Key Developments Since 2003
• Deterministic Part • Deterministic Part
– Improvements in Data – Improvements in Data
Mining and Knowledge Mining and Knowledge
Extraction Extraction
• Supervised Classification • Expert Knowledge Extraction
– Training data obtained – Bayesian Analysis of Evidence
from both points and maps – Prototype Category Theory
» Sample maps at points – Fuzzy Neural Networks
– Ensemble or multiple – Tools for Manual Extraction
realization models (100 x) of Fuzzy Expert Knowledge
» Boosting, bagging » ArcSIE, SoLIM
» Random Forests • Unsupervised classification
» ANN, Regression tree – Fuzzy k-means, c-means

Models: Key Developments Since 2003
• Stochastic Part • Stochastic Part
– Regression Kriging – Regression Kriging
• Recognized as equivalent to • Odeh et al., 1995
universal kriging or kriging • McBratney et al., 2003
with external drift • Hengl et al., 2004, 2007,
• Use of external knowledge 2003
and maps made easier • Heuvelink, 2006
– Incorporation of soft data
• Hengl how to books
• Made more accessible
– http://spatial-
through implementation in analyst.net/book/
commercial (ESRI) and – http://www.itc.nl/library
open source software (R) /Papers_2003/misca/hen
gl_comparison.pdf

Source: Hengl et al., 2012

Comparison of Soil Property Maps by
Kriging & RK
• Hengl et al., 2012
– Comparison of ordinary
kriging and regression
kriging
• Evidence supports RK as
explaining more of the
variation than OK alone
– Greater spatial detail
– Fewer extrapolation
areas
– Better fit to data

Software: Key Developments Since 2003
• Commercial Software • Non-commercial Software
– JMP (SAS) (McBratney) – Fuzzy Logic
• http://www.jmp.com/ • SoLIM Zhu et al., 1996, 1997
– S-Plus, Matlab, • ArcSIE Shi, FuzME
• Used by soil researchers
– Bayesian Logic
– See5, CUBIST, CART
– Full Range of Options
• R
– Netica (Bayesian) – http://www.r-project.org
• Norsys.com/netica – Regression Kriging
– Improvements – Random Forests
– Regression Trees
• Better visualization – GLMs
• Better interfaces • GSTAT (in R)

Source: Schmidt and Andrew., 2005

Inputs: Key Developments Since 2003
• Terrain Attributes
– More and better measures
• Primarily contextual and
related to landform position
– Real advances related to
• Multi-scale analysis
– varying window size and
grid resolution
• Window-based and flow-
based hill slope context
• Systematic examination of
relationships of properties
and processes to scale
Source: Smith et al., 2006

• Terrain Attributes
– Multi-scale analysis
• Varying window size and
grid resolution
• Identifies that some
variables are more useful
when computed over larger
windows or coarser grids
– Finer resolution grids not
always needed or better
– Drop off in predictive
power of DEMs after
about 30-50 m grid
resolution
Source: Deng et al., 2007

• ConMAP: Hyper-scale Contextual Analysis of Topographic Parameters
– Neighborhood
example
• Diameter
– 21 km
• Predictirs
– 775

Source: Berhens et al., in press

• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters
ConStat (ConMap)
- neighborhood reduction

a) Full neighborhood
b) Reduction of radii
c) Reduction on radii
d) Combination of b and c


• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters


• Hyper-scale Terrain
Analysis in ConSTAT
– Systematic analysis of relative
importance of terrain
measures different scales
• Compute statistics of terrain
measures at different scales
– Use data mining (Random
Forests) to identify
importance of different
statistics at different scales
and at each different location


MrVBF: Multi-scale DEM Analysis
Smooth and subsample Source: Gallant, 2012

Original: 25 m Generalised: 75 m Generalised 675 m
Flatness Flatness

Bottomness Bottomness

Valley Bottom Valley Bottom
Flatness Flatness

Multiple Resolution Landform Position
MrVBF Example Outputs

Broader Scale 9” DEM

MRVBF for 25 m DEM

Source: Gallant, 2012

Developments: Improved Measures of
Landform Position
• SAGA-RHSP: relative • SAGA-ABC: altitude
hydrologic slope position above channel

Source: C. Bulmer, unpublished
Calculation based on: MacMillan, 2005 Source: C. Bulmer, unpublished

Landform Position
• TOPHAT – Schmidt • Slope Position – Hatfield
and Hewitt (2004) (1996)

Source: Schmidt & Hewitt, (2004) Source: Hatfield (1996)

Landform Position - Scilands

Source: Rüdiger Köthe , 2012

Measures of Relative Slope Length (L)
Computed by LandMapR
• Percent L Pit to Peak • Percent L Channel to Divide

MEASURE OF REGIONAL CONTEXT MEASURE OF LOCAL CONTEXT

Image Data Copyright the Province of British Columbia, 2003

Measures of Relative Slope Position
Computed by LandMapR
• Percent Diffuse Upslope Area • Percent Z Channel to Divide

SENSITIVE TO HOLLOWS & DRAWS RELATIVE TO MAIN STREAM CHANNELS
Image Data Copyright the Province of British Columbia, 2003


Developments: Improved Classification of
Landform Patterns Iwahashi & Pike (2006)
• Iwahashi landform underlying 1:650k soil map

Terrain Classes
Fine texture,
Terrain Series
High convexity
1 5 9 13
Fine texture,
Low convexity 3 7 11 15
Coarse texture,
High convexity 2 6 10 14
Coarse texture,
Low convexity 4 8 12 16

steep gentle

Source: Reuter, H.I. (unpublished)

• Non-Terrain Attributes
– Systematic analysis of
environmental covariates
• Detect distances and scales
over which each covariate
exhibits a strong relationship
with a soil or property to be
predicted or just with itself
– Vary window sizes and grid
resolutions and compute
regressions on derivatives
– analyse range of variation
inherent to each covariate
» Functional relationships
are dependent on scale
Source: Park, 2004

• Non-Terrain Attributes
– Systematic analysis of scale of
environmental covariates
• Select and use input covariates
at the most appropriate scale
– Explicitly recognize the
hierarchical nature of
environmental controls on
soils
– Select variables at the scales,
resolutions or window sizes
with the strongest predictive
power for each property or
class to be predicted.

Source: Park, 2004

Harmonization of soil profile depth data through spline fitting

Source: David Jacquier, 2010

From discrete soil classes to continuous soil properties
Clearfield soil series
Wapello County, Iowa Harmonization of soil profile
Mukey: 411784 data through spline fitting
Musym: 230C

‘Modal’ Fit mass- Estimate
Fitted Spline
profile preserving averages for
Spline averages
spline spline at
at
standardised
specified
depth
depth
ranges, e.g.,
ranges
globalsoilmap
Source: Sun et al., (2010) depth ranges

Source: Hempel et al., 2011

Outputs: Key Developments Since 2003
• From Classes to Properties
– Non-disaggregated soil maps
• Weighted averages by polygon
by soil property and depth
– Calling version 0.5
– Disaggregated Soil Class Maps
• Estimate soil property values at
every grid cell location & depth
– Based on weighted likelihood
value of occurrence of each of
n soils times property value for
that soil at that depth
– Likelihood value can come
from various methods
Source: Sun et al, 2010

Outputs: Key Developments Since 2003
• From Classes to Properties
– Disaggregated Soil Class Maps
• Estimate soil property values at
every grid cell location

Source: Zhu et al., 1997

Recent Models

Recent Examples of Predictions of
Soil Class Maps

Predicting Area-Class Soil Maps
Clovis Grinand, Dominique Arrouays,
Bertrand Laroche, and Manuel Pascal Martin.
Extrapolating regional soil landscapes from an
existing soil map: Sampling intensity,
validation procedures, and integration of
spatial context. Geoderma 143, 180-190

Source: Grinand et al., 2008

Recent Knowledge-Based Classification In
Africa, Multi-scale, Hierarchical Landforms
Elevation + Slope + UPA + Catena SOTER Soil and landforms
( 2 km support) (1:1 million – 1.5 million

Source: Park et al, 2004

Digital Soil Mapping
DEM in England & Wales
Predicted
using Legacy Data
soil series

TOPAZ TAPES-G LandMapR

TRAINING DATA MODELLING OUTPUTS
(NETICA)

Point Data

Detailed soil maps
Accuracy
Covariates
assessment
Expert
knowledge

Source: Mayr, 2010

Multiple Regression Trees (100 x)
Prepare a database and tables of mapping units & soil
series, and covariates

Select 1/n of the points systematically (n=100)

Repeat n Sample soil series randomly from the multinomial
times distribution of mapping unit composites
Used See 5, (RuleQuest
Research, 2009 Construct decision tree

Predict soil series at all pixels

Calculate the soil series statistics based on the n
predictions for each pixel

Calculate the probability for each soil series

Generate soil series maps

Source: Sun et al., 2010

Multiple Regression Trees (100 x)
A closer look at the junction point in the middle of 4 combined maps,
(a) the original map units, and
(b) the most likely soil series map and its associated probability.
The length of the image is approximately 14 km.

Legend

(a) monr_comppct
Value
High : 100

Low : 7

(b)

Source: Sun et al., 2010

Recent Models

Recent Examples of Predictions of
Continuous Soil Property Maps


Kriging & RK
• Hengl et al., 2004
– Comparison of topsoil
thickness by four
different methods
• a) Point locations
• b) Soil Map only
• c) Ordinary Kriging
• d) Plain Regression
• e) Regression-kriging
– Evidence supports RK

Source: Minasny et al., 2010

Recent Example: Regression-Kriging
(scorpan + ε)
300 soil point data

Assemble
field data


(scorpan + ε)

Assemble covariates for
the predictive model


(scorpan + ε)
Perform regression to
build a predictive model

Linear Model
OC = f(x) + e

Predictors
Elevation
Aspect
Landsat band 6
NDVI
Land-use
Soil-Landscape
Unit


scorpan + ε)

Predict both
property value
and standard
error over the
entire area


(scorpan + ε)
Fit a variogram to the
residuals


scorpan + ε)

Krige the residuals


scorpan + ε)
Linear Model + Residuals
Add interpolated
residuals to the
prediction from
regression

Final Prediction


(scorpan + ε)

Add regression variance
and kriging variance to
get total variance
(Std.err. of (Std. err. of
+ kriging)2
regression)2

(Total
Variance)1/2


Regression
C predicted for C= 100-1.2EC-5.2REF-0.6REF2-2.1EL
sampled locations model

C predicted for Mg C/ha
Residuals all grid locations
95
85 Mean 64.0
75
Min 27.0
65
Max 87.9
55
Kriging 45
CV% 18.4
35 RMSE 9.8
25 RI (%) 19.7
15
Final C map

Source: Mayr et al., 2010

Hybrid Bayesian Analysis

Future Trends

Personal View of Likely Future DSM
Development
(Post 2012)

Source: Heuvelink et al., 2004

The Future: Lets Go Back and Talk About
the Universal Model of Variation Again
Z(s) = Z*(s) + ε(s) + ε
Lots of things qualify
as regression!

Deterministic part of
the predictive model

Regression just
means minimizing
variance

Stochastic part of the
predictive model

What is all this talk
about optimization?

The Future: A Conceptual Framework for
GSIF – A Global Soil Information Facility
Collaborative and
open production,
assembly and sharing
of covariate data
(World Grids)
Collaborative and
open collection, Collaborative and
input and sharing of open and modelling
geo-registered field on an inter-active,
evidence web-based server-
(Open Soil Profiles) side platform

Everything is
accessible,
transparent and
repeatable

Maps we can all contribute to, access, use, modify and
update, continuously and transparently

The Future: Functionality for GSIF – A
Global Soil Information Facility
Possibility to assess
Possibility of making error and correct for
use of existing it everywhere
legacy soil maps
(even new soil maps) Possibility of
needed for soil rescuing, sharing,
prediction anywhere harmonizing and
archiving soil
profile point data
needed for soil
prediction anywhere

Possibility to Possibility to
develop and use develop and use
global models (even multi-scale and
for local mapping) multi-resolution
hierarchical models


The Future: Conceptual Framework for
GSIF – World Soil Profiles


The Future: Implemented Framework for
GSIF – World Soil Profiles
Source: www.worldsoilprofiles.org

The Future: Conceptual Framework for
GSIF – World Grids


GSIF – World Grids
Source: www.worldgrids.org

GSIF

The Future: Collaborative Global, Multi-
Scale Mapping through GSIF

Possibility to
develop and use
global models (even
Possibility for combining for local mapping)
Top-Down and Bottom-up
mapping through weighted
averaging of 2 or more sets
of predictions
)

The Future: Global, Multi-Scale Modeling
of Soil Properties through GSIF
Possibility to
develop and use
multi-scale and
multi-resolution
hierarchical models
Possibility to
develop and use
global models (even
for local mapping)


• Global DSM Models
– Make use of ALL data
• From everywhere in
the world
– Provide initial coarse
local predictions Global Models
• That can be refined inform and
improve local
and improved with: mapping
– More & finer local data
– Local model runs

Source: Hengl personal communication, 2013


Global Models
inform and
improve local
mapping



Anyone can
access and
display the
maps


With Google
Earth everyone
has a GIS to
view free soil
maps and data

Slide credit: Tom Hengl,
2011

The Future: Collaborative Global, Multi-
Scale Mapping through GSIF
A Global
Collaboratory!
Working together
we can map the
world one tile at a
time!

The next generation
of soil surveyors is
everyone!


The Future: From Mapping to
Continuously Updated Modelling
Possibility to move from single
snapshot mapping of static soil
properties to continuous update and
improvement of maps of both static and
dynamic properties within a structured
and consistent framework.

Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Recommended

Recommended

More Related Content

Featured

Featured (20)

Golder 2013 dsm_introduction_presentation_feb6_ram_version1