1. Introduction to Digital Soil Mapping
(DSM)
R. A. (Bob) MacMillan
LandMapper Environmental Solutions Inc.
Presented to Golder Associates: Feb 6, 2013
2. Outline
• Unifying DSM Framework: Universal Model of Variation
– Z(s) = Z*(s) + ε(s) + ε
• Past: Early History of Development of DSM (pre 2003)
– Theory, Concepts, Models, Software, Inputs, Developments
– Examples of early methods and outputs
• Key Recent Developments in DSM post 2003
– Theory, Concepts, Models, Software, Inputs, Developments
– Examples of recent methods and outputs
• Future Trends: How do I See DSM Developing?
– Theory, Concepts, Inputs, Models, Software, Developments
– From Static Maps to Dynamic Real-Time Models
4. Source: Burrough, 1986 eq. 8.14
Universal Model of Soil Variation
• A Unifying Framework for Digital Soil Mapping
Z(s) = Z*(s) + ε(s) + ε
Predicted soil type or Deterministic part of Stochastic part of the Pure Noise part of
soil property value the predictive model predictive model the predictive model
Predicted spatial part of the variation part of the variation part of the variation
pattern of some soil that is predictable by that shows spatial that can’t be predicted
property or class means of some structure, can be at the current scale
including uncertainty statistical or heuristic modelled with a with the available
of the estimate soil-landscape model variogram data and models
5. Deterministic Part of Prediction Model:
Z*(s)
KLM Series FMN Series
• Conceptual Models
EOR Series DYD Series COR Series
15
– Conceptual or mental soil-40
landscape models 60
– Produce area-class maps
• Statistical Models I n d iv i d u a l s a l in it y h a z a r d r a t i n g s
fo r e a c h la y e r
1 0 0 x 1 0 0 m g r id
– Scorpan – relate soils/soil
L a y e r w e ig h tin g s
Landscape
c u r v a tu r e
2 x
properties to covariates
V e g e ta tio n
1 x
R a in fa ll
2 x
– Explain spatial distribution
G e o lo g y
1 x
S o ils
of soils in terms of known
3 x
soil forming factors as
L a n d s u r fa c e
represented by covariates T o ta l s a lin ity
h a z a r d r a tin g
S a lin ity h a z a r d
m ap
6. Stochastic Part of Prediction Model:
ε(s)
• Geostatistical Estimation
– Predict soil properties
• Point or block kriging
– Predict soil classes
• Indicator kriging
– Predict error of estimate
• Correct Deterministic Part
– Error in deterministic part
is computed (residuals)
– If structure exists in error
then krige error & subtract
7. Pure Noise Part of Prediction Model:
ε(s)
• Some Variation not Predictable
– Have to be honest about this
• Should quantify and report it
• Deterministic Prediction
– Mental and Statistical Models
• Not perfect – often lack suitable
covariates to predict target variable
Sill
• Lack covariates at finer resolution Range
• Geostatistical Prediction Semi
Variance
– Insufficient point input data Nugget
• Can’t predict at less than the d1 d2 d3 d4
Lag (distance)
smallest spacing of input point data
8. Past
Early History of DSM Development
(pre 2003)
On Digital Soil Mapping
McBratney et al., 2003
9. Early History of Development of DSM
Deterministic Stochastic
Soil Classes Soil Classes
Soil Soil
Properties Properties
10. Past Theory: Deterministic Component
Z*(s) Classed Conceptual Models
– Jenny (1941)
• CLORPT (Note no N=space)
– Simonson (1959)
• Process Model of additions,
removals, translocations,
transformations
– Ruhe (1975)
• Erosional -Depositional
surfaces, open/closed basins
– Dalrymple et al., (1968)
• Nine unit hill slope model
– Milne (1936a, 1936b)
• Catena concept, toposequences
11. Past Concepts: Deterministic Component
Z*(s) Classed Conceptual Models
Soil = f (C, O, R, P, T, …)
Climate
Organisms
Topography
Parent
Material
Soil
Time
Source: Lin, 2005 Frontiers in Soil Science
http://www7.nationalacademies.org/soilfrontiers/
13. Past Models: Deterministic Component
Z*(s) Classed Statistical Predictions
• Fuzzy Inference In d iv id u a l s a lin it y h a z a r d r a t in g s
fo r e a c h la y e r
– Zhu, 1997, Zhu et al., 1996 1 0 0 x 1 0 0 m g r id
L a y e r w e ig h tin g s
– MacMillan et al., 2000, 2005 Landscape
c u r v a tu r e
2 x
• Neural Networks 1 x
V e g e ta tio n
– Zhu, 2000
R a in fa ll
2 x
G e o lo g y
• Expert Knowledge (Bayesian) 1 x
S o ils
– Skidmore et al., 1991 3 x
– Cook et al., 1996, Corner et al., 1997 L a n d s u r fa c e
• Regression Trees
– Moran and Bui, 2002, Bui and
S a lin ity h a z a r d
T o ta l s a lin ity m ap
h a z a r d r a tin g
Moran, 2003
Source: Jones et al., 2000
14. Past Software: Deterministic Component
Z*(s) Classed Statistical Predictions
• Regression Trees • Fuzzy Logic
– CUBIST – SoLIM
• Rulequest Research , 2000 • Zhu et al., 1996, 1997
– CART – LandMapR, FuzME
• Breiman et al., 1984
– C4.5 & See5 • Bayesian Logic
• Quinlin, 1992 – Prospector
– JMP (SAS) • Duda et al., 1978
• http://www.jmp.com/ – Expector
– R • Skidmore et al., 1991
• http://www.r-project.org/
– Netica
• Norsys.com/netica
15. Past Inputs: Deterministic Component
Z*(s) Classed Statistical Predictions
• C = Climate • R = Relief (topography)
– Temp, Ppt, ET, Solar Rad – Primary Attributes
• Mean, min, max, variance • Slope, aspect, curvatures
• Annual, monthly, indices • Slope Position, roughness
• O = Organisms – Secondary Attributes
• CTI, WI, SPI, STC
– Manual Maps
• Land Use • P = Parent Material
• Vegetation – Published geology maps
– Remotely Sensed Imagery – Gamma radiometrics
• Classified RS imagery
– Thermal IR, RS Ratios
• NDVI, EVI, other ratios
• A = Age
16. Past Inputs: Deterministic Component
Z*(s) Classed Statistical Predictions
• Common Topo Inputs Profile Curvature Plan Curvature
– Profile Curvature
– Plan (Contour) Curvature
– Slope Gradient (& Aspect)
Slope Gradient Wetness Index
– CTI or Wetness Index
• Sometimes, not always
• Less Common Topo Inputs
– Surface Roughness Pit 2 Peak Relief Divide 2 Channel
– Relief within a window
– Relief relative to drainage
• Pit, peak, Ridge, channel,
Source: MacMillan, 2005
17. Past Inputs: Non-DEM Airborne
Radiometrics
• Radiometrics 4 Subsurface • Infer Parent Material
Source: Mayr, 2005
18. Past Inputs: Non-DEM Satellite Imagery
Grassland Land Cover Types Alpine Land Cover Types
25. Knowledge-Based Classification In Utah,
Knowledge-Based PURC Approach
Note: Not simple slope
elements but complex patterns
Source: Cole and Boettinger, 2004
27. Source: Zhou et al., 2004,
JZUS
Supervised Classification Using Regression Trees
Note similarity of supervised rules
and classes to typical soil-landform
conceptual classes
Note numeric estimate of
likelihood of occurrence of classes
29. Predicting Area-Class Soil Maps Using
Discriminant Analysis
Source: Scull et al., 2005, Ecological Modelling
30. Predicting Area-Class Soil Maps Using
Regression Trees
Extrapolation
Uncertainty of prediction
Bui and Moran (2003)
Geoderma 111:21-44
Source: Bui and Moran., 2003
31. Supervised Classification Using Fuzzy Logic
• Shi et al., 2004 Fuzzy likelihood of being a broad ridge
– Used multiple cases of reference
sites
– Each site was used to establish
fuzzy similarity of unclassified
locations to reference sites
– Used Fuzzy-minimum function to
compute fuzzy similarity
– Harden class using largest (Fuzzy-
maximum) value
– Considered distance to each
reference site in computing
Fuzzy-similarity
Source: Shi et al., 2004
33. Credit: J. Balkovič & G. Čemanová
Concept of Fuzzy K-means Clustering
Source: Sobocká et al., 2003
34. Example of Application of Fuzzy K-means
Unsupervised Classification
From: Burrough et al.,
2001, Landscsape
Ecology
Note similarity of unsupervised
classes to conceptual classes
35. Example of Application of Disaggregation of
a Soil Map by Clustering into Components
Source: Faine, 2001
36. Developments: Deterministic Component
Z*(s) Classed Predictive Maps in Past
• Characteristics of Models • Characteristics of Models
– Models largely ignored ε – Many use expert knowledge
• Seldom estimate error • Data mining is the exception
• Rarely correct for error • Training data seldom used
– Mainly use DEM inputs – Specialty software prevails
• Initially 3x3 windows • Software for DEM analysis
• Slope, aspect, curvatures – SoLIM, TAPESG, TOPAZ,
TOPOG, TAS, SAGA,
• Maybe wetness index
ESRI, ISRISI, LandMapR
• Later improvements were • Software for extracting rules
measures of slope position
– Expector, Netica, CART,
– Rarely use ancillary data See 5, Cubist, Prospector
• Exceptions like Bui, Skull • Software for applying rules
– Operate at single scale – ESRI, SoLIM, SIE, SAGA
37. Past Models: Deterministic Component
Z*(s) for Continuous Soil Properties
Approaches Aimed at Predicting
Continuous Soil Properties
38. Past Concepts: Deterministic Component
Z*(s) Continuous Soil Properties
• Same Theory-Concepts • Key Papers
as for Classed Maps – Moore et al., 1993
Soil = f (C, O, R, P, T, …) • Linear regression
– Except theory applied to – McSweeney et al., 1994
individual soil properties
– McKenzie & Austin, 1993
– Initially referred to as
environmental correlation – Gessler at al, 1995
– Soil properties related to • GLMs in S-Plus
• Landscape attributes – McKenzie & Ryan, 1999
• Climate variables • Regression Trees
• Geology, lithology, soil pm
39. Past Models: Deterministic Component
Z*(s) Continuous Soil Properties
• Regression Trees
– McKenzie & Ryan, 1998, Odeh et
al., 1994
• Fuzzy Logic-Neural Networks
– Zhu, 1997
• Bayesian Expert Knowledge
– Skidmore et al., 1996
– Cook et al., 1996, Corner et al., 1997
• GLMs – General Linear Models
– McKenzie & Austin, 1993
– Gessler et al., 1995
Source: McKenzie and Ryan, 1998
40. Past Inputs: Deterministic Component
Z*(s) for Continuous Soil Properties
• Similar to Classed Maps But:
– Many innovations originated
with continuous modelers
• Increased use of non-DEM
attributes
– climate, radiometrics, imagery
• Improved DEM derivatives
– Wetness Index & CTI
– Upslope means for slope, etc.
– Inverted DEMs to compute
» Down slope dispersal
» Down slope means
» New slope position data
Source: McKenzie and Ryan, 1998
41. Past Models: Deterministic Component
Z*(s) for Continuous Soil Properties
Examples of Predictions of Soil
Property Maps
42. Past Models: Deterministic Component
Z*(s) Continuous Maps
• Aandahl, 1948 (Note Date!)
– Regression model
• Predicted
– Average Nitrogen (3-24 inch)
– Total Nitrogen by depth
– Total Organic Carbon by
depth interval
– Depth of profile to loess
• Predictor (covariate)
– Slope position as expressed by
length of slope from shoulder
– Lost in the depths of time
Source: Aandahl, 1948
43. Past Models: Deterministic Component
Z*(s) for Continuous Soil Properties
• Moore et al., 1993
– Seminal paper
– Focus on topography
• Small sites
• Other covariates were
assumed constant
– Got people thinking
• About quantifying
environmental
correlation, especially
soil-topography
relationships
Source: Moore et al, 1993
44. Source: McKenzie and Ryan, 1998
Past Models: Deterministic Component
Z*(s) for Continuous Soil Properties
• McKenzie & Ryan, 1998
– Regression Tree: Soil Depth
45. Source: McKenzie and Ryan, 1998
Past Models: Deterministic Component
Z*(s) for Continuous Soil Properties
• Gessler et al., 1995
– GLMs
– Largely based
• Topo
– CTI
• Others held
– Steady
Source: Gessler, 2005
46. Credit: Minasny & McBratney
Past Models: Deterministic Component
Z*(s) for Continuous Soil Properties
2.17
160.1
Regression tree
Text: C Text: S,LS,L,CL,LiC
1.18 2.84
54.61 27.45
BD<1.43 BD>1.43 Clay<46.5 Clay>46.5
0.64 2.21 2.97 2.04
15.65 13.00 14.59 5.50
BD<1.42 BD>1.42
3.37 2.81
Source: Minasny and McBratney 1.83 8.90
47. Developments: Deterministic Component
Z*(s) Predictive Maps up to 2003
• Main Developments • Main Developments
– Better DEM derivatives – Integration of single models
• More and better measures of into multi-purpose software
landform position or • ArcGIS, ArcSIE, ArcView
context • SAGA, Whitebox, IDRISI
• Some recognition of scale
and resolution effects – Improved processing ability
– Different window sizes • Bigger files, faster processing
– Different grid resolutions – Emergence of 2 main scales
– More non-DEM inputs • Hillslope elements (series)
• Increased use of imagery – Quite similar across models
• New surrogates for PM • Landscape patterns (domains)
– Similar to associations
48. Early History of Development of DSM
Deterministic Stochastic
Soil Classes Soil Classes
Soil Soil
Properties Properties
49. Past Theory: Stochastic Component
ε(s)
– Waldo Tobler (1970)
• First law of geography
– Everything is related to
everything else, but near things
are more related than distant
things
– Matheron (1971)
• Theory of regionalized variables
– Webster and Cuanalo (1975)
• clay, silt, pH, CaCO3, colour
value, and stoniness on transect
– Burgess and Webster (1980 ab)
• Soil Property maps by kriging
• Universal kriging (drift) of EC
50. Source: Oliver, 1989
Past Models: Stochastic Component
ε(s)
– Universal Model of Variation
• Matheron (1971)
• Burgess and Webster (1980 ab)
• Webster and Burrough (1980)
• Burrough (1986)
• Webster and McBratney (1987)
• Oliver (1989)
51. Past Models: Stochastic Component
ε(s) Optimal Interpolation by Kriging
Irregular spatial distribution Compute semi-variance
(of observed point values) at different lag distances
6
5
6
7 6
7
8 5
6
y
7
Estimate values and error
x at fixed grid locations
Collect point sample observations
Fit Semi-variogram to lag data 6.1 5.7 5.3 5.8
7.0 6.5 6.0 5.2
7.6 7.0 6.0 5.7
7.2 7.0 6.2 5.5
52. Past Software: Stochastic Component
ε(s)
• Earlier Stand Alone • Later More Integrated
– Pc-Geostat (PC-Raster) – GSTAT
• Early version of GSTAT • Pebesma and Wesseling, 1998
– VESPER • Incorporated into ISRISI
• Variogram estimation and • Now incorporated into R and
spatial prediction with error S-Plus packages
• Minasny et al., 2005 – Pebesma, 2004
• http://sydney.edu.au/agricultu • http://www.gstat.org/index.ht
re/pal/software/vesper ml
– GEOEASE (DOS, 1991) – ArcGIS
• http://www.epa.gov/ada/csm • Geostatistical Analyst
os/models/geoeas.html – SGeMS (Stanford Univ)
• http://sgems.sourceforge.net/
53. Past Inputs: Stochastic Component ε(s)
• Essentially Just x,y,z Values at Point Locations
1. Start with set of soil 2. Locate the regularly
property values spaced grid nodes where
irregularly distributed in predicted soil property
x,y Cartesian space values are to be
calculated
3. Locate the n soil 4. Compute a new value
property data points for each location as the
within a search window weighted average of n
around the current grid neighbor elevations with
cell for which a value is weights established by
to be calculated the semi-variogram
54. Past Models: Stochastic Component ε(s)
for Continuous Soil Properties
Examples of Predictions of Soil
Property Maps by Kriging
55. Continuous Soil Property Maps by
Kriging
• Very Early Alberta Example SEMI-VARIOGRAM FOR A-HORIZON %SAND
SEMI-VARIANCE
160
– Lacombe Research Station 140
120
• Sampled soils on a 50 m grid 100
80
60
– Sand, Silt, Clay, 40
20
– pH, OC, EC, others 0
11
13
15
17
19
1
3
5
7
9
– 3 depths (0-15, 15-50, 50-100) LAG (1 LAG = 30 M)
• Used custom written software
– Compute variograms
– Interpolate using the variograms
• Only visualised as contour maps
– Only got 3D drapes in 1988
– Used PC-Raster to drape
– Saw strong soil-landscape pattern
LACOMBE SITE: A HORIZON %SAND (1985)
Source: MacMillan, 1985 unpublished
56. Continuous Soil Property Maps by
Kriging
Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml
57. Continuous Soil Property Maps by
Kriging
• Yasribi et al., 2009
– Simple ordinary kriging
of soil properties (OK)
• No co-kriging
• No regression prediction
– Relies on presence of
• Sufficient point samples
• Spatial structure over
distances longer then the
smallest sampling
interval
Source: Yasribi et al., 2009
58. Continuous Soil Property Maps by
Kriging
• Shi, 2009
– Comparison of pH by
four different methods
• a) HASM
• b) Kriging
• c) IWD
• d) Splines
Source: Yasribi et al., 2009
59. Developments: Stochastic Component
ε(s) Predictive Maps up to 2003
• Main Developments • Main Developments
– Theory – Software
• Becomes better understood • From stand alone and single
and accepted purpose to integrated software
– Concepts • Improvements in
• Regression-kriging evolves – Visualization
to include a separate part for – Capacity to process large
regression prediction data sets
– Automated variogram fitting
– Models – Ease of use
• Understanding and use of
universal model grows
– Inputs
• Developments in sampling
• Directional, local variograms
designs and sampling theory
60. Present and Recent Past
Key Developments in DSM Since 2003
(2003-2012)
On Digital Soil Mapping
McBratney et al., 2003
61. Developments in DSM Since 2003
Increasing Convergence and Interplay
Deterministic Stochastic
Soil Classes Soil Classes
Soil Soil
Properties Properties
Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation
62. Theory: Key Developments Since 2003
• Deterministic Part • Stochastic Part
– Pretty much unchanged – Same underlying theory
• Still based on attempting to • Still based on theory of
elucidate quantitative regionalized variables
relationships between soils – But
& environmental covariates
• Increasing realization that
– But the structural part of
• Scorpan elaboration variation (non-stationary
highlights importance of mean or drift) can be better
the spatial component (n) modelled by a deterministic
and of spatially correlated function than by purely
error ε(s) spatial calculations
63. Concepts: Key Developments Since 2003
• Deterministic Part • Factors as predictors
– Scorpan Model – Factors explicitly seen as
• Explicitly recognizes soil data quantitative predictors in
(s) as a potential input to prediction function
predict other soil data
– Soil inputs can include soil
maps, point observations,
even expert knowledge
• Explicitly recognizes space
(n) or location as a factor in
predicting soil data
– Space as in x,y location
– Space as in context, kriging
Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation
64. Concepts: Key Developments Since 2003
• Stochastic Part
– Emergence of Regression
Kriging (RK)
• Key difference to ordinary
kriging is that it is no longer
assumed that the mean of a
variable is constant
• Local variation or drift can
be modelled by some
deterministic function
– Local regression lowers
error, improves predictions
– Local regression function
can even be a soil map
Source: Heuvelink, personal communication
65. Models: Key Developments Since 2003
• Deterministic Part • Deterministic Part
– Improvements in Data – Improvements in Data
Mining and Knowledge Mining and Knowledge
Extraction Extraction
• Supervised Classification • Expert Knowledge Extraction
– Training data obtained – Bayesian Analysis of Evidence
from both points and maps – Prototype Category Theory
» Sample maps at points – Fuzzy Neural Networks
– Ensemble or multiple – Tools for Manual Extraction
realization models (100 x) of Fuzzy Expert Knowledge
» Boosting, bagging » ArcSIE, SoLIM
» Random Forests • Unsupervised classification
» ANN, Regression tree – Fuzzy k-means, c-means
66. Models: Key Developments Since 2003
• Stochastic Part • Stochastic Part
– Regression Kriging – Regression Kriging
• Recognized as equivalent to • Odeh et al., 1995
universal kriging or kriging • McBratney et al., 2003
with external drift • Hengl et al., 2004, 2007,
• Use of external knowledge 2003
and maps made easier • Heuvelink, 2006
– Incorporation of soft data
• Hengl how to books
• Made more accessible
– http://spatial-
through implementation in analyst.net/book/
commercial (ESRI) and – http://www.itc.nl/library
open source software (R) /Papers_2003/misca/hen
gl_comparison.pdf
67. Source: Hengl et al., 2012
Comparison of Soil Property Maps by
Kriging & RK
• Hengl et al., 2012
– Comparison of ordinary
kriging and regression
kriging
• Evidence supports RK as
explaining more of the
variation than OK alone
– Greater spatial detail
– Fewer extrapolation
areas
– Better fit to data
68. Software: Key Developments Since 2003
• Commercial Software • Non-commercial Software
– JMP (SAS) (McBratney) – Fuzzy Logic
• http://www.jmp.com/ • SoLIM Zhu et al., 1996, 1997
– S-Plus, Matlab, • ArcSIE Shi, FuzME
• Used by soil researchers
– Bayesian Logic
– See5, CUBIST, CART
– Full Range of Options
• Regression Trees
• R
– Netica (Bayesian) – http://www.r-project.org
• Norsys.com/netica – Regression Kriging
– Improvements – Random Forests
– Regression Trees
• Better visualization – GLMs
• Better interfaces • GSTAT (in R)
69. Source: Schmidt and Andrew., 2005
Inputs: Key Developments Since 2003
• Terrain Attributes
– More and better measures
• Primarily contextual and
related to landform position
– Real advances related to
• Multi-scale analysis
– varying window size and
grid resolution
• Window-based and flow-
based hill slope context
• Systematic examination of
relationships of properties
and processes to scale
Source: Smith et al., 2006
70. Inputs: Key Developments Since 2003
• Terrain Attributes
– Multi-scale analysis
• Varying window size and
grid resolution
• Identifies that some
variables are more useful
when computed over larger
windows or coarser grids
– Finer resolution grids not
always needed or better
– Drop off in predictive
power of DEMs after
about 30-50 m grid
resolution
Source: Deng et al., 2007
71. Inputs: Key Developments Since 2003
• ConMAP: Hyper-scale Contextual Analysis of Topographic Parameters
– Neighborhood
example
• Diameter
– 21 km
• Predictirs
– 775
Source: Berhens et al., in press
72. Inputs: Key Developments Since 2003
• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters
ConStat (ConMap)
- neighborhood reduction
a) Full neighborhood
b) Reduction of radii
c) Reduction on radii
d) Combination of b and c
Source: Berhens et al., in press
73. Inputs: Key Developments Since 2003
• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters
Source: Berhens et al., in press
74. Inputs: Key Developments Since 2003
• Hyper-scale Terrain
Analysis in ConSTAT
– Systematic analysis of relative
importance of terrain
measures different scales
• Compute statistics of terrain
measures at different scales
– Use data mining (Random
Forests) to identify
importance of different
statistics at different scales
and at each different location
Source: Berhens et al., in press
75. MrVBF: Multi-scale DEM Analysis
Smooth and subsample Source: Gallant, 2012
Original: 25 m Generalised: 75 m Generalised 675 m
Flatness Flatness
Bottomness Bottomness
Valley Bottom Valley Bottom
Flatness Flatness
76. Multiple Resolution Landform Position
MrVBF Example Outputs
Broader Scale 9” DEM
MRVBF for 25 m DEM
Source: Gallant, 2012
77. Developments: Improved Measures of
Landform Position
• SAGA-RHSP: relative • SAGA-ABC: altitude
hydrologic slope position above channel
Source: C. Bulmer, unpublished
Calculation based on: MacMillan, 2005 Source: C. Bulmer, unpublished
78. Developments: Improved Measures of
Landform Position
• TOPHAT – Schmidt • Slope Position – Hatfield
and Hewitt (2004) (1996)
Source: Schmidt & Hewitt, (2004) Source: Hatfield (1996)
80. Measures of Relative Slope Length (L)
Computed by LandMapR
• Percent L Pit to Peak • Percent L Channel to Divide
MEASURE OF REGIONAL CONTEXT MEASURE OF LOCAL CONTEXT
Image Data Copyright the Province of British Columbia, 2003
Source: MacMillan, 2005
81. Measures of Relative Slope Position
Computed by LandMapR
• Percent Diffuse Upslope Area • Percent Z Channel to Divide
SENSITIVE TO HOLLOWS & DRAWS RELATIVE TO MAIN STREAM CHANNELS
Image Data Copyright the Province of British Columbia, 2003
Source: MacMillan, 2005
82. Developments: Improved Classification of
Landform Patterns Iwahashi & Pike (2006)
• Iwahashi landform underlying 1:650k soil map
Terrain Classes
Fine texture,
Terrain Series
High convexity
1 5 9 13
Fine texture,
Low convexity 3 7 11 15
Coarse texture,
High convexity 2 6 10 14
Coarse texture,
Low convexity 4 8 12 16
steep gentle
Source: Reuter, H.I. (unpublished)
83. Inputs: Key Developments Since 2003
• Non-Terrain Attributes
– Systematic analysis of
environmental covariates
• Detect distances and scales
over which each covariate
exhibits a strong relationship
with a soil or property to be
predicted or just with itself
– Vary window sizes and grid
resolutions and compute
regressions on derivatives
– analyse range of variation
inherent to each covariate
» Functional relationships
are dependent on scale
Source: Park, 2004
84. Inputs: Key Developments Since 2003
• Non-Terrain Attributes
– Systematic analysis of scale of
environmental covariates
• Select and use input covariates
at the most appropriate scale
– Explicitly recognize the
hierarchical nature of
environmental controls on
soils
– Select variables at the scales,
resolutions or window sizes
with the strongest predictive
power for each property or
class to be predicted.
Source: Park, 2004
85. Inputs: Key Developments Since 2003
Harmonization of soil profile depth data through spline fitting
Source: David Jacquier, 2010
86. Inputs: Key Developments Since 2003
From discrete soil classes to continuous soil properties
Clearfield soil series
Wapello County, Iowa Harmonization of soil profile
Mukey: 411784 data through spline fitting
Musym: 230C
‘Modal’ Fit mass- Estimate
Fitted Spline
profile preserving averages for
Spline averages
spline spline at
at
standardised
specified
depth
depth
ranges, e.g.,
ranges
globalsoilmap
Source: Sun et al., (2010) depth ranges
87. Source: Hempel et al., 2011
Outputs: Key Developments Since 2003
• From Classes to Properties
– Non-disaggregated soil maps
• Weighted averages by polygon
by soil property and depth
– Calling version 0.5
– Disaggregated Soil Class Maps
• Estimate soil property values at
every grid cell location & depth
– Based on weighted likelihood
value of occurrence of each of
n soils times property value for
that soil at that depth
– Likelihood value can come
from various methods
Source: Sun et al, 2010
88. Outputs: Key Developments Since 2003
• From Classes to Properties
– Disaggregated Soil Class Maps
• Estimate soil property values at
every grid cell location
Source: Zhu et al., 1997
90. Predicting Area-Class Soil Maps
Clovis Grinand, Dominique Arrouays,
Bertrand Laroche, and Manuel Pascal Martin.
Extrapolating regional soil landscapes from an
existing soil map: Sampling intensity,
validation procedures, and integration of
spatial context. Geoderma 143, 180-190
Source: Grinand et al., 2008
91. Recent Knowledge-Based Classification In
Africa, Multi-scale, Hierarchical Landforms
Elevation + Slope + UPA + Catena SOTER Soil and landforms
( 2 km support) (1:1 million – 1.5 million
Source: Park et al, 2004
92. Digital Soil Mapping
DEM in England & Wales
Predicted
using Legacy Data
soil series
TOPAZ TAPES-G LandMapR
TRAINING DATA MODELLING OUTPUTS
(NETICA)
Point Data
Detailed soil maps
Accuracy
Covariates
assessment
Expert
knowledge
Source: Mayr, 2010
93. Predicting Area-Class Soil Maps Using
Multiple Regression Trees (100 x)
Prepare a database and tables of mapping units & soil
series, and covariates
Select 1/n of the points systematically (n=100)
Repeat n Sample soil series randomly from the multinomial
times distribution of mapping unit composites
Used See 5, (RuleQuest
Research, 2009 Construct decision tree
Predict soil series at all pixels
Calculate the soil series statistics based on the n
predictions for each pixel
Calculate the probability for each soil series
Generate soil series maps
Source: Sun et al., 2010
94. Predicting Area-Class Soil Maps Using
Multiple Regression Trees (100 x)
A closer look at the junction point in the middle of 4 combined maps,
(a) the original map units, and
(b) the most likely soil series map and its associated probability.
The length of the image is approximately 14 km.
Legend
(a) monr_comppct
Value
High : 100
Low : 7
(b)
Source: Sun et al., 2010
96. Source: Hengl et al., 2004
Continuous Soil Property Maps by
Kriging & RK
• Hengl et al., 2004
– Comparison of topsoil
thickness by four
different methods
• a) Point locations
• b) Soil Map only
• c) Ordinary Kriging
• d) Plain Regression
• e) Regression-kriging
– Evidence supports RK
97. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
(scorpan + ε)
300 soil point data
Assemble
field data
98. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
(scorpan + ε)
Assemble covariates for
the predictive model
99. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
(scorpan + ε)
Perform regression to
build a predictive model
Linear Model
OC = f(x) + e
Predictors
Elevation
Aspect
Landsat band 6
NDVI
Land-use
Soil-Landscape
Unit
100. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
scorpan + ε)
Predict both
property value
and standard
error over the
entire area
101. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
(scorpan + ε)
Fit a variogram to the
residuals
102. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
scorpan + ε)
Krige the residuals
103. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
scorpan + ε)
Linear Model + Residuals
Add interpolated
residuals to the
prediction from
regression
Final Prediction
104. Source: Minasny et al., 2010
Recent Example: Regression-Kriging
(scorpan + ε)
Add regression variance
and kriging variance to
get total variance
(Std.err. of (Std. err. of
+ kriging)2
regression)2
(Total
Variance)1/2
105. Recent Example: Regression-Kriging
Regression
C predicted for C= 100-1.2EC-5.2REF-0.6REF2-2.1EL
sampled locations model
C predicted for Mg C/ha
Residuals all grid locations
95
85 Mean 64.0
75
Min 27.0
65
Max 87.9
55
Kriging 45
CV% 18.4
35 RMSE 9.8
25 RI (%) 19.7
15
Final C map
106. Source: Mayr et al., 2010
Continuous Soil Property Maps by
Hybrid Bayesian Analysis
108. Source: Heuvelink et al., 2004
The Future: Lets Go Back and Talk About
the Universal Model of Variation Again
Z(s) = Z*(s) + ε(s) + ε
Lots of things qualify
as regression!
Deterministic part of
the predictive model
Regression just
means minimizing
variance
Stochastic part of the
predictive model
What is all this talk
about optimization?
109. The Future: A Conceptual Framework for
GSIF – A Global Soil Information Facility
Collaborative and
open production,
assembly and sharing
of covariate data
(World Grids)
Collaborative and
open collection, Collaborative and
input and sharing of open and modelling
geo-registered field on an inter-active,
evidence web-based server-
(Open Soil Profiles) side platform
Everything is
accessible,
transparent and
repeatable
Maps we can all contribute to, access, use, modify and
update, continuously and transparently
Source: Hengl et al., 2011
110. The Future: Functionality for GSIF – A
Global Soil Information Facility
Possibility to assess
Possibility of making error and correct for
use of existing it everywhere
legacy soil maps
(even new soil maps) Possibility of
needed for soil rescuing, sharing,
prediction anywhere harmonizing and
archiving soil
profile point data
needed for soil
prediction anywhere
Possibility to Possibility to
develop and use develop and use
global models (even multi-scale and
for local mapping) multi-resolution
hierarchical models
Source: Hengl et al., 2011
111. The Future: Conceptual Framework for
GSIF – World Soil Profiles
Source: Hengl et al., 2011
112. The Future: Implemented Framework for
GSIF – World Soil Profiles
Source: www.worldsoilprofiles.org
113. The Future: Implemented Framework for
GSIF – World Soil Profiles
Source: www.worldsoilprofiles.org
118. The Future: Collaborative Global, Multi-
Scale Mapping through GSIF
Possibility to
develop and use
global models (even
Possibility for combining for local mapping)
Top-Down and Bottom-up
mapping through weighted
averaging of 2 or more sets
of predictions
)
Source: Hengl et al., 2011
119. The Future: Global, Multi-Scale Modeling
of Soil Properties through GSIF
Possibility to
develop and use
multi-scale and
multi-resolution
hierarchical models
Possibility to
develop and use
global models (even
for local mapping)
Source: Hengl et al., 2011
120. The Future: Global, Multi-Scale Modeling
of Soil Properties through GSIF
• Global DSM Models
– Make use of ALL data
• From everywhere in
the world
– Provide initial coarse
local predictions Global Models
• That can be refined inform and
improve local
and improved with: mapping
– More & finer local data
– Local model runs
Source: Hengl personal communication, 2013
121. The Future: Global, Multi-Scale Modeling
of Soil Properties through GSIF
Global Models
inform and
improve local
mapping
Source: Hengl et al., 2011
122. The Future: Functionality for GSIF – A
Global Soil Information Facility
Anyone can
access and
display the
maps
Source: Hengl et al., 2011
123. The Future: Functionality for GSIF – A
Global Soil Information Facility
With Google
Earth everyone
has a GIS to
view free soil
maps and data
Slide credit: Tom Hengl,
2011
Source: Hengl et al., 2011
124. The Future: Collaborative Global, Multi-
Scale Mapping through GSIF
A Global
Collaboratory!
Working together
we can map the
world one tile at a
time!
The next generation
of soil surveyors is
everyone!
Source: Hengl et al., 2011
125. The Future: From Mapping to
Continuously Updated Modelling
Possibility to move from single
snapshot mapping of static soil
properties to continuous update and
improvement of maps of both static and
dynamic properties within a structured
and consistent framework.