UiPath Community: AI for UiPath Automation Developers
Global Modeling of Biodiversity and Climate Change
1. Global Modeling of Biodiversity and
Climate Change
Falk Huettmann et al.
-EWHALE lab-
Biology and Wildlife Department
Institute of Arctic Biology
University of Alaska-Faibanks
Fairbanks Alaska
3. Re. Scientific Thinking and Thought
Karl Popper Leo Breiman
Felix Shtilmark
Herman Daly
Dave Carlson
4. Scientific Landmines ?!
Spatial/Geographic Information Systems (GIS) and…
Data Sharing (online) Machine Learning
Predictions Data Mining
Diseases
Metadata
(Influenza)
Sustainability Economic Growth problem
Management
5. Central to our work:
Predictions in Space and Time,
e.g. done best with Machine Learning
-quantitative
-spatial
-statistical interactions included
-one formula
-one algorithm
-repeatable
-testable
-transparent
-open access
6. How GIS and machine learning connect… A Work Flow
ArcGIS 10.2
Salford
R
GME
Python etc
7. Tree/CART - Family
Binary recursive partitioning
Temp>15
Precip <100
Temp<5
YES NO
Leo Breiman 1984, and others
PURITY METRIC OF NODES
8. TreeNet The more nodes
(~A sequence of CARTs) …the more detail
…the slower
‘boosting’
+ + + +
each explains the remaining variance til the end…
ROC
ROC curves for accuracy tests
Importance Value
Variable
LDUSE 100.00
Score
||||||||||||||||||||||||||||||||||||||||||
e.g. correctly predicted absence app. 97%
TAIR_AUG 97.62 |||||||||||||||||||||||||||||||||||||||||
HYDRO94.35 ||||||||||||||||||||||||||||||||||||||||
DEM94.01 ||||||||||||||||||||||||||||||||||||||| e.g. correctly predicted presence app. 92%
PREC_AUG 90.17 ||||||||||||||||||||||||||||||||||||||
POP 82.54 Difficult to interpret
||||||||||||||||||||||||||||||||||
HMFPT81.46 |||||||||||||||||||||||||||||||||| =>Apply to a dataset for predictions
but good graphs
9. RandomForest (Prasad et al. 2006,
Boosting & Bagging algorithms Furlanello et al. 2003
Handles ‘noise’, interactions Breiman 2001)
and categorical data fine! Random set 1
Random set of Columns
(Predictors)
DEM Slope Aspect Climate Land-
cover
Random set of Rows
1
Random set 2
(Cases)
2
3
4
5 Average Final Tree
from e.g.>2000 trees
done by VOTING
Bagging: Optimization based on In-Bag, Out-of Bag samples
In RF no pruning => Difficult to overfit Difficult to interpret
(robust) but good graphs
10. Machine Learning example with GIS:
Spoon-billed Sandpiper and Predictions
(where are the wintering grounds of
ca. 1000 highly endangered birds…?)
(breeding,
Kamchatka)
(winter)
Engler et al.
(in prep)
11. Data means Metadata and Data Management
(specifically for GIS, for science projects, machine learning
and for graduate students)
___________Field Season 1_________ ___________Field Season 2 & 3_________
Raw Dataset 1 Metadata
Raw Dataset 2 Metadata
Raw Dataset 3 Metadata etc.
Raw Dataset 4 Metadata
A. Baltensperger Raw Dataset 5 Metadata
http://mercury.ornl.gov/clearinghouse/
=> Digital Publications
12. Two books by the EWHALE lab re. Predictions and related Philosophies
as presented here
13. Students & Projects of the EWHALE lab
Andy Baltensperger Katherine Miller
Shana Losbaugh Sue Hazlett
Tim Mullet
Keiko Akasofu Herrick
14. Students & Projects of the EWHALE lab
Ben Best
Imme Rutzen
Betsy Young
Brian Young
Michal Lindgren
Zach Meyers
15. Students & Projects of the EWHALE lab:
Visitors
Moritz Schmid Laszlo Koever
(Uni Goettingen, (Uni Debrecen,
Germany) Hungary)
David Lieske
Dmitry Korobitsyn (Mount Allison,
(Uni Archangelsk, Canada)
Russia)
Cynthia Resendiz
(Mexico)
16. Our Business Model
NOT A WETLAB
NOT FOR RE-CHARGE
CONSTANT, STEADY SMALL FLOW
17. Some Examples of what the EWHALE lab does, internationally
(~how Falk spent his sabbatical and time)
19. Ocean View I: A Global Benthos Model…(RandomForest Predictions)
Wei et al. (2011). Global Patterns and
Predictions of Seafloor Biomass using Random
Forests. PLOS 5(12): e15323.
20. Ocean View II: Dimethylsulfid (DMS), globally per month
Humphries et al.
(in review)
21. Spatial Predictions of Arctic (Pelagic) Seabirds
What Data are used: Pelagic Seabird Data ?!
Public data
+ High Quality Relevance of Arctic
Content Specimen Collections
vs.
+ Metadata ?!
(Polarstern)
22. Spatial Predictions of Arctic (Pelagic) Seabirds
What Environmental Data were Used
(Listed in no order)
1. Distance to ice edge
2. Sea temperature at 10m depth
3. Sea temperature at 0m depth
4. Phosphate concentration at 10m depth
5. Silicate concentration at surface
Public Sources & 6. Phosphate concentration at surface
Availability 7. Salinity at 20m depth
8. Distance to Settlements (!)
9.Salinity at surface
Huettmann & 10.Silicate10m depth
Hazlett (2009) 11. Discharge from rivers
for 50 layers 12. Distance to shelf edge
13. Seaice thickness
14. Nitrate concentration at surface
15. DMS (Di-Methyl Sufide) at surface (G. Humphries in prep.)
16. Nitrate concentration at 10m depth
17. Bathymetric slope
23. Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Training and Assessment Data
Env. Data
Presence (blue)
vs.
Random (red)
(Pseudo-
+
absence)
…
Algorithm
=>Predictions
24. Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Training and Assessment Data
Env. Data
Presence (blue)
vs.
Random (red)
(Pseudo-
+
absence)
Assessment
(green; telemetry
…
O. Gilg) Algorithm
=>Predictions
25. Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Predictions
Prediction Surface
Legend
Red/Yellow=Presence
t 1
af
Light blue: Weak
Dr Presence
Dark blue: Pseudo-
absence
26. Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Predictions and its data
Prediction Surface
Legend
Red/Yellow=Presence
t 1
af
Light Blue: Weak
Dr Presence
Dark Blue: Pseudo-
absence
Green: Assessment
Data (O.Gilg)
27. Circumpolar Arctic: 27 Seabird Open Access Predictions
Tufted Puffin Horned Puffin Northern Fulmar
…add up all
Ivory Gull Ross’s Gull Black-legged Kittiwake predictions…
Huettmann et al. (2011)
28. Circumpolar Arctic: Putting Models to Use
Seabird
vs.
=>We are running out of
space and time in the Arctic
(and anywhere else)
29. Circumpolar Arctic: Alaskan Crab
Ensemble Model
=> Open Access (Raw Data + Model)
in a highly commercial setting!
Compiled Raw Crab Data Predicted Crab Pres/Abs
(and Abundance)
Snow Crab off Alaska
(Hardy et al. 2011)
30. Circumpolar Arctic: Marine Protected Areas (MPA) and Biodiversity
MARXAN optimization
based on over 60 GIS layers
=>Over 20 GIS data layers for each
Pole (Arctic and Antarctic)
Huettmann and Hazlett (2010)
31. Antarctica: MPA by WWF-Australia
for the Scientific Committee on Antarctic Research (SCAR)
WWF-Australia,
SCAR 2012
34. What is a Soundscape?
• Biological Sounds
– Biophony
• Geophysical Sounds
– Geophony
• Anthropogenic Sounds
– Anthrophony
Mullet et al.
(in prep)
35. Model-Predicting Sound
(‘Soundscapes’)
Models based on:
- 7 permanent sound stations
- Stratified according to expected
sounds
- Rotate 6 sound stations
– Input GPS coordinates and related sound
data into TreeNet modeling software
– Include environmental and human-related
covariates (e.g., vegetation, distance to
roads, aspect)
– Extrapolate sound levels and sound source
data to rest of Refuge
Mullet et al.
(in prep)
39. Regionalized IPCC models,
e.g. Alaska
Temperature (August and January)
(SNAP UAF data)
2099
2008
Murohy et al.
(2010)
40. Alaskan Caribou:
Summer & winter ranges 2008 & 2099
2008
2008 2099
Summer
Range
Model in RF
with IPCC Murphy et al. (2010)
Winter
Range
41. RandomForest: Supervised and Unsupervised Classification
Supervised Classification: -Multiple Regression (classification or continuous)
-Multiple Response
e.g. YAIMPUTE
RandomForest
Unsupervised Classification: 1. Proximity Matrix via Bagging/Voting (RF)
2. Similarity Matrix
3. e.g. Regular Clustering (mclust, PAM)
3. Visualize Result
42. 11 Cliome Clusters (RF)
Climate Cluster Data, Canada &
Alaska
Credit: M. Lindgren et al.
43. Now, a topical shift to Circumpolar
Arctic and Zooplankton Forecasting
til 2100
Metridia longa showed the highest
increase in the copepodite life stage from
2010 to 2100.
Credit: M. Schmid et al.
44. Calanus hyperboreus showed the highest
change in the predicted relative index of
depth from 2010 to 2100.
Credit: M. Schmid et al.
45. GMBA Case Study: Himalaya Uplands Plant Database
Bernhard Dickoré et al.
(red: sampling points)
+ FGDC NBII/ISO Metadata
46. A High Priority Ethnomedicinal Plant in Nepal
Dactylorhiza hatagirea (Marsh Orchids)
81 “points”
Ethnobotanical Use: Tubers are used as nervine tonic and aphrodisiac. It is
also used to treat cuts, wounds, cough and anemia.
47. Prediction of a a High Priority Ethnomedicinal Plant in Nepal
Dactylorhiza hatagirea (Marsh Orchids)
48. MARXAN Solution for the Three Poles: 50% Protection Scenario
(birds, glaciers/ice and freezing temperatures)
x
Legend:
Selection
Frequency!
(the darker the more frequently selected)
Note: Terrestrial areas of Arctic &
Antarctic are not included, yet
50. Avian Influenza (AI) Prediction globally…
(all based on Machine Learning!)
Global AI model (Ecological Niche) based on K.Herrick-Akasofu,F. Huettmann,J. Runstadler et al.
(unpublished; forthcoming thesis chapter)
51. Acknowledgements
L. Strecker, all co-authors, all EWHALE lab students, NCEAS, University
of Alaska-Fairbanks, D. Steinberg (Salford Systems Ltd), COML, CAML,
ArcOD, GMBA, IPY, A.W. Diamond, and many colleagues worldwide
(a 20 years summary...) AND HUGE THANKS TO SALFORD SYSTEMS &
Dan Steinberg’s team