SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
Global Modeling of Biodiversity and
         Climate Change




       Falk Huettmann et al.
       -EWHALE lab-
       Biology and Wildlife Department
       Institute of Arctic Biology
       University of Alaska-Faibanks
       Fairbanks Alaska
EWHALE lab
Re. Scientific Thinking and Thought




     Karl Popper                       Leo Breiman




Felix Shtilmark
                                                     Herman Daly

                        Dave Carlson
Scientific Landmines ?!



              Spatial/Geographic Information Systems (GIS) and…


      Data Sharing (online)           Machine Learning



Predictions                                     Data Mining



     Diseases
                                              Metadata
     (Influenza)


          Sustainability          Economic Growth problem
          Management
Central to our work:
Predictions in Space and Time,
e.g. done best with Machine Learning




                        -quantitative
                        -spatial
                        -statistical interactions included
                        -one formula
                        -one algorithm
                        -repeatable
                        -testable
                        -transparent
                        -open access
How GIS and machine learning connect… A Work Flow




ArcGIS 10.2


Salford
R
GME
Python etc
Tree/CART - Family

                       Binary recursive partitioning




                                  Temp>15


                                          Precip <100

                                               Temp<5




                                       YES       NO
Leo Breiman 1984, and others
                                   PURITY METRIC OF NODES
TreeNet                                            The more nodes
                                           (~A sequence of CARTs)                                      …the more detail
                                                                                                       …the slower
                                                 ‘boosting’

                               +                                   +                  +                       +

                                                                 each explains the remaining variance til the end…
                    ROC




                                                                              ROC curves for accuracy tests
          Importance Value
Variable
LDUSE 100.00
                   Score
                    ||||||||||||||||||||||||||||||||||||||||||
                                                                              e.g. correctly predicted absence app. 97%
TAIR_AUG 97.62      |||||||||||||||||||||||||||||||||||||||||
HYDRO94.35          ||||||||||||||||||||||||||||||||||||||||
DEM94.01            |||||||||||||||||||||||||||||||||||||||                   e.g. correctly predicted presence app. 92%
PREC_AUG 90.17   ||||||||||||||||||||||||||||||||||||||
POP 82.54                                       Difficult to interpret
                    ||||||||||||||||||||||||||||||||||
HMFPT81.46          ||||||||||||||||||||||||||||||||||                           =>Apply to a dataset for predictions
                                                 but good graphs
RandomForest (Prasad et al. 2006,
Boosting & Bagging algorithms               Furlanello et al. 2003
Handles ‘noise’, interactions               Breiman 2001)
and categorical data fine!                                                   Random set 1
                    Random set of Columns
                           (Predictors)
                         DEM   Slope   Aspect   Climate   Land-
                                                          cover
Random set of Rows




                     1

                                                                              Random set 2
     (Cases)




                     2


                     3


                     4


                     5                                                      Average Final Tree
                                                                            from e.g.>2000 trees
                                                                            done by VOTING
     Bagging: Optimization based on In-Bag, Out-of Bag samples

     In RF no pruning => Difficult to overfit      Difficult to interpret
                              (robust)              but good graphs
Machine Learning example with GIS:
                Spoon-billed Sandpiper and Predictions
                 (where are the wintering grounds of
                  ca. 1000 highly endangered birds…?)

  (breeding,
  Kamchatka)




   (winter)

Engler et al.
(in prep)
Data means Metadata and Data Management
       (specifically for GIS, for science projects, machine learning
        and for graduate students)
                   ___________Field Season 1_________   ___________Field Season 2 & 3_________



                   Raw Dataset 1            Metadata


                   Raw Dataset 2            Metadata


                   Raw Dataset 3            Metadata                     etc.


                   Raw Dataset 4            Metadata


A. Baltensperger   Raw Dataset 5            Metadata




                   http://mercury.ornl.gov/clearinghouse/

                     => Digital Publications
Two books by the EWHALE lab re. Predictions and related Philosophies
                     as presented here
Students & Projects of the EWHALE lab


 Andy Baltensperger            Katherine Miller




 Shana Losbaugh                Sue Hazlett




                                Tim Mullet
Keiko Akasofu Herrick
Students & Projects of the EWHALE lab

                         Ben Best
  Imme Rutzen




                        Betsy Young
 Brian Young




                         Michal Lindgren
Zach Meyers
Students & Projects of the EWHALE lab:
                                  Visitors

  Moritz Schmid                     Laszlo Koever
  (Uni Goettingen,                  (Uni Debrecen,
   Germany)                          Hungary)




                                    David Lieske
 Dmitry Korobitsyn                  (Mount Allison,
 (Uni Archangelsk,                   Canada)
  Russia)




Cynthia Resendiz
(Mexico)
Our Business Model




      NOT A WETLAB




   NOT FOR RE-CHARGE




CONSTANT, STEADY SMALL FLOW
Some Examples of what the EWHALE lab does, internationally

     (~how Falk spent his sabbatical and time)
Bioice/Iceland: A research cruise “in” a predictive model…
                    ‘RV Meteor’ (Germany)
Ocean View I: A Global Benthos Model…(RandomForest Predictions)




                                    Wei et al. (2011). Global Patterns and
                                    Predictions of Seafloor Biomass using Random
                                    Forests. PLOS 5(12): e15323.
Ocean View II: Dimethylsulfid (DMS), globally per month




                                                    Humphries et al.
                                                     (in review)
Spatial Predictions of Arctic (Pelagic) Seabirds
                 What Data are used: Pelagic Seabird Data ?!




Public data

+ High Quality                                       Relevance of Arctic
  Content                                            Specimen Collections
                                                            vs.
+ Metadata ?!




                                                         (Polarstern)
Spatial Predictions of Arctic (Pelagic) Seabirds
                       What Environmental Data were Used
                         (Listed in no order)
                   1. Distance to ice edge
                   2. Sea temperature at 10m depth
                   3. Sea temperature at 0m depth
                   4. Phosphate concentration at 10m depth
                   5. Silicate concentration at surface
Public Sources &   6. Phosphate concentration at surface
Availability       7. Salinity at 20m depth
                   8. Distance to Settlements        (!)
                   9.Salinity at surface
Huettmann &        10.Silicate10m depth
Hazlett (2009)     11. Discharge from rivers
for 50 layers      12. Distance to shelf edge
                   13. Seaice thickness
                   14. Nitrate concentration at surface
                   15. DMS (Di-Methyl Sufide) at surface (G. Humphries in prep.)
                   16. Nitrate concentration at 10m depth
                   17. Bathymetric slope
Spatial Predictions of Arctic (Pelagic) Seabirds
                  How it looks like: Training and Assessment Data

                                                                         Env. Data



Presence (blue)
vs.
Random (red)
(Pseudo-
                                                                    +
 absence)



                                                                           …
                                                                        Algorithm

                                                                    =>Predictions
Spatial Predictions of Arctic (Pelagic) Seabirds
                  How it looks like: Training and Assessment Data

                                                                         Env. Data



Presence (blue)
vs.
Random (red)
(Pseudo-
                                                                    +
 absence)


Assessment
(green; telemetry
                                                                           …
         O. Gilg)                                                       Algorithm
                                                                    =>Predictions
Spatial Predictions of Arctic (Pelagic) Seabirds
                 How it looks like: Predictions




                                                  Prediction Surface
                                                  Legend
                                                  Red/Yellow=Presence

     t 1
   af
                                                  Light blue: Weak


Dr                                                            Presence

                                                  Dark blue: Pseudo-
                                                             absence
Spatial Predictions of Arctic (Pelagic) Seabirds
                 How it looks like: Predictions and its data




                                                     Prediction Surface
                                                     Legend
                                                     Red/Yellow=Presence

     t 1
   af
                                                     Light Blue: Weak


Dr                                                               Presence

                                                     Dark Blue: Pseudo-
                                                                absence

                                                     Green: Assessment
                                                            Data (O.Gilg)
Circumpolar Arctic: 27 Seabird Open Access Predictions




              Tufted Puffin        Horned Puffin     Northern Fulmar




                                                                              …add up all
               Ivory Gull          Ross’s Gull       Black-legged Kittiwake   predictions…

Huettmann et al. (2011)
Circumpolar Arctic: Putting Models to Use

                                                      Seabird




      vs.




=>We are running out of
  space and time in the Arctic
  (and anywhere else)
Circumpolar Arctic: Alaskan Crab
                  Ensemble Model

            => Open Access (Raw Data + Model)
               in a highly commercial setting!




Compiled Raw Crab Data             Predicted Crab Pres/Abs
                                      (and Abundance)

                                                     Snow Crab off Alaska
                                                     (Hardy et al. 2011)
Circumpolar Arctic: Marine Protected Areas (MPA) and Biodiversity


                                                                MARXAN optimization
                                                              based on over 60 GIS layers




                                                  =>Over 20 GIS data layers for each
                                                     Pole (Arctic and Antarctic)
Huettmann and Hazlett (2010)
Antarctica: MPA by WWF-Australia
for the Scientific Committee on Antarctic Research (SCAR)




                                                   WWF-Australia,
                                                   SCAR 2012
Antarctica: Isopode Data, Penguin Data etc




                                             Kaiser et al.,

                                             French
                                             Antarctic
                                             Service Data
Global Modeling of Biodiversity and Climate Change
What is a Soundscape?
• Biological Sounds
  – Biophony


• Geophysical Sounds
  – Geophony


• Anthropogenic Sounds
  – Anthrophony
                  Mullet et al.
                   (in prep)
Model-Predicting Sound
         (‘Soundscapes’)
Models based on:
 - 7 permanent sound stations
   - Stratified according to expected
       sounds
  - Rotate 6 sound stations
   – Input GPS coordinates and related sound
     data into TreeNet modeling software
   – Include environmental and human-related
     covariates (e.g., vegetation, distance to
     roads, aspect)
   – Extrapolate sound levels and sound source
     data to rest of Refuge


                                Mullet et al.
                                 (in prep)
Spatial Predictions of Forest Cover in Alaska




                                                Young et al.
                                                 (in prep)
Spatial Predictions of Forest Cover in Alaska




                                                Young et al.
                                                 (in prep)
Spatial Predictions of Forest Cover in Alaska
   2010                           2050




                                   Young et al.
                                    (in prep)
Regionalized IPCC models,
      e.g. Alaska
Temperature (August and January)
 (SNAP UAF data)




                                          2099




                2008
                                   Murohy et al.
                                    (2010)
Alaskan Caribou:
            Summer & winter ranges 2008 & 2099

         2008
          2008                    2099




Summer
Range

                    Model in RF
                    with IPCC               Murphy et al. (2010)


Winter
Range
RandomForest: Supervised and Unsupervised Classification

Supervised Classification: -Multiple Regression (classification or continuous)

                             -Multiple Response
                              e.g. YAIMPUTE
                                                              RandomForest

Unsupervised Classification: 1. Proximity Matrix via Bagging/Voting (RF)

                                 2. Similarity Matrix

                                 3. e.g. Regular Clustering (mclust, PAM)

                                3. Visualize Result
11 Cliome Clusters (RF)
Climate Cluster Data, Canada &
            Alaska




                     Credit: M. Lindgren et al.
Now, a topical shift to Circumpolar
Arctic and Zooplankton Forecasting
til 2100

Metridia longa showed the highest
increase in the copepodite life stage from
2010 to 2100.




                                        Credit: M. Schmid et al.
Calanus hyperboreus showed the highest
change in the predicted relative index of
depth from 2010 to 2100.




                                  Credit: M. Schmid et al.
GMBA Case Study: Himalaya Uplands Plant Database
              Bernhard Dickoré et al.




                    (red: sampling points)

                               + FGDC NBII/ISO Metadata
A High Priority Ethnomedicinal Plant in Nepal

              Dactylorhiza hatagirea (Marsh Orchids)




                81 “points”



Ethnobotanical Use: Tubers are used as nervine tonic and aphrodisiac. It is
                    also used to treat cuts, wounds, cough and anemia.
Prediction of a a High Priority Ethnomedicinal Plant in Nepal

    Dactylorhiza hatagirea (Marsh Orchids)
MARXAN Solution for the Three Poles: 50% Protection Scenario
      (birds, glaciers/ice and freezing temperatures)

x




                         Legend:
                         Selection
                         Frequency!
                         (the darker the more frequently selected)

                            Note: Terrestrial areas of Arctic &
                                  Antarctic are not included, yet
Another book by the EWHALE lab
Avian Influenza (AI) Prediction globally…
                         (all based on Machine Learning!)




Global AI model (Ecological Niche) based on K.Herrick-Akasofu,F. Huettmann,J. Runstadler et al.
(unpublished; forthcoming thesis chapter)
Acknowledgements


L. Strecker, all co-authors, all EWHALE lab students, NCEAS, University
of Alaska-Fairbanks, D. Steinberg (Salford Systems Ltd), COML, CAML,
ArcOD, GMBA, IPY, A.W. Diamond, and many colleagues worldwide
 (a 20 years summary...) AND HUGE THANKS TO SALFORD SYSTEMS &
Dan Steinberg’s team

Contenu connexe

Similaire à Global Modeling of Biodiversity and Climate Change

Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Databricks
 
Paradigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningParadigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningSalford Systems
 
The spectre of the spectrum
The spectre of the spectrumThe spectre of the spectrum
The spectre of the spectrumDavid Gleich
 
Phylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-EmondPhylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-EmondRoderic Page
 
(Talk in Powerpoint Format)
(Talk in Powerpoint Format)(Talk in Powerpoint Format)
(Talk in Powerpoint Format)butest
 
U Florida / Gainesville talk, apr 13 2011
U Florida / Gainesville  talk, apr 13 2011U Florida / Gainesville  talk, apr 13 2011
U Florida / Gainesville talk, apr 13 2011c.titus.brown
 
2012 hpcuserforum talk
2012 hpcuserforum talk2012 hpcuserforum talk
2012 hpcuserforum talkc.titus.brown
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Sri Ambati
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IRamez Abdalla, M.Sc
 
Exploring explainable machine learning for detecting changes in climate
Exploring explainable machine learning for detecting changes in climateExploring explainable machine learning for detecting changes in climate
Exploring explainable machine learning for detecting changes in climateZachary Labe
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the CloudDataMine Lab
 
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in RFinding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in RRevolution Analytics
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...Salford Systems
 
Using explainable AI to identify key regions of climate change in GFDL SPEAR ...
Using explainable AI to identify key regions of climate change in GFDL SPEAR ...Using explainable AI to identify key regions of climate change in GFDL SPEAR ...
Using explainable AI to identify key regions of climate change in GFDL SPEAR ...Zachary Labe
 
End of Sprint 5
End of Sprint 5End of Sprint 5
End of Sprint 5dm_work
 
EOS5 Demo
EOS5 DemoEOS5 Demo
EOS5 Demodm_work
 

Similaire à Global Modeling of Biodiversity and Climate Change (20)

Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
Resolution
ResolutionResolution
Resolution
 
Paradigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningParadigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learning
 
The spectre of the spectrum
The spectre of the spectrumThe spectre of the spectrum
The spectre of the spectrum
 
Phylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-EmondPhylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-Emond
 
(Talk in Powerpoint Format)
(Talk in Powerpoint Format)(Talk in Powerpoint Format)
(Talk in Powerpoint Format)
 
U Florida / Gainesville talk, apr 13 2011
U Florida / Gainesville  talk, apr 13 2011U Florida / Gainesville  talk, apr 13 2011
U Florida / Gainesville talk, apr 13 2011
 
2012 hpcuserforum talk
2012 hpcuserforum talk2012 hpcuserforum talk
2012 hpcuserforum talk
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013
 
Modeling full scale-data(2)
Modeling full scale-data(2)Modeling full scale-data(2)
Modeling full scale-data(2)
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part I
 
Exploring explainable machine learning for detecting changes in climate
Exploring explainable machine learning for detecting changes in climateExploring explainable machine learning for detecting changes in climate
Exploring explainable machine learning for detecting changes in climate
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in RFinding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
 
2014 nci-edrn
2014 nci-edrn2014 nci-edrn
2014 nci-edrn
 
Using explainable AI to identify key regions of climate change in GFDL SPEAR ...
Using explainable AI to identify key regions of climate change in GFDL SPEAR ...Using explainable AI to identify key regions of climate change in GFDL SPEAR ...
Using explainable AI to identify key regions of climate change in GFDL SPEAR ...
 
Surveys
SurveysSurveys
Surveys
 
End of Sprint 5
End of Sprint 5End of Sprint 5
End of Sprint 5
 
EOS5 Demo
EOS5 DemoEOS5 Demo
EOS5 Demo
 

Plus de Salford Systems

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsSalford Systems
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Salford Systems
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Salford Systems
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningSalford Systems
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerSalford Systems
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like YouSalford Systems
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To RememberSalford Systems
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideSalford Systems
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to marsSalford Systems
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher EducationSalford Systems
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingSalford Systems
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hivSalford Systems
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning CombinationSalford Systems
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSalford Systems
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998Salford Systems
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPMSalford Systems
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7Salford Systems
 

Plus de Salford Systems (20)

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
 
SPM v7.0 Feature Matrix
SPM v7.0 Feature MatrixSPM v7.0 Feature Matrix
SPM v7.0 Feature Matrix
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
 

Dernier

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 

Dernier (20)

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 

Global Modeling of Biodiversity and Climate Change

  • 1. Global Modeling of Biodiversity and Climate Change Falk Huettmann et al. -EWHALE lab- Biology and Wildlife Department Institute of Arctic Biology University of Alaska-Faibanks Fairbanks Alaska
  • 3. Re. Scientific Thinking and Thought Karl Popper Leo Breiman Felix Shtilmark Herman Daly Dave Carlson
  • 4. Scientific Landmines ?! Spatial/Geographic Information Systems (GIS) and… Data Sharing (online) Machine Learning Predictions Data Mining Diseases Metadata (Influenza) Sustainability Economic Growth problem Management
  • 5. Central to our work: Predictions in Space and Time, e.g. done best with Machine Learning -quantitative -spatial -statistical interactions included -one formula -one algorithm -repeatable -testable -transparent -open access
  • 6. How GIS and machine learning connect… A Work Flow ArcGIS 10.2 Salford R GME Python etc
  • 7. Tree/CART - Family Binary recursive partitioning Temp>15 Precip <100 Temp<5 YES NO Leo Breiman 1984, and others PURITY METRIC OF NODES
  • 8. TreeNet The more nodes (~A sequence of CARTs) …the more detail …the slower ‘boosting’ + + + + each explains the remaining variance til the end… ROC ROC curves for accuracy tests Importance Value Variable LDUSE 100.00 Score |||||||||||||||||||||||||||||||||||||||||| e.g. correctly predicted absence app. 97% TAIR_AUG 97.62 ||||||||||||||||||||||||||||||||||||||||| HYDRO94.35 |||||||||||||||||||||||||||||||||||||||| DEM94.01 ||||||||||||||||||||||||||||||||||||||| e.g. correctly predicted presence app. 92% PREC_AUG 90.17 |||||||||||||||||||||||||||||||||||||| POP 82.54 Difficult to interpret |||||||||||||||||||||||||||||||||| HMFPT81.46 |||||||||||||||||||||||||||||||||| =>Apply to a dataset for predictions but good graphs
  • 9. RandomForest (Prasad et al. 2006, Boosting & Bagging algorithms Furlanello et al. 2003 Handles ‘noise’, interactions Breiman 2001) and categorical data fine! Random set 1 Random set of Columns (Predictors) DEM Slope Aspect Climate Land- cover Random set of Rows 1 Random set 2 (Cases) 2 3 4 5 Average Final Tree from e.g.>2000 trees done by VOTING Bagging: Optimization based on In-Bag, Out-of Bag samples In RF no pruning => Difficult to overfit Difficult to interpret (robust) but good graphs
  • 10. Machine Learning example with GIS: Spoon-billed Sandpiper and Predictions (where are the wintering grounds of ca. 1000 highly endangered birds…?) (breeding, Kamchatka) (winter) Engler et al. (in prep)
  • 11. Data means Metadata and Data Management (specifically for GIS, for science projects, machine learning and for graduate students) ___________Field Season 1_________ ___________Field Season 2 & 3_________ Raw Dataset 1 Metadata Raw Dataset 2 Metadata Raw Dataset 3 Metadata etc. Raw Dataset 4 Metadata A. Baltensperger Raw Dataset 5 Metadata http://mercury.ornl.gov/clearinghouse/ => Digital Publications
  • 12. Two books by the EWHALE lab re. Predictions and related Philosophies as presented here
  • 13. Students & Projects of the EWHALE lab Andy Baltensperger Katherine Miller Shana Losbaugh Sue Hazlett Tim Mullet Keiko Akasofu Herrick
  • 14. Students & Projects of the EWHALE lab Ben Best Imme Rutzen Betsy Young Brian Young Michal Lindgren Zach Meyers
  • 15. Students & Projects of the EWHALE lab: Visitors Moritz Schmid Laszlo Koever (Uni Goettingen, (Uni Debrecen, Germany) Hungary) David Lieske Dmitry Korobitsyn (Mount Allison, (Uni Archangelsk, Canada) Russia) Cynthia Resendiz (Mexico)
  • 16. Our Business Model NOT A WETLAB NOT FOR RE-CHARGE CONSTANT, STEADY SMALL FLOW
  • 17. Some Examples of what the EWHALE lab does, internationally (~how Falk spent his sabbatical and time)
  • 18. Bioice/Iceland: A research cruise “in” a predictive model… ‘RV Meteor’ (Germany)
  • 19. Ocean View I: A Global Benthos Model…(RandomForest Predictions) Wei et al. (2011). Global Patterns and Predictions of Seafloor Biomass using Random Forests. PLOS 5(12): e15323.
  • 20. Ocean View II: Dimethylsulfid (DMS), globally per month Humphries et al. (in review)
  • 21. Spatial Predictions of Arctic (Pelagic) Seabirds What Data are used: Pelagic Seabird Data ?! Public data + High Quality Relevance of Arctic Content Specimen Collections vs. + Metadata ?! (Polarstern)
  • 22. Spatial Predictions of Arctic (Pelagic) Seabirds What Environmental Data were Used (Listed in no order) 1. Distance to ice edge 2. Sea temperature at 10m depth 3. Sea temperature at 0m depth 4. Phosphate concentration at 10m depth 5. Silicate concentration at surface Public Sources & 6. Phosphate concentration at surface Availability 7. Salinity at 20m depth 8. Distance to Settlements (!) 9.Salinity at surface Huettmann & 10.Silicate10m depth Hazlett (2009) 11. Discharge from rivers for 50 layers 12. Distance to shelf edge 13. Seaice thickness 14. Nitrate concentration at surface 15. DMS (Di-Methyl Sufide) at surface (G. Humphries in prep.) 16. Nitrate concentration at 10m depth 17. Bathymetric slope
  • 23. Spatial Predictions of Arctic (Pelagic) Seabirds How it looks like: Training and Assessment Data Env. Data Presence (blue) vs. Random (red) (Pseudo- + absence) … Algorithm =>Predictions
  • 24. Spatial Predictions of Arctic (Pelagic) Seabirds How it looks like: Training and Assessment Data Env. Data Presence (blue) vs. Random (red) (Pseudo- + absence) Assessment (green; telemetry … O. Gilg) Algorithm =>Predictions
  • 25. Spatial Predictions of Arctic (Pelagic) Seabirds How it looks like: Predictions Prediction Surface Legend Red/Yellow=Presence t 1 af Light blue: Weak Dr Presence Dark blue: Pseudo- absence
  • 26. Spatial Predictions of Arctic (Pelagic) Seabirds How it looks like: Predictions and its data Prediction Surface Legend Red/Yellow=Presence t 1 af Light Blue: Weak Dr Presence Dark Blue: Pseudo- absence Green: Assessment Data (O.Gilg)
  • 27. Circumpolar Arctic: 27 Seabird Open Access Predictions Tufted Puffin Horned Puffin Northern Fulmar …add up all Ivory Gull Ross’s Gull Black-legged Kittiwake predictions… Huettmann et al. (2011)
  • 28. Circumpolar Arctic: Putting Models to Use Seabird vs. =>We are running out of space and time in the Arctic (and anywhere else)
  • 29. Circumpolar Arctic: Alaskan Crab Ensemble Model => Open Access (Raw Data + Model) in a highly commercial setting! Compiled Raw Crab Data Predicted Crab Pres/Abs (and Abundance) Snow Crab off Alaska (Hardy et al. 2011)
  • 30. Circumpolar Arctic: Marine Protected Areas (MPA) and Biodiversity MARXAN optimization based on over 60 GIS layers =>Over 20 GIS data layers for each Pole (Arctic and Antarctic) Huettmann and Hazlett (2010)
  • 31. Antarctica: MPA by WWF-Australia for the Scientific Committee on Antarctic Research (SCAR) WWF-Australia, SCAR 2012
  • 32. Antarctica: Isopode Data, Penguin Data etc Kaiser et al., French Antarctic Service Data
  • 34. What is a Soundscape? • Biological Sounds – Biophony • Geophysical Sounds – Geophony • Anthropogenic Sounds – Anthrophony Mullet et al. (in prep)
  • 35. Model-Predicting Sound (‘Soundscapes’) Models based on: - 7 permanent sound stations - Stratified according to expected sounds - Rotate 6 sound stations – Input GPS coordinates and related sound data into TreeNet modeling software – Include environmental and human-related covariates (e.g., vegetation, distance to roads, aspect) – Extrapolate sound levels and sound source data to rest of Refuge Mullet et al. (in prep)
  • 36. Spatial Predictions of Forest Cover in Alaska Young et al. (in prep)
  • 37. Spatial Predictions of Forest Cover in Alaska Young et al. (in prep)
  • 38. Spatial Predictions of Forest Cover in Alaska 2010 2050 Young et al. (in prep)
  • 39. Regionalized IPCC models, e.g. Alaska Temperature (August and January) (SNAP UAF data) 2099 2008 Murohy et al. (2010)
  • 40. Alaskan Caribou: Summer & winter ranges 2008 & 2099 2008 2008 2099 Summer Range Model in RF with IPCC Murphy et al. (2010) Winter Range
  • 41. RandomForest: Supervised and Unsupervised Classification Supervised Classification: -Multiple Regression (classification or continuous) -Multiple Response e.g. YAIMPUTE RandomForest Unsupervised Classification: 1. Proximity Matrix via Bagging/Voting (RF) 2. Similarity Matrix 3. e.g. Regular Clustering (mclust, PAM) 3. Visualize Result
  • 42. 11 Cliome Clusters (RF) Climate Cluster Data, Canada & Alaska Credit: M. Lindgren et al.
  • 43. Now, a topical shift to Circumpolar Arctic and Zooplankton Forecasting til 2100 Metridia longa showed the highest increase in the copepodite life stage from 2010 to 2100. Credit: M. Schmid et al.
  • 44. Calanus hyperboreus showed the highest change in the predicted relative index of depth from 2010 to 2100. Credit: M. Schmid et al.
  • 45. GMBA Case Study: Himalaya Uplands Plant Database Bernhard Dickoré et al. (red: sampling points) + FGDC NBII/ISO Metadata
  • 46. A High Priority Ethnomedicinal Plant in Nepal Dactylorhiza hatagirea (Marsh Orchids) 81 “points” Ethnobotanical Use: Tubers are used as nervine tonic and aphrodisiac. It is also used to treat cuts, wounds, cough and anemia.
  • 47. Prediction of a a High Priority Ethnomedicinal Plant in Nepal Dactylorhiza hatagirea (Marsh Orchids)
  • 48. MARXAN Solution for the Three Poles: 50% Protection Scenario (birds, glaciers/ice and freezing temperatures) x Legend: Selection Frequency! (the darker the more frequently selected) Note: Terrestrial areas of Arctic & Antarctic are not included, yet
  • 49. Another book by the EWHALE lab
  • 50. Avian Influenza (AI) Prediction globally… (all based on Machine Learning!) Global AI model (Ecological Niche) based on K.Herrick-Akasofu,F. Huettmann,J. Runstadler et al. (unpublished; forthcoming thesis chapter)
  • 51. Acknowledgements L. Strecker, all co-authors, all EWHALE lab students, NCEAS, University of Alaska-Fairbanks, D. Steinberg (Salford Systems Ltd), COML, CAML, ArcOD, GMBA, IPY, A.W. Diamond, and many colleagues worldwide (a 20 years summary...) AND HUGE THANKS TO SALFORD SYSTEMS & Dan Steinberg’s team