SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
EURO-­‐BASIN,	
  www.euro-­‐basin.eu	
     Introduc)on	
  to	
  Sta)s)cal	
  Modelling	
  Tools	
  for	
  Habitat	
  Models	
  Development,	
  26-­‐28th	
  Oct	
  2011	
  
2


                     OUTLINE
• Why to model?


• Habitat models


• Model properties


• Steps for modelling


• What about data?
3


             WHY TO MODEL?
• “All models are wrong, some models are useful” (G. Box)


• Models are how we understand the world:
       We see the world through models
       We learn about the world using formal descriptions


• Model types:


   – Static vs dynamic
   – Explanatory vs predictive
   – Deterministic vs stochastic
   – Discrete vs continuous
4


             HABITAT MODELS
• Habitat models are focused on how environmental factors control
  the distribution of species and communities.


• Multiple applications:


    – Biogeography, impact of the global change, management,
      conservation, ecology, …


• New conceptual and operative advances due to the growth in
  computing power, e.g. GIS, remote sensing, new statistical
  modelling tools (computer intensive), etc
5


          MODEL PROPERTIES
Some desirable model properties:


• Parsimony (Occam’s razor): “All things being equal, the simplest
  solution tends to be the best one”
• Tractability: easy to be analysed
• Conceptually insightful: reveal fundamental properties
• Generalizability: can be applied to other situations/species/…
• Empirical consistency: consistent with the available data
• Falsifiability: can be tested by observations
• Predictive precision
6


         MODEL PROPERTIES



  Predictive habitat
distribution models




                Levins (1966); Sharpe (1990); Guisan and Zimmermann (2000)
7


 MODEL PROPERTIES

                              COMPLEXITY


        GENERALITY




The more complex model is not necessarily the best…
8


STEPS FOR MODELLING
 1) Conceptual phase


 2) Model formulation


 3) Model calibration


 4) Spatial predictions


 5) Model evaluation


 6) Model applicability
9


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
10


             1. Conceptual phase
• Some sort of theoretical model should be in mind, before a statistical
  model is even considered
• This phase includes:
    – Literature review
    – Define an up-to-date conceptual model
    – Set multiple hypothesis
    – Assess available and missing data
    – Identify appropriate sampling strategy for new data
    – Choose appropriate spatio-temporal resolution and geographic
      extent
    – Identify the most appropriate statistical methods for the other
      phases
11


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
12


             2. Model formulation
• The model depends on the type of response variable and its
  associated probability distribution


        Distribution             Examples
        Gaussian                 Biomass
        Poisson                  Individual counts
        Negative Binomial        Individual counts
        Multinomial              Communities
        Binomial                 Presence/absence
13


2. Model formulation




             Guisan and Zimmermann (2000)
14



REGRESSION ANALYSIS   2. Model formulation




                                   50
                                   40
                                   30
                               y
                                   20
                                   10
                                   0




                                        0     2       4       6   8   10

                      oct-11            © AZTI-Tecnalia
                                                          x            14
15



REGRESSION ANALYSIS   2. Model formulation




                               50
                               40
                               30
                           y
                               20
                               10
                               0




                                    0     2       4       6   8   10

                                    © AZTI-Tecnalia
                                                      x            15
16



REGRESSION ANALYSIS   2. Model formulation




                                   10
                                   5
                               y
                                   0
                                   -5




                                        0.0    0.2    0.4       0.6   0.8   1.0

                      oct-11             © AZTI-Tecnalia
                                                            x                 16
17



REGRESSION ANALYSIS   2. Model formulation




                                   10
                                   5
                               y
                                   0
                                   -5




                                        0.0    0.2    0.4       0.6   0.8   1.0

                      oct-11             © AZTI-Tecnalia
                                                            x                 17
18



REGRESSION ANALYSIS        2. Model formulation




                        LINK
                      FUNCTION



                         The response variable y can follow distributions like:
                             NORMAL, BINOMIAL, POISSON, GAMMA, etc

                                            McCullagh and Nelder (1989); Dobson (2008)
                                                © AZTI-Tecnalia                   18
                            oct-11
19



REGRESSION ANALYSIS        2. Model formulation




                        LINK                                                      SMOOTHS
                      FUNCTION



                         The response variable y can follow distributions like:
                             NORMAL, BINOMIAL, POISSON, GAMMA, etc

                                                Hastie and Tibshirani (1990); Wood (2006)
                                                 © AZTI-Tecnalia                     19
                            oct-11
20



REGRESSION ANALYSIS      2. Model formulation

                           Modelo lineal                          Modelo aditivo
                             (LM)                                    (AM)




                      Modelo lineal generalizado            Modelo aditivo generalizado
                               (GLM)                                 (GAM)




                          oct-11               © AZTI-Tecnalia                     20
21



REGRESSION ANALYSIS         2. Model formulation
                      Other regression models:


                      • Mixed models: LM, GLM and GAMs including random effect
                        terms. Useful for meta-analysis.


                      • Quantile regression: the quantiles are modelled instead of
                        the mean. Useful for finding limiting factors


                      • Segmented regression: the model changes depending on a
                        partition of the explanatory variable. Useful for detecting
                        regime changes


                      • Spatial autocorrelation and autoregressive models
22


CLASSIFICATION TECHNIQUES           2. Model formulation
                            • Classification is the placement of species and/or sample units
                              into groups based on the environmental variables
23


CLASSIFICATION TECHNIQUES           2. Model formulation
                            • Classification is the placement of species and/or sample units
                              into groups based on the environmental variables


                            • Many techniques included: classification decision tree,
                              regression decision tree, rule-based classification, maximum-
                              likelihood classification


                            • Mainly two groups:
                               – Supervised classification: a training data set is required
                                 (groups are known beforehand)
                               – unsupervised classification: groups are unknown and need
                                 to be defined, like in cluster analysis
24


ENVIRONMENTAL ENVELOPES           2. Model formulation
                          • The environmental envelope of a species is defined as the set
                            of environments within which it is believed that the species can
                            persist (Walker and Cocks, 1991)
25


ENVIRONMENTAL ENVELOPES           2. Model formulation
                          • The environmental envelope of a species is defined as the set
                            of environments within which it is believed that the species can
                            persist (Walker and Cocks, 1991)


                          • Examples of models:


                              – BIOCLIM: minimal       rectilinear   envelopes   based   on
                                classification trees
                              – HABITAT: convex        polytope      envelopes   based   on
                                classification trees
                              – DOMAIN: based on multivariate distance metrics
26


                                2. Model formulation
                        • Ordination is the arrangement or ‘ordering’ of species and/or
ORDINATION TECHNIQUES



                          sample units along gradients


                        • Usually applied to community data matrices (row: species,
                          column: samples, value: abundance)
27


                                   2. Model formulation
                        •   Indirect gradient analysis (no environmental data used)
                             – Distance-based approaches:
ORDINATION TECHNIQUES



                                  • Polar ordination, Principal Coordinates Analysis, Nonmetric
                                    Multidimensional Scaling
                             – Eigenanalysis-based approaches
                                  • Linear model
                                       – Principal Components Analysis
                                  • Unimodal model
                                       – Correspondence Analysis, Detrended Correspondence Analysis
                        •   Direct gradient analysis (environmental data used)
                             – Linear model
                                  • Redundancy Analysis
                             – Unimodal model
                                  • Canonical Correspondence Analysis, Detrended Canonical
                                    Correspondence Analysis


                                                                         ter Braak and Prentice (1988)
28


                          2. Model formulation
                  • Models inspired in the human-brain (interconnected group of
                    neurons)
NEURAL NETWORKS




                  • They define a non-linear function, decomposed further as a
                    weighted sum of functions, that similarly can be further
                    decomponsed, etc. So, complex non-parametric model (black-
                    box?)


                  • Adjusted by varying parameters, connection weights, or
                    specifics of the architecture such as the number of neurons or
                    their connectivity


                  • Few examples available yet
29


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
30


             3. Model calibration
• It includes model fitting (find the best value of the unknown
  parameters to improve the agreement between the data and model
  outputs) and model selection (which explanatory variables to be
  included)


• To take into account:
   – Use of predictors that are ecologically relevant: direct vs indirect
     (proxy) variables
   – Correlation between explanatory variables


• Each method has each own diagnostic tools according to their
  assumptions, e.g, in regression models the residual deviance
31


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
32


             4.Spatial predictions

• Spatial predictions can be done on the data set used for calibration
  or on new data sets. Care must be taken if predictions are done in a
  new data set with new combinations between the explanatory
  variables and for values outside the range of values in the data set
  for calibration


• GIS tools are very often used, but still many statistical models are
  not implemented in a GIS environment
33


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
34


              5. Model evaluation
• The aim is to evaluate the predictive power of a model


• If only one data set is available (we have used the data set for
  calibration), bootstrap, cross-validation, jacknife


• If other data sets are available (independent of the calibration data
  set), predicted and observed values are compared using:
    – the same goodness of fit measure as used for model calibration
    – any other measure of association


    The data sets for calibration and evaluation are called respectively
    training and evaluation data sets. Sometimes the original single
    data set is split in two (split-sample approach)
35


STEPS FOR MODELLING




      APPLICABILITY




               Guisan and Zimmermann (2000)
36


            6. Model applicability
• It refers to the domain over which a validated model can be properly
  used


• Potential uses (Decoursey, 1992):


   – Screening


   – Research


   – Planning, monitoring and assessment
37


         WHAT ABOUT DATA?
• Data is even more important than the model itself.


• Usually from multiple sources: surveys (continuous, stations, vertical
  profiles), remote sensing, circulation models, …


• The scale of the response and the environmental variables might not
  be the same. Need to define a common scale unit. Sometimes
  interpolation might be needed. This might include additional
  uncertainities


• Simple exploratory statistics and figures can be very useful before
  even start thinking on any model. They also help to spot errors in the
  data.
EURO-­‐BASIN,	
  www.euro-­‐basin.eu	
     Introduc)on	
  to	
  Sta)s)cal	
  Modelling	
  Tools	
  for	
  Habitat	
  Models	
  Development,	
  26-­‐28th	
  Oct	
  2011	
  

Contenu connexe

Plus de DTU - Technical University of Denmark

Plus de DTU - Technical University of Denmark (6)

Open Access For Global Climate Change Factsheet 2011
Open Access For Global Climate Change Factsheet 2011Open Access For Global Climate Change Factsheet 2011
Open Access For Global Climate Change Factsheet 2011
 
Introduction to gis by ibon gasparsoro euro basin training
Introduction to gis by ibon gasparsoro euro basin trainingIntroduction to gis by ibon gasparsoro euro basin training
Introduction to gis by ibon gasparsoro euro basin training
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
 
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine FeldenIntroduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
 
Model Validation, performance measures, models comparison and Weka (open sour...
Model Validation, performance measures, models comparison and Weka (open sour...Model Validation, performance measures, models comparison and Weka (open sour...
Model Validation, performance measures, models comparison and Weka (open sour...
 
Modelling Spatial Distribution of fish, by Benjamin Planque
Modelling Spatial Distribution of fish, by Benjamin PlanqueModelling Spatial Distribution of fish, by Benjamin Planque
Modelling Spatial Distribution of fish, by Benjamin Planque
 

Dernier

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Dernier (20)

Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 

Predictive Habitat Distribution Models, Leire Ibaibarriaga

  • 1. EURO-­‐BASIN,  www.euro-­‐basin.eu   Introduc)on  to  Sta)s)cal  Modelling  Tools  for  Habitat  Models  Development,  26-­‐28th  Oct  2011  
  • 2. 2 OUTLINE • Why to model? • Habitat models • Model properties • Steps for modelling • What about data?
  • 3. 3 WHY TO MODEL? • “All models are wrong, some models are useful” (G. Box) • Models are how we understand the world: We see the world through models We learn about the world using formal descriptions • Model types: – Static vs dynamic – Explanatory vs predictive – Deterministic vs stochastic – Discrete vs continuous
  • 4. 4 HABITAT MODELS • Habitat models are focused on how environmental factors control the distribution of species and communities. • Multiple applications: – Biogeography, impact of the global change, management, conservation, ecology, … • New conceptual and operative advances due to the growth in computing power, e.g. GIS, remote sensing, new statistical modelling tools (computer intensive), etc
  • 5. 5 MODEL PROPERTIES Some desirable model properties: • Parsimony (Occam’s razor): “All things being equal, the simplest solution tends to be the best one” • Tractability: easy to be analysed • Conceptually insightful: reveal fundamental properties • Generalizability: can be applied to other situations/species/… • Empirical consistency: consistent with the available data • Falsifiability: can be tested by observations • Predictive precision
  • 6. 6 MODEL PROPERTIES Predictive habitat distribution models Levins (1966); Sharpe (1990); Guisan and Zimmermann (2000)
  • 7. 7 MODEL PROPERTIES COMPLEXITY GENERALITY The more complex model is not necessarily the best…
  • 8. 8 STEPS FOR MODELLING 1) Conceptual phase 2) Model formulation 3) Model calibration 4) Spatial predictions 5) Model evaluation 6) Model applicability
  • 9. 9 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 10. 10 1. Conceptual phase • Some sort of theoretical model should be in mind, before a statistical model is even considered • This phase includes: – Literature review – Define an up-to-date conceptual model – Set multiple hypothesis – Assess available and missing data – Identify appropriate sampling strategy for new data – Choose appropriate spatio-temporal resolution and geographic extent – Identify the most appropriate statistical methods for the other phases
  • 11. 11 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 12. 12 2. Model formulation • The model depends on the type of response variable and its associated probability distribution Distribution Examples Gaussian Biomass Poisson Individual counts Negative Binomial Individual counts Multinomial Communities Binomial Presence/absence
  • 13. 13 2. Model formulation Guisan and Zimmermann (2000)
  • 14. 14 REGRESSION ANALYSIS 2. Model formulation 50 40 30 y 20 10 0 0 2 4 6 8 10 oct-11 © AZTI-Tecnalia x 14
  • 15. 15 REGRESSION ANALYSIS 2. Model formulation 50 40 30 y 20 10 0 0 2 4 6 8 10 © AZTI-Tecnalia x 15
  • 16. 16 REGRESSION ANALYSIS 2. Model formulation 10 5 y 0 -5 0.0 0.2 0.4 0.6 0.8 1.0 oct-11 © AZTI-Tecnalia x 16
  • 17. 17 REGRESSION ANALYSIS 2. Model formulation 10 5 y 0 -5 0.0 0.2 0.4 0.6 0.8 1.0 oct-11 © AZTI-Tecnalia x 17
  • 18. 18 REGRESSION ANALYSIS 2. Model formulation LINK FUNCTION The response variable y can follow distributions like: NORMAL, BINOMIAL, POISSON, GAMMA, etc McCullagh and Nelder (1989); Dobson (2008) © AZTI-Tecnalia 18 oct-11
  • 19. 19 REGRESSION ANALYSIS 2. Model formulation LINK SMOOTHS FUNCTION The response variable y can follow distributions like: NORMAL, BINOMIAL, POISSON, GAMMA, etc Hastie and Tibshirani (1990); Wood (2006) © AZTI-Tecnalia 19 oct-11
  • 20. 20 REGRESSION ANALYSIS 2. Model formulation Modelo lineal Modelo aditivo (LM) (AM) Modelo lineal generalizado Modelo aditivo generalizado (GLM) (GAM) oct-11 © AZTI-Tecnalia 20
  • 21. 21 REGRESSION ANALYSIS 2. Model formulation Other regression models: • Mixed models: LM, GLM and GAMs including random effect terms. Useful for meta-analysis. • Quantile regression: the quantiles are modelled instead of the mean. Useful for finding limiting factors • Segmented regression: the model changes depending on a partition of the explanatory variable. Useful for detecting regime changes • Spatial autocorrelation and autoregressive models
  • 22. 22 CLASSIFICATION TECHNIQUES 2. Model formulation • Classification is the placement of species and/or sample units into groups based on the environmental variables
  • 23. 23 CLASSIFICATION TECHNIQUES 2. Model formulation • Classification is the placement of species and/or sample units into groups based on the environmental variables • Many techniques included: classification decision tree, regression decision tree, rule-based classification, maximum- likelihood classification • Mainly two groups: – Supervised classification: a training data set is required (groups are known beforehand) – unsupervised classification: groups are unknown and need to be defined, like in cluster analysis
  • 24. 24 ENVIRONMENTAL ENVELOPES 2. Model formulation • The environmental envelope of a species is defined as the set of environments within which it is believed that the species can persist (Walker and Cocks, 1991)
  • 25. 25 ENVIRONMENTAL ENVELOPES 2. Model formulation • The environmental envelope of a species is defined as the set of environments within which it is believed that the species can persist (Walker and Cocks, 1991) • Examples of models: – BIOCLIM: minimal rectilinear envelopes based on classification trees – HABITAT: convex polytope envelopes based on classification trees – DOMAIN: based on multivariate distance metrics
  • 26. 26 2. Model formulation • Ordination is the arrangement or ‘ordering’ of species and/or ORDINATION TECHNIQUES sample units along gradients • Usually applied to community data matrices (row: species, column: samples, value: abundance)
  • 27. 27 2. Model formulation • Indirect gradient analysis (no environmental data used) – Distance-based approaches: ORDINATION TECHNIQUES • Polar ordination, Principal Coordinates Analysis, Nonmetric Multidimensional Scaling – Eigenanalysis-based approaches • Linear model – Principal Components Analysis • Unimodal model – Correspondence Analysis, Detrended Correspondence Analysis • Direct gradient analysis (environmental data used) – Linear model • Redundancy Analysis – Unimodal model • Canonical Correspondence Analysis, Detrended Canonical Correspondence Analysis ter Braak and Prentice (1988)
  • 28. 28 2. Model formulation • Models inspired in the human-brain (interconnected group of neurons) NEURAL NETWORKS • They define a non-linear function, decomposed further as a weighted sum of functions, that similarly can be further decomponsed, etc. So, complex non-parametric model (black- box?) • Adjusted by varying parameters, connection weights, or specifics of the architecture such as the number of neurons or their connectivity • Few examples available yet
  • 29. 29 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 30. 30 3. Model calibration • It includes model fitting (find the best value of the unknown parameters to improve the agreement between the data and model outputs) and model selection (which explanatory variables to be included) • To take into account: – Use of predictors that are ecologically relevant: direct vs indirect (proxy) variables – Correlation between explanatory variables • Each method has each own diagnostic tools according to their assumptions, e.g, in regression models the residual deviance
  • 31. 31 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 32. 32 4.Spatial predictions • Spatial predictions can be done on the data set used for calibration or on new data sets. Care must be taken if predictions are done in a new data set with new combinations between the explanatory variables and for values outside the range of values in the data set for calibration • GIS tools are very often used, but still many statistical models are not implemented in a GIS environment
  • 33. 33 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 34. 34 5. Model evaluation • The aim is to evaluate the predictive power of a model • If only one data set is available (we have used the data set for calibration), bootstrap, cross-validation, jacknife • If other data sets are available (independent of the calibration data set), predicted and observed values are compared using: – the same goodness of fit measure as used for model calibration – any other measure of association The data sets for calibration and evaluation are called respectively training and evaluation data sets. Sometimes the original single data set is split in two (split-sample approach)
  • 35. 35 STEPS FOR MODELLING APPLICABILITY Guisan and Zimmermann (2000)
  • 36. 36 6. Model applicability • It refers to the domain over which a validated model can be properly used • Potential uses (Decoursey, 1992): – Screening – Research – Planning, monitoring and assessment
  • 37. 37 WHAT ABOUT DATA? • Data is even more important than the model itself. • Usually from multiple sources: surveys (continuous, stations, vertical profiles), remote sensing, circulation models, … • The scale of the response and the environmental variables might not be the same. Need to define a common scale unit. Sometimes interpolation might be needed. This might include additional uncertainities • Simple exploratory statistics and figures can be very useful before even start thinking on any model. They also help to spot errors in the data.
  • 38. EURO-­‐BASIN,  www.euro-­‐basin.eu   Introduc)on  to  Sta)s)cal  Modelling  Tools  for  Habitat  Models  Development,  26-­‐28th  Oct  2011