Peter Sarlin. Toward robust early-warning models: A horse race, ensembles and model uncertainty

RiskLab
Toward robust early-warning models:
A horse race, ensembles and model uncertainty
Peter Sarlin, joint with Markus Holopainen
Hanken School of Economics and RiskLab Finland
Seminar at Bank of Estonia
June 30, 2015

RiskLab Motivation
An acute interest in new approaches to assess systemic risk
Financial crises triggered by various shocks (unpredictable)...
...but widespread imbalances build-up ex ante (identiable)
Early-warning models to identify systemic risk at early stages
Yet: which method(s) to use when can we trust results?

RiskLab Systemic risk
Systemic risk along two dimensions (Borio, 2009)
1. Build-up of risk in tranquil times abrupt unraveling in crisis
2. How risk is distributed and how shocks transmit in the system
Three types of systemic risks (ECB, 2010):
endogenous build-up and unraveling of widespread imbalances
exogenous aggregate shocks
contagion and spillover

RiskLabEarly-warning indicators models
Text-book example of 2-class classication: crisis vs. tranquil
To identify vulnerable states of a country you need...
Dates of historical crisis occurrences
Indicators to identify sources of vulnerability
Estimate the probability of being in a vulnerable state
Signaling: Monitor univariate indicators
Non/linear approaches for combining indicators
Set a threshold on the probability to optimize a loss function
Transforms probabilities into binary point forecasts (0/1)
Depends on preferences between type I/II errors

RiskLabEWIs Financial Stability Maps
Mapping the State of Financial Stability (joint with Peltonen)
How to represent mutliple indicators visually?
Large-volume and high-dimensional data
Clustering: Reduce large volumes of data
Projection: From high-dimensional to low-dimensional
Financial Stability Map based upon 14 macro-nancial
indicators for 28 economies from 1990Q12011Q4
VisRisk: A visualization platform for systemic risk analytics

RiskLabInterconnectedness EWMs
Interconnectedness of the banking sector as a vulnerability to crises
(joint with Rancan Peltonen)
This paper enriches an EWM with network measures
Financial networks of institutional sectors in Europe
MFI, INS, OFI, NFC, GOV, HH and ROW
Loans, deposits, debt and shares
Centrality of the MFI as an indicator for banking crises
Interconnectedness of the banking sector entails a vulnerability
Cross-border linkages capture vulnerabilities to crises...
...and larger domestic sectoral linkages amplies vulnerability...
...which yields useful predictions.
Most vulnerability descends from loans and debt securities

RiskLab Bank EWM
Predicting Bank Distress in Europe (with Betz, Oprica, Peltonen)
One of the rst EWMs for individual banks and analysis of
determinants of bank vulnerabilities in the EU
Introduces a new dataset of bank distress in Europe
Micro-macro perspective: banking sector MIP indicators
Loss function accounts for importance of individual banks
Conclusions
Importance of complementing bank-specic vulnerabilities with
macro-nancial indicators
EWM based on publicly available data would have been useful
to predict individual bank distress during this crisis
For a policymaker, it is essential to be more concerned of type
I/II errors related to systemically important banks

RiskLab Networks EWMs
Network linkages to predict bank distress (with Piloiu Peltonen)
Does predictive performance improve if the EWM is
augmented with estimated bank interdependencies?
Banks are interconnected, yet EWMs model individual distress
A bank's risk modeled as a function of its neighbors' risk
Conclusions
Two-step estimation incorporating neighbors' vulnerabilities
Accounting for interconnections improves EWM performance
Allows comparing relative eciency of dierent networks

RiskLabRiskRank: Joint measurement
RiskRank: Measuring interconnected systemic risk (with Mezei)
EWMs aggregate indicators network measures connectivity
We assume a hierarchical system of interconnected nodes
RiskRank: Joint measure of cyclical cross-sectional risk
Conclusions
Bottom-up aggregation: direct, indirect feedback eects
Improved performance for bank and country models
General framework to combine the 2 systemic risk dimensions

RiskLab This paper
A three-fold contribution:
Conduct a horse race of early-warning models (EWMs)
Test various approaches to aggregating these methods
Estimate model performance and output uncertainty
Key questions:
How EWMs perform in an objective robust ranking?
Is one above others or should they be used concurrently?
Statistical signicance
is a method better than others?
are the probabilities above the threshold?

RiskLab Literature
Early-warning method comparisons
Often entirely missing
Bilateral tests (e.g., Peltonen, '06; Marghescu et al., '11)
ESCB's horse race show: little comparability (Alessi et al., '14)
Aggregation or ensemble learning
No previous use of model aggregation
Parctly incorporated in RandomForest by Alessi Detken ('14)
Statistical signicance and uncertainty
El-Shagi et al. ('13): is a model useful?
Hurlin et al. ('14): similarity of two rms' risk measures

RiskLab Data
Quarterly data for 15 EU countries, from 19762014Q3
Systemic banking crises
Laeven and Valencia ESCB Heads of Research
Pre-crisis indicator: 5-12 quarters
Late-pre, crisis, and post-crisis periods removed
Macro-nancial indicators
asset price misalignments (house and stock prices)
excessive credit growth (growth and gaps)
business cycle indicators (GDP and ination)
macroeconomic factors (debt and CA)

RiskLab Methods in this paper
A horse race of multiple methods for early-warning exercises
Signal extraction
LDA QDA
Logit Logit LASSO
Naive Bayes
KNN
Classication tree Random forest
ANN ELM
SVM

RiskLab Taxonomy of methods
Predictive analytics
Clustering Classification
Covariance matrix
LDA
QDA
Logit
Logit
LASSO
Frequency table
Signal
extraction
Naive
Bayes
Decision
tree
Random
forest
Similarity
functions
KNN
Others
ANN
ELM
SVM
Regression

RiskLabEnsembles and uncertainty
Ensemble approaches for concurrent use of EWMs
Best-of voting
Arithmetic weighted averages of probabilities
Empirical resampling distributions to assess uncertainty
Use repeated cross-validation and bootstrapping
Model performance uncertainty
Variation in relative Usefulness of EWMs
Model output uncertainty
Variation in probabilities and thresholds

RiskLab Evaluation criterion
Apply usefulness criterion (Alessi-Detken, '11 Sarlin, '13):
Actual class Ij
Crisis No crisis
Predicted class Pj
Signal True positive (TP) False positive (FP)
No signal False negative (FN) True negative (TN)
Find the threshold that minimizes a loss function that depends
on policymakers' preferences µ between Type I errors
(T1 = FN/(FN + TP)) (missed crises) and Type II errors
(T2 = FP/(TN + FP)) (false alarms) and unconditional
probabilities of the events P1 and P2
L(µ) = µT1P1 + (1 − µ)T2P2
Dene absolute usefulness Ua as the dierence between the
loss of disregarding the model (available Ua) and the loss of
the model
Ua(µ) = min [µP1, (1 − µ) P2] − L(µ)

RiskLabEvaluation estimation strategies
Relative usefulness Ur is the ratio of captured Ua to available
Ua, given µ and P1
Ur (µ) = Ua(µ)/min [µP1, (1 − µ) P2]
Model selection to optimize free parameters via a grid search
Cross-validation exercise (repeated CV)
Assess generalization performance
10 folds
Real-time recursive exercise (bootstrapping)
Test prediction performance from 2006Q2 - 2014Q3
Use only data available at that specic point in time

RiskLabCross-validated horse race
Rank(*) Method Ur (µ) SE AUC SE
1(4) KNN 92 % 0.016 0.987 0.006
2(7) SVM 91 % 0.017 0.998 0.001
3(8) Neural network 90 % 0.022 0.996 0.003
4(8) ELM 88 % 0.023 0.991 0.005
5(8) Weighted 88 % 0.012 0.995 0.0006
6(8) Voting 88 % 0.017 0.947 0.008
7(11) Best-of 84 % 0.030 0.991 0.005
8(11) Non-weighted 83 % 0.010 0.992 0.0007
9(11) Random forest 82 % 0.042 0.996 0.001
10(11) QDA 79 % 0.024 0.984 0.001
11(13) Classif. tree 64 % 0.027 0.882 0.018
12(13) Naive Bayes 60 % 0.019 0.948 0.002
13(15) Logit 54 % 0.018 0.933 0.008
14(15) Logit LASSO 53 % 0.017 0.934 0.001
15(16) LDA 48 % 0.022 0.927 0.002
16(-) Signaling 4 % 0.014 0.712 0.000

RiskLab Recursive horse race
Rank(*) Method Ur (µ) SE AUC SE
1(8) Best-of 76 % 0.074 0.92 0.023
2(5) Weighted 75 % 0.034 0.95 0.010
3(10) Non-weighted 72 % 0.040 0.94 0.011
4(10) KNN 66 % 0.047 0.97 0.016
5(10) Voting 64 % 0.044 0.86 0.016
6(10) Neural network 64 % 0.063 0.94 0.011
7(10) QDA 61 % 0.071 0.97 0.008
8(10) ELM 60 % 0.066 0.91 0.020
9(13) SVM 52 % 0.122 0.84 0.069
10(16) Logit 44 % 0.055 0.90 0.012
11(16) Random forest 39 % 0.162 0.94 0.010
12(16) Logit LASSO 37 % 0.054 0.87 0.010
13(16) Naive Bayes 24 % 0.076 0.86 0.015
14(16) LDA 23 % 0.064 0.83 0.013
15(16) Classif. tree 22 % 0.108 0.75 0.059
16(-) Signaling -39 % 0.057 0.62 0.007

RiskLabModel output uncertainty
Probabilities for UK SWE, real-time recursive exercise
Condence bands for probabilities and thresholds
2002 2004 2006 2008 2010 2012 2014
0.00.20.40.60.81.0
Country: United Kingdom
Probability,method:kknn
q q
q
q
q
q
Probability
Insignificant probability
Threshold
Crisis
Pre−crisis
2004 2006 2008 2010 2012 2014
0.00.20.40.60.81.0
Country: Sweden
Probability,method:kknn
qq
q
Probability
Insignificant probability
Threshold
Crisis
Pre−crisis

RiskLabModel output uncertainty
Rank Method All Ur (µ) Sig Ur (µ)
1 KNN 92 % 93 %
2 SVM 91 % 100 %
3 Neural network 90 % 100 %
4 ELM 88 % 100 %
5 Random forest 82 % 100 %
6 Weighted 88 % 94 %
8 Best-of 84 % 97 %
9 Non-weighted 83 % 92 %
10 QDA 79 % 88 %
11 Classif. tree 64 % 82 %
12 Naive Bayes 60 % 75 %
13 Logit 54 % 56 %
14 Logit LASSO 53 % 58 %
15 LDA 48 % 55 %
16 Signaling 4 % -7 %

RiskLab Conclusion
A three-fold contribution...
Objectively test many methods for early-warning analysis [1]
Introduce ensemble learning to early-warning analysis
Estimate model performance and output uncertainty
...and conclusion
Machine and ensemble learning approaches perform well
Aggregation decreases variation in model performance
Accounting for output uncertainty improves model performance

RiskLab
Thanks for your attention!

RiskLab Variables
Variable name Definition Transformation andadditional information
House prices to income Nominal house prices and nominal disposable income per head Ratio, indexbased in 2010
Current account to GDP Nominal current account balance and nominal GDP Ratio
Government debt to GDP Nominal general government consolidated gross debt and nominal GDP Ratio
Debt to service ratio Debt service costs and nominal income of households and non-financial corporations Ratio
Loans to income Nominal household loans and gross disposable income Ratio
Credit to GDP Nominal total credit to the private non-financial sector and nominal GDP Ratio
Bond yield Real long-termgovernment bond yield Level
GDP growth Real gross domestic product 1-year growth rate
Credit growth Real total credit to private non-financial sector 1-year growth rate
Inflation Real consumer price index 1-year growth rate
House price growth Real residential property price index 1-year growth rate
Stock price growth Real stock price index 1-year growth rate
Credit to GDP gap Nominal bank credit to the private non-financial sector and nominal GDP Absolute deviation fromtrend, λ =400,000
House price gap Deviation fromtrend of the real residential property price index Relative deviation fromtrend, λ =400,000

RiskLab Machine learning
Unsupervised learning
Exploring the past
Univariate, bivariate and multivariate
Supervised learning
Predicting the future
Regression and classication

RiskLab Predictive modelling
Examples of approaches for supervised learning:
linear discriminant analysis
logit analysis
decision trees
articial neural networks
support vector machines
As well as ensembles of multiple models

RiskLab Bias vs. variance
Model t: Opportunity and risk
ANNs are universal approximators for any continuous function
Logit analysis tends to be robust on any sample
Bias: error from erroneous assumptions in the learning
algorithm (undert)
Variance: error from sensitivity to small uctuations in the
training set (overt)
Regularize complexity with model selection criteria
Cross-validation: partitioning into folds and testing on the fold
left out
but also leave-one-out CV, AIC, BIC etc

RiskLab What is an ANN?
ANNs are composed of nodes connected by links
Layers of nodes: Input, hidden and output
Link weights are network parameters that are tuned iteratively
by a learning algorithm
Optimization to update network parameters
Commonly backpropagation to compute the actual gradients
Derivative of the cost function with respect to the weights
Update weights in a gradient-related direction
Optimization through gradient descent, Levenberg-Marquardt,
Gauss-Newton, ML, etc

RiskLab Logit/LDA vs. ANN
f (·)
Logit/LDA through ANNs
Input: x1,x2, x3 (and interceptb)
Output: hw,b(x) = f wT
x = f
3
i=1 wi xi + b
Let f (·) be a sigmoidal function: f (z) = 1
1+exp(−z)
Or a step function with threshold θ: f (z) =
1 if z ≥ θ
0 if z θ

RiskLabWhat is a Random Forest?
Decision tree
Top-down approach by splitting data into two classes
Sequential signal extraction
Trees are grown as long as it benets the classication
This might lead to overtting: pruned via CV to generalize
Random Forest: Bagging of decision trees
Draw samples with replacement and m variables from data
Estimate decision tree models for each resampling
Use voting to combine model output

RiskLab Ensemble learning
Simultaneous use of multiple statistical learning algorithms to
improve predictive performance
Often gains in accuracy, generalization and robustness
Gains from uncorrelated output/diversity
Bagging: aggregate (var/obs) resampled models into one
model output
Boosting: output from multiple models averaged with specied
weights
Stacking: another layer of models on top of individual model
output

RiskLab Model uncertainty
General procedure applied to model performance output
Estimate SE from empirical resampling distributions
Find critical t values from the empirical distribution
Perform mean-comparison tests as overlapping condence
intervals do not assure statistical signicance

RiskLab Model selection
Method Parameters
Signal extraction Debt service ratio
LASSO λ = 0.0012
KNN k = 2 Distance = 1
Random forest No. of trees = 180 No. of predictors sampled = 5
ANN No. of hidden layer units = 8 Max no. of iterations = 200
ELM No. of hidden layer units = 300 Activation function = Tan-sig
SVM γ = 0.4 Cost = 1

Peter Sarlin. Toward robust early-warning models: A horse race, ensembles and model uncertainty

Recommandé

Recommandé

Contenu connexe

Similaire à Peter Sarlin. Toward robust early-warning models: A horse race, ensembles and model uncertainty

Similaire à Peter Sarlin. Toward robust early-warning models: A horse race, ensembles and model uncertainty (20)

Plus de Eesti Pank

Plus de Eesti Pank (20)

Dernier

Dernier (20)

Peter Sarlin. Toward robust early-warning models: A horse race, ensembles and model uncertainty