The document discusses risk management in commercial lending portfolios with small time series datasets. It aims to show that time series models are more accurate than expected loss models for forecasting portfolio losses. It also proposes a methodology to develop time series models with less than 50 observations. The methodology involves disaggregating quarterly loss data into simulated monthly observations to increase the dataset size. Time series models are then used to forecast monthly losses up to Q4 2015, which are aggregated to obtain quarterly and 12-month loss forecasts. The results are compared to expected loss model forecasts to evaluate accuracy.
1. RISK MANAGEMENT IN A COMMERCIAL LENDING PORTFOLIO WITH TIME
SERIES AND SMALL DATASETS
TANMOY GANGULI,
ASSISTANT MANAGER, FINANCIAL SERVICES ANALYTICS
GENPACT, KOLKATA
GLOBSYN MANAGEMENT CONFERENCE 2015
2. BACKGROUND OF THE WORK
• Forecasting losses for
commercial portfolios using
time series is a major
challenge, given the non-
availability of sufficient
volumes of historical data
• Businesses employ standard
loss forecasting procedures
such as Net Flow Rate
method, Vintage Loss curves,
Score Distributions etc. when
transaction level data is
available.
• Many international financial
institutions provide
consultants with data on
“Next_12_months_loss_perce
ntage”. This is a forward
looking measure of portfolio
loss percentages.
• This variable condenses the
dataset from a transaction
level data to a quarterly
reported data. The number of
data points shrink down to 20-
25.
3. BACKGROUND OF THE WORK
• Most risk managers prefer to
use the Expected Loss
approach in estimating the
losses over the next 12 months
window and back testing it on
the historical value of Actual
“Next_12_months_loss”
values.
• Expected Loss approach is a
BASEL compliant approach
which uses the Probability of
Default (PD), Loss Given
Default (LGD) and Exposure at
Default (EAD) to compute the
Expected Loss (EL). (EL = PD *
LGD * EAD)
• Two main limitations of the
Expected Loss approach are:
(1) High Coverage Ratio
compared to other loss
forecasting models (2) Relies
heavily on historical portfolio
information and hence does
not incorporate the most
recent changes in the portfolio
or macroeconomic
environment.
• Time series models are an
important alternative to the
Expected Loss approach for
forecasting portfolio losses.
4. OBJECTIVES OF THE WORK
THERE ARE TWO
IMPORTANT OBJECTIVES OF
THE WORK
First, the objective is to show that time
series models are more accurate than EL
model for forecasting expected portfolio
losses
Second, the objective is to propose an
alternate methodology to develop a
time series model in a small dataset
with less than 50 observation
Does the time series model
perform better than EL model
during crisis periods?
Is the coverage ratio more
economical under the time series
model?
What are the methods of
developing time series
models in small datasets?
Why is the present model
most suited and what are its
steps of development?
5. DATA DESCRIPTION AND PORTFOLIO SYNOPSIS
HEALTHCARE FINANCIAL
SERVICES
LEVERAGED LOANS
NON-LEVERAGED
LOANS
ASSET BASED LOANS CASH FLOW LOANS
Cash flow loans are of importance in this
analysis. Loss percentage are forecasted
for the cash flow loan segments.
6. DATA DESCRIPTION AND PORTFOLIO SYNOPSIS
NEXT 12 MONTHS LOSS
PERCENTAGE
Is a forward looking
measure of actual loss
percentage of a portfolio
It is calculated over a
rolling period of four
quarters. It shows the loss
that the portfolio can incur
over the next 12 months,
standing at the ‘As on date’
Next_12_months_loss_per
centage Q1 2008 = actual
Loss percentage Q1 +
actual loss percentage Q2+
actual loss percentage Q3+
actual loss percentage Q4
ACTUAL LOSS PERCENTAGE
Actual loss = Life time net
write off from the defaulters
in the next 12 months
Actual loss = 12 months net
write off from the defaulters
in the next 12 months
EXPECTED LOSS
Losses expected to occur from the existing
obligors (on-books obligors) over the next 12
months. As per BASEL norms, Expected Losses
are calculated as : PD * LGD *EAD
7. DATA DESCRIPTION AND PORTFOLIO SYNOPSIS
Quarter Actual Loss
percentage
Next 12
months loss
percentage
2008Q1 1% 8.5%
2008Q2 2% 11.5%
2008Q3 3% 14.5%
2008Q4 2.5% 13.5%
2009Q1 4% -
2009Q2 5% -
2009Q3 2% -
The Next 12 month’s actual loss
percentage is based on rolling sample
analysis. Blanks occur as sample cannot
be rolled further
Under prediction of reserves and
bankruptcy
8. DATA DESCRIPTION AND PORTFOLIO SYNOPSIS
Quarter Next _12 Month Loss Weighted PD Weighted LGD Expected Loss %
31-03-2008 0.74% 3.09% 12.23% 0.38%
30-06-2008 0.79% 3.26% 11.10% 0.36%
30-09-2008 0.71% 3.55% 11.41% 0.41%
31-12-2008 0.66% 8.21% 11.59% 0.95%
31-03-2009 0.68% 9.20% 11.00% 1.01%
30-06-2009 0.52% 8.67% 11.74% 1.02%
30-09-2009 0.55% 7.69% 16.84% 1.29%
31-12-2009 0.53% 7.01% 17.82% 1.25%
31-03-2010 0.38% 5.66% 18.57% 1.05%
30-06-2010 0.45% 4.25% 18.49% 0.79%
30-09-2010 0.60% 4.07% 17.15% 0.70%
31-12-2010 0.52% 3.44% 19.52% 0.67%
31-03-2011 0.76% 2.59% 18.86% 0.49%
30-06-2011 0.79% 2.52% 18.86% 0.48%
30-09-2011 0.50% 2.91% 18.08% 0.53%
31-12-2011 0.40% 3.00% 18.01% 0.54%
31-03-2012 0.26% 2.45% 17.58% 0.43%
30-06-2012 0.20% 2.23% 18.09% 0.40%
30-09-2012 0.18% 2.20% 17.66% 0.39%
31-12-2012 0.21% 2.16% 18.03% 0.39%
31-03-2013 0.34% 1.87% 17.24% 0.32%
30-06-2013 0.17% 1.78% 17.23% 0.31%
30-09-2013 1.66% 17.86% 0.30%
31-12-2013 1.88% 18.09% 0.34%
31-03-2014 1.86% 18.07% 0.34%
30-06-2014 2.21% 17.94% 0.40%
Next 12 month loss percentages and Expected loss from 2008-2014
PD = Probability of Default
LGD = Loss Given Default
EAD = Exposure at Default
Weights for an i-th obligor =
EAD of i-th obligor/
Summation of the EAD for
the portfolio.
EL = Expected Loss (Weighted
PD * Weighted LGD)
9. TIME SERIES V/S EXPECTED LOSS – A COMPARATIVE ANALYSIS
TIME SERIES
MODELS
EXPECTED LOSS
MODELS
1. The main advantage of a time series based
loss forecasting model is that it uses the most
recent loss information up to a substantial
portion in history (AR terms), the impact of
forecast errors (MA terms) as well as information
on relevant exogenous variables.
2. The next advantage of time series models is
that it uses the actual realised values of a
variable , hence most recent actual information
can be used.
1. The feeder PD, LGD and EAD models are
based on portfolio information which is at
least 12 months old and most recent
portfolio characteristics are not captured,
given the BASEL requirement of a 12
months performance period. So for
predicting the expected losses for the year
2015 using historical data from 2008Q1 to
2014Q4, information up to Q12014 can be
used, at best.
2. The Expected Loss Approach is over-
conservative in nature, and has a coverage
ratio of much more than 100%.
10. METHODOLOGY AND RESULTS
Disaggregate the
Next_12_months_loss_percentage to
obtain the quarterly data points
Simulate the monthly observations from
the quarterly data points, using the
quarterly mean and variance. The
monthly values must add up to the loss
percentage for the quarter
Estimate the monthly losses up to Q4
2015 using time series models, aggregate
it to obtain quarterly loss estimates and
then to obtain the Next_12_months_loss
_percentage
This aggregation is done to
increase the number of
data points. With the given
number of data points it is
not possible to develop a
time series model. The Box-
Jenkins criteria of 50
observations is not met.
11. METHODOLOGY AND RESULTS
DISAGGREGATING THE DATA TO QUARTERLY LEVEL FROM NEXT_12_MONTH_LOSS VARIABLE
Next 12 months loss (Q1 2008)= Loss percent in Q1 2008+Loss percent in Q2 2008+Loss
percent in Q3 2008+Loss percent in Q4 2008 (marked in blue in the table)
𝐍𝐞𝐱𝐭 𝟏𝟐 𝐦𝐨𝐧𝐭𝐡𝐬 𝐥𝐨𝐬𝐬 𝐐𝟐 𝟐𝟎𝟎𝟖
= 𝐋𝐨𝐬𝐬 𝐩𝐞𝐫𝐜𝐞𝐧𝐭 𝐢𝐧 𝐐𝟐 𝟐𝟎𝟎𝟖 + 𝐋𝐨𝐬𝐬 𝐩𝐞𝐫𝐜𝐞𝐧𝐭 𝐢𝐧 𝐐𝟑 𝟐𝟎𝟎𝟖 + 𝐋𝐨𝐬𝐬 𝐩𝐞𝐫𝐜𝐞𝐧𝐭 𝐢𝐧 𝐐𝟒 𝟐𝟎𝟎𝟖
+ 𝐋𝐨𝐬𝐬 𝐩𝐞𝐫𝐜𝐞𝐧𝐭 𝐢𝐧 𝐐𝟏 𝟐𝟎𝟎𝟗 (𝒎𝒂𝒓𝒌𝒆𝒅 𝒊𝒏 𝒈𝒓𝒆𝒆𝒏 𝒊𝒏 𝒕𝒉𝒆 𝒕𝒂𝒃𝒍𝒆)
Actual loss percentage Q1 2009 = Actual loss percentage in Q1 2008+ year_on_year change
in actual loss percentage
To obtain values using the recursion relation for 2008, we need
values from 2007. But we don’t have them!!!..
So, we need to assign initial conditions for Q1-Q4 2008…
BUT HOW?????
12. METHODOLOGY AND RESULTS
To assign the initial condition, there are two main steps:
1. Analyse the distribution of the Next_12_months_loss percentage
2. Take the Average of the Next_12_months_loss percentage at the reporting point from Q1
2008 – Q4 2008.
Quarter
Next _12
Month Loss
Average
Q1
Average
Loss
Q2
Average
Loss
Q3
Average
Loss
Q4
Average
Loss
Quarterly loss
estimate for 2008
(G.M)
Quarterly loss
estimate for
2008 (A.M)
3/31/2008 0.74% 0.00186 0.185% - - - 0.185% 0.185%
6/30/2008 0.79% 0.002 0.185% 0.198% - - 0.191% 0.191%
9/30/2008 0.71% 0.0018 0.185% 0.198% 0.178% - 0.186% 0.187%
12/31/2008 0.66% 0.0017 0.185% 0.198% 0.178% 0.165% 0.181% 0.181%
3/31/2009 0.68% 0.0017 - 0.198% 0.178% 0.165% - -
6/30/2009 0.52% 0.0013 - - 0.178% 0.165% - -
9/30/2009 0.55% 0.0014 - - - 0.165% 0.74% 0.74%
The average
can be justified
if the
distribution is a
at least
approximately
normal**
** Normality Results of the Next_12_months_loss percentages are reported in the next slide
13. METHODOLOGY AND RESULTS
Tests for Normality
Test Statistic p Value
Shapiro-Wilk W 0.933012 Pr < W 0.1418
Kolmogorov-Smirnov D 0.10162 Pr > D >0.1500
Cramer-von Mises W-Sq 0.048849 Pr > W-Sq >0.2500
Anderson-Darling A-Sq 0.401074 Pr > A-Sq >0.2500
H0 : The Next_12_month_loss
is normally distributed
v/s
HA : The Next_12_month_loss
is not normally distributed
Quarter Actual loss %age
Next_12_month_
loss
3/31/2008 0.185% 0.74%
6/30/2008 0.191% 0.79%
9/30/2008 0.186% 0.71%
12/31/2008 0.181% 0.66%
3/31/2009 0.235% 0.68%
6/30/2009 0.111% 0.52%
9/30/2009 0.136% 0.55%
12/31/2009 0.201% 0.53%
3/31/2010 0.075%
6/30/2010 0.141%
9/30/2010 0.116%
The sum of
quarterly loss
values
generated must
equal the
Next_12_month
_loss
percentage
14. METHODOLOGY AND RESULTS
OBTAINING MONTHLY DATA POINTS FROM THE QUARTERLY LOSS PERCENTAGES
To obtain the monthly data from the quarterly data
points, following are the important steps
1. Analyse the distribution of the quarterly data
points.
2. Identify the quarterly mean and variance. The
variance of the quarterly losses have been obtained in
discussion with the clients.
3. Using the quarterly mean and variance, 250 trials of
random numbers have been generated, each trial
containing 3 observations, (since each quarter has three
months.
4. The trials with sum equal to the quarterly loss
percentage for a given time point are chosen. This filter
had to be applied as the monthly loss percentages must
add up to the value of the quarterly sum of actual loss
percentage.
Month Average monthly losses Quarterly loss
1/31/2008 0.062%
2/28/2008 0.068%
3/31/2008 0.055% 0.185%
4/30/2008 0.056%
5/31/2008 0.069%
6/30/2008 0.066% 0.191%
7/31/2008 0.069%
8/31/2008 0.064%
9/30/2008 0.053% 0.186%
10/31/2008 0.054%
11/30/2008 0.063%
12/31/2008 0.064% 0.181%
1/31/2009 0.066%
2/28/2009 0.106%
3/31/2009 0.064% 0.235%
16. METHODOLOGY AND RESULTS
The time series model
is better compared to
the Expected Loss
model because:
1. It is a better
predictor of losses
during crisis
period.
2. It does not require
the firms to build up
unnecessary reserves .
Therefore, it is not
over conservative
3. It gives a better
prediction of losses
compared to EL
17. METHODOLOGY AND RESULTS
Metrics Next_12_months_loss_ARIMA Next_12_month_loss_EL Next_12_months_loss_ARIMAX
Total Number of
Quarters
26 26 26
Mean Absolute
Error (MAE)
0.0011 0.0030 0.0009
Mean Absolute
Percentage Error
(MAPE)
57% 97% 51.6%
Number of
quarters with
underprediction
11 6 10
Average Extent
of Under
prediction
-0.07% -0.29% 0.000719765
The results of
ARIMA and the
ARIMAX models are
nearly comparable.
Both models
perform better than
the Expected Loss
model. The ARIMAX
model captures the
loss percentages
better for the
Financial Crisis
period.