SlideShare a Scribd company logo
1 of 30
Download to read offline
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Actuariat de l’Assurance Non-Vie # 9
A. Charpentier (UQAM & Université de Rennes 1)
ENSAE ParisTech, Octobre 2015 - Janvier 2016.
http://freakonometrics.hypotheses.org
@freakonometrics 1
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Fourre-Tout sur la Tarification
• modèle collectif vs. modèle individuel
• cas de la grande dimension
• choix de variables
• choix de modèles
@freakonometrics 2
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La loi Tweedie
Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µ
and scale parameter φ, then it is a compound Poisson model,
• N ∼ P(λ) with λ =
φµ2−p
2 − p
• Yi ∼ G(α, β) with α = −
p − 2
p − 1
and β =
φµ1−p
p − 1
Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β),
then
• variance function power is p =
α + 2
α + 1
• mean is µ =
λα
β
• scale parameter is φ =
[λα]
α+2
α+1 −1
β2− α+2
α+1
α + 1
@freakonometrics 3
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
In the context of regression
Ni ∼ P(λi) with λi = exp[XT
i βλ]
Yj,i ∼ G(µi, φ) with µi = exp[XT
i βµ]
Then Si = Y1,i + · · · + YN,i has a Tweedie distribution
• variance function power is p =
φ + 2
φ + 1
• mean is λiµi
• scale parameter is
λ
1
φ+1 −1
i
µ
φ
φ+1
i
φ
1 + φ
There are 1 + 2dim(X) degrees of freedom.
@freakonometrics 4
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
Remark Note that the scale parameter should not depend on i.
A Tweedie regression is
• variance function power is p ∈ (1, 2)
• mean is µi = exp[XT
i βTweedie]
• scale parameter is φ
There are 2 + dim(X) degrees of freedom.
@freakonometrics 5
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Fr´quence - Coût Individuel
Considérons les bases suivantes, en RC, pour la fréquence
1 > freq = merge(contrat ,nombre_RC)
pour les coûts individuels
1 > sinistre_RC = sinistre [( sinistre$garantie =="1RC")&(sinistre$cout >0)
,]
2 > sinistre_RC = merge(sinistre_RC ,contrat)
et pour les co ûts agrégés par police
1 > agg_RC = aggregate(sinistre_RC$cout , by=list(sinistre_RC$nocontrat)
, FUN=’sum’)
2 > names(agg_RC)=c(’nocontrat ’,’cout_RC’)
3 > global_RC = merge(contrat , agg_RC , all.x=TRUE)
4 > global_RC$cout_RC[is.na(global_DO$cout_RC)]=0
@freakonometrics 6
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Fr´quence - Coût Individuel
1 > library(splines)
2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=freq ,family=poisson)
3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data=sinistre_RC
,family=Gamma(link="log"))
Simple Modèle Coût par Police
1 > library(tweedie)
2 > library(statmod)
3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=global_RC ,family=tweedie(var.power =1.5 , link.
power =0))
@freakonometrics 7
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparaison des primes
1 > freq2 = freq
2 > freq2$exposition = 1
3 > P_f = predict(reg_f,newdata=freq2 ,type="response")
4 > P_c = predict(reg_c,newdata=freq2 ,type="response")
5 prime1 = P_f*P_c
1 > k = 1.5
2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=global_DO ,family=tweedie(var.power=k, link.power
=0))
3 > prime2 = predict(reg_a,newdata=freq2 ,type="response")
1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1)
@freakonometrics 8
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
@freakonometrics 9
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
Tweedie 1
0200400600800
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
Tweedie 1−0.20.00.20.40.6
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
Comparaison des primes pures, assurés no1, no2 et no 3 (DO)
@freakonometrics 10
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
‘Optimisation’ du Paramètre Tweedie
1 > dev = function(k){
2 + reg = glm(cout_RC~zone+bs( ageconducteur )+
carburant , data=global_RC , family=
tweedie(var.power=k, link.power =0) ,
offset=log(exposition))
3 + reg$deviance
4 + }
@freakonometrics 11
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Tarification et données massives (Big Data)
Problèmes classiques avec des données massives
• beaucoup de variables explicatives, k grand, XT
X peut-être non inversible
• gros volumes de données, e.g. données télématiques
• données non quantitatives, e.g. texte, localisation, etc.
@freakonometrics 12
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
La fascination pour les estimateurs sans biais
En statistique mathématique, on aime les estimateurs sans biais car ils ont
plusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateurs
biaisés, potentiellement meilleurs ?
Consider a sample, i.i.d., {y1, · · · , yn} with distribution N(µ, σ2
). Define
θ = αY . What is the optimal α to get the best estimator of µ ?
• bias: bias θ = E θ − µ = (α − 1)µ
• variance: Var θ =
α2
σ2
n
• mse: mse θ = (α − 1)2
µ2
+
α2
σ2
n
The optimal value is α =
µ2
µ2 +
σ2
n
< 1.
@freakonometrics 13
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Linear Model
Consider some linear model yi = xT
i β + εi for all i = 1, · · · , n.
Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Write





y1
...
yn





y,n×1
=





1 x1,1 · · · x1,k
...
...
...
...
1 xn,1 · · · xn,k





X,n×(k+1)








β0
β1
...
βk








β,(k+1)×1
+





ε1
...
εn





ε,n×1
.
Assuming ε ∼ N(0, σ2
I), the maximum likelihood estimator of β is
β = argmin{ y − XT
β 2
} = (XT
X)−1
XT
y
... under the assumtption that XT
X is a full-rank matrix.
What if XT
i X cannot be inverted? Then β = [XT
X]−1
XT
y does not exist, but
βλ = [XT
X + λI]−1
XT
y always exist if λ > 0.
@freakonometrics 14
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Ridge Regression
The estimator β = [XT
X + λI]−1
XT
y is the Ridge estimate obtained as solution
of
β = argmin
β



n
i=1
[yi − β0 − xT
i β]2
+ λ β 2
1Tβ2



for some tuning parameter λ. One can also write
β = argmin
β; β 2 ≤s
{ Y − XT
β 2
}
Remark Note that we solve β = argmin
β
{objective(β)} where
objective(β) = L(β)
training loss
+ R(β)
regularization
@freakonometrics 15
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
In severall applications, k can be (very) large, but a lot of features are just noise:
βj = 0 for many j’s. Let s denote the number of relevent features, with s << k,
cf Hastie, Tibshirani & Wainwright (2015),
s = card{S} where S = {j; βj = 0}
The model is now y = XT
SβS + ε, where XT
SXS is a full rank matrix.
@freakonometrics 16
q
q
= . +
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
Define a 0 = 1(|ai| > 0). Ici dim(β) = s.
We wish we could solve
β = argmin
β; β 0 ≤s
{ Y − XT
β 2
}
Problem: it is usually not possible to describe all possible constraints, since
s
k
coefficients should be chosen here (with k (very) large).
Idea: solve the dual problem
β = argmin
β; Y −XTβ 2 ≤h
{ β 0
}
where we might convexify the 0 norm, · 0
.
@freakonometrics 17
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Regularization 0, 1 et 2
@freakonometrics 18
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
On [−1, +1]k
, the convex hull of β 0
is β 1
On [−a, +a]k
, the convex hull of β 0
is a−1
β 1
Hence,
β = argmin
β; β 1 ≤˜s
{ Y − XT
β 2
}
is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem
β = argmin{ Y − XT
β 2 +λ β 1 }
@freakonometrics 19
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO Least Absolute Shrinkage and Selection Operator
β ∈ argmin{ Y − XT
β 2
+λ β 1
}
is a convex problem (several algorithms ), but not strictly convex (no unicity of
the minimum). Nevertheless, predictions y = xT
β are unique
MM, minimize majorization, coordinate descent Hunter (2003).
@freakonometrics 20
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Optimal LASSO Penalty
Use cross validation, e.g. K-fold,
β(−k)(λ) = argmin



i∈Ik
[yi − xT
i β]2
+ λ β



then compute the sum of the squared errors,
Qk(λ) =
i∈Ik
[yi − xT
i β(−k)(λ)]2
and finally solve
λ = argmin Q(λ) =
1
K
k
Qk(λ)
Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest the
largest λ such that
Q(λ) ≤ Q(λ ) + se[λ ] with se[λ]2
=
1
K2
K
k=1
[Qk(λ) − Q(λ)]2
@freakonometrics 21
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge(contrat ,nombre_RC)
2 > freq = merge(freq ,nombre_DO)
3 > freq [ ,10]= as.factor(freq [ ,10])
4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D",
freq [ ,3]%in%c("A","B","C"))
5 > colnames(mx)=c(names(freq)[c(4,5,6)],"
diesel","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names(mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library(glmnet)
10 > fit = glmnet(x=as.matrix(mx), y=freq [,11],
offset=log(freq [ ,2]), family = "poisson
")
11 > plot(fit , xvar="lambda", label=TRUE)
LASSO, Fréquence RC
−10 −9 −8 −7 −6 −5
−0.20−0.15−0.10−0.050.000.050.10
Log Lambda
Coefficients
4 4 4 4 3 1
1
3
4
5
@freakonometrics 22
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence RC
1 > plot(fit ,label=TRUE)
2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq
[,11], offset=log(freq [ ,2]),family = "
poisson")
3 > plot(cvfit)
4 > cvfit$lambda.min
5 [1] 0.0002845703
6 > log(cvfit$lambda.min)
7 [1] -8.16453
• Cross validation curve + error bars
0.0 0.1 0.2 0.3 0.4
−0.20−0.15−0.10−0.050.000.050.10
L1 Norm
Coefficients
0 2 3 4 4
1
3
4
5
−10 −9 −8 −7 −6 −5
0.2460.2480.2500.2520.2540.2560.258
log(Lambda)
PoissonDeviance
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1
@freakonometrics 23
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge(contrat ,nombre_RC)
2 > freq = merge(freq ,nombre_DO)
3 > freq [ ,10]= as.factor(freq [ ,10])
4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D",
freq [ ,3]%in%c("A","B","C"))
5 > colnames(mx)=c(names(freq)[c(4,5,6)],"
diesel","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names(mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library(glmnet)
10 > fit = glmnet(x=as.matrix(mx), y=freq [,12],
offset=log(freq [ ,2]), family = "poisson
")
11 > plot(fit , xvar="lambda", label=TRUE)
LASSO, Fréquence DO
−9 −8 −7 −6 −5 −4
−0.8−0.6−0.4−0.20.0
Log Lambda
Coefficients
4 4 3 2 1 1
1
2
4
5
@freakonometrics 24
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence DO
1 > plot(fit ,label=TRUE)
2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq
[,12], offset=log(freq [ ,2]),family = "
poisson")
3 > plot(cvfit)
4 > cvfit$lambda.min
5 [1] 0.0004744917
6 > log(cvfit$lambda.min)
7 [1] -7.653266
• Cross validation curve + error bars
0.0 0.2 0.4 0.6 0.8 1.0
−0.8−0.6−0.4−0.20.0
L1 Norm
Coefficients
0 1 1 1 2 4
1
2
4
5
−9 −8 −7 −6 −5 −4
0.2150.2200.2250.2300.235
log(Lambda)
PoissonDeviance
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1
@freakonometrics 25
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
The theoretical curve, given a distribution F, is
u → L(u) =
F −1
(u)
−∞
tdF(t)
+∞
−∞
tdF(t)
see Gastwirth (1972, econpapers.repec.org)
@freakonometrics 26
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
1 > library(ineq)
2 > set.seed (1)
3 > (x<-sort(rlnorm (5,0,1)))
4 [1] 0.4336018 0.5344838 1.2015872 1.3902836
4.9297132
5 > Lc.sim <- Lc(x)
6 > plot(Lc.sim)
7 > points ((1:4)/5,( cumsum(x)/sum(x))[1:4] , pch
=19, col="blue")
8 > lines(Lc.lognorm , parameter =1,lty =2)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
q
q
q
q
@freakonometrics 27
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Gini index is the ratio of the areas
A
A + B
. Thus,
G =
2
n(n − 1)x
n
i=1
i · xi:n −
n + 1
n − 1
=
1
E(Y )
∞
0
F(y)(1 − F(y))dy
1 > Gini(x)
2 [1] 0.4640003
q
q
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
p
L(p)
q
q
q
q
A
B
@freakonometrics 28
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparing Models
Consider an ordered sample {y1, · · · , yn} of incomes,
with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
We have observed losses yi and premiums π(xi). Con-
sider an ordered sample by the model, see Frees, Meyers
& Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), then
plot
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
0 20 40 60 80 100
020406080100
Proportion (%)
Income(%)
poorest ← → richest
0 20 40 60 80 100
020406080100
Proportion (%)
Losses(%)
more risky ← → less risky
@freakonometrics 29
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Choix et comparaison de modèle en tarification
See Frees et al. (2010) or Tevet (2013).
@freakonometrics 30

More Related Content

What's hot (20)

HdR
HdRHdR
HdR
 
Slides ineq-3b
Slides ineq-3bSlides ineq-3b
Slides ineq-3b
 
Slides erasmus
Slides erasmusSlides erasmus
Slides erasmus
 
Slides toulouse
Slides toulouseSlides toulouse
Slides toulouse
 
Slides erm
Slides ermSlides erm
Slides erm
 
Quantile and Expectile Regression
Quantile and Expectile RegressionQuantile and Expectile Regression
Quantile and Expectile Regression
 
Proba stats-r1-2017
Proba stats-r1-2017Proba stats-r1-2017
Proba stats-r1-2017
 
Slides ensae 11bis
Slides ensae 11bisSlides ensae 11bis
Slides ensae 11bis
 
Lundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentierLundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentier
 
Slides ensae 10
Slides ensae 10Slides ensae 10
Slides ensae 10
 
Graduate Econometrics Course, part 4, 2017
Graduate Econometrics Course, part 4, 2017Graduate Econometrics Course, part 4, 2017
Graduate Econometrics Course, part 4, 2017
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Classification
ClassificationClassification
Classification
 
Pricing Game, 100% Data Sciences
Pricing Game, 100% Data SciencesPricing Game, 100% Data Sciences
Pricing Game, 100% Data Sciences
 
Slides guanauato
Slides guanauatoSlides guanauato
Slides guanauato
 
Berlin
BerlinBerlin
Berlin
 
Slides amsterdam-2013
Slides amsterdam-2013Slides amsterdam-2013
Slides amsterdam-2013
 
Slides univ-van-amsterdam
Slides univ-van-amsterdamSlides univ-van-amsterdam
Slides univ-van-amsterdam
 
Inequality #4
Inequality #4Inequality #4
Inequality #4
 
Slides econ-lm
Slides econ-lmSlides econ-lm
Slides econ-lm
 

Viewers also liked (16)

Slides ensae-2016-6
Slides ensae-2016-6Slides ensae-2016-6
Slides ensae-2016-6
 
Slides ensae-2016-5
Slides ensae-2016-5Slides ensae-2016-5
Slides ensae-2016-5
 
Slides ensae-2016-7
Slides ensae-2016-7Slides ensae-2016-7
Slides ensae-2016-7
 
Slides ensae-2016-4
Slides ensae-2016-4Slides ensae-2016-4
Slides ensae-2016-4
 
Slides networks-2017-2
Slides networks-2017-2Slides networks-2017-2
Slides networks-2017-2
 
Slides simplexe
Slides simplexeSlides simplexe
Slides simplexe
 
Slides ads ia
Slides ads iaSlides ads ia
Slides ads ia
 
IA-advanced-R
IA-advanced-RIA-advanced-R
IA-advanced-R
 
15 03 16_data sciences pour l'actuariat_f. soulie fogelman
15 03 16_data sciences pour l'actuariat_f. soulie fogelman15 03 16_data sciences pour l'actuariat_f. soulie fogelman
15 03 16_data sciences pour l'actuariat_f. soulie fogelman
 
Slides 2040-6
Slides 2040-6Slides 2040-6
Slides 2040-6
 
Slides inequality 2017
Slides inequality 2017Slides inequality 2017
Slides inequality 2017
 
Julien slides - séminaire quantact
Julien slides - séminaire quantactJulien slides - séminaire quantact
Julien slides - séminaire quantact
 
Slides arbres-ubuntu
Slides arbres-ubuntuSlides arbres-ubuntu
Slides arbres-ubuntu
 
So a webinar-2013-2
So a webinar-2013-2So a webinar-2013-2
So a webinar-2013-2
 
Slides act6420-e2014-ts-1
Slides act6420-e2014-ts-1Slides act6420-e2014-ts-1
Slides act6420-e2014-ts-1
 
Slides 2040-4
Slides 2040-4Slides 2040-4
Slides 2040-4
 

Similar to Slides ensae-2016-9

Hands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingHands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingArthur Charpentier
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsColin Gillespie
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceArthur Charpentier
 
A kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedA kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedKaiju Capital Management
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-webArthur Charpentier
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Asma Ben Slimene
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Asma Ben Slimene
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920Karl Rudeen
 

Similar to Slides ensae-2016-9 (20)

Hands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingHands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive Modeling
 
Slides ACTINFO 2016
Slides ACTINFO 2016Slides ACTINFO 2016
Slides ACTINFO 2016
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic Models
 
Slides ub-2
Slides ub-2Slides ub-2
Slides ub-2
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & Insurance
 
A kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedA kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem Resolved
 
Side 2019 #3
Side 2019 #3Side 2019 #3
Side 2019 #3
 
Slides ineq-4
Slides ineq-4Slides ineq-4
Slides ineq-4
 
Slides lyon-anr
Slides lyon-anrSlides lyon-anr
Slides lyon-anr
 
Slides ub-3
Slides ub-3Slides ub-3
Slides ub-3
 
Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
kactl.pdf
kactl.pdfkactl.pdf
kactl.pdf
 
Slides mevt
Slides mevtSlides mevt
Slides mevt
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-web
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
 
Project Paper
Project PaperProject Paper
Project Paper
 
PhDretreat2011
PhDretreat2011PhDretreat2011
PhDretreat2011
 

More from Arthur Charpentier (20)

Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
ACT6100 introduction
ACT6100 introductionACT6100 introduction
ACT6100 introduction
 
Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)
 
Control epidemics
Control epidemics Control epidemics
Control epidemics
 
STT5100 Automne 2020, introduction
STT5100 Automne 2020, introductionSTT5100 Automne 2020, introduction
STT5100 Automne 2020, introduction
 
Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
Reinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and FinanceReinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and Finance
 
Optimal Control and COVID-19
Optimal Control and COVID-19Optimal Control and COVID-19
Optimal Control and COVID-19
 
Slides OICA 2020
Slides OICA 2020Slides OICA 2020
Slides OICA 2020
 
Lausanne 2019 #3
Lausanne 2019 #3Lausanne 2019 #3
Lausanne 2019 #3
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Lausanne 2019 #2
Lausanne 2019 #2Lausanne 2019 #2
Lausanne 2019 #2
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Side 2019 #10
Side 2019 #10Side 2019 #10
Side 2019 #10
 
Side 2019 #11
Side 2019 #11Side 2019 #11
Side 2019 #11
 
Side 2019 #12
Side 2019 #12Side 2019 #12
Side 2019 #12
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
Side 2019 #8
Side 2019 #8Side 2019 #8
Side 2019 #8
 
Side 2019 #7
Side 2019 #7Side 2019 #7
Side 2019 #7
 
Side 2019 #6
Side 2019 #6Side 2019 #6
Side 2019 #6
 

Recently uploaded

The Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfThe Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfGale Pooley
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Q3 2024 Earnings Conference Call and Webcast Slides
Q3 2024 Earnings Conference Call and Webcast SlidesQ3 2024 Earnings Conference Call and Webcast Slides
Q3 2024 Earnings Conference Call and Webcast SlidesMarketing847413
 
Instant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School DesignsInstant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School Designsegoetzinger
 
OAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptx
OAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptxOAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptx
OAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptxhiddenlevers
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfThe Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfGale Pooley
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Delhi Call girls
 
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Quarter 4- Module 3 Principles of Marketing
Quarter 4- Module 3 Principles of MarketingQuarter 4- Module 3 Principles of Marketing
Quarter 4- Module 3 Principles of MarketingMaristelaRamos12
 
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130Suhani Kapoor
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfGale Pooley
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...ssifa0344
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130  Available With RoomVIP Kolkata Call Girl Serampore 👉 8250192130  Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Roomdivyansh0kumar0
 
Dividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptxDividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptxanshikagoel52
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfGale Pooley
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

The Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfThe Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdf
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Q3 2024 Earnings Conference Call and Webcast Slides
Q3 2024 Earnings Conference Call and Webcast SlidesQ3 2024 Earnings Conference Call and Webcast Slides
Q3 2024 Earnings Conference Call and Webcast Slides
 
Instant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School DesignsInstant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School Designs
 
OAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptx
OAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptxOAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptx
OAT_RI_Ep19 WeighingTheRisks_Apr24_TheYellowMetal.pptx
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 
The Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdfThe Economic History of the U.S. Lecture 22.pdf
The Economic History of the U.S. Lecture 22.pdf
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
 
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
 
Quarter 4- Module 3 Principles of Marketing
Quarter 4- Module 3 Principles of MarketingQuarter 4- Module 3 Principles of Marketing
Quarter 4- Module 3 Principles of Marketing
 
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdf
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
 
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130  Available With RoomVIP Kolkata Call Girl Serampore 👉 8250192130  Available With Room
VIP Kolkata Call Girl Serampore 👉 8250192130 Available With Room
 
Dividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptxDividend Policy and Dividend Decision Theories.pptx
Dividend Policy and Dividend Decision Theories.pptx
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdf
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
 

Slides ensae-2016-9

  • 1. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Actuariat de l’Assurance Non-Vie # 9 A. Charpentier (UQAM & Université de Rennes 1) ENSAE ParisTech, Octobre 2015 - Janvier 2016. http://freakonometrics.hypotheses.org @freakonometrics 1
  • 2. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Fourre-Tout sur la Tarification • modèle collectif vs. modèle individuel • cas de la grande dimension • choix de variables • choix de modèles @freakonometrics 2
  • 3. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La loi Tweedie Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µ and scale parameter φ, then it is a compound Poisson model, • N ∼ P(λ) with λ = φµ2−p 2 − p • Yi ∼ G(α, β) with α = − p − 2 p − 1 and β = φµ1−p p − 1 Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β), then • variance function power is p = α + 2 α + 1 • mean is µ = λα β • scale parameter is φ = [λα] α+2 α+1 −1 β2− α+2 α+1 α + 1 @freakonometrics 3
  • 4. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La régression Tweedie In the context of regression Ni ∼ P(λi) with λi = exp[XT i βλ] Yj,i ∼ G(µi, φ) with µi = exp[XT i βµ] Then Si = Y1,i + · · · + YN,i has a Tweedie distribution • variance function power is p = φ + 2 φ + 1 • mean is λiµi • scale parameter is λ 1 φ+1 −1 i µ φ φ+1 i φ 1 + φ There are 1 + 2dim(X) degrees of freedom. @freakonometrics 4
  • 5. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La régression Tweedie Remark Note that the scale parameter should not depend on i. A Tweedie regression is • variance function power is p ∈ (1, 2) • mean is µi = exp[XT i βTweedie] • scale parameter is φ There are 2 + dim(X) degrees of freedom. @freakonometrics 5
  • 6. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Double Modèle Fr´quence - Coût Individuel Considérons les bases suivantes, en RC, pour la fréquence 1 > freq = merge(contrat ,nombre_RC) pour les coûts individuels 1 > sinistre_RC = sinistre [( sinistre$garantie =="1RC")&(sinistre$cout >0) ,] 2 > sinistre_RC = merge(sinistre_RC ,contrat) et pour les co ûts agrégés par police 1 > agg_RC = aggregate(sinistre_RC$cout , by=list(sinistre_RC$nocontrat) , FUN=’sum’) 2 > names(agg_RC)=c(’nocontrat ’,’cout_RC’) 3 > global_RC = merge(contrat , agg_RC , all.x=TRUE) 4 > global_RC$cout_RC[is.na(global_DO$cout_RC)]=0 @freakonometrics 6
  • 7. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Double Modèle Fr´quence - Coût Individuel 1 > library(splines) 2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=freq ,family=poisson) 3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data=sinistre_RC ,family=Gamma(link="log")) Simple Modèle Coût par Police 1 > library(tweedie) 2 > library(statmod) 3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=global_RC ,family=tweedie(var.power =1.5 , link. power =0)) @freakonometrics 7
  • 8. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Comparaison des primes 1 > freq2 = freq 2 > freq2$exposition = 1 3 > P_f = predict(reg_f,newdata=freq2 ,type="response") 4 > P_c = predict(reg_c,newdata=freq2 ,type="response") 5 prime1 = P_f*P_c 1 > k = 1.5 2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=global_DO ,family=tweedie(var.power=k, link.power =0)) 3 > prime2 = predict(reg_a,newdata=freq2 ,type="response") 1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1) @freakonometrics 8
  • 9. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Impact du degré Tweedie sur les Primes Pures @freakonometrics 9 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq Tweedie 1 0200400600800 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq Tweedie 1−0.20.00.20.40.6
  • 10. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Impact du degré Tweedie sur les Primes Pures Comparaison des primes pures, assurés no1, no2 et no 3 (DO) @freakonometrics 10
  • 11. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 ‘Optimisation’ du Paramètre Tweedie 1 > dev = function(k){ 2 + reg = glm(cout_RC~zone+bs( ageconducteur )+ carburant , data=global_RC , family= tweedie(var.power=k, link.power =0) , offset=log(exposition)) 3 + reg$deviance 4 + } @freakonometrics 11
  • 12. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Tarification et données massives (Big Data) Problèmes classiques avec des données massives • beaucoup de variables explicatives, k grand, XT X peut-être non inversible • gros volumes de données, e.g. données télématiques • données non quantitatives, e.g. texte, localisation, etc. @freakonometrics 12
  • 13. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 La fascination pour les estimateurs sans biais En statistique mathématique, on aime les estimateurs sans biais car ils ont plusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateurs biaisés, potentiellement meilleurs ? Consider a sample, i.i.d., {y1, · · · , yn} with distribution N(µ, σ2 ). Define θ = αY . What is the optimal α to get the best estimator of µ ? • bias: bias θ = E θ − µ = (α − 1)µ • variance: Var θ = α2 σ2 n • mse: mse θ = (α − 1)2 µ2 + α2 σ2 n The optimal value is α = µ2 µ2 + σ2 n < 1. @freakonometrics 13
  • 14. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Linear Model Consider some linear model yi = xT i β + εi for all i = 1, · · · , n. Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Write      y1 ... yn      y,n×1 =      1 x1,1 · · · x1,k ... ... ... ... 1 xn,1 · · · xn,k      X,n×(k+1)         β0 β1 ... βk         β,(k+1)×1 +      ε1 ... εn      ε,n×1 . Assuming ε ∼ N(0, σ2 I), the maximum likelihood estimator of β is β = argmin{ y − XT β 2 } = (XT X)−1 XT y ... under the assumtption that XT X is a full-rank matrix. What if XT i X cannot be inverted? Then β = [XT X]−1 XT y does not exist, but βλ = [XT X + λI]−1 XT y always exist if λ > 0. @freakonometrics 14
  • 15. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Ridge Regression The estimator β = [XT X + λI]−1 XT y is the Ridge estimate obtained as solution of β = argmin β    n i=1 [yi − β0 − xT i β]2 + λ β 2 1Tβ2    for some tuning parameter λ. One can also write β = argmin β; β 2 ≤s { Y − XT β 2 } Remark Note that we solve β = argmin β {objective(β)} where objective(β) = L(β) training loss + R(β) regularization @freakonometrics 15
  • 16. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues In severall applications, k can be (very) large, but a lot of features are just noise: βj = 0 for many j’s. Let s denote the number of relevent features, with s << k, cf Hastie, Tibshirani & Wainwright (2015), s = card{S} where S = {j; βj = 0} The model is now y = XT SβS + ε, where XT SXS is a full rank matrix. @freakonometrics 16 q q = . +
  • 17. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues Define a 0 = 1(|ai| > 0). Ici dim(β) = s. We wish we could solve β = argmin β; β 0 ≤s { Y − XT β 2 } Problem: it is usually not possible to describe all possible constraints, since s k coefficients should be chosen here (with k (very) large). Idea: solve the dual problem β = argmin β; Y −XTβ 2 ≤h { β 0 } where we might convexify the 0 norm, · 0 . @freakonometrics 17
  • 18. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Regularization 0, 1 et 2 @freakonometrics 18
  • 19. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues On [−1, +1]k , the convex hull of β 0 is β 1 On [−a, +a]k , the convex hull of β 0 is a−1 β 1 Hence, β = argmin β; β 1 ≤˜s { Y − XT β 2 } is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem β = argmin{ Y − XT β 2 +λ β 1 } @freakonometrics 19
  • 20. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO Least Absolute Shrinkage and Selection Operator β ∈ argmin{ Y − XT β 2 +λ β 1 } is a convex problem (several algorithms ), but not strictly convex (no unicity of the minimum). Nevertheless, predictions y = xT β are unique MM, minimize majorization, coordinate descent Hunter (2003). @freakonometrics 20
  • 21. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Optimal LASSO Penalty Use cross validation, e.g. K-fold, β(−k)(λ) = argmin    i∈Ik [yi − xT i β]2 + λ β    then compute the sum of the squared errors, Qk(λ) = i∈Ik [yi − xT i β(−k)(λ)]2 and finally solve λ = argmin Q(λ) = 1 K k Qk(λ) Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest the largest λ such that Q(λ) ≤ Q(λ ) + se[λ ] with se[λ]2 = 1 K2 K k=1 [Qk(λ) − Q(λ)]2 @freakonometrics 21
  • 22. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 1 > freq = merge(contrat ,nombre_RC) 2 > freq = merge(freq ,nombre_DO) 3 > freq [ ,10]= as.factor(freq [ ,10]) 4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D", freq [ ,3]%in%c("A","B","C")) 5 > colnames(mx)=c(names(freq)[c(4,5,6)]," diesel","zone") 6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean( mx[,i]))/sd(mx[,i]) 7 > names(mx) 8 [1] puissance agevehicule ageconducteur diesel zone 9 > library(glmnet) 10 > fit = glmnet(x=as.matrix(mx), y=freq [,11], offset=log(freq [ ,2]), family = "poisson ") 11 > plot(fit , xvar="lambda", label=TRUE) LASSO, Fréquence RC −10 −9 −8 −7 −6 −5 −0.20−0.15−0.10−0.050.000.050.10 Log Lambda Coefficients 4 4 4 4 3 1 1 3 4 5 @freakonometrics 22
  • 23. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO, Fréquence RC 1 > plot(fit ,label=TRUE) 2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq [,11], offset=log(freq [ ,2]),family = " poisson") 3 > plot(cvfit) 4 > cvfit$lambda.min 5 [1] 0.0002845703 6 > log(cvfit$lambda.min) 7 [1] -8.16453 • Cross validation curve + error bars 0.0 0.1 0.2 0.3 0.4 −0.20−0.15−0.10−0.050.000.050.10 L1 Norm Coefficients 0 2 3 4 4 1 3 4 5 −10 −9 −8 −7 −6 −5 0.2460.2480.2500.2520.2540.2560.258 log(Lambda) PoissonDeviance q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1 @freakonometrics 23
  • 24. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 1 > freq = merge(contrat ,nombre_RC) 2 > freq = merge(freq ,nombre_DO) 3 > freq [ ,10]= as.factor(freq [ ,10]) 4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D", freq [ ,3]%in%c("A","B","C")) 5 > colnames(mx)=c(names(freq)[c(4,5,6)]," diesel","zone") 6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean( mx[,i]))/sd(mx[,i]) 7 > names(mx) 8 [1] puissance agevehicule ageconducteur diesel zone 9 > library(glmnet) 10 > fit = glmnet(x=as.matrix(mx), y=freq [,12], offset=log(freq [ ,2]), family = "poisson ") 11 > plot(fit , xvar="lambda", label=TRUE) LASSO, Fréquence DO −9 −8 −7 −6 −5 −4 −0.8−0.6−0.4−0.20.0 Log Lambda Coefficients 4 4 3 2 1 1 1 2 4 5 @freakonometrics 24
  • 25. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO, Fréquence DO 1 > plot(fit ,label=TRUE) 2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq [,12], offset=log(freq [ ,2]),family = " poisson") 3 > plot(cvfit) 4 > cvfit$lambda.min 5 [1] 0.0004744917 6 > log(cvfit$lambda.min) 7 [1] -7.653266 • Cross validation curve + error bars 0.0 0.2 0.4 0.6 0.8 1.0 −0.8−0.6−0.4−0.20.0 L1 Norm Coefficients 0 1 1 1 2 4 1 2 4 5 −9 −8 −7 −6 −5 −4 0.2150.2200.2250.2300.235 log(Lambda) PoissonDeviance q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1 @freakonometrics 25
  • 26. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj The theoretical curve, given a distribution F, is u → L(u) = F −1 (u) −∞ tdF(t) +∞ −∞ tdF(t) see Gastwirth (1972, econpapers.repec.org) @freakonometrics 26
  • 27. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) 1 > library(ineq) 2 > set.seed (1) 3 > (x<-sort(rlnorm (5,0,1))) 4 [1] 0.4336018 0.5344838 1.2015872 1.3902836 4.9297132 5 > Lc.sim <- Lc(x) 6 > plot(Lc.sim) 7 > points ((1:4)/5,( cumsum(x)/sum(x))[1:4] , pch =19, col="blue") 8 > lines(Lc.lognorm , parameter =1,lty =2) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Lorenz curve p L(p) q q q q @freakonometrics 27
  • 28. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) Gini index is the ratio of the areas A A + B . Thus, G = 2 n(n − 1)x n i=1 i · xi:n − n + 1 n − 1 = 1 E(Y ) ∞ 0 F(y)(1 − F(y))dy 1 > Gini(x) 2 [1] 0.4640003 q q 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 p L(p) q q q q A B @freakonometrics 28
  • 29. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Comparing Models Consider an ordered sample {y1, · · · , yn} of incomes, with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj We have observed losses yi and premiums π(xi). Con- sider an ordered sample by the model, see Frees, Meyers & Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), then plot {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj 0 20 40 60 80 100 020406080100 Proportion (%) Income(%) poorest ← → richest 0 20 40 60 80 100 020406080100 Proportion (%) Losses(%) more risky ← → less risky @freakonometrics 29
  • 30. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Choix et comparaison de modèle en tarification See Frees et al. (2010) or Tevet (2013). @freakonometrics 30