1. Approximate regeneration scheme for Markov Chains,
with applications to U-statistics and extreme values.
Approximate regeneration scheme
Patrice Bertail
MODAL’X and CREST
CREST 30 Janvier 2013
Patrice Bertail (MODAL’X and CREST) Approximate regeneration scheme for Markov Chains, wit CandEextreme 1 / 25
R S T , values.
2. Outline
The renewal or regenerative approach
Notations
The regenerative approach (atomic case)
Nummelin splitting trick
Approximate regeneration scheme
U-statistics for Markovian data
U-statistics
U-statistics : a block Hoe¤ding decomposition
Moment type conditions
Asymptotics results and the bootstrap
CLT
variance estimation
Berry-Esseen bounds
Second order validity of the bootstrap?
Simulation results
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 2 / 26
3. Notations
X = (Xn )n 2N denotes a ψ-irreducible time-homogeneous Markov
chain, valued in a measurable space (E , E ) (ψ being a maximal
irreducibility measure)
with transition probability Π(x, dy ) and initial distribution
ν (non-stationary case)
Pν (respectively, Px for x in E ) = probability measure such that
X0 ν (resp., conditioned upon X0 = x), for A, such that ψ(A) > 0,
by PA [.] the probability measure on the underlying space such that
X0 2 A and by EA [.] the PA -expectation.
Eν [.] the Pν -expectation (resp. Ex [.] the Px (.)-expectation)
Hypotheses : the chain X is Harris recurrent, for any subset B 2 E
such that ψ(B ) > 0
Px ( ∑ IfX n 2B g = ∞) = 1, for all x 2 E .
n 1
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 3 / 26
4. The regenerative approach (Meyn and Tweedie, 1996)
De…nition : A Markov chain is said regenerative if it possesses an
accessible atom, i.e., a measurable set A such that ψ(A) > 0 and
Π(x, .) = Π(y , .) for all (x, y ) 2 A2 .
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 4 / 26
5. The regenerative approach (Meyn and Tweedie, 1996)
De…nition : A Markov chain is said regenerative if it possesses an
accessible atom, i.e., a measurable set A such that ψ(A) > 0 and
Π(x, .) = Π(y , .) for all (x, y ) 2 A2 .
Hypothesis: The chain is positive recurrent (if and only if the
expected return time to the atom is …nite, i.e. EA [τ A ] < ∞).
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 4 / 26
6. The regenerative approach (Meyn and Tweedie, 1996)
De…nition : A Markov chain is said regenerative if it possesses an
accessible atom, i.e., a measurable set A such that ψ(A) > 0 and
Π(x, .) = Π(y , .) for all (x, y ) 2 A2 .
Hypothesis: The chain is positive recurrent (if and only if the
expected return time to the atom is …nite, i.e. EA [τ A ] < ∞).
Invariant measure, for all B 2 E ,
" #
τA
1
µ (B ) =
EA [τ A ]
EA ∑ IfX 2B g
i
.
i =1
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 4 / 26
7. Regenerative Blocks of observations
Sucessive return time to an atom A
τA = τ A (1) = inf fn 1, Xn 2 Ag
τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2.
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 5 / 26
8. Regenerative Blocks of observations
Sucessive return time to an atom A
τA = τ A (1) = inf fn 1, Xn 2 Ag
τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2.
Regenerative blocks of observations between consecutive visits to the
atom
B0 = (X1 , ..., XτA (1 ) )
B1 = (XτA (1 )+1 , ..., XτA (2 ) ), . . . , Bj = (XτA (j )+1 , ..., XτA (j +1 ) ), . . .
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 5 / 26
9. Regenerative Blocks of observations
Sucessive return time to an atom A
τA = τ A (1) = inf fn 1, Xn 2 Ag
τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2.
Regenerative blocks of observations between consecutive visits to the
atom
B0 = (X1 , ..., XτA (1 ) )
B1 = (XτA (1 )+1 , ..., XτA (2 ) ), . . . , Bj = (XτA (j )+1 , ..., XτA (j +1 ) ), . . .
The sequence B1 , B2 , ..., Bj ... is i.i.d (B0 independent from the
others but depends on ν) by the STRONG MARKOV Propertry.
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 5 / 26
10. Regenerative Blocks of observations
Sucessive return time to an atom A
τA = τ A (1) = inf fn 1, Xn 2 Ag
τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2.
Regenerative blocks of observations between consecutive visits to the
atom
B0 = (X1 , ..., XτA (1 ) )
B1 = (XτA (1 )+1 , ..., XτA (2 ) ), . . . , Bj = (XτA (j )+1 , ..., XτA (j +1 ) ), . . .
The sequence B1 , B2 , ..., Bj ... is i.i.d (B0 independent from the
others but depends on ν) by the STRONG MARKOV Propertry.
ln =number of regeneration in a sequence of length n =random
variable (depending on the length l (Bi ) of the blocks),
ln p.s . 1
n ! ( EA τ A ) .
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data l 2012 5 / 26
11. Harris recurrent Markov chains (non atomic case)
Nummelin splitting technique (1978)
All recurrent Markov chains can be extended to be atomic using the
Nummelin splitting technique
Notion of small set and minorization condition
De…nition: A set S 2 E is said to be small for the chain if there exist
m 2 N , δ > 0 and a probability measure Φ supported by S such that:
8(x, B ) 2 S E ,
Πm (x, B ) δΦ(B ),
denoting by Πm the m-th iterate of the transition kernel Π.
We assume that m = 1 here and throughout, with no loss of
generality.
Idea : a mixture with a component independent of x
Π(x, B ) δΦ(B )
Π(x, B ) = δΦ(B ) + (1 δ) , for all B S, x 2 S.
1 δ
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 6 / 26
12. Nummelin splitting trick
Y = (Yn )n 2N a sequence of independent Bernoulli r.v.’ with parameter δ
s
such that (X , Y ) is a bivariate Markov chain, referred to as the split chain,
with state space E f0, 1g. The split chain is atomic with atom
AS = S f1g.
Reference measure λ(dy ) dominating fΠ(x, dy ); x 2 E g.
Π(x, dy ) = π (x, y ) λ(dy ), Φ(dy ) = φ(y ) λ(dy )
8(x, y ) 2 S 2 , π (x, y ) δφ(y )
Conditioned upon X (n ) = (X1 , . . . , Xn ), the random variables
Y1 , . . . , Yn are mutually independent and, for all i 2 f1, . . . , ng, Yi
are drawn from a Bernoulli distribution with parameter
δφ(Xi +1 )
δIfX i 2S g + I .
/
π (Xi , Xi +1 ) fX i 2S g
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 7 / 26
13. Approximate regeneration Scheme (Bertail and Clemençon,
2005-2007)
b
Suppose that an estimate π n (x, y ) of the transition density over
S S, such that 8(x, y ) 2 S 2 , π n (x, y ) δφ(y ), is available (we
b
may choose S = S and δ = b).
b δ
Conditioned upon X (n ) = (X1 , . . . , Xn ), the random variables
b b b
Y1 , . . . , Yn are mutually independent and, for all i 2 f1, . . . , ng, Yi
are drawn from a Bernoulli distribution with parameter
b (Xi +1 )
δφ
δIfX i 2S g +
/b I b .
π n (Xi , Xi +1 ) fX i 2S g
b
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 8 / 26
14. Empirical choice of the small set
Equilibrium between the size of the small set and the number of
regenerations (knowing the trajectory).
Choose a neigborhood of size ε around a point x0 (typically the mean)
n
Nn (ε) = E( ∑ IfXi 2 Vx0 (ε), Yi = 1g jX (n +1 ) )
i =1
δ(ε) n
2ε i∑
= If(Xi , Xi +1 ) 2 Vx0 (ε)2 g/p (Xi , Xi +1 ).
=1
Since the transition density p and its minimum over Vx0 (ε)2 are unknown,
b
an empirical criterion Nn (ε) to optimize is obtained by replacing p by an
estimate pn and δ(ε)/2ε by a lower bound bn (ε)/2ε for pn over Vx0 (ε)2
δ
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 9 / 26
15. Empirical choice of the small set
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 10 / 26
16. U-statistics
The parameter
Z
µ (h ) = h(x, y )µ(dx )µ(dy )
(x ,y )2E 2
" #
τ A (1 ) τ A (2 )
1
=
(E τ A )2
EA ∑ ∑ h(Xi , Xj ) ,
i =1 j =1 + τ A (1 )
where h : R2 ! R is a symmetric kernel
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 11 / 26
17. U-statistics
The parameter
Z
µ (h ) = h(x, y )µ(dx )µ(dy )
(x ,y )2E 2
" #
τ A (1 ) τ A (2 )
1
=
(E τ A )2
EA ∑ ∑ h(Xi , Xj ) ,
i =1 j =1 + τ A (1 )
where h : R2 ! R is a symmetric kernel
U-statistics of degree 2
2
Un ( h ) =
n (n 1) 1 ∑ h(Xi , Xj ),
i <j n
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 11 / 26
18. Examples
2
The Gini index Gn = n (n 1 ) ∑1 i <j n jXi Xj j
The Wilcoxon statistics Wn = 2
n (n 1 ) ∑1 i <j n f 2 IfX i +X j >0 g 1g
AUC (ROC Curve)
The Takens estimator linked to
Cn (r ) = n (n1 1 ) ∑1 i 6=j n IfjjX i X j jj r g ,
1 jjX i X j jj
Tn = n (n 1 ) ∑1 i 6 =j n log r0
Regular part (2d order) of Frechet di¤erentiable functionals
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 12 / 26
19. Block representation of the U-statistics
4 di¤erent contributions : - true U-stat on blocks (center), contr. of B0 ,
contribution of Bln , contr. diagonal of incomplete blocks.
Main tool : Hoe¤ding decomposition on the U-stat based on blocks
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 13 / 26
20. Hoe¤ding Decomposition on Blocks
Notations:
τ A (k +1 ) τ A (l +1 )
ω h (Bk , Bl ) = ∑ ∑ (h(Xi , Xj ) µ(h))
i =τ A (k )+1 j =τ A (l )+1
U-statistics of regenerative blocks
2
RL (h) =
L(L 1) 1 ∑ ω h (Bk , Bl ),
k <l L
(ln
1) (ln 2)
Un ( h ) = µ ( h ) + Rl (h) + Wn (h).
n n 1 n 1
Hoe¤ding decomposition on the U-statistics on blocks.
RL (h) = 2SL (h) + DL (h)
where
1 L 2
L k∑ ∑
SL ( h ) = h1 (Bk ) and DL (h) = h2 (Bk , Bl ),
=1 L(L 1) 1 k <l L
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 14 / 26
21. Moment type conditions
A0 (Block-length: moment assumption.) Let q 1, we have
EA [τ q ] < ∞.
A
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 15 / 26
22. Moment type conditions
A0 (Block-length: moment assumption.) Let q 1, we have
EA [τ q ] < ∞.
A
A1 (Non-regenerative block.) Let l 1, we have Eν τ lA < ∞ as
well as
2 !l 3 2 !l 3
τA τA τ A τ A (2 )
Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞
i =1 j =1 i =1 j =1 + τ A
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 15 / 26
23. Moment type conditions
A0 (Block-length: moment assumption.) Let q 1, we have
EA [τ q ] < ∞.
A
A1 (Non-regenerative block.) Let l 1, we have Eν τ lA < ∞ as
well as
2 !l 3 2 !l 3
τA τA τ A τ A (2 )
Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞
i =1 j =1 i =1 j =1 + τ A
A2 (Block-sums: moment assumptions.) Let k 1, we have
2 !k 3 2 !k 3
τA τA τ A τ A (2 )
EA 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , EA 4 ∑ ∑ jh(Xi , Xj )j 5 <
i =1 j =1 i =1 j =1 + τ A
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 15 / 26
24. Moment type conditions
A0 (Block-length: moment assumption.) Let q 1, we have
EA [τ q ] < ∞.
A
A1 (Non-regenerative block.) Let l 1, we have Eν τ lA < ∞ as
well as
2 !l 3 2 !l 3
τA τA τ A τ A (2 )
Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞
i =1 j =1 i =1 j =1 + τ A
A2 (Block-sums: moment assumptions.) Let k 1, we have
2 !k 3 2 !k 3
τA τA τ A τ A (2 )
EA 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , EA 4 ∑ ∑ jh(Xi , Xj )j 5 <
i =1 j =1 i =1 j =1 + τ A
A2bis (Uniform moment assumptions.) Let p 0, we have
" !p # 2 ! p +2 3
τA τA
sup Eν
x 2S
∑ h(x, Xj )
¯ < ∞ , sup EA 4
x 2S
∑ h(x, Xj )
¯ 5 < ∞.
j =1 j =1
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 15 / 26
25. Additional conditions in the general case
b
B1. The MSE of π is of order αn when error is measured by the sup norm
over S 2 :
" #
Eν sup b
jπ (x, y ) π (x, y )j2 = O (αn ),
(x ,y )2S 2
where (αn ) denotes a sequence of nonnegative numbers decaying to
zero at in…nity.
B2. The parameters S and φ are chosen so that inf x 2S φ(x ) > 0.
B3. We have sup(x ,y )2S 2 π (x, y ) < ∞ and
supn 2N sup(x ,y )2S 2 π n (x, y ) < ∞ Pν -a.s.
ˆ
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 16 / 26
26. Asymptotic results
Theorem 1 (Central Limit Theorem) Suppose that assumptions
A0 A2 (or A2bis) with q = k = l = 2 are ful…lled. Then, we have the
convergence in distribution under Pν :
p
n (Un (h) µ(h)) ) N (0, σ2 (h)), as n ! ∞,
where σ2 (h) = 4EA h1 (B1 )2 /α3 . The bootstrap analog also holds.
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data l 2012 17 / 26
27. Asymptotic bias
Theorem 2 (Asymptotic bias) Suppose that assumptions A0 with
q = 4 + δ for some δ > 0, A1 with k = 2, A2 with p = 2, + a Cramer
condition. Then, as n ! ∞, we have
2∆ + 2φν 2β/α + 2γ
Eν [Un (h)] = µ(h) + + O (n 3/2 ),
n
where
τA τA
φν = Eν [ ∑ h0 (Xi )], γ = EA [ ∑ (τ A j )h0 (Xi )]/α
i =1 i =1
| {z }
contribution the …rst and last blocks
" # " #
τA
∆ = EA ∑ h(Xk , Xj ) /α, β = EA τ A ∑ h0 (Xi )
1 k <j τ A i =1
| {z }
contrib. of incomplete blocks+ the randomness of ln
R
with h0 (x ) = y 2E fh(x, y ) µ(h)gµ(dx ) for all x 2 E .
Proof :(MODAL’X and CREST)arguments, Malinovskii (1985)
P. Bertail
partitionning U-statistics of Markovian data 2012 18 / 26
28. Estimation of the variance
Jacknnife type estimator (Callaert, Verarverbeke, 1981)
L
1 2
ˆ
h1, j (b ) =
L ∑
1 k =1, k 6=j
ω h (b, Bk )
L(L 1) 1 ∑ ω h (Bk , Bl ),
k <l L
1 L ˆ2
L k∑ 1,
ˆ2
sL ( h ) = h k (Bk ).
=1
With norming constant related to the U-statistic Un (h):
σ2 (h) = 4 (ln /n)3 sl2
ˆn ˆn 1 (h ).
De…ne similarly the Bootstrap version of the variancesay σn2 (h), on blocks
ˆ
taken with replacement (either with ln …xed or by drawing the blocks
sequentially until the size of the bootstrap sample is n (ln blocks, random).
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 19 / 26
29. Estimation of the variance and bootstrap
Theorem 3 (Variance estimation) Suppose that assumptions A0 A2
(or A2bis and B1 B3) are ful…lled with q = 4. Then, the statistic σ2 (h)
ˆn
is a strongly consistent estimator of σ2 (h),
σ2 (h) ! σ2 (h) Pν -almost-surely, as n ! ∞.
ˆn
Pr
σ n 2 (h )
ˆ σ 2 (h )
ˆn > 0 a.s., as n ! ∞.
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 20 / 26
30. Rate of convergence and the bootstrap
Theorem (A Berry-Esseen bound and a rate for the
bootstrap). Under assumptions A0 with q = 3 + ε, ε > 0, A1 with
k = 2, A2 (or A2 bis ) with l = 3, there exists an explicit constant C (h)
depending only on the moments involved in hypotheses A1 A3 such
that: as n ! ∞,
p
sup Pν nσ (h) 1 (Un (h) µ(h)) x Φ (x ) C (h) n 1/2
x 2R
It follows that (if in addition B1 B3 hold for the general case)
p
P nbn (h) 1 (Un (h) Un (h)) x
σ
sup p = O (n 1/2 ).
x 2R Pν nσ(h) 1 (Un (h) µ(h)) x
Proof : Stein method + some simple partitionning arguments.(improves
over Bolthausen (1986) results for the mean).
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 21 / 26
31. Second order Validity of the Bootstrap?
DOES NOT HOLD if the U-statitics is constructed on all the
observations!
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 22 / 26
32. Second order Validity of the Bootstrap?
DOES NOT HOLD if the U-statitics is constructed on all the
observations!
DOES NOT HOLD if ln is held …xed in the boostrap procedure.
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 22 / 26
33. Second order Validity of the Bootstrap?
DOES NOT HOLD if the U-statitics is constructed on all the
observations!
DOES NOT HOLD if ln is held …xed in the boostrap procedure.
The …rst blocks should be dropped even in the case of the mean. The
approximate regeneration scheme gives a rule to eliminate the data
from a "burning period".
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 22 / 26
34. Second order Validity of the Bootstrap?
DOES NOT HOLD if the U-statitics is constructed on all the
observations!
DOES NOT HOLD if ln is held …xed in the boostrap procedure.
The …rst blocks should be dropped even in the case of the mean. The
approximate regeneration scheme gives a rule to eliminate the data
from a "burning period".
Di¢ culty for proving the second order correctness (true in the case of
the mean) : partitioning arguments reduce to obtaining a local
Edgeworth expansion for
( )
m
P ∑ ω h (Bi , Bj )/σ2
h y, ∑ l (Bj ) = k
1 i 6 =j m j =1
Usual methods (Dubinskaite, 1985-86) very technical...
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 22 / 26
35. Second order Validity of the Bootstrap?
DOES NOT HOLD if the U-statitics is constructed on all the
observations!
DOES NOT HOLD if ln is held …xed in the boostrap procedure.
The …rst blocks should be dropped even in the case of the mean. The
approximate regeneration scheme gives a rule to eliminate the data
from a "burning period".
Di¢ culty for proving the second order correctness (true in the case of
the mean) : partitioning arguments reduce to obtaining a local
Edgeworth expansion for
( )
m
P ∑ ω h (Bi , Bj )/σ2
h y, ∑ l (Bj ) = k
1 i 6 =j m j =1
Usual methods (Dubinskaite, 1985-86) very technical...
Simultaneous control of the degenerate part of the U-statistics and
the lattice part...
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 22 / 26
36. Simulation results
Graph panel:XGini index, EXP-AR(1) Markovian data α1 = 0.8 and 23 / 26
P. Bertail (MODAL’ and CREST) U-statistics of model with 2012
37. Basic Lemma for establishing a Berry-Essen type of bound
2
Let Wn be a r.v. such that EWn = 1 and such that, for some constant C, Wn
admits a Berry Esseen type bound
C
jjP (Wn < x ) Φ(x )jj∞
n1/2
then, for any random sequence ∆n ,we have
C
jjP (Wn + ∆n < x ) Φ(x )jj∞ + 8E (∆2 )1/2
n
n1/2
Proof : the proof easily follows from the Stein method for establishing Berry
Esseen Bounds: see for instance Shorack (2000), lemma 1.3, p. 261.
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 25 / 26
38. !2
n n n
EA jln [n/α]j 1/2
∑ f (Bj )/σ(f ) = ∑ ∑ ∑ al ,r ,s = I + II
j 2)l 1,[n/α] 1 ( l =1 r =1 s =1
[n/α] 1 n n
with I = ∑ ∑ ∑ al ,r ,s
l =1 r =1 s =1
n n n
and II = ∑ ∑ ∑ al ,r ,s
l =[n/α]+1 r =1 s =1
with
Z
al ,r ,s = Pν (B0 2 du, τ A = r )PA (Bl 2 dv , τ l +1 > s )
Z l 1
x 2 PA jl [n/α]j 1/2
∑ f (Bj )/σ(f ) 2 dx, ∑ l (Bj ) = n
j 2)l 1,[n/α] 1 ( i =1
P. Bertail (MODAL’ and CREST)
X U-statistics of Markovian data 2012 26 / 26