SlideShare a Scribd company logo
1 of 32
Download to read offline
New tools from the bandit literature
to improve A/B Testing
Emilie Kaufmann,
joint work with Aur´elien Garivier & Olivier Capp´e
RecSys Meetup,
March 23rd, 2016
Emilie Kaufmann Improved A/B Testing
Motivation
Emilie Kaufmann Improved A/B Testing
A/B Testing
A way to do A/B Testing:
allocate nA users to page A and nB users to page B
perform a statistical test of “A better than B”
Emilie Kaufmann Improved A/B Testing
A/B Testing
A way to do A/B Testing:
allocate nA users to page A and nB users to page B
perform a statistical test of “A better than B”
A variant: fully adaptive A/B Testing
sequentially choose which version to allocate to each visitor
adaptively choose when to stop the experiment
§ multi-armed bandit model
Emilie Kaufmann Improved A/B Testing
A/B/C... testing as a Best Arm Identification problem
K arms = K probability distributions (νa has mean µa)
ν1 ν2 ν3 ν4 ν5
a∗
= argmax
a=1,...,K
µa
For the t-th user,
allocate a version (arm) At ∈ {1, . . . , K}
observe a feedback Xt ∼ νAt
Goal: design
a sequential sampling rule: At+1 = Ft(A1, X1, . . . , At, Xt),
a stopping rule τ
a recommendation rule ˆaτ
such that P(ˆaτ = a∗
) ≥ 1 − δ and τ is as small as possible.
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
PAC algorithms in one-parameter bandit models
P = {νµ, µ ∈ I} set of distributions parametrized by their mean
Example: Bernoulli, Poisson, Gaussian (known variance)
νµ1 , . . . , νµK
∈ PK
⇔ µ = (µ1, . . . , µK ) ∈ IK
S = µ ∈ IK
: ∃a ∈ {1, . . . , K} : µa > max
i=a
µi
A strategy is δ-PAC (on S) if
∀ν ∈ S, Pν(ˆaτ = a∗
) ≥ 1 − δ.
Emilie Kaufmann Improved A/B Testing
PAC algorithms in one-parameter bandit models
P = {νµ, µ ∈ I} set of distributions parametrized by their mean
Example: Bernoulli, Poisson, Gaussian (known variance)
νµ1 , . . . , νµK
∈ PK
⇔ µ = (µ1, . . . , µK ) ∈ IK
S = µ ∈ IK
: ∃a ∈ {1, . . . , K} : µa > max
i=a
µi
A strategy is δ-PAC (on S) if
∀ν ∈ S, Pν(ˆaτ = a∗
) ≥ 1 − δ.
§ What is the optimal sample complexity of a δ-PAC strategy?
inf
δ-PAC
Eµ[τ]?
Emilie Kaufmann Improved A/B Testing
The optimal sample complexity
To answer this question, we need
§ a lower bound on Eν[τ] for any δ-PAC strategy
§ a δ-PAC strategy such that Eν[τ] matches this bound
State-of-the-art: δ-PAC algorithms for which
Eµ[τ] = O H(µ) log
1
δ
, H(µ) =
1
(µ2 − µ1)2
+
K
a=2
1
(µa − µ1)2
[Even Dar et al. 2006, Kalyanakrishnan et al. 2012]
§ the optimal sample complexity is not identified...
Notation: Kullback-Leibler divergence
d(µ, µ ) := KL(νµ
, νµ
) = EX∼νµ log
dνµ
dνµ
(X)
is the KL-divergence between the distributions of mean µ and µ .
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
Lower bound
A first (easy to interpret) result
Theorem [Kaufmann, Capp´e, Garivier 2015]
For any δ-PAC algorithm,
Eµ[τ] ≥
1
d(µ1, µ2)
+
K
a=2
1
d(µa, µ1)
log
1
2.4δ
Emilie Kaufmann Improved A/B Testing
Lower bound
A first (easy to interpret) result
Theorem [Kaufmann, Capp´e, Garivier 2015]
For any δ-PAC algorithm,
Eµ[τ] ≥
1
d(µ1, µ2)
+
K
a=2
1
d(µa, µ1)
log
1
2.4δ
A tighter (non explicit) lower bound
Theorem [Kaufmann and Garivier, 2016]
Alt(µ) := {λ : a∗(λ) = a∗(µ)}. For any δ-PAC algorithm,
Eµ[τ] ≥ T∗
(µ) log
1
2.4δ
,
where
T∗
(µ)−1
= sup
w∈ΣK
inf
λ∈Alt(µ)
K
a=1
wad(µa, λa) .
Emilie Kaufmann Improved A/B Testing
A vector of optimal proportions
w∗
(µ) := argmax
w∈ΣK
inf
λ∈Alt(µ)
K
a=1
wad(µa, λa)
is unique and represents the optimal proportions of draws: a
strategy matching the lower bound should satisfy
∀a ∈ {1, . . . , K},
Eµ[Na(τ)]
Eµ[τ]
= w∗
a (µ).
Na(t) : number of draws of arm a up to time t
§ we propose an efficient algorithm to compute w∗(µ)
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
Sampling rule: Tracking the optimal proportions
ˆµ(t) = (ˆµ1(t), . . . , ˆµK (t)): vector of empirical means
Introducing
Ut = {a : Na(t) <
√
t},
the arm sampled at round t + 1 is
At+1 ∈



argmin
a∈Ut
Na(t) if Ut = ∅ (forced exploration)
argmax
1≤a≤K
[t w∗
a (ˆµ(t)) − Na(t)] (tracking)
Lemma
Under the Tracking sampling rule,
Pµ lim
t→∞
Na(t)
t
= w∗
a (µ) = 1.
Emilie Kaufmann Improved A/B Testing
Stopping rule: performing statistical tests
High values of the Generalized Likelihood Ratio
Za,b(t) := log
max{λ:λa≥λb} (X1, . . . , Xt; λ)
max{λ:λa≤λb} (X1, . . . , Xt; λ)
,
reject the hypothesis that (µa < µb).
We stop when one arm is accessed to be significantly larger than
all other arms, according to a GLR Test:
τδ = inf {t ∈ N : ∃a ∈ {1, . . . , K}, ∀b = a, Za,b(t) > β(t, δ)}
= inf t ∈ N : max
a∈{1,...,K}
min
b=a
Za,b(t) > β(t, δ)
Chernoff stopping rule [Chernoff 59]
Emilie Kaufmann Improved A/B Testing
Stopping rule: an alternative interpretation
One has Za,b(t) = −Zb,a(t) and, if ˆµa(t) ≥ ˆµb(t),
Za,b(t) = Na(t) d ˆµa(t), ˆµa,b(t) + Nb(t) d ˆµb(t), ˆµa,b(t) ,
where ˆµa,b(t) := Na(t)
Na(t)+Nb(t) ˆµa(t) + Nb(t)
Na(t)+Nb(t) ˆµb(t).
A link with the lower bound
max
a
min
b=a
Za,b(t) = t × inf
λ∈Alt(ˆµ(t))
K
a=1
Na(t)
t
d(ˆµa(t), λa)
t
T∗(µ)
under a “good” sampling strategy (for t large)
Emilie Kaufmann Improved A/B Testing
An asymptotically optimal algorithm
Theorem
The Track-and-Stop strategy, that uses
the Tracking sampling rule
the Chernoff stopping rule with β(t, δ) = log 2(K−1)t
δ
and recommends ˆaτ = argmax
a=1...K
ˆµa(τ)
is δ-PAC for every δ ∈]0, 1[ and satisfies
lim sup
δ→0
Eµ[τδ]
log(1/δ)
= T∗
(µ).
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
Optimal sample complexity
Two arms, B(µ1) and B(µ2)
Eµ[τ] ≥ T∗
(µ) log
1
2.4δ
,
with
T∗
(µ)−1
= sup
α∈[0,1]
[αd(µ1, αµ1 + (1 − α)µ2)+
(1 − α)d(µ2, αµ1 + (1 − α)µ2)]
= d∗(µ1, µ2),
d∗(µ1, µ2) = d(µ1, z∗
)
with z∗ defined by
d(µ1, z∗
) = d(µ2, z∗
)
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
x yz*
Emilie Kaufmann Improved A/B Testing
Algorithms
Track-and-Stop
Sampling rule:
At+1 = argmax
a=1,2
d ˆµa(t),
N1(t)ˆµ1(t) + N2(t)ˆµ2(t)
N1(t) + N2(t)
Stopping rule: stop after t samples if
a=1,2
Na(t)d ˆµa(t),
N1(t)ˆµ1(t) + N2(t)ˆµ2(t)
N1(t) + N2(t)
> β(t, δ)
Eµ[τ]
1
d∗(µ1, µ2)
log
1
δ
Emilie Kaufmann Improved A/B Testing
Algorithms
Uniform sampling (and optimal stopping)
Sampling rule:
At+1 = t [2]
Stopping rule: stop after t samples if
a=1,2
Na(t)d ˆµa(t),
N1(t)ˆµ1(t) + N2(t)ˆµ2(t)
N1(t) + N2(t)
> β(t, δ)
Eµ[τ]
1
I∗(µ1, µ2)
log
1
δ
with
I∗(µ1, µ2) =
d µ1, µ1+µ2
2 + d µ1, µ1+µ2
2
2
.
Remark: I∗(µ1, µ2) very close to d∗(µ1, µ2)
§ uniform sampling is close to optimal
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
An optimal algorithm
Two arms, N(µ1, σ2
1) and N(µ2, σ2
2) σ1, σ2 known
Eµ[τ] ≥
2(σ2
1 + σ2
2)
(µ1 − µ2)2
log
1
2.4δ
and
w∗(µ) =
σ1
σ1 + σ2
;
σ2
σ1 + σ2
§ allocate the arms proportionaly to the standard deviations
(no uniform sampling if σ1 = σ2)
Optimal algorithm:
Sampling rule:
At+1 = 1 ⇔
N1(t)
t
<
σ1
σ1 + σ2
Stopping rule: stop after t samples if
|ˆµ1(t) − ˆµ2(t)| > 2
σ2
1
N1(t)
+
σ2
2
N2(t)
β(t, δ)
Emilie Kaufmann Improved A/B Testing
Outline
1 Optimal algorithms for best-arm identification
Lower bounds
The Track-and-Stop strategy
2 A/B Testing
Bernoulli distribution
Gaussian distribution
3 Practical performance
Emilie Kaufmann Improved A/B Testing
State-of-the-art algorithms
An algorithm based on confidence intervals : KL-LUCB
[K., Kalyanakrishnan 13]
ua(t) = max {q : Na(t)d(ˆµa(t), q) ≤ β(t, δ)}
la(t) = min {q : Na(t)d(ˆµa(t), q) ≤ β(t, δ)}
0
1
771 459 200 45 48 23
sampling rule: At+1 = argmax
a
ˆµa(t), Bt+1 = argmax
b=At+1
ub(t)
stopping rule: τ = inf{t ∈ N : lAt (t) > uBt (t)}
Emilie Kaufmann Improved A/B Testing
State-of-the-art algorithms
A Racing-type algorithm: KL-Racing [K., Kalyanakrishnan 13]
R = {1, . . . , K} set of remaining arms.
r = 0 current round
while |R| > 1
r=r+1
draw each a ∈ R, compute ˆµa,r , the empirical mean of the r
samples observed sofar
compute the empirical best and empirical worst arms:
br = argmax
a∈R
ˆµa,r wr = argmin
a∈R
ˆµa,r
Elimination step: if
lbr (r) > uwr (r),
eliminate wr : R = R{wr }
end
Outpout: ˆa the single element in R.
Emilie Kaufmann Improved A/B Testing
The Chernoff-Racing algorithm
R = {1, . . . , K} set of remaining arms.
r = 0 current round
while |R| > 1
r=r+1
draw each a ∈ R, compute ˆµa,r , the empirical mean of the r
samples observed sofar
compute the empirical best and empirical worst arms:
br = argmax
a∈R
ˆµa,r wr = argmin
a∈R
ˆµa,r
Elimination step: if (Zbr ,wr (r) > β(r, δ)), or
rd ˆµa,r ,
ˆµa,r + ˆµb,r
2
+ rd ˆµb,r ,
ˆµa,r + ˆµb,r
2
> β(r, δ),
eliminate wr : R = R{wr }
end
Outpout: ˆa the single element in R.
Emilie Kaufmann Improved A/B Testing
Numerical experiments
Experiments on two Bernoulli bandit models:
µ1 = [0.5 0.45 0.43 0.4], such that
w∗
(µ1) = [0.417 0.390 0.136 0.057]
µ2 = [0.3 0.21 0.2 0.19 0.18], such that
w∗
(µ2) = [0.336 0.251 0.177 0.132 0.104]
In practice, set the threshold to β(t, δ) = log log(t)+1
δ .
Track-and-Stop Chernoff-Racing KL-LUCB KL-Racing
µ1 4052 4516 8437 9590
µ2 1406 3078 2716 3334
Table : Expected number of draws Eµ[τδ] for δ = 0.1, averaged over
N = 3000 experiments.
Emilie Kaufmann Improved A/B Testing
Take-home message
Useful tools for sequential A/B Testing:
stop using Sequential Generalized Likelihood Ratio tests
sample the arms to match the optimal proportions w∗(µ)
... which can be approximated by uniform sampling for
Bernoulli distribution
Final remark:
Good algorithms for best arm identification are very different for
bandit algorithms designed for regret minimization
(UCB, Thompson Sampling)
Emilie Kaufmann Improved A/B Testing
References
This talk is based on
A. Garivier, E. Kaufmann.
Optimal Best Arm Identification with Fixed Confidence,
arXiv:1602.04589, 2016
E. Kaufmann, O. Capp´e, A. Garivier.
On the Complexity of A/B Testing. COLT, 2014
E. Kaufmann, O. Capp´e, A. Garivier.
On the Complexity of Best Arm Identification in Multi-Armed
Bandit Models. JMLR, 2015
Emilie Kaufmann Improved A/B Testing

More Related Content

What's hot

RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010
Christian Robert
 
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsA Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
Matt Parker
 

What's hot (19)

Building Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsBuilding Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and Manifolds
 
Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
Value Function Geometry and Gradient TD
Value Function Geometry and Gradient TDValue Function Geometry and Gradient TD
Value Function Geometry and Gradient TD
 
Gradient descent optimizer
Gradient descent optimizerGradient descent optimizer
Gradient descent optimizer
 
Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)
 
RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010RSS discussion of Girolami and Calderhead, October 13, 2010
RSS discussion of Girolami and Calderhead, October 13, 2010
 
Understanding Dynamic Programming through Bellman Operators
Understanding Dynamic Programming through Bellman OperatorsUnderstanding Dynamic Programming through Bellman Operators
Understanding Dynamic Programming through Bellman Operators
 
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsA Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
 
Differential Equations 4th Edition Blanchard Solutions Manual
Differential Equations 4th Edition Blanchard Solutions ManualDifferential Equations 4th Edition Blanchard Solutions Manual
Differential Equations 4th Edition Blanchard Solutions Manual
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
 
Bathymetry smoothing in ROMS: A new approach
Bathymetry smoothing in ROMS: A new approachBathymetry smoothing in ROMS: A new approach
Bathymetry smoothing in ROMS: A new approach
 
Note chapter9 0708-edit-1
Note chapter9 0708-edit-1Note chapter9 0708-edit-1
Note chapter9 0708-edit-1
 
3 earth atmosphere
3 earth atmosphere3 earth atmosphere
3 earth atmosphere
 
Selaidechou
SelaidechouSelaidechou
Selaidechou
 
Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]
Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]
Sensor Fusion Study - Ch7. Kalman Filter Generalizations [김영범]
 
Kalman filter for Beginners
Kalman filter for BeginnersKalman filter for Beginners
Kalman filter for Beginners
 
Physmed11 u1 1
Physmed11 u1 1Physmed11 u1 1
Physmed11 u1 1
 
Semi-automatic ABC: a discussion
Semi-automatic ABC: a discussionSemi-automatic ABC: a discussion
Semi-automatic ABC: a discussion
 
no U-turn sampler, a discussion of Hoffman & Gelman NUTS algorithm
no U-turn sampler, a discussion of Hoffman & Gelman NUTS algorithmno U-turn sampler, a discussion of Hoffman & Gelman NUTS algorithm
no U-turn sampler, a discussion of Hoffman & Gelman NUTS algorithm
 

Viewers also liked

Viewers also liked (16)

Tailor-made personalization and recommendation - Sailendra
Tailor-made personalization and recommendation - SailendraTailor-made personalization and recommendation - Sailendra
Tailor-made personalization and recommendation - Sailendra
 
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer FlowStory of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flow
 
Recommendation @ PriceMinister-Rakuten - Road to personalization
Recommendation @ PriceMinister-Rakuten - Road to personalizationRecommendation @ PriceMinister-Rakuten - Road to personalization
Recommendation @ PriceMinister-Rakuten - Road to personalization
 
Rakuten Institute of Technology Paris
Rakuten Institute of Technology ParisRakuten Institute of Technology Paris
Rakuten Institute of Technology Paris
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meetic
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scale
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and love
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Model
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphs
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratings
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentation
 

Similar to New tools from the bandit literature to improve A/B Testing

Increasing and decreasing functions ap calc sec 3.3
Increasing and decreasing functions ap calc sec 3.3Increasing and decreasing functions ap calc sec 3.3
Increasing and decreasing functions ap calc sec 3.3
Ron Eick
 
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Sean Meyn
 

Similar to New tools from the bandit literature to improve A/B Testing (20)

Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)
 
Time series Modelling Basics
Time series Modelling BasicsTime series Modelling Basics
Time series Modelling Basics
 
Cerutti -- TAFA2013
Cerutti -- TAFA2013Cerutti -- TAFA2013
Cerutti -- TAFA2013
 
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse Problems
 
Recursion Algorithms Derivation
Recursion Algorithms DerivationRecursion Algorithms Derivation
Recursion Algorithms Derivation
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methods
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
Zap Q-Learning - ISMP 2018
Zap Q-Learning - ISMP 2018Zap Q-Learning - ISMP 2018
Zap Q-Learning - ISMP 2018
 
Secure Domination in graphs
Secure Domination in graphsSecure Domination in graphs
Secure Domination in graphs
 
Increasing and decreasing functions ap calc sec 3.3
Increasing and decreasing functions ap calc sec 3.3Increasing and decreasing functions ap calc sec 3.3
Increasing and decreasing functions ap calc sec 3.3
 
Input analysis
Input analysisInput analysis
Input analysis
 
22 01 2014_03_23_31_eee_formula_sheet_final
22 01 2014_03_23_31_eee_formula_sheet_final22 01 2014_03_23_31_eee_formula_sheet_final
22 01 2014_03_23_31_eee_formula_sheet_final
 
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...
 

More from recsysfr

More from recsysfr (9)

Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty FiveMulti Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
 
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender system
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Group
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 

Recently uploaded

Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
vexqp
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 

Recently uploaded (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 

New tools from the bandit literature to improve A/B Testing

  • 1. New tools from the bandit literature to improve A/B Testing Emilie Kaufmann, joint work with Aur´elien Garivier & Olivier Capp´e RecSys Meetup, March 23rd, 2016 Emilie Kaufmann Improved A/B Testing
  • 3. A/B Testing A way to do A/B Testing: allocate nA users to page A and nB users to page B perform a statistical test of “A better than B” Emilie Kaufmann Improved A/B Testing
  • 4. A/B Testing A way to do A/B Testing: allocate nA users to page A and nB users to page B perform a statistical test of “A better than B” A variant: fully adaptive A/B Testing sequentially choose which version to allocate to each visitor adaptively choose when to stop the experiment § multi-armed bandit model Emilie Kaufmann Improved A/B Testing
  • 5. A/B/C... testing as a Best Arm Identification problem K arms = K probability distributions (νa has mean µa) ν1 ν2 ν3 ν4 ν5 a∗ = argmax a=1,...,K µa For the t-th user, allocate a version (arm) At ∈ {1, . . . , K} observe a feedback Xt ∼ νAt Goal: design a sequential sampling rule: At+1 = Ft(A1, X1, . . . , At, Xt), a stopping rule τ a recommendation rule ˆaτ such that P(ˆaτ = a∗ ) ≥ 1 − δ and τ is as small as possible. Emilie Kaufmann Improved A/B Testing
  • 6. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 7. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 8. PAC algorithms in one-parameter bandit models P = {νµ, µ ∈ I} set of distributions parametrized by their mean Example: Bernoulli, Poisson, Gaussian (known variance) νµ1 , . . . , νµK ∈ PK ⇔ µ = (µ1, . . . , µK ) ∈ IK S = µ ∈ IK : ∃a ∈ {1, . . . , K} : µa > max i=a µi A strategy is δ-PAC (on S) if ∀ν ∈ S, Pν(ˆaτ = a∗ ) ≥ 1 − δ. Emilie Kaufmann Improved A/B Testing
  • 9. PAC algorithms in one-parameter bandit models P = {νµ, µ ∈ I} set of distributions parametrized by their mean Example: Bernoulli, Poisson, Gaussian (known variance) νµ1 , . . . , νµK ∈ PK ⇔ µ = (µ1, . . . , µK ) ∈ IK S = µ ∈ IK : ∃a ∈ {1, . . . , K} : µa > max i=a µi A strategy is δ-PAC (on S) if ∀ν ∈ S, Pν(ˆaτ = a∗ ) ≥ 1 − δ. § What is the optimal sample complexity of a δ-PAC strategy? inf δ-PAC Eµ[τ]? Emilie Kaufmann Improved A/B Testing
  • 10. The optimal sample complexity To answer this question, we need § a lower bound on Eν[τ] for any δ-PAC strategy § a δ-PAC strategy such that Eν[τ] matches this bound State-of-the-art: δ-PAC algorithms for which Eµ[τ] = O H(µ) log 1 δ , H(µ) = 1 (µ2 − µ1)2 + K a=2 1 (µa − µ1)2 [Even Dar et al. 2006, Kalyanakrishnan et al. 2012] § the optimal sample complexity is not identified... Notation: Kullback-Leibler divergence d(µ, µ ) := KL(νµ , νµ ) = EX∼νµ log dνµ dνµ (X) is the KL-divergence between the distributions of mean µ and µ . Emilie Kaufmann Improved A/B Testing
  • 11. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 12. Lower bound A first (easy to interpret) result Theorem [Kaufmann, Capp´e, Garivier 2015] For any δ-PAC algorithm, Eµ[τ] ≥ 1 d(µ1, µ2) + K a=2 1 d(µa, µ1) log 1 2.4δ Emilie Kaufmann Improved A/B Testing
  • 13. Lower bound A first (easy to interpret) result Theorem [Kaufmann, Capp´e, Garivier 2015] For any δ-PAC algorithm, Eµ[τ] ≥ 1 d(µ1, µ2) + K a=2 1 d(µa, µ1) log 1 2.4δ A tighter (non explicit) lower bound Theorem [Kaufmann and Garivier, 2016] Alt(µ) := {λ : a∗(λ) = a∗(µ)}. For any δ-PAC algorithm, Eµ[τ] ≥ T∗ (µ) log 1 2.4δ , where T∗ (µ)−1 = sup w∈ΣK inf λ∈Alt(µ) K a=1 wad(µa, λa) . Emilie Kaufmann Improved A/B Testing
  • 14. A vector of optimal proportions w∗ (µ) := argmax w∈ΣK inf λ∈Alt(µ) K a=1 wad(µa, λa) is unique and represents the optimal proportions of draws: a strategy matching the lower bound should satisfy ∀a ∈ {1, . . . , K}, Eµ[Na(τ)] Eµ[τ] = w∗ a (µ). Na(t) : number of draws of arm a up to time t § we propose an efficient algorithm to compute w∗(µ) Emilie Kaufmann Improved A/B Testing
  • 15. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 16. Sampling rule: Tracking the optimal proportions ˆµ(t) = (ˆµ1(t), . . . , ˆµK (t)): vector of empirical means Introducing Ut = {a : Na(t) < √ t}, the arm sampled at round t + 1 is At+1 ∈    argmin a∈Ut Na(t) if Ut = ∅ (forced exploration) argmax 1≤a≤K [t w∗ a (ˆµ(t)) − Na(t)] (tracking) Lemma Under the Tracking sampling rule, Pµ lim t→∞ Na(t) t = w∗ a (µ) = 1. Emilie Kaufmann Improved A/B Testing
  • 17. Stopping rule: performing statistical tests High values of the Generalized Likelihood Ratio Za,b(t) := log max{λ:λa≥λb} (X1, . . . , Xt; λ) max{λ:λa≤λb} (X1, . . . , Xt; λ) , reject the hypothesis that (µa < µb). We stop when one arm is accessed to be significantly larger than all other arms, according to a GLR Test: τδ = inf {t ∈ N : ∃a ∈ {1, . . . , K}, ∀b = a, Za,b(t) > β(t, δ)} = inf t ∈ N : max a∈{1,...,K} min b=a Za,b(t) > β(t, δ) Chernoff stopping rule [Chernoff 59] Emilie Kaufmann Improved A/B Testing
  • 18. Stopping rule: an alternative interpretation One has Za,b(t) = −Zb,a(t) and, if ˆµa(t) ≥ ˆµb(t), Za,b(t) = Na(t) d ˆµa(t), ˆµa,b(t) + Nb(t) d ˆµb(t), ˆµa,b(t) , where ˆµa,b(t) := Na(t) Na(t)+Nb(t) ˆµa(t) + Nb(t) Na(t)+Nb(t) ˆµb(t). A link with the lower bound max a min b=a Za,b(t) = t × inf λ∈Alt(ˆµ(t)) K a=1 Na(t) t d(ˆµa(t), λa) t T∗(µ) under a “good” sampling strategy (for t large) Emilie Kaufmann Improved A/B Testing
  • 19. An asymptotically optimal algorithm Theorem The Track-and-Stop strategy, that uses the Tracking sampling rule the Chernoff stopping rule with β(t, δ) = log 2(K−1)t δ and recommends ˆaτ = argmax a=1...K ˆµa(τ) is δ-PAC for every δ ∈]0, 1[ and satisfies lim sup δ→0 Eµ[τδ] log(1/δ) = T∗ (µ). Emilie Kaufmann Improved A/B Testing
  • 20. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 21. Optimal sample complexity Two arms, B(µ1) and B(µ2) Eµ[τ] ≥ T∗ (µ) log 1 2.4δ , with T∗ (µ)−1 = sup α∈[0,1] [αd(µ1, αµ1 + (1 − α)µ2)+ (1 − α)d(µ2, αµ1 + (1 − α)µ2)] = d∗(µ1, µ2), d∗(µ1, µ2) = d(µ1, z∗ ) with z∗ defined by d(µ1, z∗ ) = d(µ2, z∗ ) 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 x yz* Emilie Kaufmann Improved A/B Testing
  • 22. Algorithms Track-and-Stop Sampling rule: At+1 = argmax a=1,2 d ˆµa(t), N1(t)ˆµ1(t) + N2(t)ˆµ2(t) N1(t) + N2(t) Stopping rule: stop after t samples if a=1,2 Na(t)d ˆµa(t), N1(t)ˆµ1(t) + N2(t)ˆµ2(t) N1(t) + N2(t) > β(t, δ) Eµ[τ] 1 d∗(µ1, µ2) log 1 δ Emilie Kaufmann Improved A/B Testing
  • 23. Algorithms Uniform sampling (and optimal stopping) Sampling rule: At+1 = t [2] Stopping rule: stop after t samples if a=1,2 Na(t)d ˆµa(t), N1(t)ˆµ1(t) + N2(t)ˆµ2(t) N1(t) + N2(t) > β(t, δ) Eµ[τ] 1 I∗(µ1, µ2) log 1 δ with I∗(µ1, µ2) = d µ1, µ1+µ2 2 + d µ1, µ1+µ2 2 2 . Remark: I∗(µ1, µ2) very close to d∗(µ1, µ2) § uniform sampling is close to optimal Emilie Kaufmann Improved A/B Testing
  • 24. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 25. An optimal algorithm Two arms, N(µ1, σ2 1) and N(µ2, σ2 2) σ1, σ2 known Eµ[τ] ≥ 2(σ2 1 + σ2 2) (µ1 − µ2)2 log 1 2.4δ and w∗(µ) = σ1 σ1 + σ2 ; σ2 σ1 + σ2 § allocate the arms proportionaly to the standard deviations (no uniform sampling if σ1 = σ2) Optimal algorithm: Sampling rule: At+1 = 1 ⇔ N1(t) t < σ1 σ1 + σ2 Stopping rule: stop after t samples if |ˆµ1(t) − ˆµ2(t)| > 2 σ2 1 N1(t) + σ2 2 N2(t) β(t, δ) Emilie Kaufmann Improved A/B Testing
  • 26. Outline 1 Optimal algorithms for best-arm identification Lower bounds The Track-and-Stop strategy 2 A/B Testing Bernoulli distribution Gaussian distribution 3 Practical performance Emilie Kaufmann Improved A/B Testing
  • 27. State-of-the-art algorithms An algorithm based on confidence intervals : KL-LUCB [K., Kalyanakrishnan 13] ua(t) = max {q : Na(t)d(ˆµa(t), q) ≤ β(t, δ)} la(t) = min {q : Na(t)d(ˆµa(t), q) ≤ β(t, δ)} 0 1 771 459 200 45 48 23 sampling rule: At+1 = argmax a ˆµa(t), Bt+1 = argmax b=At+1 ub(t) stopping rule: τ = inf{t ∈ N : lAt (t) > uBt (t)} Emilie Kaufmann Improved A/B Testing
  • 28. State-of-the-art algorithms A Racing-type algorithm: KL-Racing [K., Kalyanakrishnan 13] R = {1, . . . , K} set of remaining arms. r = 0 current round while |R| > 1 r=r+1 draw each a ∈ R, compute ˆµa,r , the empirical mean of the r samples observed sofar compute the empirical best and empirical worst arms: br = argmax a∈R ˆµa,r wr = argmin a∈R ˆµa,r Elimination step: if lbr (r) > uwr (r), eliminate wr : R = R{wr } end Outpout: ˆa the single element in R. Emilie Kaufmann Improved A/B Testing
  • 29. The Chernoff-Racing algorithm R = {1, . . . , K} set of remaining arms. r = 0 current round while |R| > 1 r=r+1 draw each a ∈ R, compute ˆµa,r , the empirical mean of the r samples observed sofar compute the empirical best and empirical worst arms: br = argmax a∈R ˆµa,r wr = argmin a∈R ˆµa,r Elimination step: if (Zbr ,wr (r) > β(r, δ)), or rd ˆµa,r , ˆµa,r + ˆµb,r 2 + rd ˆµb,r , ˆµa,r + ˆµb,r 2 > β(r, δ), eliminate wr : R = R{wr } end Outpout: ˆa the single element in R. Emilie Kaufmann Improved A/B Testing
  • 30. Numerical experiments Experiments on two Bernoulli bandit models: µ1 = [0.5 0.45 0.43 0.4], such that w∗ (µ1) = [0.417 0.390 0.136 0.057] µ2 = [0.3 0.21 0.2 0.19 0.18], such that w∗ (µ2) = [0.336 0.251 0.177 0.132 0.104] In practice, set the threshold to β(t, δ) = log log(t)+1 δ . Track-and-Stop Chernoff-Racing KL-LUCB KL-Racing µ1 4052 4516 8437 9590 µ2 1406 3078 2716 3334 Table : Expected number of draws Eµ[τδ] for δ = 0.1, averaged over N = 3000 experiments. Emilie Kaufmann Improved A/B Testing
  • 31. Take-home message Useful tools for sequential A/B Testing: stop using Sequential Generalized Likelihood Ratio tests sample the arms to match the optimal proportions w∗(µ) ... which can be approximated by uniform sampling for Bernoulli distribution Final remark: Good algorithms for best arm identification are very different for bandit algorithms designed for regret minimization (UCB, Thompson Sampling) Emilie Kaufmann Improved A/B Testing
  • 32. References This talk is based on A. Garivier, E. Kaufmann. Optimal Best Arm Identification with Fixed Confidence, arXiv:1602.04589, 2016 E. Kaufmann, O. Capp´e, A. Garivier. On the Complexity of A/B Testing. COLT, 2014 E. Kaufmann, O. Capp´e, A. Garivier. On the Complexity of Best Arm Identification in Multi-Armed Bandit Models. JMLR, 2015 Emilie Kaufmann Improved A/B Testing