Tutorial 11 (computational advertising)

Computational advertising
Kira Radinsky
Slides based on material from the paper
“Bandits for Taxonomies: A Model-based Approach” by
Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti,
Vanja Josifovski, in SDM 2007

The Content Match Problem
Advertisers
Ads
DB
Ads
Ad impression: Showing an ad to a user
(click)

Advertisers
Ads
Ad click: user click leads to revenue for ad server and content provider
Ads
DB
(click)

Advertisers
Ads
DB
Ads
The Content Match Problem:
Match ads to pages to maximize clicks

Advertisers
Ads
DB
Ads
Maximizing the number of clicks means:
 For each webpage, find the ad with the best
Click-Through Rate (CTR)
 but without wasting too many impressions in
learning this.

Outline
Problem
Background: Multi-armed bandits
• Proposed Multi-level Policy
• Experiments
• Conclusions

Background: Bandits
Bandit “arms”
p1 p2 p3
(unknown payoff
probabilities)
Pull arms sequentially so as to maximize the total
expected reward
• Estimate payoff probabilities pi
• Bias the estimation process towards better arms

Background: Bandits Solutions
• Try 1: Greedy Solution:
• Compute the sample mean of an arm A by dividing the total
reward received from the arm by the number of times the arm
has been pulled. At each time step choose the arm with
highest sample mean.
• Try 2: Naïve solution:
• Pull each arm an equal number of times.
• Epsilon-greedy strategy:
• The best bandit is selected for a proportion 1 − ε of the trials,
and another bandit is randomly selected (with uniform
probability) for a proportion ε.
• Many more strategies

Ad matching as a bandit problemWebpage1
Bandit “arms”
Webpage2Webpage3
= ads
~106 ads
~109
pages

Ad matching as a bandit problem
Ads
Webpages
Content Match = A matrix
• Each row is a bandit
• Each cell has an unknown CTR
One instance of the MAB
problem (1 bandit)
Unknown CTR

Background: Bandits
Bandit Policy
1.Assign priority to
each arm
2.“Pull” arm with
max priority, and
observe reward
3.Update priorities
Priority 1 Priority 2 Priority 3
Allocation
Estimation

Background: Bandits
Why not simply apply a bandit policy
directly to the problem?
• Convergence is too slow
~109 instances of the MAB
problem(bandits), with ~106 arms per
instance (bandit)
• Additional structure is available, that
can help  Taxonomies

Outline
Problem
Proposed Multi-level Policy
• Experiments
• Conclusions

Multi-level Policy
Ads
Webpages
… …
……
……
classes
classes
Consider only two levels

Multi-level Policy
Apparel
Compu-
ters Travel
… …
……
……
Consider only two levels
Travel
Compu-
tersApparel
Ad parent
classes
Ad child classes
Block
One MAB problem
instance (bandit)

Multi-level Policy
Apparel
Compu-
ters Travel
… …
……
……
Key idea: CTRs in a block are homogeneous
Ad parent
classes
Block
One MAB problem
instance (bandit)
Travel
Compu-
tersApparel
Ad child classes

Multi-level Policy
• CTRs in a block are
homogeneous
– Used in allocation (picking ad for
each new page)
– Used in estimation (updating
priorities after each observation)

Multi-level Policy
homogeneous
Used in allocation (picking ad for
each new page)
– Used in estimation (updating

C
A C T
AT
Multi-level Policy (Allocation)
?
Page
classifier
• Classify webpage  page class, parent page class
• Run bandit on ad parent classes  pick one ad parent class

C
A C T
AT
• Classify webpage  page class, parent page class
• Run bandit on ad parent classes  pick one ad parent class
• Run bandit among cells  pick one ad class
• In general, continue from root to leaf  final ad
?
Page
classifier
ad

C
A C T
AT
ad
Bandits at higher levels
• use aggregated information
• have fewer bandit arms
Quickly figure out the best ad parent class
Page
classifier

Multi-level Policy
homogeneous
Used in allocation (picking ad for
each new page)
Used in estimation (updating

Multi-level Policy (Estimation)
homogeneous
– Observations from one cell also
give information about others in
the block
– How can we model this
dependence?

• Shrinkage Model
Scell | CTRcell ~ Bin (Ncell, CTRcell)
CTRcell ~ Beta (Paramsblock)
# clicks in
cell
# impressions in cell
All cells in a block come from the same distribution

• Intuitively, this leads to shrinkage
of cell CTRs towards block CTRs
E[CTR] = α.Priorblock + (1-α).Scell/Ncell
Estimated
CTR
Beta prior (“block
CTR”)
Observed
CTR

Outline
Problem
Proposed Multi-level Policy
Experiments
• Conclusions

Experiments [S. Panday et al. 2007]
Root
20 nodes
221 nodes
…
~7000 leaves
Taxonomy structure
use these 2
levels
Depth 0
Depth
7
Depth 1
Depth 2

Experiments
• Data collected over a 1 day period
• Collected from only one server, under some
other ad-matching rules (not our bandit)
• ~229M impressions
• CTR values have been linearly transformed for
purposes of confidentiality

Experiments (Multi-level Policy)
Multi-level gives much higher #clicks
Number of pulls
Clicks

Experiments (Multi-level Policy)
Multi-level gives much better Mean-Squared Error  it has learnt
more from its explorations
Mean-SquaredError
Number of pulls

Conclusions
• When having a CTR guided system, exploration is a
key component
• Short term penalty for the exploration needs to be
limited (exploration budget)
• Most exploration mechanisms use a weighted
combination of the predicted CTR rate (average) and
the CTR uncertainty (variance)
• Exploration in a reduced dimensional space: class
hierarchy
• Top down traversal of the hierarchy to determine the
class of the ad to show

Tutorial 11 (computational advertising)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Tutorial 11 (computational advertising)

Similar to Tutorial 11 (computational advertising) (20)

More from Kira

More from Kira (12)

Recently uploaded

Recently uploaded (20)

Tutorial 11 (computational advertising)