Scheduling advertisements on a web page to maximize revenue
1. Scheduling advertisements on a
web page to maximize revenue
Speaker : Scott
Date : 17/6/2014
Subodha Kumar
Varghese S. Jacob
Cheeliah Sriskandarajah
173 (2006) 1067–1089
European Journal of Operational Research
2. Introduction
• The amount of users on the Internet is becoming stupendous.
• Advertisement revenue
2003→$7.3 billion
2002→$6 billion
2006→$15.4 billion (prediction)
• Banner advertisements, major form
The most common type, rectangular
• Limited space spawns the issue of maximizing revenue
• Three factors which will be considered
(1)time (2) number (3)size
• The problem belongs to a NP-Hard problem.
1
3. Problem description
• A set of n ads A = {𝐴1,…, 𝐴 𝑛} competing for space in a given planning horizon.
• Time fraction, access fraction, and ad geometry determines the expected number of
the impression of an ad.
Time fraction, 𝑡𝑖, means the fraction of time for which 𝐴𝑖 is displayed.
Access fraction, 𝑎𝑖, means
number of visitors who see 𝐴 𝑖
Total number of visitors
.
Geometry is specified by 𝑙𝑖 which may represent the length of 𝐴𝑖.
The width, W, of all ads is assumed to be the same.
• The length and the width of a rectangular slot are denoted as S and W, respectively.
• An instance, 𝐼1, is given by {(𝑎𝑖, 𝑡𝑖, 𝑙𝑖)|𝑎𝑖>0, 𝑡𝑖>0, 𝑙𝑖>0, 𝐴𝑖 ∈ 𝐴}.
It can be transformed as 𝐼2 given by {(𝑠𝑖, 𝑤𝑖)|𝑠𝑖>0,𝑤𝑖>0, 𝐴𝑖 ∈ 𝐴}.
𝑤 means frequency instead of W signified as the width of a slot previously.
• N represents the number of slots each having the size S.
2
4. Problem description
• The fullness of any slot j is 𝑓𝑗 = 𝐴 𝑗∈𝐵 𝑗
𝑠𝑖, 𝐵𝑗 ⊆ 𝐴
max
𝑗
𝑓𝑗 ≤ 𝑆
• Three scenarios where 𝐼1 can be transformed as 𝐼2.
Most accesses have very short duration.
Most accesses have long duration
Each ad has the same geometry and only one as is displayed at a time.
• A MAXSPACE problem
3
5. Related literature review
• Yager (1997), a general framework for the competitive selection
• Dreze and Zufryden (1997), intern.com Corp. (1998), Kohda and Endo (1996),
Marx (1996), Risdel et al. (1998), the issue of increasing of the effectiveness of
web ads.
• Aggarwel et al. (1998), the optimization of advertisements on webservers
• Adler et al. (2002), SUBSET-LSLF
4
6. Heuristic algorithm Integer programming formulation
max 𝑍 =
𝑗=1
𝑁
𝑖=1
𝑛
𝑠𝑖 𝑥𝑖𝑗
subject to
𝑖=1
𝑛
𝑠𝑖 𝑥𝑖𝑗 ≤ 𝑆 , 𝑗 = 1, 2, … , 𝑁
𝑗=1
𝑁
𝑥𝑖𝑗 = 𝑤𝑖 𝑦𝑖 , 𝑖 = 1, 2, … , 𝑁
𝑥𝑖𝑗 =
1 if ad 𝐴𝑖 is assigned to slot 𝑗.
0 otherwise
𝑦𝑖 =
1 if ad 𝐴𝑖 is selected.
0 otherwise
5
7. Heuristic algorithm SUBSET-LSLF
𝑠𝑖, 𝑤𝑖, i=1, 2,…,n.
N : number of slots
S : size of each slot
𝑠 = {𝐴𝑖|𝑠𝑖 = 𝑆}
𝑠 = {𝐴𝑖|𝑠𝑖 < 𝑆}
𝐵𝑠 =
𝐴 𝑖∈𝑠
𝑠𝑖 𝑤𝑖
𝐵 𝑠 =
𝐴 𝑖∈𝑠
𝑠𝑖 𝑤𝑖
If 𝐵𝑠 ≥ 𝐵 𝑠
Sort ads in 𝑠 with the order of frequency
Sort ads in 𝑠 by size
If 𝐵𝑠 < 𝐵 𝑠
Sort ads in 𝑠 by size
Sort ads in 𝑠 with the order of frequency
6
8. Heuristic algorithm Largest Size Most Fill (LSMF)
𝐶𝐿 = max
1
𝑁
𝑖=1
𝑛
𝑆𝑖 , max
1≤𝑖≤𝑛
𝑆𝑖
𝐶 𝑈 = 2𝐶𝐿
𝐶 =
𝐶𝐿(𝐼) + 𝐶𝑈(𝐼)
2
I = 1, 𝐾 = 10
If FFD(C) ≤ N
I=I+1
CU(I)=CL(I-1)
If I ≤ K, start from calculating C
ELSE
I=I+1
CU(I)=CU(I-1)
CL(I)=C
SUBSET-LSMF
7
9. Heuristic algorithm Largest Size Most Fill (LSMF)
1 SUBSET-LSMF();
2 If C ≤ S End;
3 Calculate the values of 𝐵𝑖 for all ads; Sort the ads by 𝐵𝑖, ⬆;
4 k=1; i=1; Schedule-={𝐵++𝑖(𝑠++𝑖)}; Discard(k) = 𝑠𝑖; SUBSET-LSMF();
5 If C ≤ S
6 If k = 1 End;
7 Else
8 Schedule+={Discard(k-1)}; SUBSET-LSMF();
9 If C > S Schedule-={Discard(k-1)}; Else k-=1;
10 Else
11 k+=1; Schedule-={𝐵++𝑖(𝑠++𝑖)}; Discard(k) = 𝑠𝑖; SUBSET-LSMF(); GOTO 5;
Algorithm LSMF
8
10. Heuristic algorithm A genetic algorithm
• For MAXSPACE problems, GA views sequences of ads as chromosomes.
• A simple GA is usually composed of three operations.
Selection
Crossover
Mutation
• A design of experiments (DOE) approach was devised.
9
11. Heuristic algorithm A genetic algorithm
1 Assign ads to any slots with the principle
2 Calculate fitness value for each sequence; Sort all the sequences with descending order
3 Select ε for reproduction
4 k=0; Select 2 parents and cross them over; k+=1;
5 Mutate the children
6 Estimate the fitness of the children
7 If k<
𝑝𝑠
2
− 0.5 GOTO Line 4;
8 If i = 0 the overall best sequence = the current best sequence; GOTO Line 10;
9 If the overall best sequence < the current best sequence
10 i+=1; If i = 𝑛 𝑔𝑒𝑛, END; Else GOTO Line 2;
10
12. Heuristic algorithm Hybrid GA
• The whole processes are very much the same as the GA algorithm.
• The evaluation of fitness value are calculated three times with GA, LSMF, and
SUBSET-LSLF per sequence.
11
13. Computational studies
• 190 randomly generated problems with limitation
• Four algorithms were programmed in C.
• Parameterization of appropriate parameters for the GA algorithm
Set # No. of slots (N) Elite fraction (ε) Population size (ps)
Probability of
crossover (𝒑 𝒄)
Probability of
mutation (𝒑 𝒎)
1 10 0.25 75 0.95 0.10
2 25 0.25 75 0.75 0.05
3 50 0.25 75 0.60 0.05
4 75 0.25 200 0.75 0.01
5 100 0.25 200 0.75 0.01
12
14. Computational studies Comparison of results for
the small size problems
• 40 problems
Comparison of results for the small size problems with known optimal values
Prob.
Set #
No. of
slots (N)
Size of
each
slot (S)
%SUBSET-LSLF
gap
%LSMF gap %GA gap %Hybrid GA gap
Max Avg Min Max Avg Min Max Avg Min Max Avg Min
1 5 5 13.04 1.72 0 24 7 0 0 0 0 0 0 0
2 5 10 15.79 6.39 0 28 13 0 0 0 0 0 0 0
3 10 10 16.00 3.40 0 8 3.1 0 0 0 0 0 0 0
4 10 15 14.09 3.99 0 11.3 5.0 1.3 3.4 0.81 0 0 0 0
13
15. Computational studies Comparison of results for
the small size problems
• 40 problems
Comparison of results for the small size problems with known optimal values
Prob.
Set #
No. of
slots (N)
Size of
each
slot (S)
%Imp in Avg %
gap of LSMF
Over
SUBSET-LSLF
%lmp in Avg %
gap of GA over
SUBSET-LSLF
%Imp in Avg %
gap of hybrid
GA over
SUBSET-LSLF
1 5 5 -306.9 100 100
2 5 10 -103.4 100 100
3 10 10 8.82 100 100
4 10 15 -25.3 79.7 100
14
16. Computational studies Comparison of results for
the large size problems
• 150 problems
• The results generated from the three algorithms are compared to the upper bounds
calculated from CPLEX.
• For most of the test problems, GA and LSMF both provide improvements over
SUBSET-LSMF.
15
17. Case study
• The dataset was obtained by observing the ads on ValuePay’s pIggy Adbar.
• Ads on an Adbar will be updated periodically due to the characteristic of the function.
Change every 20 seconds
The planning horizon is 180
Two banners, one is 468X60, the other is 120X60
• Assuming unit size = 12
• 33 different ads were displayed during the hour.
• For reaching the situation more closed to the practicality, 15 ads had been generated
randomly and added to the existing list.
Four sets were generated.
• The price of an ad was determined by the CPM model.
• Total revenue: 𝐴 𝑖∈𝐴′ 𝑠𝑖 𝑤𝑖 1000
16
19. Conclusions and future research directions
• Growing business on the Internet
• The optimal utilization of space
• Efficient heuristics was designed.
• The LSMF was proposed and the hybrid GA was designed.
• The hybrid GA provided the optimal solutions for all the test problems.
• Revenue may increase within different situations
• Discussing the study with other emerging pricing models can be considered.
• Comparing different pricing models can be considered.
18
20. Comment
• Some similar symbols meaning different things bewilders people.
• In section 6, the authors said a phenomenon that usually there will be much more
ads competing for space by merely stating rather than providing some more
concrete evidence which can support the authors’ view.
• Many websites mentioned in the paper has changed their way of showing
webpages or even has been a wasteland, such as ValuePay’s Piggy.
19