In our presentation, we'll explain the difference between the simple content A/B testing (also known as a one-tailed test) and the on-page conversion rate optimisation method with the multi-armed bandit algorithm (also known as a two-tailed test). We describe the possibilities and the principles of the automatic on-page conversion optimisation used in our service Maxymizely.com
2. One-tailed test vs. Two-tailed test
● A one-tailed test (or simple A/B testing) allows you to determine if one essence is better or worse than another one
but not the both. A direction must be chosen before the test.
● A two-tailed test (or multi-armed bandit algotithm) allows you to determine if two website essences are different
from one another. A direction does not have to be specified before the test. The fact is that automatic conversion
optimization will take into account the possibility of both, a positive and a negative effect.
4. The limitations of simple A/B testing
Despite the fact that A/B testing (also known as a ‘one-hand bandit’ algorithm) is the known method of
negotiation the impact of personal bias, it has some limitations, such as:
● While test is running the users have to see a ‘bad version’ 50% of the whole testing time.
● The method includes a human factor: when to stop the test and which version is the best.
● The method requires lots of samples and lots of rounds to go.
5. Two-tailed test (the multi-armed bandit
principle)
It’s that very multi-armed bandit
algorithm used for automatic
conversion optimization (also known
as machine learning in on-page
conversion optimization) :)
7. Machine learning for conversion
optimization principle
Plainly speaking, it’s a two-tailed testing process,
that consists of multiplicity of repeating testing
rounds.
Each testing round consists of the exploration and
exploitation phases, the combination of which
helps to find the best working balance.
Practically and plainly speaking it means that the
best performing page is shown for the maximally
possible number of times, while the worst
performing pages almost are not shown.
8. The short review of the various
multi-armed bandit algorithms
Group of methods/strategies Pros Cons
The iterative strategies (dynamic
programming, Gittins indices, UCB1)
Provably optimal strategies
Limited horizon, a short sequence of
steps, the computational complexity of
the problem when you zoom, very slow
convergence
Linear reward actions
Dynamic weight update, fast
conversion
Sensitive to the initial approximation
Heuristic methods (e-greedy,
Softmax, exp 3, etc.)
Not sensitive to an increase in the
scale of the problem
More often lead to sub-optimal
solutions
9. The multi-armed bandit algorithm
principles used in Maxymizely
To improve the reliability of results, at Maxymizely we use the combination of such methods as
epsilon-greedy and linear reward-action.
1. The exploration phase. The equal traffic distribution (10% of the whole traffic). During the first phase
we’re using the principles of retraining, taken from the E-greedy algorithm
2. The exploitation phase. The most successful variations get the most of the traffic. Our system detects
changes in conversion of each variation and adjusts the weight, according to the probability to win.
We also take into account the speed of weight changes to compensate the errors. During the second
phase, the algorithm is using the linear reward-action method.
10. The linear reward-action principles
The linear reward-action is based on the principles of PID controllers and consists of two
components: the differentiate component (D) is used to define the speed of change in
probability to win and the proportional integral component (PI) is the current probability to win
in a test. To evaluate the probability of winning we use the Bayesian approach. There is an
example of a handy Bayesian calculator helping to visualise the estimate of the probability
that A beats B given the data you have. In a low quantity of attempts it looks like a
beta-distribution in terms of statistics, but on the large samples it’s a completely normal
distribution.
Our algorithm is retraining up to 4 times a day, providing optimal solutions for maximization of
gained profit. And if our algorithm considers that the variation has a 100% chance of winning,
after a while it will get the 100% weight in the 90% exploitation phase in order to maximize
your profit.
11. The limitation of Maxymizely’s machine
learning algorithm
Generally, our multi-armed bandit exceeds the
possibilities of the plain A/B testing in the sense of the
fastest conversion increase. The only limitation you can
face launching this method is the quantity of your traffic.
The minimum amount of traffic you have to provide for
every arm of a multi-armed bandit is 500 unique visitors.
12. AB testing vs. Automatic conversion
optimization
● Minimize losses: A/B tests send equal amounts of visitors to pages, no matter how well they
perform. The Maxymizely’s bandit algorithm, on the other hand, will learn again and again to send
visitors to the best performing page. In this sense, the bandit algorithm has an incontestable
advantage over ordinary A/B tests. Also, bandits allow you to distinguish relatively similar
performance between two versions of a page.
● Sample size: For multi-armed bandits you generally require fewer observations in order to come
to a conclusion for results with the same level of confidence as an A/B test. The difference lies in
different approaches to define the ‘winner’; and both of these statistical approaches are equally
reliable.
● Easy to set-up: Certain conditions in setting up a bandit algorithm are difficult to implement, that’s
why this method has been less popular before the start of the modern machine learning
practices. While A/B testing has been used for over a century, bandit algorithms haven’t been
used mainly because the calculation time is much longer than in an ordinary A/B test. However
today the iteration of a Bayesian bandit takes less than a second for our computers, and this is
why we are seeing the return in popularity for such optimization method. Easy to set up and use!
13. Thank you for the attention!
For more information visit our
website:
http://maxymizely.com/