By Alexandra Kulachikova – Yandex
Machine learning is not a magic tool; with using it one can get real results in the present. In this session, Alexandra will show a few cases of how real businesses use machine learning opportunities to find new clients across devices on each stage of their paths and predict their conversions. During the presentation she will provide a step-by-step guide, so that everyone can try to build their own prediction models.
Foundation First - Why Your Website and Content Matters - David Pisarek
Machine Learning In Real Life: Gather. Unite. Predict
1.
2. Machine learning in the real
life: gather, unite, predict
Alexandra Kulachikova, Head of Yandex.Metrica promotion
3. ▌ Leading web analytics product
› Millions of clients
› #2 analytics tool by domain share
–w3techs.com, datanyze.com
› International product
3
Top Countries
Russia
United States
Turkey
Ukraine
Germany
India
Brazil
17. ML based predictions for DIY online store
17
220 Volt – one of top online stores
in Russia
› predict the probability of
conversion knowing all the
customer’s history
› use predictions for analysis and
advertisement
18. Say ”no” to attribution modelling
18
› Prediction works on the user
level regardless of attribution
modelling
20. Tools
20
› extract raw-data from
Yandex.Metrica with Logs API
› use ClickHouse to calculate
customers features
› machine learning algorythms
to built a model
21. More than 60 characteristics
21
User characteristics: device,
browser, region
Behavioral: traffic sources, revenue,
last visit date etc.
designer
loves heavy metal
28 years old
iPhone
likes coffee
Yandex.Browse
r
heavy site visitor
wants to buy a
drill
woman
23. Everything matters
23
Characteristic Importance
Days since last visit 0.1445
Days since first visit 0.1041
Number of items viewed 0.0777
Avg. time on site 0.0771
Avg. depth on site 0.0701
Revenue 0.0470
Number of non-bounce visits 0.0395
Region 0.0392
Number of visits 0.0340
Number of visits from advertising 0.0259
24. Never be in a hurry when you training a
model
24
› Best algorythm – XGBoost
› Quality metric ROC AUC ~0.9
› Old data is a new data
› Triple-check before start
25. Bitter truth
25
More than 80%
visitors have
probability to
purchase less than
5%
Probability to purchase
%ofusers
26. Divide visitors by probability
26
› Excellent: probability >= 50%
› Good: 15% <= probability <
50%
› Normal: 5% <= probability <
15%
› Bad: probability < 5%
27. Daily update
27
› Recalculate and predict
probability for each user every
day
› Use user parameters data
upload to keep data updated
28. Excellent conversion of excellent visitors
28
CR of group Excellent is about 5 times higher than
average CR
29. Let’s make an experiment
29
› Remarketing
› Bid corrections
› A/B tests
31. Results: rise of revenue
Retargeting
› Revenue grew by 96%
› Conversion grew by 25%
› Costs grew by 26%
31
Advertising network
› Revenue grew by 31%
› Conversion hasn’t changed
› Costs haven’t changed
32. Where to use
32
› Analysis: traffic sources
performance and users
behavior
› Advertisement: bids,
remarketing, look-alike
› Direct-marketing
33. The recipe of magic ML potion
33
Big Data + Machine learning
› free tools only: YM Logs API +
ClickHouse + Python + Pandas
+ XGBoost
Bright mind and straight hands :)