2. cleverdata.ru | info@cleverdata.ru
International market
business development
since 2012
One of three leading IT companies in Russia
43 branches in Russia and abroad
+5500 employees
100K projects for 10K customers
Data management innovative
platform (Data Exchange Service)
Cloud Service
In-house development
Internet advertising solutions
Data Management Platforms
Customers Base Management
Web Analytics
Marketing automation
Big Data
Data Mining
Digital Intelligence
Operational Intelligence
Low Latency and NoSQL
Cloud Computing
5. TRACKING DATA
cleverdata.ru | info@cleverdata.ru
publishers
COOKIE SYNCs
ACCESS LOGS
PARTNER’S DATA
3rd PARTY DATA
CLICK STREAMS
advertisers
S
S
P
D
S
P
DMP
Data Management Platform (DMP)
6. cleverdata.ru | info@cleverdata.ru
3rd party
data
Relational Data Store
raw data3rd party
data
3rd party
data
Raw Data Store & Processing
RealTime Data Store
user profilesaggregates
Типовые потоки данных
7. cleverdata.ru | info@cleverdata.ru
Типовые потоки данных :: RTB
3rd party
data
Relational Data Store
RTB
SRV
Exchange
SSP
bid req.
bid resp.
pixels :: impressions :: clicks
bid requests
user profiles
raw data3rd party
data
3rd party
data
Raw Data Store & Processing
RealTime Data Store
user profilesaggregates
8. cleverdata.ru | info@cleverdata.ru
1st-party data
3rd party
data
Relational Data Store
RTB
SRV
Exchange
SSP
bid req.
bid resp.
pixels :: impressions :: clicks
bid requests
user profiles
raw data3rd party
data
3rd party
data
Raw Data Store & Processing
RealTime Data Store
user profilesaggregates
10. cleverdata.ru | info@cleverdata.ru
Зачем монетизировать?
Найти всех пользователей, которые
участвовали в рекламной кампании “Star Wars” [и]
видели один из баннеров “Darth Vader” или “Luke Skywalker”
в течении последних 6 дней [и]
кликнули на этот баннер [и]
посетили страницу покупки светового меча Darth’а Vader’а [и]
но так ничего и не купили
Для того, чтобы
сделать ретаргетинг персонифицированным баннером со
скидкой на меч в 40%
11. cleverdata.ru | info@cleverdata.ru
find all users who have
taken part in campaign[s] “Star Wars” [and]
viewed banner[s] “Darth Vader” or “Luke Skywalker”
during [last] 6 day[s] [and]
clicked banner[s] “Darth Vader's lightsaber” [and]
visited buying area of “Darth Vader's lightsaber” [and]
not visited order confirmed area of “Darth Vader's lightsaber”
Как монетизировать?
[impression]
[click]
[tr. pixel]
[tr. pixel]
id cookie event_id event_type campaign_id timestamp …
1 c1 “Darth Vader” impression “Star Wars” 2015-04-20 14:25:11.462 …
2 c1 “Darth Vader's lightsaber” click “Star Wars” 2015-04-21 06:31:12.157 …
3 c1 “Darth Vader's lightsaber” tr. pixel “Star Wars” 2015-04-22 18:57:19.628 …
[cookies]
12. cleverdata.ru | info@cleverdata.ru
Как монетизировать?
reducefind all users who have
taken part in campaign[s] “Star Wars”
viewed banner[s] “Darth Vader” or
“Luke Skywalker” during [last] 6 day[s]
clicked banner[s] “Darth Vader's
lightsaber”
visited buying area of “Darth Vader's
lightsaber”
not visited order confirmed area of “Darth
Vader's lightsaber”
(c1, 0)
(c1, 1)
(c1, 2)
(c1, 3)
Ø
map
(c1, 0;1;2;3)
true(0) and
true(1) and
true(2) and
true(3) and
not false(4)
C1
24. cleverdata.ru | info@cleverdata.ru
MR vs Spark :: Secondary Sort
MR
setSortComparatorClass
setGroupingComparatorClass
setPartitionerClass
Spark
repartitionAndSortWithinPartitions
mapPartitions
Entire partition processing result
must be able to fit in memory
25. cleverdata.ru | info@cleverdata.ru
MR vs Spark :: Тестирование
MR
MRUnit
o.a.h.h.MiniDFSCluster
o.a.h.m.MiniMRCluster
o.a.h.y.s.MiniYARNCluster
o.a.h.m.v2.MiniMRYarnCluster
Spark
Local executor
26. cleverdata.ru | info@cleverdata.ru
Что дальше и почему Spark?
• Spark Streaming;
• Micro Batches;
• λ-архитектура.
без серьезного хирургического вмешательства