Content recommendations

Content Recommendations
with Redis

Torben Brodt
plista GmbH

28. February 2013

Recommender Systems
Stammtisch

http://recommenders.de

Introduction
● plista GmbH
○ recommendations & advertising
○ founded in 2008, Berlin [DE]
○ ~3k recommendations/ second

● never batch = never Hadoop
● stream computing with In Memory Database

● we love

How to build recommendations?
welt.de/football/berlin_wins.html

We only have the URL?

to show recommendations
we are integrated on the
website

so "at least" we can count
the hits

Most popular
● ZINCR "p:welt.de" berlin_wins
● ZREVRANGEBYSCORE
p:welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135

Live Read
+ Live Write
= Real Time Recommendations

Most popular with timeseries
● ZINCR "p:welt.de:1360007000" berlin_wins
● ZUNION
○ "p:welt.de:1360007000"
○ "p:welt.de:1360006000"
○ "p:welt.de:1360005000"
p:welt.de:1360005000
p:welt.de:1360006000
berlin_wins 420
p:welt.de:1360007000
berlin_wins 420
berlin_wins
summer_is_coming 135 689
plista_best_company 689
plista_best_company 689 420
summer_is_coming

● ZINCR "p:welt.de:1360007000" berlin_wins
● ZUNION ... WEIGHTS
○ "p:welt.de:1360007000" .. 4
○ "p:welt.de:1360006000" .. 2
○ "p:welt.de:1360005000" .. 1
p:welt.de:1360005000
p:welt.de:1360006000
berlin_wins 420
p:welt.de:1360007000
berlin_wins 420
berlin_wins
summer_is_coming 135 689
plista_best_company 689 420
summer_is_coming

:1360007000

:1360007000
:1360007000

-1h -2h -3h -4h -5h -6h -7h -8h

Most popular to any context
● it's not only publisher, we use ~50 context
attributes publisher = welt.de
weekday = sunday
berlin_wins 689 +1
berlin_wins 400 +1
dortmund_wins 200
plista_company 135
... 100
context attributes:
● publisher geolocation = dortmund
● weekday dortmund_wins 200
● geolocation
● demographics berlin_wins 10 +1
● ... ... 5

Most popular to any context
● how it looks like in Redis
ZUNION ... WEIGHTS publisher = welt.de
p:welt.de:1360007 4
p:welt.de:1360006 2 weekday = sunday
berlin_wins 689 +1
p:welt.de:1360005 1 berlin_wins 400
w:sunday:1360007 4 dortmund_wins 200
plista_company 135
w:sunday:1360006 2
w:sunday:1360005 1 ... 100

g:dortmund:1360007 4 geolocation = dortmund
g:dortmund:1360006 2
g:dortmund:1360005 1 dortmund_wins 200
berlin_wins 10
... 5

Most popular with Effect size
● which context has an influence?
ZUNION ... WEIGHTS
p:welt.de:1360007 4 * 70%
p:welt.de:1360006 2 * 70%
p:welt.de:1360005 1 * 70%
Examples:
w:sunday:1360007 4 * 10% small effect: weather
w:sunday:1360006 2 * 10% big effect: publisher
w:sunday:1360005 1 * 10%
Data with small effect
g:dortmund:1360007 4 * 30% should not been taken
g:dortmund:1360006 2 * 30% into account, otherwise
g:dortmund:1360005 1 * 30% we get avg results

Effect Size

Most popular with Significance
● some data has more significance/trust
● so we add a significance matrix

publisher = welt.de sig:publisher = welt.de

berlin_wins 689 berlin_wins 1

X summer_is_coming 1

plista_company 135 plista_company 0.5

● Significance might depend on a common limit,
like 200 (in the example)

Most popular with Significance
● some data has more significance/trust
● so we add a significance matrix
SUM over all context

Σ( )
publisher = welt.de sig:publisher = welt.de

berlin_wins 689 berlin_wins 1

X summer_is_coming 1

plista_company 135 plista_company 0.5
Numerator

SUM over all context sig:publisher = welt.de Denominator

Σ
berlin_wins 1

summer_is_coming 1

plista_company 0.5

SUM over..
ZUNION ... WEIGHTS
● timeseries p:welt.de:1360007 4
● different context p:welt.de:1360006 2
● previous hits of the user p:welt.de:1360005 1
● similar publisher w:sunday:1360007 4
knowledge w:sunday:1360006 2
w:sunday:1360005 1

Σ
publisher = welt.de
berlin_wins 689 g:dortmund:1360005 1

plista_company 135 ... redis can do it ;)

Even more Matrix Operations ;)
● Similarity Matrix
● Human Control Matrix
Σ
● Meta-learning Matrix
○ might be covered in next talk
○ cooperation with
∏
○ aided from

Conclusions
● Redis fits perfect for simple operations
○ SUM + AGGREGATE + MIN + MAX
● In-Memory operations are pretty fast
● Real-time features feel better in a real-time
database (e.g. time series)
● We don't need batch

What else?
In Redis
● Incremental Collaborative Filtering
● More Recommender
● Live Statistics
At plista
● Semantics with Lucene
● Cloud Technologies
○ Scalability
○ Enterprise Service Bus
● Contest for Recommenders

Questions?

www.plista.com

torben.brodt@plista.com

@torbenbrodt

xing.com/profile/Torben_Brodt

http://goo.gl/pvXm5

http://lnkd.in/MUXXuv

Content recommendations

Recommandé

Recommandé

Contenu connexe

Plus de Torben Brodt

Plus de Torben Brodt (14)

Content recommendations