Real-time content, offer and ad targeting decisions must happen quickly. When a user requests information from a web application, a processing clock starts, requiring a decision in as little as 40 msec. Delays in targeting decisions lead to delays in responding to the user. These delays can lead to user dissatisfaction and, ultimately, loss of audience and revenue.
This session describes how AOL Advertising uses Hadoop to create sophisticated user profiles and NoSQL database technology from Couchbase to access those profiles in real-time, with sub-millisecond latency. This architecture leaves the bulk of the processing time budget for improved content, offer and ad targeting and even real-time content customization.
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Making Millions with NoSQL
1. Simple. Fast. Elastic.
How AOL Advertising Uses NoSQL to Make Millions
of Smart Targeting Decisions Every Hour
NoSQL Now! 2011
Matt Ingenthron
1
2. AD
AND
OFFER
TARGETING
“ AOL
asny
incremental
f
improvement
pier
pdrocessing
Bme
d
translates
la>orms,
and
erves
billions
o impressions
n
ay
from
our
a serving
p
to
huge
benefits
in
our
ability
to
more
effecBvely
serve
the
ads
needed
to
meet
our
contractual
commitments.
TradiBonal
databases
lack
the
scalability
required
to
support
our
goal
of
five
milliseconds
per
read/write.
CreaBng
user
profiles
with
Hadoop,
then
serving
them
from
Couchbase,
reduces
profile
read
and
write
access
to
under
a
millisecond,
leaving
the
bulk
of
the
processing
Bme
budget
for
improved
targeBng
and
customizaBon.
Pero
Subasic ”
Chief
Architect,
AOL
2
3. Ad
and
offer
targe/ng
40
milliseconds
to
respond
with
the
decision.
profiles,
real
/me
campaign
3 sta/s/cs
2
1 profiles,
campaigns
events
3
4. Proven at small, and extra large scale
• Leading cloud service (PAAS) • Social game leader – FarmVille,
provider Mafia Wars, Café World
• Over 150,000 hosted • Over 230 million monthly active
applications
users
• Couchbase Server serving over
6,200 Heroku customers
• Couchbase Server is the
primary database behind key
Zynga properties
4
5. Modern interactive software architecture
Application Scales Out
Just add more commodity web servers
Database Scales Up
Get a bigger, more complex server
-‐Expensive
and
disrup/ve
sharding
-‐Doesn’t
perform
at
large
scale
5
6. Couchbase data layer scales like application logic tier
Data layer now scales with linear cost and constant performance.
Application Scales Out
Just add more commodity web servers
Couchbase
Servers
Database Scales Out
Just add more commodity data servers
Horizontally
scalable,
schema-‐less,
auto-‐
sharding,
high-‐performance
at
Web
Scale
Scaling out flattens the cost and performance curves. 6
7. Couchbase
is
a
distributed
database
Couchbase
Web
Console
Applica/on
user
Web
applica/on
server
Couchbase
Servers
In
the
data
center On
the
administrator
console
7
8. Couchbase
is
Simple,
Fast,
Elas/c
NoSQL
• Simple
to:
ElasBc
Couchbase
– Deploy
(Membase
ServerTemplate)
– Develop
(memcached)
– Manage
(UI
and
RESTful
API)
• Fast:
– Predictable
low
latency
– Sub-‐ms
response
Bmes
– Built-‐in
memcached
technology
• Zero-‐down/me
Elas/city:
– Spread
I/O
and
data
across
instances
– Consistent
performance
with
linear
cost
– Dynamic
rebalancing
of
a
live
cluster
8
10. Couchbase
“write”
data
flow
–
applica/on
view
User
acBon
results
in
the
need
to
change
the
VALUE
of
KEY 1
ApplicaBon
updates
key’s
VALUE,
2 performs
SET
operaBon
4 Couchbase
client
hashes
KEY,
idenBfies
3 KEY’s
master
server
SET
request
sent
over
network
to
master
server
5
Couchbase
replicates
KEY-‐VALUE
pair,
caches
it
in
memory
and
stores
it
to
disk
10
11. Couchbase
data
flow
–
under
the
hood
SET
request
arrives
at
KEY’s
SET
acknowledgement
master
server
1 5 returned
to
applicaBon
2 2
Listener-‐Sender
RAM* 2
Couchbase
storage
engine
Disk Disk Disk
3
Disk Disk Disk
Replica
Server
1
for
KEY Master
server
for
KEY Replica
Server
2
for
KEY
11
12. Couchbase
Architecture
11211 11210
memcapable
1.0 memcapable
2.0
moxi
REST
management
API/Web
UI
vBucket
state
and
replicaBon
manager
Global
singleton
supervisor
Rebalance
orchestrator
memcached
ConfiguraBon
manager
Node
health
monitor
Process
monitor
protocol
listener/sender
Heartbeat
Data
Manager Cluster
Manager
engine
interface
CouchDB hp on
each
node one
per
cluster
Erlang/OTP
HTTP erlang
port
mapper distributed
erlang
8080 4369 21100
–
21199 12
13. Couchbase
Architecture
11211 11210
memcapable
1.0 memcapable
2.0
moxi
vBucket
state
and
replicaBon
manager
REST
management
API/Web
UI
Global
singleton
supervisor
Rebalance
orchestrator
memcached
ConfiguraBon
manager
Node
health
monitor
Process
monitor
protocol
listener/sender
Heartbeat
engine
interface
CouchDB hp on
each
node one
per
cluster
Erlang/OTP
HTTP erlang
port
mapper distributed
erlang
8091 4369 21100
–
21199 13
14. Data
buckets
are
secure
Couchbase
“slices”
Applica/on
user
Web
applica/on
server
Bucket
1
Bucket
2
Aggregate
Cluster
Memory
and
Disk
Capacity
Couchbase
data
servers
In
the
data
center On
the
administrator
console
14
15. Elas/c
Rebalancing
Node
1 Node
2 Node
3
Before vBucket
1 vBucket
7
• Adding
Node
3 vBucket
2 vBucket
8
• Node
3
is
in
pending
state vBucket
3 vBucket
9
• Clients
talk
to
Node
1,2
only Pending
state
vBucket
4 vBucket
10
vBucket
5 vBucket
11
vBucket
6 vBucket
12
15
16. Elas/c
Rebalancing
Node
1 Node
2 Node
3
Before vBucket
1 vBucket
7
• Adding
Node
3 vBucket
2 vBucket
8
• Node
3
is
in
pending
state vBucket
3 vBucket
9
• Clients
talk
to
Node
1,2
only Pending
state
vBucket
4 vBucket
10
vBucket
5 vBucket
11
vBucket
6 vBucket
12
vBucket
1 vBucket
7
vBucket
2 vBucket
8
During vBucket
3 vBucket
9 Rebalancing
• Rebalancing
orchestrator
recalculates
the
vBucket
4 vBucket
10
vBucket
map
(including
replicas)
vBucket
5 vBucket
11
• Migrate
vBuckets
to
the
new
server
vBucket
6 vBucket
12
• Finalize
migraBon
vBucket
migrator vBucket
migrator
15
17. Elas/c
Rebalancing
Node
1 Node
2 Node
3
Before vBucket
1 vBucket
7
• Adding
Node
3 vBucket
2 vBucket
8
• Node
3
is
in
pending
state vBucket
3 vBucket
9
• Clients
talk
to
Node
1,2
only Pending
state
vBucket
4 vBucket
10
vBucket
5 vBucket
11
vBucket
6 vBucket
12
vBucket
1 vBucket
7
vBucket
2 vBucket
8
During vBucket
3 vBucket
9 Rebalancing
• Rebalancing
orchestrator
recalculates
the
vBucket
4 vBucket
10
vBucket
map
(including
replicas)
vBucket
5 vBucket
11
• Migrate
vBuckets
to
the
new
server
vBucket
6 vBucket
12
• Finalize
migraBon
vBucket
migrator vBucket
migrator
Client
AKer
vBucket
1 vBucket
7 vBucket
5
• Node
3
is
balanced
• Clients
are
reconfigured
to
talk
to
Node
3 vBucket
2 vBucket
8 vBucket
6
vBucket
3 vBucket
9 vBucket
11
vBucket
4 vBucket
10 vBucket
12
15
19. Online Advertising
Publishers Advertisers
Aol Advertising Advertiser Constraints:
“Match Maker” •
Payment model
–
may
pay
per
impression,
click,
or
conversion
•
Allowability
–
may
restrict
on
what
web
sites
to
be
served
•
Targeting
–
may
only
want
to
be
shown
to
internet
users
in
a
certain
geo
locaBon,
or
from
a
specific
demographic
Internet Users •
Frequency
–
may
limit
how
oaen
the
same
user
is
shown
the
ad
•
Campaign Delivery:
Publisher Constraints:
-‐
The
total
ad
budget
may
•
Payment model
–
may
charge
have
to
be
delivered
per
impression,
click,
or
according
to
a
plan
conversion
-‐
The
served
impressions
•
Allowability
–
may
prohibit
may
have
to
generate
no
Terminology:
certain
types
of
ads
to
be
• CPM = Cost Per Mille, e.g. $1.50 per 1000 impressions
less
than
a
prescribed
click-‐
displayed • CPC = Cost Per Click, e.g. $2 per click
through
or
conversion
rate
• CPA = Cost Per Acquisition, e.g. $15 per acquisition/conversion
20. Large-‐Scale
Analy/cs
• Mission
• Team
• Data
• Ad
serving
logs,
content,
and
3rd
party
data
to
be
processed
• Research
• Technologies
• Cloudera:
Hadoop,
HDFS,
Flume,
Workflow
Manager
• Distributed
opera/onal
store:
Couchbase
• Light
DB:
MySQL
• Use
MPI
for
model
building
• Constantly
experimen/ng...
21. Data Feeds Flume
Inges/on
-‐
Cpu-‐intensive
(MPI-‐based
ML)
-‐
OperaBonal
store
highly
cached
Large-scale Analytics:
in
Couchbase -ReporBng
and
Insights
-‐
Distributed
search
(Sphinx)
-‐
DB
support
for
Hadoop
and
MySQL -PredicBve
Segments
-Contextual
Analysis
and
SegmentaBon
Couchbase
DB
Cluster
Actionable data
Data from the Internet (to ad serving)
22. Use
Cases
Today
• data
set
enrichment:
given
a
field
in
a
data
set
stored
on
HDFS,
enrich
by
adding
related
fields;
media
-‐>
campaign
-‐>
adver>ser
chain
• blackboard
for
inter-‐process/job
communica>on:
contextual
segmenta>on
pipelines;
predic>ve
modeling
can
load
per-‐campaign
models
to
be
used
for
large-‐scale
scoring
• larger
map-‐side
joins
(where
Hadoop
DistributedCache
and
in-‐memory
process/task
cache
is
insufficient)
• aggrega>ons
with
large
number
of
item
lookups,
e.g.
user-‐level
contextual
profiles
aggregated
from
visited
url
contextual
profies
stored
in
memcache
• Flume
integra>on
for
data
flow
reliability
end
recovery
• segment
genera>on
currently
carried
out
through
Hadoop
pipelines
and
uploaded
into
server-‐side
Membase
for
targe>ng
• but:
strong
tendency
to
move
closer
to
ad
serving
mo>vates
thinking
about
new
architectures
to
reduce
segment
genera>on
>me
23. RT
Framework:
Capture,
Compute
and
Forward
Data Feeds
Flume
Inges/on
COMPUTE CAPTURE FORWARD
Couchbase
Compute
Couchbase (front-‐end)
Cluster (back-‐end) and
ad-‐serving
logic
Big
Data
Loop
Hadoop
24. RT
Contextual
Segmenta/on
Data Feeds
Flume
Inges/on
User-‐ContentID
Mapper
Active Event Frame
User-‐Segment
Mapper UC Map
Membase Couchbase
+
US Short-term Map
ad-‐serving
logic
ContentID-Segment Map
25. Rough
Capacity
Es/mates
• Data
Volume
Calcula/on
– 60000
events
per
second
-‐>
60000
*
900
=
55
mil.
events
during
15-‐minute
burst
– 1KB
per
event
-‐>
55GB
for
staging
frame
+
55GB
for
loading
frame
=
110GB;
the
rest
~
800GB
is
for
data
output
from
processes
– 10
nodes
at
128GB
=
1TB
-‐>
more
than
enough!
(assuming
one
copy)
– exact
calcula/ons
at
hqp://wiki.membase.org/display/membase/Sizing
+Guidelines
• Processing
bandwidth
– assuming
cluster
supports
200K
ops
per
second
(conserva/ve)
– 60000
opera/ons/sec
reserved
for
loading
the
current
15-‐minute
frame
– remaining
140K
opera/ons/sec
for
jobs
27. Couchbase
Architecture
• Requirements
• Support
iniBally
up
to
1.2
billion
keys
(1
key
per
user
in
the
system).
• Minimum
of
10K
writes
per
second.
• Two
clusters,
one
on
each
coast,
to
reduce
latency.
• Easily
scalable,
support
an
increasing
number
of
keys
&
writes/reads
per
second
and
seamlessly
allow
growth
for
the
future
• Couchbase
Set
up
• IniBally
1.1
billion
keys,
now
650
million
keys.
• 250
to
~2K
writes/second.
• 1K
to
7K
reads/second.
• 2
clusters,
10
nodes
each.
• Dual
wriBng,
once
to
each
cluster.
• 1.19
TB
of
RAM
available,
124/128
GB
allocated
on
each
server.
200
Gb
in
use
at
the
moment.
25
28. Lessons
Learned
Issue ResoluMon
Do
not
use
the
local
moxy.
The
membase
client
(Spy)
ReplicaBon
across
data
center
by
wriBng
to
a
local
should
connect
directly
to
an
instance
of
a
remote
moxy
to
moxy
dramaBcally
reduces
the
throughput.
perform
updates.
Membase
needs
150
bytes
per
item
for
meta
data
Correctly
sizing
the
membase
cluster
based
on
Ensure
mem_high_wat
is
not
exceeded
to
prevent
the
expected
number
of
keys
and
size
of
the
spillover
to
disk.
If
incoming
data
arrives
faster
than
the
object
is
criBcal
to
the
membase
operaBon
data
write
to
disk,
the
system
returns
errors
Re-‐issue
flushctl
command
every
Bme
Membase
server
Membase
seongs
such
as
memory
high/low
restarts.
Membase
indicated
that
a
beer
configuraBon
water
marks
modified
by
flushctl
will
be
reverted
system
to
allow
permanent
change
of
seongs
is
coming.
to
default
when
the
service
restarts.
UnBl
then,
they
recommend
to
sBck
with
default
seongs.
26