Target and Connect Intelligently with Kafka & Storm

Target and Connect Intelligently
Experience with Kafka & Storm
Otto Mok
Solution Architect, AcuityAds
April 30, 2014 – Toronto Hadoop User Group

2
Agenda
• Background
– What does AcuityAds do?
• Use case
– What are we trying to do?
• High-level System Architecture
– How does the data flow?
• Kafka & Storm
– What did we do wrong?

3
Background
Source: https://www.google.ca/search?q=banner+ads&tbm=isch&tbo=u

4
Background
• Digital Advertising
– Website banner, pre-roll video, free mobile app
• Buy ad impressions at ‘real-time’
– Response within 50ms for auction
• Find best match between people and ads
– Show ad that you care about
• Use machine learning algo to ‘learn’
– Data, data, data

5
Use case
• 10+ billion daily impressions
• 30,000+ new sites daily
• How many daily impressions by site?
• How are the impressions distributed?
– Country, Province, Gender, Age Range, etc...

6
High-level System Architecture
• 10+ billion daily bid requests
• Make up to 4 billion daily bids
• Serve millions of daily impressions
• 10+ TB of messages daily
• 300k+ message / second
Bidder Adserver
Kafka
Hbase/Hadoop
Storm

7
Kafka
Source: http://kafka.apache.org/documentation.html

8
Kafka - Spec
• Kafka v0.8.0
• Servers – 10 x 2U(10 x 3TB) JBOD
• Total storage – 300 TB
• Replication – 3x
• Unique data – 100 TB
• Capacity – a few days
• Producer acknowledgment – never waits
• Topic - BIDREQUEST

9
Kafka - Monitoring
• Nagios
– Ping, CPU, memory, network I/O, disk space
• Producer-Consumer group message counting
– Hourly consumption rate check
Topic Consumer Group ID Producer Count Consumer Count Error Ratio
BIDREQUEST InventoryTopology 122,450,812 122,444,294 None 1.00
BIDREQUEST SearchTargetingTopology 122,450,812 107,755,295 Ratio below 98% 0.88

10
Kafka - Monitoring
• Kafka Web Console
– Partition offset for each consumer group

11
Kafka - Issues
• Issue 1 - Partitions
– 10 partitions
– Each partition > 1 TB a day
– 100 TB / 1 TB – no problem!
• Each partition is stored in a directory
– /disk05/kafka-logs/BIDREQUEST-09
– /disk09/kafka-logs/BIDREQUEST-03

12
Kafka - Issues
• Issue 2 – Unbalanced partition distribution
– Some servers running out of space
– Some servers are not “leader” for any partition
• Network glitch cause server to drop out of
cluster, no longer leader after rejoin
• auto.leader.rebalance.enable=true

13
Lots of data – now what?
Source: http://bookriotcom.c.presscdn.com/wp-content/uploads/2013/03/server-farm-shot.jpg

14
Use case - again
• 10+ billion daily impressions
• 30,000+ new sites daily
• How many daily impressions by site?
• How are the impressions distributed?
– Country, Province, Gender, Age Range, etc...

15
Storm
Source: http://storm.incubator.apache.org/documentation/Tutorial.html

16
Storm - Spec
• Storm v0.8.2
• Servers – 13 x Dual Quad Core Xeon 36G RAM
• 4 worker slots per server
• Total logical CPUs – 208
• Total memory – 468 G
• Total slots – 52 worker slots (JVMs)

18
Storm - Topology
• Spout read each BidRequest from Kafka topic
• Determine new or existing, emit tuples to
different “streams”

19
Storm - Topology
• InsertInventoryBolt
– Process tuples from NewInventory stream
– Field grouping on sourceId, domainName
– Tick tuple every 1 second
• UpdateInventoryBolt
– Process tuples from ExistingInventory stream
– Field grouping on inventoryId
– Tick tuple every 1 second

20
Storm - Topology
• LogInventoryBolt
– Process tuples from ExistingInventory stream
– Field grouping on inventoryId
– Tick tuple every 10 seconds

21
Storm - Issues
• Issue – Low uptime
– 10 workers, 100 executors
– Not processing many tuples
– Process latency < 10ms
• Bolts restarts due to uncaught Exceptions

22
Conclusion
• Cost
– Bleed edge technology  bugs
– Support  mailing lists
– Monitoring  roll your own
– Operation  dedicated personnel
• Benefit
– Near real-time data on site impression volume &
distribution by geo, demo, etc...

23
Forward Looking
• Kafka v0.8.1.1
– Allow specify broker hostname for producer &
consumer
– Change # of partitions of a topic online
• Storm v0.9.1
– Faster pure Java Netty transport
– View logs from each server from Storm UI
– Tick tuple using floating point seconds
– Storm on Hadoop (HDP 2.1)

24
Thank you
Otto Mok
otto.mok@acuityads.com
Source: http://jamesgieordano.files.wordpress.com/2011/05/babyelephant.jpg

Target and Connect Intelligently with Kafka & Storm

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Target and Connect Intelligently with Kafka & Storm

Similaire à Target and Connect Intelligently with Kafka & Storm (20)

Dernier

Dernier (20)

Target and Connect Intelligently with Kafka & Storm