Josh Glover, Software Engineer at Videoplaza, will introduce you to the domain of video advertising and show how Videoplaza uses Apache Cassandra as part of a system that solves the difficult problem of allowing clients to analyse the performance of their advertising campaigns in real-time. Videoplaza needs to aggregate data for tens of thousands of combinations of dimensions and metrics for hundreds of clients from an incoming stream of thousands of requests per second, and do it fast enough so that clients can see trends as they happen.
Time: hours, months, years, etc. Event: views, clicks Device: iPhone, PC, PS3 Demo: age, gender, income, interests Location
Create report template -> aggregate ID
Tracker publishes to message broker (we use RabbitMQ). If you don’t know anything about messaging, definitely talk to me afterwards! For each event, you will increment a bunch of counters (e.g. ad and event, ad and time and event, etc.)
Upgrading to 1.2 this week!
10mb sstables
Scala-style; key -> value Explain transaction ID here
Adef tells us which combinations of dimensions are necessary Dimensions repository contains dimension values
Read rows one at a time due to Thrift max message size After we upgrade, we can use binary CQL client for better performance
Java futures
Shard by hashing values of all dimensions except time
We can replay data, but we can’t unplay it. No way to decrement a Hyper LogLog counter.