Digital Advertising on AWS summarizes trends in digital advertising and examples of how AWS solutions can help. The document discusses how the industry is fragmented with many intermediaries, and areas of high growth like mobile, real-time bidding (RTB), big data, and video. It provides examples of AWS solutions that help customers with RTB by optimizing compute scaling, data access, and networking. Another solution ingests 150TB of data daily for RTB data collection. The document advocates working with customers to identify "undifferentiated heavy lifting" that AWS can solve through industry solutions.
4. Goals for Industry Business Development
Better understand our customers
business and technical challenges.
Collect the data and draw
conclusions how AWS can help with
solutions.
Evolve. Accelerate change and new
product development at AWS through
solutions.
6. Observations
1. Industry is very fragmented – barrier to enter is low
2. Value chain from Advertiser to Publisher is complex –
too many intermediaries (take about 60% of the
Advertising $$$)
3. Overall market is growing - $170B in 2015 to over
$200B in 2017
4. Some areas ground much faster – Mobile, Video, RTB
10. Video
High growth, high eCPMs
Fraud issues
Ad blocking
Server side ad stitching
New VAST 3.0 is just
published
11. Industry Solutions
Dec 2006 Web 2.0 Summit – Tim
O’Reilly interviewed Jeff Bezos and
Jeff mentioned “undifferentiated
heavy lifting”
Industry solutions – try to identify
new “undifferentiated heavy lifting”
and solve the problem with AWS.
12. Some Examples: Real Time Bidding
Differentiators
Algorithms
for RTB
ML Models
Heavy Lifting
Scaling EC2
instances
Low latency
NoSQL
Heavy Lifting
Data access
libraries
Optimizing
Networking
13. Example RTB – Cost Curve
Do I need co-
located bidders?
Do I need a
dedicated
networking team?
Do I need to own
networking
equipment?
$0
$100,000
$200,000
$300,000
$400,000
$500,000
$600,000
1 2 5 7 10
USD
Time (ms)
Monthly RTB Fleet Spend vs.
Roundrip Exchange Latency (ms)
14. Customer Perspective on RTB Solutions
Customers have very
different perspectives on
what their business
considers as competitive
advantage
What is your competitive
advantage?
What do you consider
shareable knowledge? Customer 1 Customer 2 Customer 3
More advanced
customers are
sharing more
and raising the
HL bar
HL
HL
HL
15. “We run the RTB platform on more than 2,500
machines, approximately eight hours a day globally, at
a cost of less than $0.05 per day per machine...”
“Because we’re running on AWS, we’re able to focus
95 percent of our staff on new product development.
Using AWS allows us to focus on innovating our
platform and solving customer problems.”
Valentino Volonghi, CTO AdRoll
Example: Enabling Real Time Bidding
Advertiser
Solutions
16. Example: RTB Data Collection
Improved speed (mins to secs), simplicity & cost reduction
Reducing data latency to seconds
Ingesting approximately 150TB daily
17. Big Data
This business is all about data – competitive advantage
Everyone does it, but still easy to overspend
Apache Spark
Druid
Analytics on streaming data
18. Solution Example: Ad Exchanges Outside AWS
18
Equinix
AdIX
AWS
Customer
Ad Exchange
Provider
Ad Exchange
Provider
Ad Exchange
Provider
Equinix
AdIX
Ad Exchange
Provider
Ad Exchange
Provider
Ad Exchange
Provider
Ashburn
New York
Partner provides:
Channel on NNI (<1GB) or
Dedicated port (>1GB)
Private
IP
Public IP
Public
IP
DX Partner
DX Partner
Reduce AWS traffic spend ~25%
Predictable latency (vs. Internet)
Reduced latency
19. Solution Example: Druid + Spark
Everyone knows and loves Apache Spark
Druid is less popular but was “quietly”
implemented in a number of leading
Advertising companies
Druid is an real time OLAP engine and
can be used together with Apache
Spark
20. Druid Architecture
Real Time Nodes
Broker Nodes
Historical Nodes
Hand Off Data
Segments
Query API
Real Time
Data
21. Druid + Spark Solution Technology
Improves query
performance
Uses Druid index under
RDD, no change to
Spark user experience
Uses open source
library by Sparkline
Data to rewrite the
queries and redirect to
Druid.
23. Solutions Ideas and Feedback
1. RTB
2. Ad Exchange Connectivity
3. Low latency user stores
4. Data Science
• Advertiser problems on sample data – attribution examples, etc.
• Streaming data analytics (Kinesis + EMR/Spark)
5. Mobile (SDK IAB compliance, mobile services etc.)
6. Ad content delivery
7. Cost savings with EC2 Spot
If we look at the advertising value chain – all the companies between the Marketers and Content Publishers, we see that AWS is enabling clients with wide variety of business models. On the advertiser side of the business (Advertiser Solution Providers) we see Ad Agencies like Razorfish, Demand Side Platforms like DataXu and on the publisher side we see Publisher Solution Providers or Supply Side Platforms like Fiksu and Zedo. We also have a very active group of advertising networks ad exchanges rapidly growing on AWS especially in Mobile and Video segments.
Our clients see the value of significantly improving the economics of their business by using AWS. In addition, having multiple companies co-exist in the same AWS Regions creates a number of new opportunities for B2B collaboration between them. Digital Advertising business frequently requires low latency communications as well as data analytics on Petabyte size data sets. These scenarios are becoming very popular and easily deployable on AWS.
Customers find their operating ranges they are comfortable with and make economic sense
More advanced customers are sharing more and elevating overall knowledge
You can see that in other industries too
NO Undifferentiated heavy lifting – Compare and contrast with traditional companies
A great example of stunning growth and success is a company from California called AdRoll. AdRoll is an Advertiser Solution Provider and global leader in retargeting with more than 10,000 active advertisers across over 100 countries. The company is razor focused on delivering new business functionality to their clients and leaving the heavy undifferentiated lifting to AWS. What sets this company apart is significant simplification of the infrastructure which is achieved through latest advancements in several AWS technologies. Good question to think about - how many people in your organization are focused on new product development vs. maintenance of the existing solutions?
Video (3 mins) https://aws.amazon.com/solutions/case-studies/adroll/
Here is an example of large scale data ingest solution implemented by AdRoll
The key benefit is reducing data latency and insight from minutes to seconds. It implements a streaming service called Kinesis which received the data from multiple sources and makes it available for real time batch and real time applications.
Batching records to save $ in Kinesis
Removing large numbers of small files
Ingesting approximately 150TB daily
AdRoll Kinesis ingest
http://tech.adroll.com/blog/data/2015/06/26/kinesis.html
Stream is coming in
Real time nodes – buffer them – bring in data and once in a while they hand off to historical data nodes
Brokers – you query them – they understand how to locate the data
Real time node – periodic buffer into the disk
Real time node queries into
Segment – is sent to the Deep Storage/historical node
Availability – how is it handled (failure scenarios)
Losing the process and losing the disk
Losing the process – bring it back up what it has seen drom the disk
Lose process and the disk – that data did not make it to the deep storage
Features of the message bus – how do you deliver the message to the bus – be able to replay the messages
Replicate the feed onto multiple machines.
Need to take data from Kafka (read back from)
Configure multiple machines to handle the same data – to do HA on the real time node