SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
May 12, 2014
Ad Yield Optimization @
Spotify
I’m Kinshuk Mishra
•  Work on distributed systems and data science problems
•  Lead architecture for ads backend platform at Spotify
•  You can find me @_kinshukmishra
3
•  Started in 2006
•  Currently has over 24 million users
•  6 million paying users
•  Available in 28 countries
•  Over 300 engineers, of which 100 in NYC
What is Spotify?
•  getFreeTierUsers() / getAllUsers() > 0.70
•  getSpotifyPayoutToMusicLabels() = $$$
•  Great medium for promotions and announcements
Why are Ads important?
5
Native Ads
The problem
How do we optimize the ad yield on Spotify platform?
The type of questions we have
Find the total available audio ad impressions on iOS platform
between 9/12/2013 and 9/13/2013 in NYC metro area for male
users in the age-group of 18-35, and who typically listen to hip-hop
music genre?
What is unique about us?
•  Rules triggering ad breaks are unique
•  We also log user activity and audio streaming data
Different approaches
•  Simulate ad delivery by replaying user events and
triggering ad breaks
•  Pre-compute impression aggregates for different
dimensions and build a complex model to combine those
•  Use subset of impression data then filter and extrapolate it
using a simple model
Our Hadoop infrastructure
700 nodes in our hadoop cluster
Some constraints
•  Fast real-time lookup service
•  Consistent results
•  Ability to handle additional targeting
•  Ability to scale
The solution
Use subset of impression data then filter and extrapolate it
using a simple model in a service
But how?
Now begins the fun part…
Lets dive deeper to solve this problem
What was the big picture going be like?
Hadoop	
  
Ad	
  impression	
  log	
  
Postgres	
  DB	
  
Booked	
  Campaigns	
  
Forecas4ng	
  
	
  engine	
  
Forecast	
  Query	
  
High level forecasting engine algorithm
Log	
  
data	
  
Load	
  Data	
  Cache	
  
Campaign	
  
data	
  daily Once a minute
Submit	
  Forecast	
  
query	
  
Wait	
  for	
  
query	
  
Apply	
  filter	
  criteria	
  
to	
  dataset	
  
Count	
  available	
  
impressions	
  
Apply	
  growth	
  and	
  
other	
  
extrapola4on	
  
factors	
  
Some challenges…
•  Organic growth in inventory
•  Cold start
•  Seasonality
Organic growth in inventory
Ad impression inventory in a growing market
Organic growth in inventory?
Ad impression inventory in a market with high conversion to premium
Cold start
Ad impression inventory in a newly launched market
Seasonality
Ad impression inventory dip in early Q1
Volume of data
•  Billions of ad impressions per month
•  Terabytes of relevant forecasting data
Data overload?
Sampling
Caching
9/12/2013	
  
9/11/2013	
  
9/10/2013	
  
9/09/2013	
  
9/08/2013	
  
9/07/2013	
  
Log	
  
data	
  
Load	
  Data	
  Cache	
  
Campaign	
  
data	
  daily Once a minute
9/13/2013	
  
9/14/2013	
  
Optimizing data retrieval
•  We analyzed our data access pattern and found over 75% of
our campaigns are targeted by age and location.
•  So we mapped location to a list of users sorted by age using
SortedSetMultimap
•  Optimized user lookup by location and age-group to O(kLgN)
from typical O(kN) where,
N : Total users for a location
k : constant
Day of the Month
1	
   2	
   3	
   4	
   5	
   6	
   7	
   8	
   9	
   10	
   11	
   12	
   13	
   14	
   15	
   16	
   17	
   18	
   19	
   20	
   21	
   22	
   23	
   24	
   25	
   26	
   27	
   28	
   29	
   30	
   31	
   32	
  
Growth
How to find available inventory for sample population?
1.  Take all user ad impressions by applying “day of the month”
substitution
2.  Apply filters by ad-type, location, age, gender, platform, etc.
3.  Count the total impressions for all the users who match
4.  Read booked impressions for the similar target criteria from
the cache
5.  Inventory available = total impressions – booked
impressions
Growth Factor
Keep it simple
Extrapolation
•  Population (15 million) -> Sample (150,000)
•  Scaling factor is 100
•  Total Available inventory = scaling factor * available inventory for sample
Other features
•  Ad Frequency capping
•  Day of the week and time of the day filtering
•  View per user (VPU) capping
What worked for us?
1.  Fast lookups
2.  Simple models scaled well
3.  Deterministic algorithms easier to debug
4.  Adding new targeting features was easy
5.  Forecasting engine agnostic to changes in ad server
What didn’t work that well?
1.  Campaign level forecasts difficult without simulation
2.  Cold start is a real problem when there is no proxy dataset
3.  Forecasting inventory for new ad types can be challenging
What we’ve learnt
•  Think data volume
•  Consider Sampling
•  Choose appropriate time window
•  Analyze data access patterns and optimize for it
•  Use deterministic algorithms
•  Analyze data trends and factor those in computation
•  Simple models scale well
May 12, 2014
Email - Kinshuk@spotify.com
https://twitter.com/Spotifyjobs
Thanks!

Contenu connexe

Similaire à Ad Yield Optimization @ Spotify - DataGotham 2013

Predictive Media Buying - an analytical method to optimise media buying.
Predictive Media Buying - an analytical method to optimise media buying.Predictive Media Buying - an analytical method to optimise media buying.
Predictive Media Buying - an analytical method to optimise media buying.Shannon Shortridge
 
Marketing campaign cost
Marketing campaign costMarketing campaign cost
Marketing campaign costISV World
 
What's Next: The Value of Data
What's Next: The Value of DataWhat's Next: The Value of Data
What's Next: The Value of DataOgilvy Consulting
 
Ad:Tech Data Summit - Sydney | 2013
Ad:Tech Data Summit - Sydney | 2013Ad:Tech Data Summit - Sydney | 2013
Ad:Tech Data Summit - Sydney | 2013Louder
 
9 Steps to a World-Class VoC Program
9 Steps to a World-Class VoC Program9 Steps to a World-Class VoC Program
9 Steps to a World-Class VoC ProgramQualtrics
 
Mobile Marketing Trends and Strategies
Mobile Marketing Trends and StrategiesMobile Marketing Trends and Strategies
Mobile Marketing Trends and StrategiesAlexander Tsatkin
 
Digital marketing trends for 2018
Digital marketing trends for 2018Digital marketing trends for 2018
Digital marketing trends for 2018Smart Insights
 
Gain a Holistic View of your Customer's Journey
Gain a Holistic View of your Customer's JourneyGain a Holistic View of your Customer's Journey
Gain a Holistic View of your Customer's JourneyPlatfora
 
Use of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyUse of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyAmit Parija
 
Digital marketing for Business Growth
Digital marketing for Business GrowthDigital marketing for Business Growth
Digital marketing for Business GrowthJohn Gs
 
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectData Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectFormulatedby
 
Spec for a Post Campaign Report
Spec for a Post Campaign ReportSpec for a Post Campaign Report
Spec for a Post Campaign ReportAnkur Gupta
 
Webinar: Optimizing Digital Spend Using Consumer Search Behavior
Webinar: Optimizing Digital Spend Using Consumer Search BehaviorWebinar: Optimizing Digital Spend Using Consumer Search Behavior
Webinar: Optimizing Digital Spend Using Consumer Search BehaviorCourse5i
 
Why is programmatic taking off? What is this revolution all about?
Why is programmatic taking off?  What is this revolution all about?Why is programmatic taking off?  What is this revolution all about?
Why is programmatic taking off? What is this revolution all about?Datacratic
 
SaaS Business & Marketing & Strategy
SaaS Business & Marketing & StrategySaaS Business & Marketing & Strategy
SaaS Business & Marketing & StrategySriram Reddy
 
Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...
Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...
Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...G3 Communications
 

Similaire à Ad Yield Optimization @ Spotify - DataGotham 2013 (20)

Predictive Media Buying - an analytical method to optimise media buying.
Predictive Media Buying - an analytical method to optimise media buying.Predictive Media Buying - an analytical method to optimise media buying.
Predictive Media Buying - an analytical method to optimise media buying.
 
Marketing campaign cost
Marketing campaign costMarketing campaign cost
Marketing campaign cost
 
Google Analytics 101
Google Analytics 101Google Analytics 101
Google Analytics 101
 
Media management post graduate Week 8.pptx
Media management post graduate Week 8.pptxMedia management post graduate Week 8.pptx
Media management post graduate Week 8.pptx
 
What's Next: The Value of Data
What's Next: The Value of DataWhat's Next: The Value of Data
What's Next: The Value of Data
 
Ad:Tech Data Summit - Sydney | 2013
Ad:Tech Data Summit - Sydney | 2013Ad:Tech Data Summit - Sydney | 2013
Ad:Tech Data Summit - Sydney | 2013
 
9 Steps to a World-Class VoC Program
9 Steps to a World-Class VoC Program9 Steps to a World-Class VoC Program
9 Steps to a World-Class VoC Program
 
Mobile Marketing Trends and Strategies
Mobile Marketing Trends and StrategiesMobile Marketing Trends and Strategies
Mobile Marketing Trends and Strategies
 
Digital marketing trends for 2018
Digital marketing trends for 2018Digital marketing trends for 2018
Digital marketing trends for 2018
 
Gain a Holistic View of your Customer's Journey
Gain a Holistic View of your Customer's JourneyGain a Holistic View of your Customer's Journey
Gain a Holistic View of your Customer's Journey
 
Use of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyUse of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economy
 
Digital marketing for Business Growth
Digital marketing for Business GrowthDigital marketing for Business Growth
Digital marketing for Business Growth
 
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectData Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
 
Spec for a Post Campaign Report
Spec for a Post Campaign ReportSpec for a Post Campaign Report
Spec for a Post Campaign Report
 
Webinar: Optimizing Digital Spend Using Consumer Search Behavior
Webinar: Optimizing Digital Spend Using Consumer Search BehaviorWebinar: Optimizing Digital Spend Using Consumer Search Behavior
Webinar: Optimizing Digital Spend Using Consumer Search Behavior
 
Why is programmatic taking off? What is this revolution all about?
Why is programmatic taking off?  What is this revolution all about?Why is programmatic taking off?  What is this revolution all about?
Why is programmatic taking off? What is this revolution all about?
 
SaaS Business & Marketing & Strategy
SaaS Business & Marketing & StrategySaaS Business & Marketing & Strategy
SaaS Business & Marketing & Strategy
 
Saa s marketing
Saa s marketing Saa s marketing
Saa s marketing
 
Dat credentials 04 11-2016
Dat credentials 04 11-2016Dat credentials 04 11-2016
Dat credentials 04 11-2016
 
Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...
Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...
Spotting Buying Signals At All Stages: Applying Predictive Analytics Across T...
 

Dernier

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 

Dernier (20)

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 

Ad Yield Optimization @ Spotify - DataGotham 2013

  • 1. May 12, 2014 Ad Yield Optimization @ Spotify
  • 2. I’m Kinshuk Mishra •  Work on distributed systems and data science problems •  Lead architecture for ads backend platform at Spotify •  You can find me @_kinshukmishra
  • 3. 3 •  Started in 2006 •  Currently has over 24 million users •  6 million paying users •  Available in 28 countries •  Over 300 engineers, of which 100 in NYC What is Spotify?
  • 4. •  getFreeTierUsers() / getAllUsers() > 0.70 •  getSpotifyPayoutToMusicLabels() = $$$ •  Great medium for promotions and announcements Why are Ads important?
  • 6. The problem How do we optimize the ad yield on Spotify platform?
  • 7. The type of questions we have Find the total available audio ad impressions on iOS platform between 9/12/2013 and 9/13/2013 in NYC metro area for male users in the age-group of 18-35, and who typically listen to hip-hop music genre?
  • 8. What is unique about us? •  Rules triggering ad breaks are unique •  We also log user activity and audio streaming data
  • 9. Different approaches •  Simulate ad delivery by replaying user events and triggering ad breaks •  Pre-compute impression aggregates for different dimensions and build a complex model to combine those •  Use subset of impression data then filter and extrapolate it using a simple model
  • 10. Our Hadoop infrastructure 700 nodes in our hadoop cluster
  • 11. Some constraints •  Fast real-time lookup service •  Consistent results •  Ability to handle additional targeting •  Ability to scale
  • 12. The solution Use subset of impression data then filter and extrapolate it using a simple model in a service
  • 13. But how? Now begins the fun part… Lets dive deeper to solve this problem
  • 14. What was the big picture going be like? Hadoop   Ad  impression  log   Postgres  DB   Booked  Campaigns   Forecas4ng    engine   Forecast  Query  
  • 15. High level forecasting engine algorithm Log   data   Load  Data  Cache   Campaign   data  daily Once a minute Submit  Forecast   query   Wait  for   query   Apply  filter  criteria   to  dataset   Count  available   impressions   Apply  growth  and   other   extrapola4on   factors  
  • 16. Some challenges… •  Organic growth in inventory •  Cold start •  Seasonality
  • 17. Organic growth in inventory Ad impression inventory in a growing market
  • 18. Organic growth in inventory? Ad impression inventory in a market with high conversion to premium
  • 19. Cold start Ad impression inventory in a newly launched market
  • 21. Volume of data •  Billions of ad impressions per month •  Terabytes of relevant forecasting data Data overload?
  • 23. Caching 9/12/2013   9/11/2013   9/10/2013   9/09/2013   9/08/2013   9/07/2013   Log   data   Load  Data  Cache   Campaign   data  daily Once a minute 9/13/2013   9/14/2013  
  • 24. Optimizing data retrieval •  We analyzed our data access pattern and found over 75% of our campaigns are targeted by age and location. •  So we mapped location to a list of users sorted by age using SortedSetMultimap •  Optimized user lookup by location and age-group to O(kLgN) from typical O(kN) where, N : Total users for a location k : constant
  • 25. Day of the Month 1   2   3   4   5   6   7   8   9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32   Growth
  • 26. How to find available inventory for sample population? 1.  Take all user ad impressions by applying “day of the month” substitution 2.  Apply filters by ad-type, location, age, gender, platform, etc. 3.  Count the total impressions for all the users who match 4.  Read booked impressions for the similar target criteria from the cache 5.  Inventory available = total impressions – booked impressions
  • 28. Extrapolation •  Population (15 million) -> Sample (150,000) •  Scaling factor is 100 •  Total Available inventory = scaling factor * available inventory for sample
  • 29. Other features •  Ad Frequency capping •  Day of the week and time of the day filtering •  View per user (VPU) capping
  • 30. What worked for us? 1.  Fast lookups 2.  Simple models scaled well 3.  Deterministic algorithms easier to debug 4.  Adding new targeting features was easy 5.  Forecasting engine agnostic to changes in ad server
  • 31. What didn’t work that well? 1.  Campaign level forecasts difficult without simulation 2.  Cold start is a real problem when there is no proxy dataset 3.  Forecasting inventory for new ad types can be challenging
  • 32. What we’ve learnt •  Think data volume •  Consider Sampling •  Choose appropriate time window •  Analyze data access patterns and optimize for it •  Use deterministic algorithms •  Analyze data trends and factor those in computation •  Simple models scale well
  • 33. May 12, 2014 Email - Kinshuk@spotify.com https://twitter.com/Spotifyjobs Thanks!