SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Claudia Perlich
Chief Scientist, Dstillery
Adjunct Professor, Stern (NYU)
@claudia_perlich
Talesfromdatatrenchesof
displayadvertising
Ad
Exchange
Shopping at one of
our campaign sites
cookies
10 Million
URL’s
200 Million
browsers
0.0001% to 1%
baserate 10 Billions of
auctions
per day
conversion
Where should
we advertise and
at what price?
Does the ad have
causal effect?
What data should
we pay for?
Attribution?
Who should
we target for
a marketer?
What requests
are fraudulent?
The Non-Branded Web
A consumer’s online/mobile activity
The Branded Web
gets recorded like this:
Our BrowserData:Agnostic
I do not want to ‘understand’ who you are …
Browsing History
Hashed URL’s:
date1 abkcc
date2 kkllo
date3 88iok
date4 7uiol
…
Browsing History
Hashed URL’s:
date1 abkcc
date2 kkllo
date3 88iok
date4 7uiol
…
Brand Event
Encoded
date1 3012L20
date 2 4199L30
…
date n 3075L50
Brand Event
Encoded
date1 3012L20
date 2 4199L30
…
date n 3075L50
Targeting Model
Bidding
Model
Fraud
Causal
Analysis
Analytical
Decomposition
TheHeartandSoul
 Predictive modeling on hashed browsing history
 10 Million dimensions for URL’s (binary indicators)
 extremely sparse data
 positives are extremely rare
Targeting
Model
P(Buy|URL,inventory,ad)
Howcanwelearnfrom10Mfeatureswith
no/fewpositives?
 We cheat.
In ML, cheating is called “Transfer Learning”
Theheartand soul
 Has to deal with the 10 Million URL’s
 Need to find more positives!
Targeting
Model P(Buy|URL,inventory,ad)
Experiment
 Randomized targeting across 58 different large display ad campaigns.
 Served ads to users with active, stable cookies
 Targeted ~5000 random users per day for each marketer. Campaigns ran
for 1 to 5 months, between 100K and 4MM impressions per campaign
 Observed outcomes: clicks on ads, post-impression (PI) purchases
(conversions)
Data
Targeting
• Optimize targeting using Click and PI Purchase
• Technographic info and web history as input variables
• Evaluate each separately trained model on its ability to rank order users for PI
Purchase, using AUC (Mann-Whitney Wilcoxin Statistic)
• Each model is trained/evaluated using Logistic Regression
.2.4.6.8
AUC
Train on Click Train on Purchase
*Restricted feature set used for these modeling results; qualitative conclusions gener
Predictiveperformance*(AUC)forpurchase
learning
[Dalessandro et al. 2012]
.2.4.6.8
AUC
Train on Click Train on Purchase
®
.2.4.6.8
AUC
Train on Click Train on Purchase
®
*Restricted feature set used for these modeling results; qualitative conclusions gener
Predictiveperformance*(AUC)forclick
learning
[Dalessandro et al. 2012]
Evaluatedonpredictingpurchases
(AUCinthetargetdomain)
.2.4.6.81
Train on Clicks Train on Site Visits Train on Purchase
AUCDistribution
*Restricted feature set used for these modeling results; qualitative conclusions gener
Predictiveperformance*(AUC)
forSiteVisitlearning
[Dalessandro et al. 2012]
Significantly better targeting training on source
task
Evaluatedonpredictingpurchases
(AUCinthetargetdomain)
Whyislearningthewrongthing
better???
Transfer:NavigatingBias-Variance
.2.4.6.81
Train on Clicks Train on Site Visits Train on Purchase
AUCDistribution
*Restricted feature set used for these modeling results; qualitative conclusions gener
Predictiveperformance*(AUC)across58
differentdisplayadcampaigns
[Dalessandro et al. 2012]
Significantly better targeting training on source
task
High cost
High correlation
High Variance
Low cost
Low correlation
High Bias
Low Cost
High correlation
Low Bias & Variance
Theheartand soul
 Has to deal with the 10 Million URL’s
 Transfer learning:
 Use all kinds of Site visits instead of new purchases
 Biased sample in every possible way to reduce variance
 Negatives are ‘everything else’
 Pre-campaign without impression
 Stacking for transfer learning
Targeting
Model
Organic: P(SiteVisit|URL’s)
P(Buy|URL,inventory,ad)
MLJ 2014
Logisticregressionin 10
Milliondimensions
 Stochastic Gradient Descent
 L1 and L2 constraints
 Automatic estimation of optimal learning rates
 Bayesian empirical industry priors
 Streaming updates of the models
 Fully Automated ~10000 model per week
KDD 2014
Targeting
Model
p(sv|urls) =
Ad AdAd
Real-timeScoringof aUser
Ad
OBSERVATION
Purchase
ProspectRank
Threshold
site visit with positive correlation
site visit with negative correlation
ENGAGEMENT
Some prospects fall
out of favor once their
in-market indicators
decline.
0
5
10
15
20
25
0
1.0M
2.0M
3.0M
4.0M
5.0M
6.0M
NNLiftoverRON
TotalImpressions
median lift = 5x
Note: the top prospects are consistently rated as
being excellent compared to alternatives by advertising
clients’ internal measures, and when measured by their
analysis partners (e.g., Nielsen): high ROI,
low cost-per-acquisition, etc.
Lift over random for 66 campaigns
for online display ad prospecting
Liftoverbaseline
<snip>
ThePokerface Bidding
ModelP(SiteVisit|Prospect Rank, Inventory, ad)
KDD 2012 Best Paper
Marginal Inventory Score:
Convert into bid price:
InventoryforHotelCampaign
20
Lift
Measuringcausaleffect?
A/B Testing
Practical concerns
Estimate Causal effects from observational data
 Using targeted maximumlikelihood(TMLE)
to estimatecausal impact
 Canbe done ex-post for different questions
 Need tocontrol for confounding
 Data has to be ‘rich’and cover allcombinations of
confounding and treatment
ADKDD 2011
E[YA=ad] – E[YA=no ad]
Animportantdecision…
I think she is hot!
Hmm – so what should I write
to her to get her number?
Source: OK Trends
?
?
Hardshipsofcausality.
Beauty is Confounding
determines both the probability
of getting the number and of the
probability that James will say it
need to control for the actual
beauty or it can appear that
making compliments is a bad idea
“You are beautiful.”
Hardshipsof causality.
Targeting is Confounding
We only show ads to people
we know are more likely to
convert (ad or not)
conversionrates
DID NOT SEE ADSAW AD
Needtocontrolforconfounding
Datahastobe‘rich’andcoverall
combinationsofconfoundingand
treatment
ObservationalCausalMethods:TMLE
Negative Test: wrong ad
Positive Test: A/B comparison
Somecreativesdo notwork…
27
ThePoliceFraud
 Tracking artificial
co-visitation patters
 Blacklist inventory in the
exchanges
 Ignore the browser
KDD 2013
UnreasonablePerformanceIncreaseSpring12
2 weeks
PerformanceIndex
2x
Oddly predictive websites?
36%trafficisNon-Intentional
2011 2012
6%
36%
Traffic patterns are ‘non - human’
website 1 website 2
50%
Data from Bid Requests in Ad-Exchanges
Node:
hostname
Edge:
50% co-visitation
WWW2010
BostonHerald
BostonHerald
womenshealthbase?
WWW2012
Unreasonable Performance Increase Spring 12
2 weeks
PerformanceIndex
2x
Now it is coming also to brands
• ‘Cookie Stuffing’ increases the value of the ad for
retargeting
• Messing upWeb analytics …
• Messes up my models because a botnet is easier to
predict than a human
Fraudpollutesmymodels
• Don’t show ads on those sites
• Don’t show ads to a high jacked browser
• Need to remove the visits to the fraud sites
• Need to remove the fraudulent brand visits
When we see a browser on caught up in fraudulent
activity: send him to the penalty box where we
ignore all his actions
Usingthepenaltybox:allbacktonormal
44
3 more weeks in spring 2012
PerformanceIndex
In eigenerSache
claudia.perlich@gmail.com
1. B. Dalessandro, F. Provost, R. Hook. Audience Selection for On-Line Brand
Advertising: Privacy Friendly Social Network Targeting, KDD 2009
2. O. Stitelman, B. Dalessandro, C. Perlich, and F. Provost. Estimating The Effect Of
Online Display Advertising On Browser Conversion. ADKDD 2011
3. C.Perlich, O. Stitelman, B. Dalessandro, T. Raeder and F. Provost. Bid Optimizing
and Inventory Scoring in Targeted Online Advertising. KDD 2012 (Best Paper Award)
4. T. Raeder, O. Stitelman, B. Dalessandro, C. Perlich, and F. Provost. Design
Principles of Massive, Robust Prediction Systems. KDD 2012
5. B. Dalessandro, O. Stitelman, C. Perlich, F. Provost Causally Motivated Attribution for
Online Advertising. In Proceedings of KDD, ADKDD 2012
6. B. Dalessandro, R. Hook. C. Perlich, F. Provost. Transfer Learning for Display
Advertising MLJ 2014
7. T. Raeder, C. Perlich, B. Dalessandro, O. Stitelman, F. Provost. Scalable Supervised
Dimensionality Reduction Using Clustering at KDD 2013
8. O. Stitelman, C. Perlich, B. Dalessandro, R. Hook, T. Raeder, F. Provost. Using Co-
visitation Networks For Classifying Non-Intentional Traffic‘ at KDD 2013
46
SomeReferences

Contenu connexe

Tendances

Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides
Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides
Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides
SlideTeam
 

Tendances (20)

Debunking Ad Testing
Debunking Ad TestingDebunking Ad Testing
Debunking Ad Testing
 
Finding your Organization’s Digital Pulse: Leveraging Your Web Site as a Busi...
Finding your Organization’s Digital Pulse: Leveraging Your Web Site as a Busi...Finding your Organization’s Digital Pulse: Leveraging Your Web Site as a Busi...
Finding your Organization’s Digital Pulse: Leveraging Your Web Site as a Busi...
 
The Guide to Performance Job Advertising
The Guide to Performance Job AdvertisingThe Guide to Performance Job Advertising
The Guide to Performance Job Advertising
 
Social Business Outreach Engagement Strategies For 2013
Social Business Outreach Engagement Strategies For 2013Social Business Outreach Engagement Strategies For 2013
Social Business Outreach Engagement Strategies For 2013
 
Is Google AdWords Remarketing for You?
Is Google AdWords Remarketing for You?Is Google AdWords Remarketing for You?
Is Google AdWords Remarketing for You?
 
Microsoft Advertising Bootcamp - Morning Session
Microsoft Advertising Bootcamp - Morning SessionMicrosoft Advertising Bootcamp - Morning Session
Microsoft Advertising Bootcamp - Morning Session
 
Seojocktoberfest - Attribution - Russell McAthy
Seojocktoberfest - Attribution - Russell McAthySeojocktoberfest - Attribution - Russell McAthy
Seojocktoberfest - Attribution - Russell McAthy
 
When Abandoning Best Practices is the Right Thing To Do
When Abandoning Best Practices is the Right Thing To DoWhen Abandoning Best Practices is the Right Thing To Do
When Abandoning Best Practices is the Right Thing To Do
 
Search & Programmatic in the Post-Digital World
Search & Programmatic in the Post-Digital WorldSearch & Programmatic in the Post-Digital World
Search & Programmatic in the Post-Digital World
 
Murat Yatagan - Advanced Search Summit Napa 2021
Murat Yatagan - Advanced Search Summit Napa 2021Murat Yatagan - Advanced Search Summit Napa 2021
Murat Yatagan - Advanced Search Summit Napa 2021
 
[Webinar for Job Sites] What Programmatic Trends Mean for Job Sites
[Webinar for Job Sites] What Programmatic Trends Mean for Job Sites[Webinar for Job Sites] What Programmatic Trends Mean for Job Sites
[Webinar for Job Sites] What Programmatic Trends Mean for Job Sites
 
4 Crucial Small Business Marketing Metrics for 2022: Everything You Need to ...
4 Crucial Small Business Marketing Metrics for 2022:  Everything You Need to ...4 Crucial Small Business Marketing Metrics for 2022:  Everything You Need to ...
4 Crucial Small Business Marketing Metrics for 2022: Everything You Need to ...
 
Here’s your digital marketing playbook for the restaurant industry in 2021
Here’s your digital marketing playbook for the restaurant industry in 2021Here’s your digital marketing playbook for the restaurant industry in 2021
Here’s your digital marketing playbook for the restaurant industry in 2021
 
Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides
Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides
Digital Marketing Opportunities And Challenges PowerPoint Presentation Slides
 
Digital Marketing for High-End Real Estate
Digital Marketing for High-End Real EstateDigital Marketing for High-End Real Estate
Digital Marketing for High-End Real Estate
 
Marketing Mashup: Top takeaways from Web Opt Summit 2014
Marketing Mashup: Top takeaways from Web Opt Summit 2014Marketing Mashup: Top takeaways from Web Opt Summit 2014
Marketing Mashup: Top takeaways from Web Opt Summit 2014
 
[Webinar] What is Programmatic Job Advertising?
[Webinar] What is Programmatic Job Advertising?[Webinar] What is Programmatic Job Advertising?
[Webinar] What is Programmatic Job Advertising?
 
Marketing Metrics - The Smart Marketer's Advantage
Marketing Metrics - The Smart Marketer's AdvantageMarketing Metrics - The Smart Marketer's Advantage
Marketing Metrics - The Smart Marketer's Advantage
 
Attribution Demystified: Digital World Expo 2012
Attribution Demystified: Digital World Expo 2012Attribution Demystified: Digital World Expo 2012
Attribution Demystified: Digital World Expo 2012
 
[Webinar] The Second Wave of Recruitment Marketing
[Webinar] The Second Wave of Recruitment Marketing[Webinar] The Second Wave of Recruitment Marketing
[Webinar] The Second Wave of Recruitment Marketing
 

Similaire à MLconf NYC Claudia Perlich

nextNY Online Marketing School - SEM Presentation
nextNY Online Marketing School - SEM PresentationnextNY Online Marketing School - SEM Presentation
nextNY Online Marketing School - SEM Presentation
nextNY
 
Invite media playbook report
Invite media playbook reportInvite media playbook report
Invite media playbook report
AdCMO
 
Digital marketing strategy playbook
Digital marketing strategy playbookDigital marketing strategy playbook
Digital marketing strategy playbook
AdCMO
 
Invite media playbook
Invite media playbookInvite media playbook
Invite media playbook
AdCMO
 
B2B Search Marketing & Analytics
B2B Search Marketing & AnalyticsB2B Search Marketing & Analytics
B2B Search Marketing & Analytics
David Vogel
 
Using the Marketing Measuring Tape
Using the Marketing Measuring TapeUsing the Marketing Measuring Tape
Using the Marketing Measuring Tape
Werkshop Marketing
 

Similaire à MLconf NYC Claudia Perlich (20)

eyeDemand "Demystifying RTB: Keys to a Successful Campaign"
eyeDemand "Demystifying RTB: Keys to a Successful Campaign"eyeDemand "Demystifying RTB: Keys to a Successful Campaign"
eyeDemand "Demystifying RTB: Keys to a Successful Campaign"
 
Personalize Wisely: The Dos and Don'ts of Personalization
Personalize Wisely: The Dos and Don'ts of PersonalizationPersonalize Wisely: The Dos and Don'ts of Personalization
Personalize Wisely: The Dos and Don'ts of Personalization
 
3. Josh Colbeck, Head of Biddable - Blueclaw
3. Josh Colbeck, Head of Biddable - Blueclaw3. Josh Colbeck, Head of Biddable - Blueclaw
3. Josh Colbeck, Head of Biddable - Blueclaw
 
nextNY Online Marketing School - SEM Presentation
nextNY Online Marketing School - SEM PresentationnextNY Online Marketing School - SEM Presentation
nextNY Online Marketing School - SEM Presentation
 
Invite media playbook report
Invite media playbook reportInvite media playbook report
Invite media playbook report
 
Digital marketing strategy playbook
Digital marketing strategy playbookDigital marketing strategy playbook
Digital marketing strategy playbook
 
Invite media playbook
Invite media playbookInvite media playbook
Invite media playbook
 
Demystify - Programmatic for Recruitment
Demystify - Programmatic for Recruitment Demystify - Programmatic for Recruitment
Demystify - Programmatic for Recruitment
 
How The Flywheel is Changing Customer Acquisition Strategies for Agencies
How The Flywheel is Changing Customer Acquisition Strategies for AgenciesHow The Flywheel is Changing Customer Acquisition Strategies for Agencies
How The Flywheel is Changing Customer Acquisition Strategies for Agencies
 
B2B Search Marketing & Analytics
B2B Search Marketing & AnalyticsB2B Search Marketing & Analytics
B2B Search Marketing & Analytics
 
Using the Marketing Measuring Tape
Using the Marketing Measuring TapeUsing the Marketing Measuring Tape
Using the Marketing Measuring Tape
 
Google Analytics For Enhanced Marketing Measurement
Google Analytics For Enhanced Marketing MeasurementGoogle Analytics For Enhanced Marketing Measurement
Google Analytics For Enhanced Marketing Measurement
 
Epiphany Summer Conference 2016
Epiphany Summer Conference 2016 Epiphany Summer Conference 2016
Epiphany Summer Conference 2016
 
Epiphany Summer Conference 2016 London
Epiphany Summer Conference 2016 LondonEpiphany Summer Conference 2016 London
Epiphany Summer Conference 2016 London
 
Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017
 
Data and Creativity in Paid Search
Data and Creativity in Paid SearchData and Creativity in Paid Search
Data and Creativity in Paid Search
 
Qmd 1903 wast_googad
Qmd 1903 wast_googadQmd 1903 wast_googad
Qmd 1903 wast_googad
 
Introduction to digital marketing - mylivpro
Introduction to digital marketing - mylivproIntroduction to digital marketing - mylivpro
Introduction to digital marketing - mylivpro
 
Future-proofing your AdWords for 2016.
Future-proofing your AdWords for 2016. Future-proofing your AdWords for 2016.
Future-proofing your AdWords for 2016.
 
Benchmarking Your Online Impact: From Stats to Reputation Management
Benchmarking Your Online Impact: From Stats to Reputation ManagementBenchmarking Your Online Impact: From Stats to Reputation Management
Benchmarking Your Online Impact: From Stats to Reputation Management
 

Plus de MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 

Plus de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

MLconf NYC Claudia Perlich

  • 1. Claudia Perlich Chief Scientist, Dstillery Adjunct Professor, Stern (NYU) @claudia_perlich Talesfromdatatrenchesof displayadvertising
  • 2. Ad Exchange Shopping at one of our campaign sites cookies 10 Million URL’s 200 Million browsers 0.0001% to 1% baserate 10 Billions of auctions per day conversion Where should we advertise and at what price? Does the ad have causal effect? What data should we pay for? Attribution? Who should we target for a marketer? What requests are fraudulent?
  • 3. The Non-Branded Web A consumer’s online/mobile activity The Branded Web gets recorded like this: Our BrowserData:Agnostic I do not want to ‘understand’ who you are … Browsing History Hashed URL’s: date1 abkcc date2 kkllo date3 88iok date4 7uiol … Browsing History Hashed URL’s: date1 abkcc date2 kkllo date3 88iok date4 7uiol … Brand Event Encoded date1 3012L20 date 2 4199L30 … date n 3075L50 Brand Event Encoded date1 3012L20 date 2 4199L30 … date n 3075L50
  • 5. TheHeartandSoul  Predictive modeling on hashed browsing history  10 Million dimensions for URL’s (binary indicators)  extremely sparse data  positives are extremely rare Targeting Model P(Buy|URL,inventory,ad)
  • 6. Howcanwelearnfrom10Mfeatureswith no/fewpositives?  We cheat. In ML, cheating is called “Transfer Learning”
  • 7. Theheartand soul  Has to deal with the 10 Million URL’s  Need to find more positives! Targeting Model P(Buy|URL,inventory,ad)
  • 8. Experiment  Randomized targeting across 58 different large display ad campaigns.  Served ads to users with active, stable cookies  Targeted ~5000 random users per day for each marketer. Campaigns ran for 1 to 5 months, between 100K and 4MM impressions per campaign  Observed outcomes: clicks on ads, post-impression (PI) purchases (conversions) Data Targeting • Optimize targeting using Click and PI Purchase • Technographic info and web history as input variables • Evaluate each separately trained model on its ability to rank order users for PI Purchase, using AUC (Mann-Whitney Wilcoxin Statistic) • Each model is trained/evaluated using Logistic Regression
  • 9. .2.4.6.8 AUC Train on Click Train on Purchase *Restricted feature set used for these modeling results; qualitative conclusions gener Predictiveperformance*(AUC)forpurchase learning [Dalessandro et al. 2012] .2.4.6.8 AUC Train on Click Train on Purchase ®
  • 10. .2.4.6.8 AUC Train on Click Train on Purchase ® *Restricted feature set used for these modeling results; qualitative conclusions gener Predictiveperformance*(AUC)forclick learning [Dalessandro et al. 2012] Evaluatedonpredictingpurchases (AUCinthetargetdomain)
  • 11. .2.4.6.81 Train on Clicks Train on Site Visits Train on Purchase AUCDistribution *Restricted feature set used for these modeling results; qualitative conclusions gener Predictiveperformance*(AUC) forSiteVisitlearning [Dalessandro et al. 2012] Significantly better targeting training on source task Evaluatedonpredictingpurchases (AUCinthetargetdomain)
  • 14. .2.4.6.81 Train on Clicks Train on Site Visits Train on Purchase AUCDistribution *Restricted feature set used for these modeling results; qualitative conclusions gener Predictiveperformance*(AUC)across58 differentdisplayadcampaigns [Dalessandro et al. 2012] Significantly better targeting training on source task High cost High correlation High Variance Low cost Low correlation High Bias Low Cost High correlation Low Bias & Variance
  • 15. Theheartand soul  Has to deal with the 10 Million URL’s  Transfer learning:  Use all kinds of Site visits instead of new purchases  Biased sample in every possible way to reduce variance  Negatives are ‘everything else’  Pre-campaign without impression  Stacking for transfer learning Targeting Model Organic: P(SiteVisit|URL’s) P(Buy|URL,inventory,ad) MLJ 2014
  • 16. Logisticregressionin 10 Milliondimensions  Stochastic Gradient Descent  L1 and L2 constraints  Automatic estimation of optimal learning rates  Bayesian empirical industry priors  Streaming updates of the models  Fully Automated ~10000 model per week KDD 2014 Targeting Model p(sv|urls) =
  • 17. Ad AdAd Real-timeScoringof aUser Ad OBSERVATION Purchase ProspectRank Threshold site visit with positive correlation site visit with negative correlation ENGAGEMENT Some prospects fall out of favor once their in-market indicators decline.
  • 18. 0 5 10 15 20 25 0 1.0M 2.0M 3.0M 4.0M 5.0M 6.0M NNLiftoverRON TotalImpressions median lift = 5x Note: the top prospects are consistently rated as being excellent compared to alternatives by advertising clients’ internal measures, and when measured by their analysis partners (e.g., Nielsen): high ROI, low cost-per-acquisition, etc. Lift over random for 66 campaigns for online display ad prospecting Liftoverbaseline <snip>
  • 19. ThePokerface Bidding ModelP(SiteVisit|Prospect Rank, Inventory, ad) KDD 2012 Best Paper Marginal Inventory Score: Convert into bid price:
  • 21. Measuringcausaleffect? A/B Testing Practical concerns Estimate Causal effects from observational data  Using targeted maximumlikelihood(TMLE) to estimatecausal impact  Canbe done ex-post for different questions  Need tocontrol for confounding  Data has to be ‘rich’and cover allcombinations of confounding and treatment ADKDD 2011 E[YA=ad] – E[YA=no ad]
  • 22. Animportantdecision… I think she is hot! Hmm – so what should I write to her to get her number?
  • 24. Hardshipsofcausality. Beauty is Confounding determines both the probability of getting the number and of the probability that James will say it need to control for the actual beauty or it can appear that making compliments is a bad idea “You are beautiful.”
  • 25. Hardshipsof causality. Targeting is Confounding We only show ads to people we know are more likely to convert (ad or not) conversionrates DID NOT SEE ADSAW AD Needtocontrolforconfounding Datahastobe‘rich’andcoverall combinationsofconfoundingand treatment
  • 26. ObservationalCausalMethods:TMLE Negative Test: wrong ad Positive Test: A/B comparison
  • 28. ThePoliceFraud  Tracking artificial co-visitation patters  Blacklist inventory in the exchanges  Ignore the browser KDD 2013
  • 32. Traffic patterns are ‘non - human’ website 1 website 2 50% Data from Bid Requests in Ad-Exchanges
  • 37.
  • 38.
  • 39.
  • 41. Unreasonable Performance Increase Spring 12 2 weeks PerformanceIndex 2x
  • 42. Now it is coming also to brands • ‘Cookie Stuffing’ increases the value of the ad for retargeting • Messing upWeb analytics … • Messes up my models because a botnet is easier to predict than a human
  • 43. Fraudpollutesmymodels • Don’t show ads on those sites • Don’t show ads to a high jacked browser • Need to remove the visits to the fraud sites • Need to remove the fraudulent brand visits When we see a browser on caught up in fraudulent activity: send him to the penalty box where we ignore all his actions
  • 44. Usingthepenaltybox:allbacktonormal 44 3 more weeks in spring 2012 PerformanceIndex
  • 46. 1. B. Dalessandro, F. Provost, R. Hook. Audience Selection for On-Line Brand Advertising: Privacy Friendly Social Network Targeting, KDD 2009 2. O. Stitelman, B. Dalessandro, C. Perlich, and F. Provost. Estimating The Effect Of Online Display Advertising On Browser Conversion. ADKDD 2011 3. C.Perlich, O. Stitelman, B. Dalessandro, T. Raeder and F. Provost. Bid Optimizing and Inventory Scoring in Targeted Online Advertising. KDD 2012 (Best Paper Award) 4. T. Raeder, O. Stitelman, B. Dalessandro, C. Perlich, and F. Provost. Design Principles of Massive, Robust Prediction Systems. KDD 2012 5. B. Dalessandro, O. Stitelman, C. Perlich, F. Provost Causally Motivated Attribution for Online Advertising. In Proceedings of KDD, ADKDD 2012 6. B. Dalessandro, R. Hook. C. Perlich, F. Provost. Transfer Learning for Display Advertising MLJ 2014 7. T. Raeder, C. Perlich, B. Dalessandro, O. Stitelman, F. Provost. Scalable Supervised Dimensionality Reduction Using Clustering at KDD 2013 8. O. Stitelman, C. Perlich, B. Dalessandro, R. Hook, T. Raeder, F. Provost. Using Co- visitation Networks For Classifying Non-Intentional Traffic‘ at KDD 2013 46 SomeReferences