SlideShare une entreprise Scribd logo
1  sur  15
Copyright © 2014 Criteo
Making Advertising Personal
Large Scale Real-Time Product Recommendation at Criteo
Olivier Koch & Romain Lerallut
May 19th, 2015
Copyright © 2014 Criteo
Performance Advertising
We buy
• Inventory ! (ad spaces)
• Billions of times a day
• All over the Internet
• For 95% of the population
Where is the need for tech ?
We sell
• Clicks !
• (that convert)
• (that convert a lot)
We take the risk
You pay only for what you get
Copyright © 2013. Confidential
International infrastructure
Key figures
Traffic
550 k HTTP requests / sec (peak activity)
23000 impressions /sec (peak activity)
180 k requests / sec on RTB (average)
Less than 10 ms to process an RTB request
3
Figures of May2013
Physical infrastructure
6 Data centers on 3 continents operated and conceived in-
house
~ 12000 servers, largest Hadoop cluster in Europe
Availability / Uptime >99.95%
More than 20 PB of storage Big Data
Copyright © 2012. Confidential 4
Arbitrage
•Should we bid?
•At which price?
Recommendation
•Which products should
we display?
Graphical
optimization
•Big image vs small image
•Background color, ...
Prediction
•Generic prediction engine
•Specific models trained on TBs of data
Global Engine Architecture
Copyright © 2014 Criteo
Building ads with
personalized content in realtime
Recommendation for Advertising
2.5 billion products
~ 8ms response time
Copyright © 2014 Criteo
Data Sources
• Catalog data
• Feed provided by the merchants
• User behavior data
• Large scale intent data
• All visits to merchant websites
• Page views, basket, sales events
• Ad display data
• Displayed and clicked ads
6
Copyright © 2012. Confidential
7
Recommendation execution flow
Candidates
Generation
• Get candidates from all
sources using user historical
products and bestofs
Candidates
Aggregation
• Remove duplicates
• Aggregate features
Preselection
• Call degraded prediction to
decrease number of
candidates
Scoring
• Call full prediction model to
score each candidate
Winner Selector
• Select N products on score,
randomization
Glup Logging
• Log recommended products,
prediction variables
Copyright © 2014 Criteo
Issue #1: Retrieving user-specific products
• The need: storing [user → interesting products] vectors
• Difficult to store and retrieve at scale (900+ mln users)
• Hard to keep up to date
• Using seen products as a proxy
• Store the [user → viewed products] vectors
• Easier to maintain
• Store [viewed product → interesting products] vectors
• Based on aggregated user behavior data
• Computable offline
• Final ranking by a ML model
8
Copyright © 2014 Criteo
Issue #2: Scoring products
• Need to fuse data from several sources
• Product-specific
• User-specific
• User-product interactions
• Display-specific
• 1st solution: regression model
• Predict P(product click then sale)
• Easy to evaluate
• 2nd solution: ranking model
• More appropriate for our needs
• Still maximizing post-click sales
• Can be evaluated only on multi-product banners
9
Copyright © 2014 Criteo
Issue #3: Picking the right products
• Several questions:
• User-product fatigue
• Independent product choice assumption
• Explore / exploit
• Solution 1: Randomization in the banners
• Keep independent products assumption
• Separate optimization process to shuffle displayed items
• Solution 2: A better scoring model
• Score a full banner, not independent products
• Store all product display counts
10
Copyright © 2014 Criteo
Issue #4: 8ms response time ! Tweaking for performance
• CTR / Sales prediction takes ~40 µs per candidate using in-house library
• 2-step prediction:
• A fast pass to remove most of the candidates
• A slow pass to score accurately the remaining candidates
And the technical fine print:
• All real-time code in C#
• Async I/O for better efficiency
• HAProxy to scale the front-end
• Memcached to store all required data in memory
11
Copyright © 2014 Criteo
Upcoming challenges
• Long(er)-term user profiles
• More and better product information (images, semantic, NLP)
• Vertical-optimized engine
• Classifieds (catalog-free recommendation ?)
• Travel
• Instant-update of similarities
• (because batch computation is soooo last year)
12
Copyright © 2014 Criteo
Fancy a try ?
13
On your own:
With us !
http://www.criteo.com/careers/
• Our 1st public dataset is online: http://bit.ly/1vgw2XC
• 4GB display and click data, Kaggle challenge in 2014
• NEW : 1TB dataset released a few weeks ago
• Hosted on Microsoft Azure, just waiting for you
Copyright © 2014 Criteo
Questions?
Copyright © 2014 Criteo
Thank you !
o.koch@criteo.com
r.lerallut@criteo.com

Contenu connexe

Tendances

criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015Carolyn Bednarz
 
Introduction Criteo - 2.0
Introduction Criteo - 2.0Introduction Criteo - 2.0
Introduction Criteo - 2.0Scott Turecek
 
Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo Dataconomy Media
 
Criteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) MeetupCriteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) MeetupIbrahim Abubakari
 
Your Future With Content Manager OnDemand
Your Future With Content Manager OnDemandYour Future With Content Manager OnDemand
Your Future With Content Manager OnDemandZia Consulting
 
Aws community day keynote
Aws community day keynoteAws community day keynote
Aws community day keynoteJames Samuel
 
Ad Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchangeAd Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchangeAd Server Solutions
 
3 Minute Introduction
3 Minute Introduction3 Minute Introduction
3 Minute IntroductionJulian Tol
 
Optimize your Cloud Purchase Strategy
Optimize your Cloud Purchase Strategy Optimize your Cloud Purchase Strategy
Optimize your Cloud Purchase Strategy Jan Thielscher
 
Google’s strategy to gain an edge on the Cloud
Google’s strategy to gain an edge on the CloudGoogle’s strategy to gain an edge on the Cloud
Google’s strategy to gain an edge on the CloudRohitSingh1837
 
Successful IoT projects - a few lessons
Successful IoT projects - a few lessonsSuccessful IoT projects - a few lessons
Successful IoT projects - a few lessonsJan Thielscher
 
Online Ad Serving
Online Ad ServingOnline Ad Serving
Online Ad ServingNeha Gupta
 
Criteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of ClickersCriteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of ClickersCriteo
 
Axonite Campaign Automation Infrastructure for HasOffers
Axonite Campaign Automation Infrastructure for HasOffersAxonite Campaign Automation Infrastructure for HasOffers
Axonite Campaign Automation Infrastructure for HasOffersYuval Shefler
 

Tendances (16)

criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015
 
Introduction Criteo - 2.0
Introduction Criteo - 2.0Introduction Criteo - 2.0
Introduction Criteo - 2.0
 
Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo
 
Criteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) MeetupCriteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) Meetup
 
Criteo Couchbase live 2015
Criteo Couchbase live 2015Criteo Couchbase live 2015
Criteo Couchbase live 2015
 
Your Future With Content Manager OnDemand
Your Future With Content Manager OnDemandYour Future With Content Manager OnDemand
Your Future With Content Manager OnDemand
 
Aws community day keynote
Aws community day keynoteAws community day keynote
Aws community day keynote
 
Ad Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchangeAd Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchange
 
3 Minute Introduction
3 Minute Introduction3 Minute Introduction
3 Minute Introduction
 
Optimize your Cloud Purchase Strategy
Optimize your Cloud Purchase Strategy Optimize your Cloud Purchase Strategy
Optimize your Cloud Purchase Strategy
 
Google’s strategy to gain an edge on the Cloud
Google’s strategy to gain an edge on the CloudGoogle’s strategy to gain an edge on the Cloud
Google’s strategy to gain an edge on the Cloud
 
Successful IoT projects - a few lessons
Successful IoT projects - a few lessonsSuccessful IoT projects - a few lessons
Successful IoT projects - a few lessons
 
Online Ad Serving
Online Ad ServingOnline Ad Serving
Online Ad Serving
 
Criteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of ClickersCriteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
 
Axonite Campaign Automation Infrastructure for HasOffers
Axonite Campaign Automation Infrastructure for HasOffersAxonite Campaign Automation Infrastructure for HasOffers
Axonite Campaign Automation Infrastructure for HasOffers
 
CrowdCast Pitch
CrowdCast PitchCrowdCast Pitch
CrowdCast Pitch
 

En vedette

C# development workflow @ criteo
C# development workflow @ criteoC# development workflow @ criteo
C# development workflow @ criteoIbrahim Abubakari
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentationrecsysfr
 
Criteo. Reach people, not devices!
Criteo. Reach people, not devices!Criteo. Reach people, not devices!
Criteo. Reach people, not devices!HybridRussia
 
Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo? Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo? Criteolabs
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
 
Response prediction for display advertising - WSDM 2014
Response prediction for display advertising - WSDM 2014Response prediction for display advertising - WSDM 2014
Response prediction for display advertising - WSDM 2014Olivier Chapelle
 
Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!Changepoint
 
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2Paris Monitoring
 
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012Romain Fonnier
 
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...MongoDB
 
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...Amazon Web Services
 
TechFeedというテクノロジーキュレーションサービスを作った話
TechFeedというテクノロジーキュレーションサービスを作った話TechFeedというテクノロジーキュレーションサービスを作った話
TechFeedというテクノロジーキュレーションサービスを作った話yoshikawa_t
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsrecsysfr
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Informationrecsysfr
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...recsysfr
 

En vedette (18)

C# development workflow @ criteo
C# development workflow @ criteoC# development workflow @ criteo
C# development workflow @ criteo
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentation
 
Criteo. Reach people, not devices!
Criteo. Reach people, not devices!Criteo. Reach people, not devices!
Criteo. Reach people, not devices!
 
Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo? Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo?
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
Response prediction for display advertising - WSDM 2014
Response prediction for display advertising - WSDM 2014Response prediction for display advertising - WSDM 2014
Response prediction for display advertising - WSDM 2014
 
Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!
 
Saintjo Two AV4
Saintjo Two AV4Saintjo Two AV4
Saintjo Two AV4
 
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
 
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012
 
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
 
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
 
Hadoop summit-ams-2014-04-03
Hadoop summit-ams-2014-04-03Hadoop summit-ams-2014-04-03
Hadoop summit-ams-2014-04-03
 
TechFeedというテクノロジーキュレーションサービスを作った話
TechFeedというテクノロジーキュレーションサービスを作った話TechFeedというテクノロジーキュレーションサービスを作った話
TechFeedというテクノロジーキュレーションサービスを作った話
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratings
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
 

Similaire à Making advertising personal, 4th NL Recommenders Meetup

Recommendation at scale
Recommendation at scaleRecommendation at scale
Recommendation at scalesimondolle
 
Open Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeOpen Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeApache Geode
 
An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)Anthony Baker
 
GraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in GraphdatenbankenGraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in GraphdatenbankenNeo4j
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayDataStax
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the GlobeDataWorks Summit
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMBig Data Joe™ Rossi
 
RTBkit Introduction & Best Practices
RTBkit Introduction & Best PracticesRTBkit Introduction & Best Practices
RTBkit Introduction & Best PracticesDatacratic
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital
 
The Business Justification for APM
The Business Justification for APMThe Business Justification for APM
The Business Justification for APMJonah Kowall
 
Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission TeamsDashlane
 
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launchLean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launchPeople10 Technosoft Private Limited
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
Containers and VMs and Clouds: Oh My. by Mike Coleman
Containers and VMs and Clouds: Oh My. by Mike ColemanContainers and VMs and Clouds: Oh My. by Mike Coleman
Containers and VMs and Clouds: Oh My. by Mike ColemanDocker, Inc.
 

Similaire à Making advertising personal, 4th NL Recommenders Meetup (20)

Recommendation at scale
Recommendation at scaleRecommendation at scale
Recommendation at scale
 
Open Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeOpen Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache Geode
 
An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)
 
GraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in GraphdatenbankenGraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in Graphdatenbanken
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each Day
 
The New Model
The New ModelThe New Model
The New Model
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 
RTBkit Introduction & Best Practices
RTBkit Introduction & Best PracticesRTBkit Introduction & Best Practices
RTBkit Introduction & Best Practices
 
Hybris @ Neev
Hybris @ NeevHybris @ Neev
Hybris @ Neev
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
 
The Business Justification for APM
The Business Justification for APMThe Business Justification for APM
The Business Justification for APM
 
Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019
Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019
Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019
 
Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission Teams
 
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launchLean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Lean startup 2019
Lean startup 2019Lean startup 2019
Lean startup 2019
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Containers and VMs and Clouds: Oh My. by Mike Coleman
Containers and VMs and Clouds: Oh My. by Mike ColemanContainers and VMs and Clouds: Oh My. by Mike Coleman
Containers and VMs and Clouds: Oh My. by Mike Coleman
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Making advertising personal, 4th NL Recommenders Meetup

  • 1. Copyright © 2014 Criteo Making Advertising Personal Large Scale Real-Time Product Recommendation at Criteo Olivier Koch & Romain Lerallut May 19th, 2015
  • 2. Copyright © 2014 Criteo Performance Advertising We buy • Inventory ! (ad spaces) • Billions of times a day • All over the Internet • For 95% of the population Where is the need for tech ? We sell • Clicks ! • (that convert) • (that convert a lot) We take the risk You pay only for what you get
  • 3. Copyright © 2013. Confidential International infrastructure Key figures Traffic 550 k HTTP requests / sec (peak activity) 23000 impressions /sec (peak activity) 180 k requests / sec on RTB (average) Less than 10 ms to process an RTB request 3 Figures of May2013 Physical infrastructure 6 Data centers on 3 continents operated and conceived in- house ~ 12000 servers, largest Hadoop cluster in Europe Availability / Uptime >99.95% More than 20 PB of storage Big Data
  • 4. Copyright © 2012. Confidential 4 Arbitrage •Should we bid? •At which price? Recommendation •Which products should we display? Graphical optimization •Big image vs small image •Background color, ... Prediction •Generic prediction engine •Specific models trained on TBs of data Global Engine Architecture
  • 5. Copyright © 2014 Criteo Building ads with personalized content in realtime Recommendation for Advertising 2.5 billion products ~ 8ms response time
  • 6. Copyright © 2014 Criteo Data Sources • Catalog data • Feed provided by the merchants • User behavior data • Large scale intent data • All visits to merchant websites • Page views, basket, sales events • Ad display data • Displayed and clicked ads 6
  • 7. Copyright © 2012. Confidential 7 Recommendation execution flow Candidates Generation • Get candidates from all sources using user historical products and bestofs Candidates Aggregation • Remove duplicates • Aggregate features Preselection • Call degraded prediction to decrease number of candidates Scoring • Call full prediction model to score each candidate Winner Selector • Select N products on score, randomization Glup Logging • Log recommended products, prediction variables
  • 8. Copyright © 2014 Criteo Issue #1: Retrieving user-specific products • The need: storing [user → interesting products] vectors • Difficult to store and retrieve at scale (900+ mln users) • Hard to keep up to date • Using seen products as a proxy • Store the [user → viewed products] vectors • Easier to maintain • Store [viewed product → interesting products] vectors • Based on aggregated user behavior data • Computable offline • Final ranking by a ML model 8
  • 9. Copyright © 2014 Criteo Issue #2: Scoring products • Need to fuse data from several sources • Product-specific • User-specific • User-product interactions • Display-specific • 1st solution: regression model • Predict P(product click then sale) • Easy to evaluate • 2nd solution: ranking model • More appropriate for our needs • Still maximizing post-click sales • Can be evaluated only on multi-product banners 9
  • 10. Copyright © 2014 Criteo Issue #3: Picking the right products • Several questions: • User-product fatigue • Independent product choice assumption • Explore / exploit • Solution 1: Randomization in the banners • Keep independent products assumption • Separate optimization process to shuffle displayed items • Solution 2: A better scoring model • Score a full banner, not independent products • Store all product display counts 10
  • 11. Copyright © 2014 Criteo Issue #4: 8ms response time ! Tweaking for performance • CTR / Sales prediction takes ~40 µs per candidate using in-house library • 2-step prediction: • A fast pass to remove most of the candidates • A slow pass to score accurately the remaining candidates And the technical fine print: • All real-time code in C# • Async I/O for better efficiency • HAProxy to scale the front-end • Memcached to store all required data in memory 11
  • 12. Copyright © 2014 Criteo Upcoming challenges • Long(er)-term user profiles • More and better product information (images, semantic, NLP) • Vertical-optimized engine • Classifieds (catalog-free recommendation ?) • Travel • Instant-update of similarities • (because batch computation is soooo last year) 12
  • 13. Copyright © 2014 Criteo Fancy a try ? 13 On your own: With us ! http://www.criteo.com/careers/ • Our 1st public dataset is online: http://bit.ly/1vgw2XC • 4GB display and click data, Kaggle challenge in 2014 • NEW : 1TB dataset released a few weeks ago • Hosted on Microsoft Azure, just waiting for you
  • 14. Copyright © 2014 Criteo Questions?
  • 15. Copyright © 2014 Criteo Thank you ! o.koch@criteo.com r.lerallut@criteo.com