SlideShare a Scribd company logo
1 of 44
Download to read offline
Software
Engineer
@Criteo AI Lab
Gilles LEGOUX at Grenoble INP - Ensimag
2020-11-10 <g.legoux@criteo.com>
2 •
What is Criteo ?
The leading advertising platform for the open internet
Open Internet AI* Engine E-commerce
Dataset
Criteo was founded in 2005
+2700 employees with +650 in R&D
See more details
AMERICAS EMEA
APAC
Publisher
Access
Advertiser
Platform
*: Artificial Intelligence
Source Criteo in 2019
30 locations in the world
with Paris (FR), Grenoble (FR), Ann-Arbor (USA)
as R&D offices
See more details
3 •
" I am Software Engineer @Criteo R&D*
in the Criteo AI Lab, more precisely in UC** Team "
https://ailab.criteo.com
*: Research & Development
**: Universal Catalog
4 •
2016
2014
2017
2020
Information Systems
Engineering specialization
Startup experience
as Web Software Engineer
Post master's degree in data
science and big data
Software Reliability Engineer
Software Engineer
at Criteo AI Lab
5 •
Ad Online World
go, Go, GO!
6 •
Criteo demo
Ad choices
" Here an online ad "
Publisher website
Advertiser website
7 •
Who are the top acting companies of the online ad world?
go, Go, GO!
Source SimilarTech* online data
*Bing for Microsoft, Double click is Google, Taboola bought Outbrain, Amazon new actors should be present
for 10K sites (data viewed in 2020-10 but should in ~2018)
Ads market share
8 •
How works of the online ads ?
source ad-exchange.fr
DSP: Demand-Side Platforms
SSP: Supply-Side Platforms
Cash flow
go, Go, GO!
CTR* 1%
x2 in relation to competitor's
average
*: Click Through Rate
Cash flow
9 •
What are products provided by Criteo ?
https://marketing.criteo.com
Advertiser
Platform
Criteo is a full DSP, our main business
partners are advertisers:
• Import products
• Manage campaigns
with budget & audience rules
• Analyze results
• Create ads
https://pmc.criteo.com
Publisher
Access
Our business partners are
publishers, but Criteo can use other
SSPs to provides ads.
10 •
Criteo datasets
User Events Advertiser
Configuration
E-commerce
Dataset
AI* Engine
Universal
Catalog
Advertiser catalogs
11 •
How the users interact with the online ads ?
RTB: Real Time Bidding CAS: Criteo Ads Server CAT : Criteo Ads Targeting
CRITEO
INTERNET
Billing
Views
Displays Clicks
Events
View, List, Basket, Sale
Auctions &
Biddings
Loading script
Browsing
Open Internet
Won auction
static.criteo.net
12 •
How the universal catalog is used ?
Publisher Direct Access
& SSPs
RTB
Render
Ads Creator Reco*
Universal Catalog
+ 12B products
Audience Budget
*: Recommendation
User Web Client
Arbitrage
CAS CAT
Campaign
Internal Criteo Network
The Internet
Advertiser
13 •
Criteo AI* Lab
go, Go, GO!
*: Artificial Intelligence
14 •
What and who is the Criteo AI Lab ?
• R&D department
• Machine Learning, said ML
• Researchers & Software Engineers
Infrastructure
Product Engineering
Site Reliability Engineering
Product
Engineering
Engineering
Pprogram
Management
Research & Development
Product
Engineering
15 •
How and why is the Criteo AI Lab ?
• 4 groups of teams
• Provide ML state-of-art for Criteo
• Academic contributions & visibility
Criteo AI Lab Structure
Product
Engineering
Research CAML**
ML Platform
Recommendation
**: Criteo Applied Machine Learning
*: Universal Catalog
UC* Team
16 •
A yearly kick-off for the Criteo strategy. We have a 9
months plan, several Objective Key Results (OKRs) per
quarter, bi-weeks scrum sprint, and daily tasks.
Organization of a team
" Every team is owner of its own daily organizations with a common culture "
Team members
EPM*
Manager
Team lead
*: Engineering Program Manager
Software Engineer
Product
Owner
17 •
Workday
8h-10h Start
• Development: Single/Pair/Mob programming for maintenance,
tech debt, features, hot fixes
• Meeting: Demo, Sharing Knowledge, Brainstorming, Project,
1:1 team lead or manager
• Communication: Email/Slack Questions, News
• Documentation: User/Developer/Design/Code/Organization
• Event: Social, Conference, CAIL/R&D All Hands, CTF*, Hackathon
• Learning: Online courses, blog articles reading, competition
• Break: coffee, lunch
17h-21h End
*: Capture The Flag
18 •
Used Tools
Instant messaging Code versioning
Presentation, Email,
Calendar management
Programming language
Online meetings
platform
Ticketing
management
Documentation
management
Feedback platform
Award platform
Integrated
Development
Environment
19 •
Software Engineer Skills
Feedback processes 2 times in the year:
middle of year and end of year by your peers, from a
matrix of levels (junior, senior, staff, senior staff, principal, …) based
on these 10 skills.
Hard skills
Soft skills
20 •
Interactions
Research
*: Engineering Program Manager
Software Engineer
Data scientist
Software Engineer
Site Reliability Engineer
Product Analyst
Manager
EPM* Product Owner
Users
21 •
Universal Catalog
go, Go, GO!
22 •
Our mission ?
Outcome
Universal Catalog
+ 12B enriched products
Advertiser catalogs
+30K catalogs
Merge and unify all advertiser catalogs to a universal catalog.
23 •
How to build this universal catalog ?
Product Model Prediction
Enriched
Product
Simple processing
Build the universal catalog for Criteo business
with machine learning and data processing algorithms.
24 •
What are the features of an enriched product ?
Outcome
Provided features
vendor
id
title
description
category
brand
price
universal brand
universal category
gender
price in euros
price range
Product Enriched Product
vendor
id
title
description
category
brand
price
Enrichments
25 •
What is our data ?
Universal Catalog
+ 30K products catalogs
+ 12B products
12 languages
Outcome
Product Universal
Categories
+5K
Product Universal
Brands
+60K
E-commerce
Dataset
26 •
What is the universal category model ?
AI Engine
Deep Learning model
title
description
Product
Predicted universal
leaf category
Supervised model for classification with K classes
27 •
What is the technical environment ?
annotate
products
Import catalogs
meta store
ML labs & experiments
models
metadata
deploy model
sample
products
enrich the products
with predictions
or simple processes
feed
data sets
get data sets
Annotation API & UI
Jobs scheduler
Advertiser catalogs
data sets
AI Engine
data warehouse
28 •
What are our components ?
• Scheduler with a Spark job
• Web Application
• Machine learning lab
29 •
Build
Tools Server
CI/CD*
Server
Review
Server
Gerrit server
Artifact
stores
Deployment
Server
Container
platform
*: Continous Integration/Continous Delivery
Workstation
What is the development cycle and the pipeline for “go to (pre-)production” ?
Preprod or prod?
Datacenter(s)?
.pex
.jar
30 •
What is the production environment ?
Container
platform
meta store
Container
platform
models
data warehouse
metadata
universal catalog
databases
Spark job
enricher
Jobs
scheduler
31 •
What is the technical stack ?
Jobs
Web applications
ML labs & experiments
Analytics Monitoring
Container platforms Storages
Thank you!
go, Go, GO!
33 •
" We are recruiting ! "
Already +20 graduates here
" Join us ✌️ "
Criteo Tech blog
Criteo Open Positions
criteo.com
Q & A
go, Go, GO!
g.legoux@criteo.com
@gilleslegoux
35 •
Criteoers* contribute and create regularly
open source projects , but we have some internal
projects to keep advance on our competitors!
*: name for the employees of Criteo
Criteo GitHub
Open source projects
See more details
Criteo Gitlab
Experiment internal projects
Criteo Gerrit
Production internal projects
What's about open source?
" We love Open Source projects "
36 •
One situation by location, but remote work is "strongly
advised" until June 2021 for Paris and Grenoble. We have
a small impact on our business due to Covid-19.
What's happen with Covid-19 ?
" Everyone is safe , business is good "
Covid-19 vs Criteo
See more details
37 •
Each team has a part of this common tech stack,
and can use any tech for experiments.
What's about your technical stack ?
" It depends on your team and mission, but
we have a common tech stack! "
Criteo Tech Stack
See more details
38 •
We have 1 kickoff, 1 hackathon (3 days) and 2 conferences per
year, an onboarding with datacenter visit, paying external
trainings or internal trainings, tool licenses, matrix levels (SRE,
SDE, ML ENG, ...), 3 voyager programs, peer feedbacks every 6
months with promotion process … See working in R&D to join us
What's about professional career and experience life at Criteo ?
" Become a complete happy engineer! "
Criteo Experience life
See more details
39 •
What are the voyager programs?
Annexes
go, Go, GO!
41 •
" We are sensible at these questions "
Criteo is also a society project, not only a company
for the open internet! See our values and cares .
Save environment
See more details
Respect private data
See more details
42 •
" Criteo in digits ? "
See more details
The development team of the future at Criteo
Here are a few figures, because we like data, yes indeed we do:
• 15 datacenters (9 with computing capacity + 6 dedicated to network connectivity)
across US, EU, APAC
• More than 35K servers, running a mix of Linux and Windows
• One of the largest Hadoop clusters in Europe with close to 171 PB of storage and 42.000 cores
• 250B HTTP requests and close to 4B unique banners displayed per day
• 130Gbps of bandwidth, half of it through peering exchanges
• Respond to bids in 80ms or less, 24/7
• Close to 4M HTTP requests per second handled during peak times
• Less than 10ms on average to select optimal campaign
• 10ms to find best product in catalogue of hundreds of millions of products
• Tens of TB of new data stored daily
• Largest public Machine Learning Dataset in the world with over 4 billion lines and over 1TB in size
•Technologies: Hadoop, Couchbase, Redis, Mesos, Kafka, Storm, Cassandra, Spark, Vertica, Druid, …
Source Criteo in 2019
43 •
" What are the Criteo datacenters ? "
Source Criteo in 2020
44
" How a data center is installed at Criteo ? "
You can visit it !
go, Go, GO!

More Related Content

What's hot

Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftestSeongJae Park
 
jemalloc 세미나
jemalloc 세미나jemalloc 세미나
jemalloc 세미나Jang Hoon
 
MBA Resume Sample Format
MBA Resume Sample FormatMBA Resume Sample Format
MBA Resume Sample Formatsanthose menon
 
Road to Winning at Horse Racing with Data Science
Road to Winning at Horse Racing with Data ScienceRoad to Winning at Horse Racing with Data Science
Road to Winning at Horse Racing with Data ScienceShun Nukui
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
JSON: The Basics
JSON: The BasicsJSON: The Basics
JSON: The BasicsJeff Fox
 
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback흥배 최
 
Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction Kanika Gera
 
고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들
고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들
고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들Chris Ohk
 
人それぞれの競プロとの向き合い方
人それぞれの競プロとの向き合い方人それぞれの競プロとの向き合い方
人それぞれの競プロとの向き合い方Kensuke Otsuki
 
Overcome the 6 Antipatterns of Agile Adoption
Overcome the 6 Antipatterns of Agile AdoptionOvercome the 6 Antipatterns of Agile Adoption
Overcome the 6 Antipatterns of Agile AdoptionAgile Velocity
 
Agile Requirements & Design
Agile Requirements & DesignAgile Requirements & Design
Agile Requirements & DesignMike Cottmeyer
 
Interpreting Cumulative Flow Diagrams
Interpreting Cumulative Flow DiagramsInterpreting Cumulative Flow Diagrams
Interpreting Cumulative Flow DiagramsNick Zdunić
 
Planning Poker estimating technique
Planning Poker estimating techniquePlanning Poker estimating technique
Planning Poker estimating techniqueSuhail Jamaldeen
 
Introduction To Scrum For Managers
Introduction To Scrum For ManagersIntroduction To Scrum For Managers
Introduction To Scrum For ManagersRobert Dempsey
 

What's hot (20)

Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftest
 
jemalloc 세미나
jemalloc 세미나jemalloc 세미나
jemalloc 세미나
 
MBA Resume Sample Format
MBA Resume Sample FormatMBA Resume Sample Format
MBA Resume Sample Format
 
Road to Winning at Horse Racing with Data Science
Road to Winning at Horse Racing with Data ScienceRoad to Winning at Horse Racing with Data Science
Road to Winning at Horse Racing with Data Science
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
JSON: The Basics
JSON: The BasicsJSON: The Basics
JSON: The Basics
 
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
 
Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction Node.Js: Basics Concepts and Introduction
Node.Js: Basics Concepts and Introduction
 
고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들
고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들
고려대학교 컴퓨터학과 특강 - 대학생 때 알았더라면 좋았을 것들
 
人それぞれの競プロとの向き合い方
人それぞれの競プロとの向き合い方人それぞれの競プロとの向き合い方
人それぞれの競プロとの向き合い方
 
Overcome the 6 Antipatterns of Agile Adoption
Overcome the 6 Antipatterns of Agile AdoptionOvercome the 6 Antipatterns of Agile Adoption
Overcome the 6 Antipatterns of Agile Adoption
 
Programming with Python
Programming with PythonProgramming with Python
Programming with Python
 
Agile Requirements & Design
Agile Requirements & DesignAgile Requirements & Design
Agile Requirements & Design
 
Ethereum
EthereumEthereum
Ethereum
 
React Hooks
React HooksReact Hooks
React Hooks
 
Interpreting Cumulative Flow Diagrams
Interpreting Cumulative Flow DiagramsInterpreting Cumulative Flow Diagrams
Interpreting Cumulative Flow Diagrams
 
Planning Poker estimating technique
Planning Poker estimating techniquePlanning Poker estimating technique
Planning Poker estimating technique
 
Introduction To Scrum For Managers
Introduction To Scrum For ManagersIntroduction To Scrum For Managers
Introduction To Scrum For Managers
 
Python/Django Training
Python/Django TrainingPython/Django Training
Python/Django Training
 
Javascript
JavascriptJavascript
Javascript
 

Similar to Tech Job Conference: Software Engineer @Criteo

Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and PythonTravis Oliphant
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGroup
 
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDatenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDenodo
 
Cubitic: Predictive Analytics
Cubitic: Predictive AnalyticsCubitic: Predictive Analytics
Cubitic: Predictive Analyticshuguk
 
Criteo TektosData Meetup
Criteo TektosData MeetupCriteo TektosData Meetup
Criteo TektosData MeetupOlivier Koch
 
Rethink! How Digital Transformation disrupts Enterprise Architecture
Rethink! How Digital Transformation disrupts Enterprise ArchitectureRethink! How Digital Transformation disrupts Enterprise Architecture
Rethink! How Digital Transformation disrupts Enterprise ArchitectureLeanIX GmbH
 
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...DRI - Discovery/Reinvention/Integration/
 
Datasciencein E-commerce industry
Datasciencein E-commerce industryDatasciencein E-commerce industry
Datasciencein E-commerce industryRakuten Group, Inc.
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation PlatformKarthik Murugesan
 
Sinergija 11 Introduction to HealthVault
Sinergija 11   Introduction to HealthVaultSinergija 11   Introduction to HealthVault
Sinergija 11 Introduction to HealthVaultCatalin Gheorghiu
 
What's new in the latest source{d} releases!
What's new in the latest source{d} releases!What's new in the latest source{d} releases!
What's new in the latest source{d} releases!source{d}
 
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4jBuilding Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4jNeo4j
 
Real-World, Open Source, End-to-End JavaScript in IoT
Real-World, Open Source, End-to-End JavaScript in IoTReal-World, Open Source, End-to-End JavaScript in IoT
Real-World, Open Source, End-to-End JavaScript in IoTAll Things Open
 
Big Data & IoT. Opportunities and challenges
Big Data & IoT. Opportunities and challengesBig Data & IoT. Opportunities and challenges
Big Data & IoT. Opportunities and challengesMediaTek Labs
 
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Elasticsearch
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsLooker
 
Webinar-Serie: Digital Experiences, Teil 1: Innovative Konzepte
Webinar-Serie: Digital Experiences, Teil 1: Innovative KonzepteWebinar-Serie: Digital Experiences, Teil 1: Innovative Konzepte
Webinar-Serie: Digital Experiences, Teil 1: Innovative KonzepteAcquia
 
Session 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench ToolboxSession 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench ToolboxDataBench
 
apidays London 2023 - Open Standards, AI and Data for better business decisio...
apidays London 2023 - Open Standards, AI and Data for better business decisio...apidays London 2023 - Open Standards, AI and Data for better business decisio...
apidays London 2023 - Open Standards, AI and Data for better business decisio...apidays
 

Similar to Tech Job Conference: Software Engineer @Criteo (20)

Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDatenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
 
Cubitic: Predictive Analytics
Cubitic: Predictive AnalyticsCubitic: Predictive Analytics
Cubitic: Predictive Analytics
 
Criteo TektosData Meetup
Criteo TektosData MeetupCriteo TektosData Meetup
Criteo TektosData Meetup
 
Rethink! How Digital Transformation disrupts Enterprise Architecture
Rethink! How Digital Transformation disrupts Enterprise ArchitectureRethink! How Digital Transformation disrupts Enterprise Architecture
Rethink! How Digital Transformation disrupts Enterprise Architecture
 
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
Digital Personalisation: Growing Revenue Faster with Digital Experiences That...
 
Datasciencein E-commerce industry
Datasciencein E-commerce industryDatasciencein E-commerce industry
Datasciencein E-commerce industry
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
 
Sinergija 11 Introduction to HealthVault
Sinergija 11   Introduction to HealthVaultSinergija 11   Introduction to HealthVault
Sinergija 11 Introduction to HealthVault
 
What's new in the latest source{d} releases!
What's new in the latest source{d} releases!What's new in the latest source{d} releases!
What's new in the latest source{d} releases!
 
BUDDY White Paper
BUDDY White PaperBUDDY White Paper
BUDDY White Paper
 
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4jBuilding Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
Building Intelligent Solutions with Graphs, Stefan Kolmar, Neo4j
 
Real-World, Open Source, End-to-End JavaScript in IoT
Real-World, Open Source, End-to-End JavaScript in IoTReal-World, Open Source, End-to-End JavaScript in IoT
Real-World, Open Source, End-to-End JavaScript in IoT
 
Big Data & IoT. Opportunities and challenges
Big Data & IoT. Opportunities and challengesBig Data & IoT. Opportunities and challenges
Big Data & IoT. Opportunities and challenges
 
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
 
Webinar-Serie: Digital Experiences, Teil 1: Innovative Konzepte
Webinar-Serie: Digital Experiences, Teil 1: Innovative KonzepteWebinar-Serie: Digital Experiences, Teil 1: Innovative Konzepte
Webinar-Serie: Digital Experiences, Teil 1: Innovative Konzepte
 
Session 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench ToolboxSession 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench Toolbox
 
apidays London 2023 - Open Standards, AI and Data for better business decisio...
apidays London 2023 - Open Standards, AI and Data for better business decisio...apidays London 2023 - Open Standards, AI and Data for better business decisio...
apidays London 2023 - Open Standards, AI and Data for better business decisio...
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 

Tech Job Conference: Software Engineer @Criteo

  • 1. Software Engineer @Criteo AI Lab Gilles LEGOUX at Grenoble INP - Ensimag 2020-11-10 <g.legoux@criteo.com>
  • 2. 2 • What is Criteo ? The leading advertising platform for the open internet Open Internet AI* Engine E-commerce Dataset Criteo was founded in 2005 +2700 employees with +650 in R&D See more details AMERICAS EMEA APAC Publisher Access Advertiser Platform *: Artificial Intelligence Source Criteo in 2019 30 locations in the world with Paris (FR), Grenoble (FR), Ann-Arbor (USA) as R&D offices See more details
  • 3. 3 • " I am Software Engineer @Criteo R&D* in the Criteo AI Lab, more precisely in UC** Team " https://ailab.criteo.com *: Research & Development **: Universal Catalog
  • 4. 4 • 2016 2014 2017 2020 Information Systems Engineering specialization Startup experience as Web Software Engineer Post master's degree in data science and big data Software Reliability Engineer Software Engineer at Criteo AI Lab
  • 5. 5 • Ad Online World go, Go, GO!
  • 6. 6 • Criteo demo Ad choices " Here an online ad " Publisher website Advertiser website
  • 7. 7 • Who are the top acting companies of the online ad world? go, Go, GO! Source SimilarTech* online data *Bing for Microsoft, Double click is Google, Taboola bought Outbrain, Amazon new actors should be present for 10K sites (data viewed in 2020-10 but should in ~2018) Ads market share
  • 8. 8 • How works of the online ads ? source ad-exchange.fr DSP: Demand-Side Platforms SSP: Supply-Side Platforms Cash flow go, Go, GO! CTR* 1% x2 in relation to competitor's average *: Click Through Rate Cash flow
  • 9. 9 • What are products provided by Criteo ? https://marketing.criteo.com Advertiser Platform Criteo is a full DSP, our main business partners are advertisers: • Import products • Manage campaigns with budget & audience rules • Analyze results • Create ads https://pmc.criteo.com Publisher Access Our business partners are publishers, but Criteo can use other SSPs to provides ads.
  • 10. 10 • Criteo datasets User Events Advertiser Configuration E-commerce Dataset AI* Engine Universal Catalog Advertiser catalogs
  • 11. 11 • How the users interact with the online ads ? RTB: Real Time Bidding CAS: Criteo Ads Server CAT : Criteo Ads Targeting CRITEO INTERNET Billing Views Displays Clicks Events View, List, Basket, Sale Auctions & Biddings Loading script Browsing Open Internet Won auction static.criteo.net
  • 12. 12 • How the universal catalog is used ? Publisher Direct Access & SSPs RTB Render Ads Creator Reco* Universal Catalog + 12B products Audience Budget *: Recommendation User Web Client Arbitrage CAS CAT Campaign Internal Criteo Network The Internet Advertiser
  • 13. 13 • Criteo AI* Lab go, Go, GO! *: Artificial Intelligence
  • 14. 14 • What and who is the Criteo AI Lab ? • R&D department • Machine Learning, said ML • Researchers & Software Engineers Infrastructure Product Engineering Site Reliability Engineering Product Engineering Engineering Pprogram Management Research & Development Product Engineering
  • 15. 15 • How and why is the Criteo AI Lab ? • 4 groups of teams • Provide ML state-of-art for Criteo • Academic contributions & visibility Criteo AI Lab Structure Product Engineering Research CAML** ML Platform Recommendation **: Criteo Applied Machine Learning *: Universal Catalog UC* Team
  • 16. 16 • A yearly kick-off for the Criteo strategy. We have a 9 months plan, several Objective Key Results (OKRs) per quarter, bi-weeks scrum sprint, and daily tasks. Organization of a team " Every team is owner of its own daily organizations with a common culture " Team members EPM* Manager Team lead *: Engineering Program Manager Software Engineer Product Owner
  • 17. 17 • Workday 8h-10h Start • Development: Single/Pair/Mob programming for maintenance, tech debt, features, hot fixes • Meeting: Demo, Sharing Knowledge, Brainstorming, Project, 1:1 team lead or manager • Communication: Email/Slack Questions, News • Documentation: User/Developer/Design/Code/Organization • Event: Social, Conference, CAIL/R&D All Hands, CTF*, Hackathon • Learning: Online courses, blog articles reading, competition • Break: coffee, lunch 17h-21h End *: Capture The Flag
  • 18. 18 • Used Tools Instant messaging Code versioning Presentation, Email, Calendar management Programming language Online meetings platform Ticketing management Documentation management Feedback platform Award platform Integrated Development Environment
  • 19. 19 • Software Engineer Skills Feedback processes 2 times in the year: middle of year and end of year by your peers, from a matrix of levels (junior, senior, staff, senior staff, principal, …) based on these 10 skills. Hard skills Soft skills
  • 20. 20 • Interactions Research *: Engineering Program Manager Software Engineer Data scientist Software Engineer Site Reliability Engineer Product Analyst Manager EPM* Product Owner Users
  • 22. 22 • Our mission ? Outcome Universal Catalog + 12B enriched products Advertiser catalogs +30K catalogs Merge and unify all advertiser catalogs to a universal catalog.
  • 23. 23 • How to build this universal catalog ? Product Model Prediction Enriched Product Simple processing Build the universal catalog for Criteo business with machine learning and data processing algorithms.
  • 24. 24 • What are the features of an enriched product ? Outcome Provided features vendor id title description category brand price universal brand universal category gender price in euros price range Product Enriched Product vendor id title description category brand price Enrichments
  • 25. 25 • What is our data ? Universal Catalog + 30K products catalogs + 12B products 12 languages Outcome Product Universal Categories +5K Product Universal Brands +60K E-commerce Dataset
  • 26. 26 • What is the universal category model ? AI Engine Deep Learning model title description Product Predicted universal leaf category Supervised model for classification with K classes
  • 27. 27 • What is the technical environment ? annotate products Import catalogs meta store ML labs & experiments models metadata deploy model sample products enrich the products with predictions or simple processes feed data sets get data sets Annotation API & UI Jobs scheduler Advertiser catalogs data sets AI Engine data warehouse
  • 28. 28 • What are our components ? • Scheduler with a Spark job • Web Application • Machine learning lab
  • 29. 29 • Build Tools Server CI/CD* Server Review Server Gerrit server Artifact stores Deployment Server Container platform *: Continous Integration/Continous Delivery Workstation What is the development cycle and the pipeline for “go to (pre-)production” ? Preprod or prod? Datacenter(s)? .pex .jar
  • 30. 30 • What is the production environment ? Container platform meta store Container platform models data warehouse metadata universal catalog databases Spark job enricher Jobs scheduler
  • 31. 31 • What is the technical stack ? Jobs Web applications ML labs & experiments Analytics Monitoring Container platforms Storages
  • 33. 33 • " We are recruiting ! " Already +20 graduates here " Join us ✌️ " Criteo Tech blog Criteo Open Positions criteo.com
  • 34. Q & A go, Go, GO! g.legoux@criteo.com @gilleslegoux
  • 35. 35 • Criteoers* contribute and create regularly open source projects , but we have some internal projects to keep advance on our competitors! *: name for the employees of Criteo Criteo GitHub Open source projects See more details Criteo Gitlab Experiment internal projects Criteo Gerrit Production internal projects What's about open source? " We love Open Source projects "
  • 36. 36 • One situation by location, but remote work is "strongly advised" until June 2021 for Paris and Grenoble. We have a small impact on our business due to Covid-19. What's happen with Covid-19 ? " Everyone is safe , business is good " Covid-19 vs Criteo See more details
  • 37. 37 • Each team has a part of this common tech stack, and can use any tech for experiments. What's about your technical stack ? " It depends on your team and mission, but we have a common tech stack! " Criteo Tech Stack See more details
  • 38. 38 • We have 1 kickoff, 1 hackathon (3 days) and 2 conferences per year, an onboarding with datacenter visit, paying external trainings or internal trainings, tool licenses, matrix levels (SRE, SDE, ML ENG, ...), 3 voyager programs, peer feedbacks every 6 months with promotion process … See working in R&D to join us What's about professional career and experience life at Criteo ? " Become a complete happy engineer! " Criteo Experience life See more details
  • 39. 39 • What are the voyager programs?
  • 41. 41 • " We are sensible at these questions " Criteo is also a society project, not only a company for the open internet! See our values and cares . Save environment See more details Respect private data See more details
  • 42. 42 • " Criteo in digits ? " See more details The development team of the future at Criteo Here are a few figures, because we like data, yes indeed we do: • 15 datacenters (9 with computing capacity + 6 dedicated to network connectivity) across US, EU, APAC • More than 35K servers, running a mix of Linux and Windows • One of the largest Hadoop clusters in Europe with close to 171 PB of storage and 42.000 cores • 250B HTTP requests and close to 4B unique banners displayed per day • 130Gbps of bandwidth, half of it through peering exchanges • Respond to bids in 80ms or less, 24/7 • Close to 4M HTTP requests per second handled during peak times • Less than 10ms on average to select optimal campaign • 10ms to find best product in catalogue of hundreds of millions of products • Tens of TB of new data stored daily • Largest public Machine Learning Dataset in the world with over 4 billion lines and over 1TB in size •Technologies: Hadoop, Couchbase, Redis, Mesos, Kafka, Storm, Cassandra, Spark, Vertica, Druid, … Source Criteo in 2019
  • 43. 43 • " What are the Criteo datacenters ? " Source Criteo in 2020
  • 44. 44 " How a data center is installed at Criteo ? " You can visit it ! go, Go, GO!