https://www.youtube.com/watch?v=JmurOpNMVyo&feature=youtu.be
Presentation done for students to show how SREs work at Criteo. We discuss about DevOps, Agile, the purpose of SRE and which specialities exists inside of it
Video available here :
1. Frederic Boismenu - Titouan Chary
Core Services @Criteo
From Intern to SRE
Infrastructure at scale
2. 2 •
Currently
SRE @ Criteo for 1 year
Previously
Student at Université Lyon 1
Titouan Chary
About us
Frédéric Boismenu
Currently
SRE @ Criteo for 2 years
Previously
Dev, Ops, stuff in a hedge fund
(even Delphi!)
3. 1 2 3 4
Introduction From Workstation
to CD
At Scale Virtualize your Infra
Agenda
5. 5 •
Who’s Criteo
A Global technology company
French company created in 2005
3.000 employees
30 offices worldwide
R&D in Paris, Ann Arbor and Palo Alto
Managing its own
infrastructure
140 Site Reliability Engineers
8 datacenters in 3
continents
35.000 servers
Providing targeted online
advertising
1.2B exposed consumers
per month
3B ads displayed daily
6. Let’s throw
some
BuzzWords
● What is "Production"?
- On-call, incident management
● Best practices development, operations, and more!
● What is Agile, DevOps, and SREs
● Software Lifecycle in a “Real-World” Company
(CI/CD)
● Grow your application at web scale
- Discovery, load balancing, elasticity, healthchecks
- Containers
- Docker, mesos, kubernetes, etc
- Observability
● Big Data ( possible extension )
● Intelligence Artificielle ( possible extension )
17. 17 •
Buy a Bigger Server
EATING SOUP MAKES YOU GROW
Advantages :
+ Straight Forward Improvements
+ Monoliths are simple
Inconvenients:
- $$$$
- Monoliths are Bad
22. 22 •
Application is still too slow
● SOA
● Microservice
● Front
● Middleware
● Back
23. 23 •
Scale Horizontally
Increase the number of instances
for, (at least) 2 possible reasons :
- Increase the Load
(Load balancing)
- Improve the availability (HA)
44. 44 •
Projects On Going at @ Criteo / SRE
● Container Orchestration as a Service
● GPU experimentation plateform for AI researchers
● Migration of 20k+ Windows servers to app .NetCore on Mesos
● VIP-as-a-service layer 4 (example: SSH)
● SDN (Software Defined Networking)
● Automated Canary and End-to-End Testing
● SRE Infrastructure Agent Helper
● Automatic Outlier detection Monitoring
● Auto-Scaling / Auto-Shaping
● Migration to Hadoop 3
45. 45 •
● Build a complete platform
● Hyperparameter tuning
● A/B testing of different strategies
● Tensorflow GPUs
● Research :)