SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Load Balancing
in the SRE way
Ke Zhu
@shawnzhu
Site (Un)Reliability Engineer at IBM
For What?
• GitHub Enterprise Cluster
• On Internet
• Zero downtime
• 100M+ HTTP requests per week
• 30k+ attacks per week
• 26k+ git clone per hour
(https://help.github.com/enterprise/2.8/admin/guides/installation/maintenance-mode/)
Design Goals
• Scripting Platform
• traffic conducting via code
• do social coding
• Observable
• Blue/Green deployment
• High performance
• Security from day one
Software Stack
&& 🎩 magic kernel parameters in /etc/sysctl.conf 🐰
Scripting Platform
• OpenResty (Nginx + Lua) - https://openresty.org/en/
(Example: customized request rate limiting)
Blue/Green Deployment
• Can not terminate any TCP connection
• Two stacks:
• load-balancer-green
• load-balancer-blue (for experiment)
• Cloud DNS
• Switching A record + short TTL (~5m)
• Simple/Weighted Routing policy
• Run experiment by using docker image tags
• Real time metrics collection by librato.com
• Test docker images
• RSpec + Serverspec
• Travis CI
• Test docker host
• RSpec + Serverspec
• Test Kitchen
Test Driven for Container
❤vault
• Secret mgmt via API - https://www.vaultproject.io/
• retrieve all secrets for provisioning load balancer
via a single token with TTL 5min
Blocking mode in Production
• Signal Sciences - https://signalsciences.net/
Summary
• Conducting HTTPS traffic via Lua code
• Blue-green deployment of Load balancer via DNS
• Testing docker with RSpec + Serverspec
• SignalSciences

Contenu connexe

Tendances

Tendances (20)

Scaling application servers for efficiency
Scaling application servers for efficiencyScaling application servers for efficiency
Scaling application servers for efficiency
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
 
Hyperloglog Lightning Talk
Hyperloglog Lightning TalkHyperloglog Lightning Talk
Hyperloglog Lightning Talk
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Gnocchi v4 - past and present
Gnocchi v4 - past and presentGnocchi v4 - past and present
Gnocchi v4 - past and present
 
Introduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackIntroduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK Stack
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
 
Breaking Prometheus (Promcon Berlin '16)
Breaking Prometheus (Promcon Berlin '16)Breaking Prometheus (Promcon Berlin '16)
Breaking Prometheus (Promcon Berlin '16)
 
OpenContrail Implementations
OpenContrail ImplementationsOpenContrail Implementations
OpenContrail Implementations
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
MongoDB-Migration-Strategies
MongoDB-Migration-StrategiesMongoDB-Migration-Strategies
MongoDB-Migration-Strategies
 
Seastar @ SF/BA C++UG
Seastar @ SF/BA C++UGSeastar @ SF/BA C++UG
Seastar @ SF/BA C++UG
 
Lightning Talk: MongoDB Migration Strategies
Lightning Talk: MongoDB Migration StrategiesLightning Talk: MongoDB Migration Strategies
Lightning Talk: MongoDB Migration Strategies
 
Highly Available Docker Networking With BGP
Highly Available Docker Networking With BGPHighly Available Docker Networking With BGP
Highly Available Docker Networking With BGP
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
CloudAnts - Kubernetes
CloudAnts - KubernetesCloudAnts - Kubernetes
CloudAnts - Kubernetes
 
Developing Ansible Dynamic Inventory Script - Nov 2017
Developing Ansible Dynamic Inventory Script - Nov 2017Developing Ansible Dynamic Inventory Script - Nov 2017
Developing Ansible Dynamic Inventory Script - Nov 2017
 
InfluxDB & Kubernetes
InfluxDB & KubernetesInfluxDB & Kubernetes
InfluxDB & Kubernetes
 
Distributed scheduler hell (MicroXChg 2017 Berlin)
Distributed scheduler hell (MicroXChg 2017 Berlin)Distributed scheduler hell (MicroXChg 2017 Berlin)
Distributed scheduler hell (MicroXChg 2017 Berlin)
 
Object Compaction in Cloud for High Yield
Object Compaction in Cloud for High YieldObject Compaction in Cloud for High Yield
Object Compaction in Cloud for High Yield
 

En vedette

CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
Michael Kehoe
 
Script up your application with Lua! -- RyanE -- OpenWest 2014
Script up your application with Lua! -- RyanE -- OpenWest 2014Script up your application with Lua! -- RyanE -- OpenWest 2014
Script up your application with Lua! -- RyanE -- OpenWest 2014
ryanerickson
 

En vedette (20)

Perl在nginx里的应用
Perl在nginx里的应用Perl在nginx里的应用
Perl在nginx里的应用
 
The ROLE SRE Approach - Getting more concrete
The ROLE SRE Approach - Getting more concreteThe ROLE SRE Approach - Getting more concrete
The ROLE SRE Approach - Getting more concrete
 
The Social Requirements Engineering (SRE) Approach to Developing a Large-scal...
The Social Requirements Engineering (SRE) Approach to Developing a Large-scal...The Social Requirements Engineering (SRE) Approach to Developing a Large-scal...
The Social Requirements Engineering (SRE) Approach to Developing a Large-scal...
 
Sre con16 tier 1 metamorphosis
Sre con16   tier 1 metamorphosisSre con16   tier 1 metamorphosis
Sre con16 tier 1 metamorphosis
 
Couchbase Meetup Jan 2016
Couchbase Meetup Jan 2016Couchbase Meetup Jan 2016
Couchbase Meetup Jan 2016
 
SRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level TalentSRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level Talent
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 
SRE in Startup
SRE in StartupSRE in Startup
SRE in Startup
 
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedIn
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedInCouchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedIn
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedIn
 
SouthBay SRE Meetup Jan 2016
SouthBay SRE Meetup Jan 2016SouthBay SRE Meetup Jan 2016
SouthBay SRE Meetup Jan 2016
 
Couchbase Connect 2016
Couchbase Connect 2016Couchbase Connect 2016
Couchbase Connect 2016
 
APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...
APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...
APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...
 
Software reliability tools and common software errors
Software reliability tools and common software errorsSoftware reliability tools and common software errors
Software reliability tools and common software errors
 
Scio Saa S Readiness Evaluation Sre V1.0
Scio Saa S Readiness Evaluation Sre V1.0Scio Saa S Readiness Evaluation Sre V1.0
Scio Saa S Readiness Evaluation Sre V1.0
 
Stephen McHenry - Chanecellor of Site Reliability Engineering, Google
Stephen McHenry - Chanecellor of Site Reliability Engineering, GoogleStephen McHenry - Chanecellor of Site Reliability Engineering, Google
Stephen McHenry - Chanecellor of Site Reliability Engineering, Google
 
Works of site reliability engineer
Works of site reliability engineerWorks of site reliability engineer
Works of site reliability engineer
 
Using SaltStack to Auto Triage and Remediate Production Systems
Using SaltStack to Auto Triage and Remediate Production SystemsUsing SaltStack to Auto Triage and Remediate Production Systems
Using SaltStack to Auto Triage and Remediate Production Systems
 
Script up your application with Lua! -- RyanE -- OpenWest 2014
Script up your application with Lua! -- RyanE -- OpenWest 2014Script up your application with Lua! -- RyanE -- OpenWest 2014
Script up your application with Lua! -- RyanE -- OpenWest 2014
 
Nginx+lua在阿里巴巴的使用
Nginx+lua在阿里巴巴的使用Nginx+lua在阿里巴巴的使用
Nginx+lua在阿里巴巴的使用
 
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
Cloud Native, Microservices and SRE/Chaos Engineering: The new Rules of The G...
 

Similaire à Load balancing in the SRE way

Similaire à Load balancing in the SRE way (20)

Rex gke-clustree
Rex gke-clustreeRex gke-clustree
Rex gke-clustree
 
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
 
DockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging WorkshopDockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging Workshop
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production 6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud Applications
 
Implementing an Automated Staging Environment
Implementing an Automated Staging EnvironmentImplementing an Automated Staging Environment
Implementing an Automated Staging Environment
 
Devoxx France 2023 - 1,2,3 Quarkus.pdf
Devoxx France 2023 - 1,2,3 Quarkus.pdfDevoxx France 2023 - 1,2,3 Quarkus.pdf
Devoxx France 2023 - 1,2,3 Quarkus.pdf
 
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptxBSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
 
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
 
Safe deployments with Blue-Green and Spinnaker
Safe deployments with Blue-Green and SpinnakerSafe deployments with Blue-Green and Spinnaker
Safe deployments with Blue-Green and Spinnaker
 
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius SchumacherOSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 
HOW TO DRONE.IO IN CI/CD WORLD
HOW TO DRONE.IO IN CI/CD WORLDHOW TO DRONE.IO IN CI/CD WORLD
HOW TO DRONE.IO IN CI/CD WORLD
 
Cavity Data
Cavity DataCavity Data
Cavity Data
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Monitoring your API
Monitoring your APIMonitoring your API
Monitoring your API
 
OSDC 2016 - Continous Integration in Data Centers - Further 3 Years later by ...
OSDC 2016 - Continous Integration in Data Centers - Further 3 Years later by ...OSDC 2016 - Continous Integration in Data Centers - Further 3 Years later by ...
OSDC 2016 - Continous Integration in Data Centers - Further 3 Years later by ...
 
Gradle - From minutes to seconds: minimizing build times
Gradle - From minutes to seconds: minimizing build timesGradle - From minutes to seconds: minimizing build times
Gradle - From minutes to seconds: minimizing build times
 
Docker & ECS: Secure Nearline Execution
Docker & ECS: Secure Nearline ExecutionDocker & ECS: Secure Nearline Execution
Docker & ECS: Secure Nearline Execution
 

Dernier

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Dernier (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 

Load balancing in the SRE way

  • 1. Load Balancing in the SRE way Ke Zhu @shawnzhu Site (Un)Reliability Engineer at IBM
  • 2. For What? • GitHub Enterprise Cluster • On Internet • Zero downtime • 100M+ HTTP requests per week • 30k+ attacks per week • 26k+ git clone per hour (https://help.github.com/enterprise/2.8/admin/guides/installation/maintenance-mode/)
  • 3.
  • 4. Design Goals • Scripting Platform • traffic conducting via code • do social coding • Observable • Blue/Green deployment • High performance • Security from day one
  • 5. Software Stack && 🎩 magic kernel parameters in /etc/sysctl.conf 🐰
  • 6. Scripting Platform • OpenResty (Nginx + Lua) - https://openresty.org/en/ (Example: customized request rate limiting)
  • 7. Blue/Green Deployment • Can not terminate any TCP connection • Two stacks: • load-balancer-green • load-balancer-blue (for experiment) • Cloud DNS • Switching A record + short TTL (~5m) • Simple/Weighted Routing policy • Run experiment by using docker image tags • Real time metrics collection by librato.com
  • 8. • Test docker images • RSpec + Serverspec • Travis CI • Test docker host • RSpec + Serverspec • Test Kitchen Test Driven for Container
  • 9. ❤vault • Secret mgmt via API - https://www.vaultproject.io/ • retrieve all secrets for provisioning load balancer via a single token with TTL 5min
  • 10. Blocking mode in Production • Signal Sciences - https://signalsciences.net/
  • 11. Summary • Conducting HTTPS traffic via Lua code • Blue-green deployment of Load balancer via DNS • Testing docker with RSpec + Serverspec • SignalSciences