SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Winston
Diagnostic and Remediation Engineering (DaRE)
Vinay Shah & Jean-Sebastien Jeannotte
● Introduction
● Internals - How it works?
● Demo - See it in action!
● Learnings and challenges
● Metrics & Road ahead
● Additional resources
Topics
Introduction
Landscape
Operational load vs.
new features
Scale and Growth Availability
Application or
Service
Monitoring
Alerting
Pagerduty Email Winston
● Reduce MTTR
● Reduce risk of human errors
● Reduce pager fatigue, provide tier 1 support
● Don’t worry about infrastructure, focus on your business logic
● Best practice for runbook lifecycle management
Business goals
Winston is an event driven runbook
automation platform. It is designed to host
and execute runbooks in response to
operational events.
Internals
Howisitdeployed?
Execution Flow
● One stop portal for all things Winston
● Supports Create, Read, Update, Delete, Execute and Diagnose functionality
● Implements best practises
○ Compliance/Auditing
○ Persistence
○ Security (Authentication/Authorization)
● Self serve & scalable
Winston Studio
● Pack
A group of related automations typically organized around a discreet
service or product
● Action
Set of steps to help with diagnostics or remediations written as code
● Event & event source
External services that are the source of events that trigger a runbook
Terminology
Demo
Winston Studio
DEMO
● False positives
○ Cassandra ring health
● Diagnostics - correlation could point towards causation - e.g:
○ Querying Chronos events
○ Querying dependencies upstream and downstream for anomalous behaviour
● Remediation
○ Clean up disk space
○ Restart Kafka process
Sample use cases
Learnings &
challenges
Common patterns
● Usage
○ Culture of automating the manual and repeatable
○ Noisy signals become more interesting
○ Lesser the control more the opportunity
● Product
○ Safety is crucial
○ Usability is important
○ Resiliency
Insights
● Don’t reinvent the wheel
● Start simple and iterate
● Allow experimentation
● Pay special care to usability of your product
● Push for changing the culture - usage will follow
● Talk to us/others who have gone through some of the pains and learnings
Recommendations to get started
Metrics and
Road ahead
● Adoption. Adoption. Adoption.
● Usability
○ Polyglot support (Groovy based actions)
○ Deeper Integrations
● Safety
○ Resource isolation (Containers)
○ Rate limiting
The road ahead
● Introducing Winston:
http://techblog.netflix.com/2016/08/introducing-winston-event-driven.html
● Stackstorm: https://docs.stackstorm.com/
● Reach out: vshah@netflix.com or jjeannotte@netflix.com
We are hiring
Senior Software Engineer - https://jobs.netflix.com/jobs/860752
Links & resources
Thank you.

Contenu connexe

Tendances

Alta Disponibilidad con PostgreSQL
Alta Disponibilidad con PostgreSQLAlta Disponibilidad con PostgreSQL
Alta Disponibilidad con PostgreSQLCarlos Gustavo Ruiz
 
Advantages of Cloud Computing for Business
Advantages of Cloud Computing for BusinessAdvantages of Cloud Computing for Business
Advantages of Cloud Computing for BusinessGrazitti Interactive
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaYoungHeon (Roy) Kim
 
Running Microsoft SharePoint On AWS - Smartronix and AWS - Webinar
Running Microsoft SharePoint On AWS - Smartronix and AWS - WebinarRunning Microsoft SharePoint On AWS - Smartronix and AWS - Webinar
Running Microsoft SharePoint On AWS - Smartronix and AWS - WebinarAmazon Web Services
 
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAmazon Web Services
 
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...Amazon Web Services
 
Exploiting IAM in the google cloud platform - dani_goland_mohsan_farid
Exploiting IAM in the google cloud platform - dani_goland_mohsan_faridExploiting IAM in the google cloud platform - dani_goland_mohsan_farid
Exploiting IAM in the google cloud platform - dani_goland_mohsan_faridCloudVillage
 
Best practices for choosing identity solutions for applications + workloads -...
Best practices for choosing identity solutions for applications + workloads -...Best practices for choosing identity solutions for applications + workloads -...
Best practices for choosing identity solutions for applications + workloads -...Amazon Web Services
 
From airflow to google cloud composer
From airflow to google cloud composerFrom airflow to google cloud composer
From airflow to google cloud composerBruce Kuo
 
Networking Challenges for the Next Decade
Networking Challenges for the Next DecadeNetworking Challenges for the Next Decade
Networking Challenges for the Next DecadeOpen Networking Summit
 
Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...
Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...
Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...Amazon Web Services
 
OSMC 2021 | Introduction into OpenSearch
OSMC 2021 | Introduction into OpenSearchOSMC 2021 | Introduction into OpenSearch
OSMC 2021 | Introduction into OpenSearchNETWAYS
 
Visualizing large datasets with js using deckgl
Visualizing large datasets with js using deckglVisualizing large datasets with js using deckgl
Visualizing large datasets with js using deckglMarko Letic
 

Tendances (20)

Alta Disponibilidad con PostgreSQL
Alta Disponibilidad con PostgreSQLAlta Disponibilidad con PostgreSQL
Alta Disponibilidad con PostgreSQL
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Advantages of Cloud Computing for Business
Advantages of Cloud Computing for BusinessAdvantages of Cloud Computing for Business
Advantages of Cloud Computing for Business
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & Grafana
 
Running Microsoft SharePoint On AWS - Smartronix and AWS - Webinar
Running Microsoft SharePoint On AWS - Smartronix and AWS - WebinarRunning Microsoft SharePoint On AWS - Smartronix and AWS - Webinar
Running Microsoft SharePoint On AWS - Smartronix and AWS - Webinar
 
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
 
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
 
Machine Learning on AWS
Machine Learning on AWSMachine Learning on AWS
Machine Learning on AWS
 
Dns
DnsDns
Dns
 
Exploiting IAM in the google cloud platform - dani_goland_mohsan_farid
Exploiting IAM in the google cloud platform - dani_goland_mohsan_faridExploiting IAM in the google cloud platform - dani_goland_mohsan_farid
Exploiting IAM in the google cloud platform - dani_goland_mohsan_farid
 
Best practices for choosing identity solutions for applications + workloads -...
Best practices for choosing identity solutions for applications + workloads -...Best practices for choosing identity solutions for applications + workloads -...
Best practices for choosing identity solutions for applications + workloads -...
 
From airflow to google cloud composer
From airflow to google cloud composerFrom airflow to google cloud composer
From airflow to google cloud composer
 
AWS Cloud Watch
AWS Cloud WatchAWS Cloud Watch
AWS Cloud Watch
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3) AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
 
Networking Challenges for the Next Decade
Networking Challenges for the Next DecadeNetworking Challenges for the Next Decade
Networking Challenges for the Next Decade
 
Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...
Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...
Best Practices for Running Oracle Databases on Amazon RDS (DAT317) - AWS re:I...
 
OSMC 2021 | Introduction into OpenSearch
OSMC 2021 | Introduction into OpenSearchOSMC 2021 | Introduction into OpenSearch
OSMC 2021 | Introduction into OpenSearch
 
Visualizing large datasets with js using deckgl
Visualizing large datasets with js using deckglVisualizing large datasets with js using deckgl
Visualizing large datasets with js using deckgl
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 

Similaire à Winston - Netflix's event driven auto remediation and diagnostics tool

The Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security TestingThe Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security TestingMatt Tesauro
 
Webinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingWebinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingFederico Toledo
 
Agile Testing Analytics
Agile Testing AnalyticsAgile Testing Analytics
Agile Testing AnalyticsQASymphony
 
Testing in a continuous delivery environment
Testing in a continuous delivery environmentTesting in a continuous delivery environment
Testing in a continuous delivery environmentStefan Verhoeff
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideasRichard Robinson
 
Managing software projects & teams effectively
Managing software projects & teams effectivelyManaging software projects & teams effectively
Managing software projects & teams effectivelyAshutosh Agarwal
 
Dscrum
DscrumDscrum
Dscrumsilexc
 
AWS Well Architected Framework in Summary
AWS Well Architected Framework in SummaryAWS Well Architected Framework in Summary
AWS Well Architected Framework in SummaryEwere Diagboya
 
Agile Development Practices May 2017
Agile Development Practices May 2017Agile Development Practices May 2017
Agile Development Practices May 2017Jaroslav Gergic
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseChristian McHugh
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software houseParis Apostolopoulos
 
Metrics 4 faster feedback
Metrics 4 faster feedbackMetrics 4 faster feedback
Metrics 4 faster feedbackKris Buytaert
 
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosDrools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosMauricio (Salaboy) Salatino
 
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...Ontico
 
Usa prácticas de integración continua y sobrevive para luchar otro día.
 Usa prácticas de integración continua y sobrevive para luchar otro día. Usa prácticas de integración continua y sobrevive para luchar otro día.
Usa prácticas de integración continua y sobrevive para luchar otro día.Software Guru
 
3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of AgileNeotys
 
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...CodeScience
 

Similaire à Winston - Netflix's event driven auto remediation and diagnostics tool (20)

The Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security TestingThe Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security Testing
 
Webinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingWebinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testing
 
Agile Testing Analytics
Agile Testing AnalyticsAgile Testing Analytics
Agile Testing Analytics
 
Agile testing (n)
Agile testing (n)Agile testing (n)
Agile testing (n)
 
Testing in a continuous delivery environment
Testing in a continuous delivery environmentTesting in a continuous delivery environment
Testing in a continuous delivery environment
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideas
 
Managing software projects & teams effectively
Managing software projects & teams effectivelyManaging software projects & teams effectively
Managing software projects & teams effectively
 
Dscrum
DscrumDscrum
Dscrum
 
AWS Well Architected Framework in Summary
AWS Well Architected Framework in SummaryAWS Well Architected Framework in Summary
AWS Well Architected Framework in Summary
 
Agile Development Practices May 2017
Agile Development Practices May 2017Agile Development Practices May 2017
Agile Development Practices May 2017
 
Pusheando en master, que es gerundio
Pusheando en master, que es gerundioPusheando en master, que es gerundio
Pusheando en master, que es gerundio
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterprise
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software house
 
Software management for tech startups
Software management for tech startupsSoftware management for tech startups
Software management for tech startups
 
Metrics 4 faster feedback
Metrics 4 faster feedbackMetrics 4 faster feedback
Metrics 4 faster feedback
 
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosDrools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
 
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
 
Usa prácticas de integración continua y sobrevive para luchar otro día.
 Usa prácticas de integración continua y sobrevive para luchar otro día. Usa prácticas de integración continua y sobrevive para luchar otro día.
Usa prácticas de integración continua y sobrevive para luchar otro día.
 
3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile
 
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
 

Dernier

WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 

Dernier (20)

WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 

Winston - Netflix's event driven auto remediation and diagnostics tool

  • 1. Winston Diagnostic and Remediation Engineering (DaRE) Vinay Shah & Jean-Sebastien Jeannotte
  • 2. ● Introduction ● Internals - How it works? ● Demo - See it in action! ● Learnings and challenges ● Metrics & Road ahead ● Additional resources Topics
  • 4. Landscape Operational load vs. new features Scale and Growth Availability
  • 6. ● Reduce MTTR ● Reduce risk of human errors ● Reduce pager fatigue, provide tier 1 support ● Don’t worry about infrastructure, focus on your business logic ● Best practice for runbook lifecycle management Business goals
  • 7. Winston is an event driven runbook automation platform. It is designed to host and execute runbooks in response to operational events.
  • 11. ● One stop portal for all things Winston ● Supports Create, Read, Update, Delete, Execute and Diagnose functionality ● Implements best practises ○ Compliance/Auditing ○ Persistence ○ Security (Authentication/Authorization) ● Self serve & scalable Winston Studio
  • 12. ● Pack A group of related automations typically organized around a discreet service or product ● Action Set of steps to help with diagnostics or remediations written as code ● Event & event source External services that are the source of events that trigger a runbook Terminology
  • 13. Demo
  • 15. ● False positives ○ Cassandra ring health ● Diagnostics - correlation could point towards causation - e.g: ○ Querying Chronos events ○ Querying dependencies upstream and downstream for anomalous behaviour ● Remediation ○ Clean up disk space ○ Restart Kafka process Sample use cases
  • 18. ● Usage ○ Culture of automating the manual and repeatable ○ Noisy signals become more interesting ○ Lesser the control more the opportunity ● Product ○ Safety is crucial ○ Usability is important ○ Resiliency Insights
  • 19. ● Don’t reinvent the wheel ● Start simple and iterate ● Allow experimentation ● Pay special care to usability of your product ● Push for changing the culture - usage will follow ● Talk to us/others who have gone through some of the pains and learnings Recommendations to get started
  • 21. ● Adoption. Adoption. Adoption. ● Usability ○ Polyglot support (Groovy based actions) ○ Deeper Integrations ● Safety ○ Resource isolation (Containers) ○ Rate limiting The road ahead
  • 22. ● Introducing Winston: http://techblog.netflix.com/2016/08/introducing-winston-event-driven.html ● Stackstorm: https://docs.stackstorm.com/ ● Reach out: vshah@netflix.com or jjeannotte@netflix.com We are hiring Senior Software Engineer - https://jobs.netflix.com/jobs/860752 Links & resources