Slides from my talk at London Go User Group.
The RED Method defines three key metrics you should measure for every microservice in your architecture; inspired by the USE Method from Brendan Gregg, it gives developers a template for instrumenting their services and building dashboards in a consistent, repeatable fashion.
In this talk we will discuss patterns of application instrumentation, where and when they are applicable, and how they can be implemented with Prometheus. We’ll cover Google’s Four Golden Signals, the RED Method, the USE Method, and Dye Testing. We’ll also discuss why consistency is an important approach for reducing cognitive load. Finally we’ll talk about the limitations of these approaches and what can be done to overcome them.
Data Contracts In Practice With Debezium and Apache Flink
The RED Method: How To Instrument Your Services
1. The RED Method
Patterns for instrumentation & monitoring.
Tom Wilkie, London Go Meetup
tom@kausal.co @tom_wilkie
github.com/tomwilkie
2.
3. • Founder Kausal, “transforming observability”
• Prometheus developer
• Home brewer
Previously:
• Worked on Kubernetes & Prometheus at
Weaveworks
• SRE for Google Analytics
• Founder/CTO at Acunu, worked on Cassandra
4. Introduction
Why does this matter?
The RED Method
Requests Rate, Errors, Duration..
The USE Method
Utilisation, Saturation, Errors
The Four Golden Signals
RED + Saturation
7. The RED Method
For every service, monitor request:
• Rate - number of requests per second
• Errors - the number of those requests that are failing
• Duration - the amount of time those requests take
16. More Details
• “Monitoring Microservices” - Weaveworks (slides)
• "The RED Method: key metrics for microservices architecture” - Weaveworks
• “Monitoring and Observability with USE and RED” - VividCortex
• “RED Method for Prometheus – 3 Key Metrics for Monitoring” - Rancher Labs
• “Logs and Metrics” - Cindy Sridharan
• "Logging v. instrumentation”, “Go best practices, six years in” - Peter Bourgon
18. For every resource, monitor:
• Utilisation: % time that the resource was busy
• Saturation: amount of work resource has to do, often queue length
• Errors: the count of error events
USE Methodhttp://www.brendangregg.com/usemethod.html
25. The Four Golden Signals
For each service, monitor:
• Latency - time taken to serve a request
• Traffic - how much demand is places on your system
• Errors - rate or requests that are failing
• Saturation - how “full” your services is