SlideShare a Scribd company logo
1 of 43
Brian Brazil
Founder
Counting with Prometheus
Who am I?
Engineer passionate about running software reliably in production.
Core developer of Prometheus
Studied Computer Science in Trinity College Dublin.
Google SRE for 7 years, working on high-scale reliable systems.
Contributor to many open source projects, including Prometheus, Ansible,
Python, Aurora and Zookeeper.
Founder of Robust Perception, provider of commercial support and consulting
What is Prometheus?
Prometheus is a metrics-based time series database, designed for whitebox
monitoring.
It supports labels (dimensions/tags).
Alerting and graphing are unified, using the same language.
Architecture
Counting
Counting is easy right?
Just 1, 2, 3?
Counting… to the Extreme!!!
What if we're counting monitoring-related events though for a metrics system.
We're usually sampling data over the network => potential data loss.
What happens with the data we transfer at the other end?
Counters and Gauges
There are two base metric types.
Gauges are a snapshot of current state, such as memory usage or temperature.
They can go up and down.
Counters are the other base type.
To explain them, we need to go on a "small" detour.
Events are key
Events are the thing we want to track with Counters.
An event might be a HTTP request hitting our server, a function call being made or
an error being returned.
An event logging system would record each event individually.
A metrics-based system like Prometheus or Graphite has events aggregated
across time before they get to the TSDB. Therein lies the rub.
Approach #1: Resetting Count
There's a few common approaches to providing this aggregate over time.
The first and simplest is the resetting count.
You start at 0, and every time there's an event you increment the count.
On a regular interval you transfer the current count, and reset to 0.
Counter Reset, Normal Operation
Approach #1: Resetting Count Problems
If a transfer fails, you've lost that data.
Can't presume this effect will be random and unbiased, e.g. if a big spike in traffic
also saturates network links used for monitoring.
Doesn't work if you want to transfer data to more than one place for redundancy.
Each would get 1/n of the data.
Counter Reset, Failed Transfer
Approach #2: Exponential moving average
A number of instrumentation libraries offer this, such as DropWizard's Meter.
Basically the same way Unix load averages work:
result(t) = (1 - w) * result(t-1) + (w) * events_this_period
Where t is the tick, and w is a weighting factor.
The weighting factor determines how quickly new data is incorporated.
Dropwizard evaluates the above every 5 seconds.
Exponential Moving Average, Normal Operation
Approach #2: Exponential moving average Problems
Events aren't uniformly considered. If you're transferring data every 10s, then the
most recent 5s matter more.
Thus reconstructing what happened is hard for debugging, unless you get every
5s update.
You're bound to the 1m, 5m and 15m weightings that the implementation has
chosen.
Also means that it's not particularly resilient to missing a scrape.
Aside: Graphite's summarize() function
Summarize() returns events during e.g. the last hour.
Some have a belief that summarize is accurate. It isn't.
Problem is that with say 15m granularity, data point at 13:02 will include data from
13 minutes before 13:00-14:00 and similarly at the end.
If you want this accurately, need to use logs.
No metrics based system can report this accurately in the general case.
Graphite's summarize() and non-aligned data
Aliasing
Depending on the exact time offsets between the process start, metric
initialisation, data transfers and when the user makes a query you can get
different results.
A second in either direction can make a big difference to the patterns you see in
your graphs.
This is an expected signal processing effect, be aware of it.
Expressive Power
Both previous solutions are reasonable if the monitoring system is a fairly dumb
data store, often with little math beyond addition (if even that).
Losing data or having no redundancy are better than having nothing at all.
What if you have the option for your monitoring system to do math?
What if you control both ends?
Approach #3: Prometheus Counters
Like Approach #1, we have a counter that starts at 0 increments at each event.
This is transferred regularly to Prometheus.
It's not reset on every transfer though, keeps on increasing.
Rate() function in Prometheus takes this in, and calculates how quickly it
increased over the given time period.
Prometheus rate(), Normal Operation
Approach #3: Prometheus Counters
Resilient to failed transfers (lose resolution, not data)
Can handle multiple places to transfer to
Can choose the time period you want to calculate over in monitoring system, thus
choose your level of smoothing e.g. rate(my_metric[5m]) or rate(my_metric[10m])
Uniform consideration of data
Easy to implement on client side
Prometheus rate(), Failed Transfer
Prometheus Counters: Rate()
There's many details to getting the rate() function right.
Processes might restart
Scrapes might be missed
Time periods rarely align perfectly
Time series stop and start
Prometheus Counters: Resets
Counters can be reset to 0, such as if a process restarts or a network switch gets
rebooted.
If we see the value decreasing, then that's a counter reset so presume it went to 0.
So seeing 0 -> 10 -> 5 is 10 + 5 = 15 of an increase.
Graphite/InfluxDB's nonNegativeDerivative() function would ignore the drop, report
based on just the increase 10.
Prometheus Counters: Missed scrapes
If we miss a scrape in the middle of a time period, no problem as we still have the
data points around it.
Little more complicated around the edges of the time period we're looking at
though.
Prometheus Counters: Alignment
It is rare that the user will request data exactly on the same cycle as the scrapes.
Especially when you're monitoring multiple servers with staggered scrapes.
Or given that timestamps are millisecond resolution, and the endpoint graphs use
accepts only second-granularity input.
Thus we need to extrapolate out to the end of the rate()'s range.
increase() can return non-integers on integral data
This is why one of the more surprising behaviours of increase() happens.
So if we have data which is:
t= 1 10
t= 6 12
t=11 13
Request a increase() from t=0 for 15s, you'll get an increase of 3 over 10s.
Extrapolating over the 15s, that's a result of 4.5.
This is the correct result on average. If you want exact answers, use logs.
Non-integral increase due to extrapolation
Prometheus Counters: Time series lifecycle
Time series are created and destroyed. If we always extrapolated out to the edge
of the rate() range we'd get much bigger results than we should.
So we detect that. We calculate the average interval between data points.
If the first/last data point start/end of the range is within 110% of the average
interval, then we extrapolate to the start/end. Allows for failed scrapes.
Otherwise we extrapolate 50% of the average interval.
We also know counters can't go negative, so don't extrapolate before the point
they'd be 0 at.
rate() extrapolation
Problem: Timeseries not always existing
The previous logic handles all the edge cases around counters resets, process
restarts and rolling restarts, on average.
What if a counter appears with the value 1 though long after the process has
started and doesn't increase again?
No increase in the history, so rate() doesn't see it. Can't tell when the increase
happened. Prometheus is designed to be available, not catch 100% of events.
Solution: Logs, or make sure all your counters are being initialised on process
start so it goes 0->1. Will only miss it prior to the first scrape then.
Problem: Lag
All these solutions produce results that lag the actual data - already seen with
summarize().
A 5m Prometheus rate() at a given time, is really the average from 5 minutes ago
to now. Similarly with resetting counters.
Exponential moving averages more complicated to explain, same issue though.
Always compare like with like, stick to one range for your rate()s.
Client Implications
The Prometheus Counter is very easy to implement, only need to increment a
single number.
Concurrency handling varies by language. Mutexes are the slowest, then atomics,
then per-processor values - which the Prometheus Java client approximates.
Dropwizard Meter has to increment 4 numbers and do the decay logic, so about
6x slower per benchmarks.
Dropwizard Counter (which is really a Gauge, as it can go down) is as fast as
Prometheus Counter.
Other performance considerations
Values for each label value (called a "Child") are in map in each metric.
That map lookup can be relatively expensive (~100ns), keep a pointer to the Child
if that could matter. Need to know the labels you'll be using in advance though.
Similarly, don't create a map from metric names to metric objects. Store metric
objects as pointers in simple variables after you create them.
Best Practices
Use seconds for timing. Prometheus values are all floats, so developers don't
need to choose and deal with a mix of ns, us, ms, s and minutes.
increase() function handy for display, but similarly for consistency only use it for
display. Use rate() for recording rules.
increase() is only syntactic sugar for rate().
irate(): The other rate function
Prometheus also has irate().
This looks at the last two points in the given range, and returns how fast they're
increasing per second.
Great for seeing very up to date data.
Can be hard to read if data is very spiky.
Need to be careful you're not missing data.
irate(), Normal Operation
irate(), Failed Transfer
Steps and rate durations
The query_range HTTP endpoint has a step parameter, this is the resolution of
the data returned.
If you have a 10m step and 5m rate, you're going to be ignoring half your data.
To avoid this, make sure your rate range is at least your step for graphs.
For irate(), your step should be no more than your sample resolution.
Compound Types: Summary
How to track average latency? With two counters!
One for total requests (_count), one for total latency of those requests (_sum).
Take the rates, divide and you have average latency.
This is how the compound Summary metric works. It's a more convenient API over
doing the above by hand.
Some clients also offer quantiles. Beware, slow and unaggregatable.
Compound Types: Histogram
Histogram also includes the _count and _sum.
The main purpose is calculating quantiles in Prometheus.
The histogram has buckets, which are counters. You can take the rate() of these,
aggregate them and then use histogram_quantile() to calculate arbitrary quantiles.
Be wary of cardinality explosion, use sparingly.
Resources
Official Project Website: prometheus.io
User Mailing List: prometheus-users@googlegroups.com
Developer Mailing List: prometheus-developers@googlegroups.com
Source code:
https://github.com/prometheus/prometheus/blob/master/promql/functions.go
Robust Perception Blog: www.robustperception.io/blog

More Related Content

What's hot

[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기Ji-Woong Choi
 
Thanos: Global, durable Prometheus monitoring
Thanos: Global, durable Prometheus monitoringThanos: Global, durable Prometheus monitoring
Thanos: Global, durable Prometheus monitoringBartłomiej Płotka
 
Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentSadique Puthen
 
SymfonyCon 2019: Head first into Symfony Cache, Redis & Redis Cluster
SymfonyCon 2019:   Head first into Symfony Cache, Redis & Redis ClusterSymfonyCon 2019:   Head first into Symfony Cache, Redis & Redis Cluster
SymfonyCon 2019: Head first into Symfony Cache, Redis & Redis ClusterAndré Rømcke
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Ji-Woong Choi
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Brian Brazil
 
ASP.NET CoreとAzure AD B2Cを使ったサクっと認証
ASP.NET CoreとAzure AD B2Cを使ったサクっと認証ASP.NET CoreとAzure AD B2Cを使ったサクっと認証
ASP.NET CoreとAzure AD B2Cを使ったサクっと認証Yuta Matsumura
 
XSSフィルターを利用したXSS攻撃 by Masato Kinugawa
XSSフィルターを利用したXSS攻撃 by Masato KinugawaXSSフィルターを利用したXSS攻撃 by Masato Kinugawa
XSSフィルターを利用したXSS攻撃 by Masato KinugawaCODE BLUE
 
쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기Brian Hong
 
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Thomas Riley
 
Alphorm.com Formation Kubernetes : Installation et Configuration
Alphorm.com Formation Kubernetes : Installation et ConfigurationAlphorm.com Formation Kubernetes : Installation et Configuration
Alphorm.com Formation Kubernetes : Installation et ConfigurationAlphorm
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOSAkihiro Suda
 
TRICK 2022 Results
TRICK 2022 ResultsTRICK 2022 Results
TRICK 2022 Resultsmametter
 
MariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムMariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムKouhei Sutou
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...Tokuhiro Matsuno
 
실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기
실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기
실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기Kee Hoon Lee
 
웹을 지탱하는 기술
웹을 지탱하는 기술웹을 지탱하는 기술
웹을 지탱하는 기술JungHyuk Kwon
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward
 
ASP.NET MVC Performance
ASP.NET MVC PerformanceASP.NET MVC Performance
ASP.NET MVC Performancerudib
 

What's hot (20)

[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
 
Thanos: Global, durable Prometheus monitoring
Thanos: Global, durable Prometheus monitoringThanos: Global, durable Prometheus monitoring
Thanos: Global, durable Prometheus monitoring
 
Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deployment
 
SymfonyCon 2019: Head first into Symfony Cache, Redis & Redis Cluster
SymfonyCon 2019:   Head first into Symfony Cache, Redis & Redis ClusterSymfonyCon 2019:   Head first into Symfony Cache, Redis & Redis Cluster
SymfonyCon 2019: Head first into Symfony Cache, Redis & Redis Cluster
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
ASP.NET CoreとAzure AD B2Cを使ったサクっと認証
ASP.NET CoreとAzure AD B2Cを使ったサクっと認証ASP.NET CoreとAzure AD B2Cを使ったサクっと認証
ASP.NET CoreとAzure AD B2Cを使ったサクっと認証
 
XSSフィルターを利用したXSS攻撃 by Masato Kinugawa
XSSフィルターを利用したXSS攻撃 by Masato KinugawaXSSフィルターを利用したXSS攻撃 by Masato Kinugawa
XSSフィルターを利用したXSS攻撃 by Masato Kinugawa
 
쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기쿠키런 1년, 서버개발 분투기
쿠키런 1년, 서버개발 분투기
 
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
 
Alphorm.com Formation Kubernetes : Installation et Configuration
Alphorm.com Formation Kubernetes : Installation et ConfigurationAlphorm.com Formation Kubernetes : Installation et Configuration
Alphorm.com Formation Kubernetes : Installation et Configuration
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
 
TRICK 2022 Results
TRICK 2022 ResultsTRICK 2022 Results
TRICK 2022 Results
 
MariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システムMariaDBとMroongaで作る全言語対応超高速全文検索システム
MariaDBとMroongaで作る全言語対応超高速全文検索システム
 
Uml速習会
Uml速習会Uml速習会
Uml速習会
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
 
실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기
실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기
실시간 이상탐지를 위한 머신러닝 모델에 Druid _ Imply 활용하기
 
웹을 지탱하는 기술
웹을 지탱하는 기술웹을 지탱하는 기술
웹을 지탱하는 기술
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
 
ASP.NET MVC Performance
ASP.NET MVC PerformanceASP.NET MVC Performance
ASP.NET MVC Performance
 

Similar to Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)

Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Brian Brazil
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)Brian Brazil
 
My Postdoctoral Research
My Postdoctoral ResearchMy Postdoctoral Research
My Postdoctoral ResearchPo-Ting Wu
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdfMemMem25
 
Stevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting AlgorithmsStevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting AlgorithmsJames Stevens
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemDanny Yuan
 
Mantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemMantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemC4Media
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to AlgorithmsVenkatesh Iyer
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Brian Brazil
 
TIME EXECUTION OF DIFFERENT SORTED ALGORITHMS
TIME EXECUTION   OF  DIFFERENT SORTED ALGORITHMSTIME EXECUTION   OF  DIFFERENT SORTED ALGORITHMS
TIME EXECUTION OF DIFFERENT SORTED ALGORITHMSTanya Makkar
 
Six Sigma Dfss Application In Data Accarucy
Six Sigma Dfss Application In Data AccarucySix Sigma Dfss Application In Data Accarucy
Six Sigma Dfss Application In Data Accarucyxyhfun
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum JapanBrian Brazil
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeUsing Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeDataWorks Summit
 
Process Synchronization -1.ppt
Process Synchronization -1.pptProcess Synchronization -1.ppt
Process Synchronization -1.pptjayverma27
 
Ch7 OS
Ch7 OSCh7 OS
Ch7 OSC.U
 

Similar to Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017) (20)

Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
 
My Postdoctoral Research
My Postdoctoral ResearchMy Postdoctoral Research
My Postdoctoral Research
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdf
 
Stevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting AlgorithmsStevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting Algorithms
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
 
Mantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemMantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing System
 
Introduction to Algorithms
Introduction to AlgorithmsIntroduction to Algorithms
Introduction to Algorithms
 
Algo and flowchart
Algo and flowchartAlgo and flowchart
Algo and flowchart
 
Analyzing algorithms
Analyzing algorithmsAnalyzing algorithms
Analyzing algorithms
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
TIME EXECUTION OF DIFFERENT SORTED ALGORITHMS
TIME EXECUTION   OF  DIFFERENT SORTED ALGORITHMSTIME EXECUTION   OF  DIFFERENT SORTED ALGORITHMS
TIME EXECUTION OF DIFFERENT SORTED ALGORITHMS
 
Six Sigma Dfss Application In Data Accarucy
Six Sigma Dfss Application In Data AccarucySix Sigma Dfss Application In Data Accarucy
Six Sigma Dfss Application In Data Accarucy
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeUsing Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Process Synchronization -1.ppt
Process Synchronization -1.pptProcess Synchronization -1.ppt
Process Synchronization -1.ppt
 
Major ppt
Major pptMajor ppt
Major ppt
 
Ch7 OS
Ch7 OSCh7 OS
Ch7 OS
 

More from Brian Brazil

OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)Brian Brazil
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Brian Brazil
 
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)Brian Brazil
 
Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Brian Brazil
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Brian Brazil
 
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)Brian Brazil
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Brian Brazil
 
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Evolution of the Prometheus TSDB  (Percona Live Europe 2017)Evolution of the Prometheus TSDB  (Percona Live Europe 2017)
Evolution of the Prometheus TSDB (Percona Live Europe 2017)Brian Brazil
 
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)Brian Brazil
 
Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)Brian Brazil
 
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Brian Brazil
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)Brian Brazil
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Brian Brazil
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Brian Brazil
 
So You Want to Write an Exporter
So You Want to Write an ExporterSo You Want to Write an Exporter
So You Want to Write an ExporterBrian Brazil
 
An Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQLAn Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQLBrian Brazil
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Brian Brazil
 
Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)Brian Brazil
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
 
Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Brian Brazil
 

More from Brian Brazil (20)

OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
 
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
Evaluating Prometheus Knowledge in Interviews (PromCon 2018)
 
Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)Anatomy of a Prometheus Client Library (PromCon 2018)
Anatomy of a Prometheus Client Library (PromCon 2018)
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)
 
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
 
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Evolution of the Prometheus TSDB  (Percona Live Europe 2017)Evolution of the Prometheus TSDB  (Percona Live Europe 2017)
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
 
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
Staleness and Isolation in Prometheus 2.0 (PromCon 2017)
 
Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)Rule 110 for Prometheus (PromCon 2017)
Rule 110 for Prometheus (PromCon 2017)
 
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)Prometheus:  From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
Prometheus: From Berlin to Bonanza (Keynote CloudNativeCon+Kubecon Europe 2017)
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)
 
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
 
So You Want to Write an Exporter
So You Want to Write an ExporterSo You Want to Write an Exporter
So You Want to Write an Exporter
 
An Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQLAn Exploration of the Formal Properties of PromQL
An Exploration of the Formal Properties of PromQL
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)
 
Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)Prometheus (Monitorama 2016)
Prometheus (Monitorama 2016)
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)
 

Recently uploaded

2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理F
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsPriya Reddy
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...kajalverma014
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsMonica Sydney
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...meghakumariji156
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoilmeghakumariji156
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 

Recently uploaded (20)

2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime NagercoilNagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
Nagercoil Escorts Service Girl ^ 9332606886, WhatsApp Anytime Nagercoil
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 

Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)

  • 2. Who am I? Engineer passionate about running software reliably in production. Core developer of Prometheus Studied Computer Science in Trinity College Dublin. Google SRE for 7 years, working on high-scale reliable systems. Contributor to many open source projects, including Prometheus, Ansible, Python, Aurora and Zookeeper. Founder of Robust Perception, provider of commercial support and consulting
  • 3. What is Prometheus? Prometheus is a metrics-based time series database, designed for whitebox monitoring. It supports labels (dimensions/tags). Alerting and graphing are unified, using the same language.
  • 5. Counting Counting is easy right? Just 1, 2, 3?
  • 6. Counting… to the Extreme!!! What if we're counting monitoring-related events though for a metrics system. We're usually sampling data over the network => potential data loss. What happens with the data we transfer at the other end?
  • 7. Counters and Gauges There are two base metric types. Gauges are a snapshot of current state, such as memory usage or temperature. They can go up and down. Counters are the other base type. To explain them, we need to go on a "small" detour.
  • 8. Events are key Events are the thing we want to track with Counters. An event might be a HTTP request hitting our server, a function call being made or an error being returned. An event logging system would record each event individually. A metrics-based system like Prometheus or Graphite has events aggregated across time before they get to the TSDB. Therein lies the rub.
  • 9. Approach #1: Resetting Count There's a few common approaches to providing this aggregate over time. The first and simplest is the resetting count. You start at 0, and every time there's an event you increment the count. On a regular interval you transfer the current count, and reset to 0.
  • 11. Approach #1: Resetting Count Problems If a transfer fails, you've lost that data. Can't presume this effect will be random and unbiased, e.g. if a big spike in traffic also saturates network links used for monitoring. Doesn't work if you want to transfer data to more than one place for redundancy. Each would get 1/n of the data.
  • 13. Approach #2: Exponential moving average A number of instrumentation libraries offer this, such as DropWizard's Meter. Basically the same way Unix load averages work: result(t) = (1 - w) * result(t-1) + (w) * events_this_period Where t is the tick, and w is a weighting factor. The weighting factor determines how quickly new data is incorporated. Dropwizard evaluates the above every 5 seconds.
  • 14. Exponential Moving Average, Normal Operation
  • 15. Approach #2: Exponential moving average Problems Events aren't uniformly considered. If you're transferring data every 10s, then the most recent 5s matter more. Thus reconstructing what happened is hard for debugging, unless you get every 5s update. You're bound to the 1m, 5m and 15m weightings that the implementation has chosen. Also means that it's not particularly resilient to missing a scrape.
  • 16. Aside: Graphite's summarize() function Summarize() returns events during e.g. the last hour. Some have a belief that summarize is accurate. It isn't. Problem is that with say 15m granularity, data point at 13:02 will include data from 13 minutes before 13:00-14:00 and similarly at the end. If you want this accurately, need to use logs. No metrics based system can report this accurately in the general case.
  • 17. Graphite's summarize() and non-aligned data
  • 18. Aliasing Depending on the exact time offsets between the process start, metric initialisation, data transfers and when the user makes a query you can get different results. A second in either direction can make a big difference to the patterns you see in your graphs. This is an expected signal processing effect, be aware of it.
  • 19. Expressive Power Both previous solutions are reasonable if the monitoring system is a fairly dumb data store, often with little math beyond addition (if even that). Losing data or having no redundancy are better than having nothing at all. What if you have the option for your monitoring system to do math? What if you control both ends?
  • 20. Approach #3: Prometheus Counters Like Approach #1, we have a counter that starts at 0 increments at each event. This is transferred regularly to Prometheus. It's not reset on every transfer though, keeps on increasing. Rate() function in Prometheus takes this in, and calculates how quickly it increased over the given time period.
  • 22. Approach #3: Prometheus Counters Resilient to failed transfers (lose resolution, not data) Can handle multiple places to transfer to Can choose the time period you want to calculate over in monitoring system, thus choose your level of smoothing e.g. rate(my_metric[5m]) or rate(my_metric[10m]) Uniform consideration of data Easy to implement on client side
  • 24. Prometheus Counters: Rate() There's many details to getting the rate() function right. Processes might restart Scrapes might be missed Time periods rarely align perfectly Time series stop and start
  • 25. Prometheus Counters: Resets Counters can be reset to 0, such as if a process restarts or a network switch gets rebooted. If we see the value decreasing, then that's a counter reset so presume it went to 0. So seeing 0 -> 10 -> 5 is 10 + 5 = 15 of an increase. Graphite/InfluxDB's nonNegativeDerivative() function would ignore the drop, report based on just the increase 10.
  • 26. Prometheus Counters: Missed scrapes If we miss a scrape in the middle of a time period, no problem as we still have the data points around it. Little more complicated around the edges of the time period we're looking at though.
  • 27. Prometheus Counters: Alignment It is rare that the user will request data exactly on the same cycle as the scrapes. Especially when you're monitoring multiple servers with staggered scrapes. Or given that timestamps are millisecond resolution, and the endpoint graphs use accepts only second-granularity input. Thus we need to extrapolate out to the end of the rate()'s range.
  • 28. increase() can return non-integers on integral data This is why one of the more surprising behaviours of increase() happens. So if we have data which is: t= 1 10 t= 6 12 t=11 13 Request a increase() from t=0 for 15s, you'll get an increase of 3 over 10s. Extrapolating over the 15s, that's a result of 4.5. This is the correct result on average. If you want exact answers, use logs.
  • 29. Non-integral increase due to extrapolation
  • 30. Prometheus Counters: Time series lifecycle Time series are created and destroyed. If we always extrapolated out to the edge of the rate() range we'd get much bigger results than we should. So we detect that. We calculate the average interval between data points. If the first/last data point start/end of the range is within 110% of the average interval, then we extrapolate to the start/end. Allows for failed scrapes. Otherwise we extrapolate 50% of the average interval. We also know counters can't go negative, so don't extrapolate before the point they'd be 0 at.
  • 32. Problem: Timeseries not always existing The previous logic handles all the edge cases around counters resets, process restarts and rolling restarts, on average. What if a counter appears with the value 1 though long after the process has started and doesn't increase again? No increase in the history, so rate() doesn't see it. Can't tell when the increase happened. Prometheus is designed to be available, not catch 100% of events. Solution: Logs, or make sure all your counters are being initialised on process start so it goes 0->1. Will only miss it prior to the first scrape then.
  • 33. Problem: Lag All these solutions produce results that lag the actual data - already seen with summarize(). A 5m Prometheus rate() at a given time, is really the average from 5 minutes ago to now. Similarly with resetting counters. Exponential moving averages more complicated to explain, same issue though. Always compare like with like, stick to one range for your rate()s.
  • 34. Client Implications The Prometheus Counter is very easy to implement, only need to increment a single number. Concurrency handling varies by language. Mutexes are the slowest, then atomics, then per-processor values - which the Prometheus Java client approximates. Dropwizard Meter has to increment 4 numbers and do the decay logic, so about 6x slower per benchmarks. Dropwizard Counter (which is really a Gauge, as it can go down) is as fast as Prometheus Counter.
  • 35. Other performance considerations Values for each label value (called a "Child") are in map in each metric. That map lookup can be relatively expensive (~100ns), keep a pointer to the Child if that could matter. Need to know the labels you'll be using in advance though. Similarly, don't create a map from metric names to metric objects. Store metric objects as pointers in simple variables after you create them.
  • 36. Best Practices Use seconds for timing. Prometheus values are all floats, so developers don't need to choose and deal with a mix of ns, us, ms, s and minutes. increase() function handy for display, but similarly for consistency only use it for display. Use rate() for recording rules. increase() is only syntactic sugar for rate().
  • 37. irate(): The other rate function Prometheus also has irate(). This looks at the last two points in the given range, and returns how fast they're increasing per second. Great for seeing very up to date data. Can be hard to read if data is very spiky. Need to be careful you're not missing data.
  • 40. Steps and rate durations The query_range HTTP endpoint has a step parameter, this is the resolution of the data returned. If you have a 10m step and 5m rate, you're going to be ignoring half your data. To avoid this, make sure your rate range is at least your step for graphs. For irate(), your step should be no more than your sample resolution.
  • 41. Compound Types: Summary How to track average latency? With two counters! One for total requests (_count), one for total latency of those requests (_sum). Take the rates, divide and you have average latency. This is how the compound Summary metric works. It's a more convenient API over doing the above by hand. Some clients also offer quantiles. Beware, slow and unaggregatable.
  • 42. Compound Types: Histogram Histogram also includes the _count and _sum. The main purpose is calculating quantiles in Prometheus. The histogram has buckets, which are counters. You can take the rate() of these, aggregate them and then use histogram_quantile() to calculate arbitrary quantiles. Be wary of cardinality explosion, use sparingly.
  • 43. Resources Official Project Website: prometheus.io User Mailing List: prometheus-users@googlegroups.com Developer Mailing List: prometheus-developers@googlegroups.com Source code: https://github.com/prometheus/prometheus/blob/master/promql/functions.go Robust Perception Blog: www.robustperception.io/blog