SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
What it is, what is new, and what’s coming
Prometheus
Julien Pivotto and Richard “RichiH” Hartmann
What it is
Prometheus
| 2
| 3
Prometheus 101
● Inspired by Google's Borgmon
● Time series database
● unit64 millisecond timestamp, float64 value
● Over 1000 community-created instrumentation & exporters
● Metrics, not logs
● Dashboarding via Grafana
| 4
Main selling points
● Highly dynamic, built-in service discovery
● No hierarchical model, n-dimensional label set
● PromQL: for processing, graphing, alerting, and export
● Simple operation
● Highly efficient
| 5
Main selling points
● Prometheus is a pull-based system
● Black-box monitoring: Looking at a service from the outside (Does
the server answer to HTTP requests?)
● White-box monitoring: Instrumenting code from the inside (How
much time does this subroutine take?)
● Every service should have its own metrics endpoint
● Hard API commitments within major versions
| 6
Time series
● Time series are recorded values that change over time
● Individual events are usually merged into counters and/or histograms
● Changing values are recorded as gauges
● Typical examples
○ Access rates to a web server (counter)
○ Temperatures in a data center (gauge)
○ Service latency (histograms)
| 7
Super easy to emit, parse & read
http_requests_total{env="prod",method="post",code="200"} 1027
http_requests_total{env="prod",method="post",code="400"} 3
http_requests_total{env="prod",method="post",code="500"} 12
http_requests_total{env="prod",method="get",code="200"} 20
http_requests_total{env="test",method="post",code="200"} 372
http_requests_total{env="test",method="post",code="400"} 75
| 8
Scale
● Kubernetes is ~Borg
● Prometheus is ~Borgmon, but with Monarch APIs
● Google couldn't have run Borg without Borgmon (and Omega and
Monarch)
● Kubernetes & Prometheus are designed and written with each other
in mind
| 9
Scale
● 2,500,000+ samples/second/instance
● 60,000+ samples/second/core
● 16 bytes/sample compressed to 1.36 bytes/sample
The highest we saw in production on a single Prometheus instance were
125,000,000 active times series at once!
| 10
Long-term storage
● Two long-term storage solutions have Prometheus-team members
working on them
○ Thanos
■ Historically easier to run, but slower
■ Scales storage horizontally
○ Cortex
■ Easy to run these days
■ Scales storage, ingester, and querier horizontally
● Both converge on tech again; I have annoyed people with “Corthanos”
for years
What is new?
Prometheus
| 11
Service discoveries
| 12
● In the last year, we added 5 new service discoveries
○ DigitalOcean
○ Scaleway
○ Hetzner
○ Eureka
○ Docker
○ Docker Swarm
| 13
Basic Authentication / TLS
● Prometheus has gained support for TLS/basic auth server side
● A new “exporter toolkit” has been created for the Go exporters
● TLS/Basic auth is being added to more and more exporters
PromQL
| 14
● New functions, like last_over_time
● @ modifier
○ rate(container_cpu_usage) and
topk(4, rate(container_cpu_usage[5m])) @ end()
● Negative offsets
○ up offset -5m
● Composite durations, e.g. 1h30m.
Remote Write receiver
| 15
● Prometheus can receive metrics from Remote Write
● Writing from one Prometheus server to another
● Enable new use cases, like Prometheus “on the edge”
UI
| 16
● Prometheus has switched to the React UI by default
● New fully-featured editor, with labels autocompletion, snippets
● Dark theme
Exemplars
| 17
http_request_seconds{le=”5.0”} 9036.32 # {trace_id="KOO5S4vxi0o"} 0.67
● Attach external data to metrics (trace ID)
● Easily jump from metrics to traces
● Grafana supports them
● Trace & span ID format taken from W3C tracing specs
Alertmanager
| 18
● Time-based muting
○ Do not send alerts in weekends / Out of business hours
○ Controlled per route
● Negative matchers
○ Silence alerts that do not match certain labels
What is coming
Prometheus
| 19
| 20
Aggressively open
● Historically, Prometheus has been conservative even with features
marked EXPERIMENTAL
○ We treated them as stable
● Revisiting a lot of old assumptions, and enabling more use cases
● Make our code more modular, easier to re-use
● Lots of work on Agents and data pipelines; support all deployment
and operating models in upstream https://github.com/prometheus
● Exporter toolkit to ease writing and operating them
○ Mixins out of the box
| 21
Aggressively open
● Current design docs
○ Prometheus Agent
○ Configuration handling in exporters and Prometheus 3.x
○ Improved joins in PromQL across different label names
Imitation is the sincerest form of flattery
Oscar Wilde
| 22
| 23
CNCF End User Survey on Observability
| 24
“PromQL-style syntax”
Jul 2020
“added support for PromQL”
Sept 2020
“Creating the PromQL Transpiler for Flux”
Jun 2019
“support the Prometheus
query language, PromQL”
Jun 2020
“Promscale supports ... PromQL”
Oct 2020
“AWS Introduces ... Amazon Managed
Service for Prometheus”
Jan 2021
Working towards full compatibility
Mar 2021
| 25
Tests, compliance, and compatibility
● CNCF, vendors, and projects have asked us for support
○ OpenMetrics specification
○ Prometheus Remote Write Specification v0.1
● Test suites
○ PromQL Compliance
○ Remote Write Compliance
○ OpenMetrics Compliance
○ Potentially TSDB & data correctness
| 26
Share your thoughts!
● All dev summits recorded and open to join
○ Prometheus dev summit, rolling document
○ Prometheus Community Meeting
○ Prometheus Storage Working Group
○ Prometheus Contributor Office Hours
● All public meetings in one calendar
● Prometheus Monitoring YouTube Channel
Thank you!
https://twitter.com/roidelapluie
https://twitter.com/TwitchiH

Contenu connexe

Tendances

Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 

Tendances (20)

MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability Library
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
OpenTelemetry For Architects
OpenTelemetry For ArchitectsOpenTelemetry For Architects
OpenTelemetry For Architects
 
Prometheus + Grafana = Awesome Monitoring
Prometheus + Grafana = Awesome MonitoringPrometheus + Grafana = Awesome Monitoring
Prometheus + Grafana = Awesome Monitoring
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
 
Introduction to Grafana Loki
Introduction to Grafana LokiIntroduction to Grafana Loki
Introduction to Grafana Loki
 
Prometheus workshop
Prometheus workshopPrometheus workshop
Prometheus workshop
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and Grafana
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Monitoring microservices with Prometheus
Monitoring microservices with PrometheusMonitoring microservices with Prometheus
Monitoring microservices with Prometheus
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on Kubernetes
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
 
Cloud Monitoring tool Grafana
Cloud Monitoring  tool Grafana Cloud Monitoring  tool Grafana
Cloud Monitoring tool Grafana
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & Grafana
 

Similaire à Prometheus: What is is, what is new, what is coming

DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
Fedir RYKHTIK
 

Similaire à Prometheus: What is is, what is new, what is coming (20)

OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
 
Go at uber
Go at uberGo at uber
Go at uber
 
Kubernetes 1.12 Update and Container Security with Liz Rice
Kubernetes 1.12 Update and Container Security with Liz RiceKubernetes 1.12 Update and Container Security with Liz Rice
Kubernetes 1.12 Update and Container Security with Liz Rice
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps Workshop
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in Production
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open Source
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
 
Kubernetes - how to orchestrate containers
Kubernetes - how to orchestrate containersKubernetes - how to orchestrate containers
Kubernetes - how to orchestrate containers
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentWebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
 
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
 
Integrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsIntegrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperations
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
 
Promise of DevOps
Promise of DevOpsPromise of DevOps
Promise of DevOps
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps Workshop
 

Plus de Julien Pivotto

Plus de Julien Pivotto (20)

The O11y Toolkit
The O11y ToolkitThe O11y Toolkit
The O11y Toolkit
 
What's New in Prometheus and Its Ecosystem
What's New in Prometheus and Its EcosystemWhat's New in Prometheus and Its Ecosystem
What's New in Prometheus and Its Ecosystem
 
What's new in Prometheus?
What's new in Prometheus?What's new in Prometheus?
What's new in Prometheus?
 
Why you should revisit mgmt
Why you should revisit mgmtWhy you should revisit mgmt
Why you should revisit mgmt
 
Observing the HashiCorp Ecosystem From Prometheus
Observing the HashiCorp Ecosystem From PrometheusObserving the HashiCorp Ecosystem From Prometheus
Observing the HashiCorp Ecosystem From Prometheus
 
Monitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with PrometheusMonitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with Prometheus
 
5 tips for Prometheus Service Discovery
5 tips for Prometheus Service Discovery5 tips for Prometheus Service Discovery
5 tips for Prometheus Service Discovery
 
Prometheus and TLS - an Introduction
Prometheus and TLS - an IntroductionPrometheus and TLS - an Introduction
Prometheus and TLS - an Introduction
 
Powerful graphs in Grafana
Powerful graphs in GrafanaPowerful graphs in Grafana
Powerful graphs in Grafana
 
YAML Magic
YAML MagicYAML Magic
YAML Magic
 
HAProxy as Egress Controller
HAProxy as Egress ControllerHAProxy as Egress Controller
HAProxy as Egress Controller
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and Alertmanager
 
SIngle Sign On with Keycloak
SIngle Sign On with KeycloakSIngle Sign On with Keycloak
SIngle Sign On with Keycloak
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
Incident Resolution as Code
Incident Resolution as CodeIncident Resolution as Code
Incident Resolution as Code
 
Monitor your CentOS stack with Prometheus
Monitor your CentOS stack with PrometheusMonitor your CentOS stack with Prometheus
Monitor your CentOS stack with Prometheus
 
Monitor your CentOS stack with Prometheus
Monitor your CentOS stack with PrometheusMonitor your CentOS stack with Prometheus
Monitor your CentOS stack with Prometheus
 
An introduction to Ansible
An introduction to AnsibleAn introduction to Ansible
An introduction to Ansible
 
Jsonnet
JsonnetJsonnet
Jsonnet
 
Cfgmgmt Challenges aren't technical anymore
Cfgmgmt Challenges aren't technical anymoreCfgmgmt Challenges aren't technical anymore
Cfgmgmt Challenges aren't technical anymore
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Prometheus: What is is, what is new, what is coming

  • 1. What it is, what is new, and what’s coming Prometheus Julien Pivotto and Richard “RichiH” Hartmann
  • 3. | 3 Prometheus 101 ● Inspired by Google's Borgmon ● Time series database ● unit64 millisecond timestamp, float64 value ● Over 1000 community-created instrumentation & exporters ● Metrics, not logs ● Dashboarding via Grafana
  • 4. | 4 Main selling points ● Highly dynamic, built-in service discovery ● No hierarchical model, n-dimensional label set ● PromQL: for processing, graphing, alerting, and export ● Simple operation ● Highly efficient
  • 5. | 5 Main selling points ● Prometheus is a pull-based system ● Black-box monitoring: Looking at a service from the outside (Does the server answer to HTTP requests?) ● White-box monitoring: Instrumenting code from the inside (How much time does this subroutine take?) ● Every service should have its own metrics endpoint ● Hard API commitments within major versions
  • 6. | 6 Time series ● Time series are recorded values that change over time ● Individual events are usually merged into counters and/or histograms ● Changing values are recorded as gauges ● Typical examples ○ Access rates to a web server (counter) ○ Temperatures in a data center (gauge) ○ Service latency (histograms)
  • 7. | 7 Super easy to emit, parse & read http_requests_total{env="prod",method="post",code="200"} 1027 http_requests_total{env="prod",method="post",code="400"} 3 http_requests_total{env="prod",method="post",code="500"} 12 http_requests_total{env="prod",method="get",code="200"} 20 http_requests_total{env="test",method="post",code="200"} 372 http_requests_total{env="test",method="post",code="400"} 75
  • 8. | 8 Scale ● Kubernetes is ~Borg ● Prometheus is ~Borgmon, but with Monarch APIs ● Google couldn't have run Borg without Borgmon (and Omega and Monarch) ● Kubernetes & Prometheus are designed and written with each other in mind
  • 9. | 9 Scale ● 2,500,000+ samples/second/instance ● 60,000+ samples/second/core ● 16 bytes/sample compressed to 1.36 bytes/sample The highest we saw in production on a single Prometheus instance were 125,000,000 active times series at once!
  • 10. | 10 Long-term storage ● Two long-term storage solutions have Prometheus-team members working on them ○ Thanos ■ Historically easier to run, but slower ■ Scales storage horizontally ○ Cortex ■ Easy to run these days ■ Scales storage, ingester, and querier horizontally ● Both converge on tech again; I have annoyed people with “Corthanos” for years
  • 12. Service discoveries | 12 ● In the last year, we added 5 new service discoveries ○ DigitalOcean ○ Scaleway ○ Hetzner ○ Eureka ○ Docker ○ Docker Swarm
  • 13. | 13 Basic Authentication / TLS ● Prometheus has gained support for TLS/basic auth server side ● A new “exporter toolkit” has been created for the Go exporters ● TLS/Basic auth is being added to more and more exporters
  • 14. PromQL | 14 ● New functions, like last_over_time ● @ modifier ○ rate(container_cpu_usage) and topk(4, rate(container_cpu_usage[5m])) @ end() ● Negative offsets ○ up offset -5m ● Composite durations, e.g. 1h30m.
  • 15. Remote Write receiver | 15 ● Prometheus can receive metrics from Remote Write ● Writing from one Prometheus server to another ● Enable new use cases, like Prometheus “on the edge”
  • 16. UI | 16 ● Prometheus has switched to the React UI by default ● New fully-featured editor, with labels autocompletion, snippets ● Dark theme
  • 17. Exemplars | 17 http_request_seconds{le=”5.0”} 9036.32 # {trace_id="KOO5S4vxi0o"} 0.67 ● Attach external data to metrics (trace ID) ● Easily jump from metrics to traces ● Grafana supports them ● Trace & span ID format taken from W3C tracing specs
  • 18. Alertmanager | 18 ● Time-based muting ○ Do not send alerts in weekends / Out of business hours ○ Controlled per route ● Negative matchers ○ Silence alerts that do not match certain labels
  • 20. | 20 Aggressively open ● Historically, Prometheus has been conservative even with features marked EXPERIMENTAL ○ We treated them as stable ● Revisiting a lot of old assumptions, and enabling more use cases ● Make our code more modular, easier to re-use ● Lots of work on Agents and data pipelines; support all deployment and operating models in upstream https://github.com/prometheus ● Exporter toolkit to ease writing and operating them ○ Mixins out of the box
  • 21. | 21 Aggressively open ● Current design docs ○ Prometheus Agent ○ Configuration handling in exporters and Prometheus 3.x ○ Improved joins in PromQL across different label names
  • 22. Imitation is the sincerest form of flattery Oscar Wilde | 22
  • 23. | 23 CNCF End User Survey on Observability
  • 24. | 24 “PromQL-style syntax” Jul 2020 “added support for PromQL” Sept 2020 “Creating the PromQL Transpiler for Flux” Jun 2019 “support the Prometheus query language, PromQL” Jun 2020 “Promscale supports ... PromQL” Oct 2020 “AWS Introduces ... Amazon Managed Service for Prometheus” Jan 2021 Working towards full compatibility Mar 2021
  • 25. | 25 Tests, compliance, and compatibility ● CNCF, vendors, and projects have asked us for support ○ OpenMetrics specification ○ Prometheus Remote Write Specification v0.1 ● Test suites ○ PromQL Compliance ○ Remote Write Compliance ○ OpenMetrics Compliance ○ Potentially TSDB & data correctness
  • 26. | 26 Share your thoughts! ● All dev summits recorded and open to join ○ Prometheus dev summit, rolling document ○ Prometheus Community Meeting ○ Prometheus Storage Working Group ○ Prometheus Contributor Office Hours ● All public meetings in one calendar ● Prometheus Monitoring YouTube Channel