SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Time Series in Prometheus
Fabian Reinartz – Engineer, SoundCloud Ltd.
prometheus.io
...
http_requests_total{status="200",method="GET"} @1434317560938 94355
http_requests_total{status="200",method="GET"} @1434317561287 94934
http_requests_total{status="200",method="GET"} @1434317562344 96483
http_requests_total{status="404",method="GET"} @1434317560938 38473
http_requests_total{status="404",method="GET"} @1434317561249 38544
http_requests_total{status="404",method="GET"} @1434317562588 38663
http_requests_total{status="200",method="POST"} @1434317560885 4748
http_requests_total{status="200",method="POST"} @1434317561483 4795
http_requests_total{status="200",method="POST"} @1434317562589 4833
http_requests_total{status="404",method="POST"} @1434317560939 122
...
Prometheus Metrics
Metric name Labels Timestamp Sample Value
● 1 million time series
● 10 second sample resolution
● 64bit timestamp + 64bit value
Requirements
100,000 samples/sec
time [~weeks]
series
[~millions]
Writes
The Fundamental Problem
Orthogonal write and read patterns.
Reads
...
http_requests_total{status="200",method="GET"} @1434317560938 94355
http_requests_total{status="200",method="GET"} @1434317561287 94934
http_requests_total{status="200",method="GET"} @1434317562344 96483
http_requests_total{status="404",method="GET"} @1434317560938 38473
http_requests_total{status="404",method="GET"} @1434317561249 38544
http_requests_total{status="404",method="GET"} @1434317562588 38663
http_requests_total{status="200",method="POST"} @1434317560885 4748
http_requests_total{status="200",method="POST"} @1434317561483 4795
http_requests_total{status="200",method="POST"} @1434317562589 4833
http_requests_total{status="404",method="POST"} @1434317560939 122
...
Prometheus Metrics
Key-Value store (with BigTable semantics) seems suitable.
Metric name Labels Timestamp Sample Value
VALUEKEY
Ingestion
PromQL
Storage
in-memory data
append(series, time, value)
series iterators
HDD / SSD
LevelDB
Encode
Decode
Compress Decompress
http_requests_total{status="200",method="GET"}
Prometheus Metrics
Metric name Labels
{__name__="http_requests_total",status="200",method="GET"}
Labels
Prometheus Metrics
Learning the hard way
__name__ = http_requests_total
status = 200
method = GET
fnv(sort( )
fnv( __name__ = http_requests_total )
fnv( status = 200 )
fnv( method = GET )
⊕
⊕
1KB chunks
chunk in memory
[complete and immutable]
head chunk
[incomplete]
Sample
Ingestion
append(series, time, value)
memory
disk
evictable chunks (LRU)
chunk on disk
[complete and immutable]
PromQL
series iterator
one file per time series
series hash:
Series maintenance
memory
disk
older than retention time
Chunk preloading
memory
disk
PromQL
series iterator
basetime
Anatomy of a chunk [v0]5bytesheader
basevalue
valuetime
valuetime
valuetime
valuetime
... (one per timestamp)
... (one per value)
1000000 1441558420098
1001050 1441558432221
1002040 1441558444311
1002040
1441558444311
1000000
1441558420098
1001050
1441558432221
basetime
Anatomy of a chunk [v1]5bytesheader
basevalue
valuetime
valuetime
valuetime
valuetime
... (one per timestamp)
... (one per value)
1000000 1441558420098
1050 12123
2040 24213
3100 36313
4250 48500
1002040
1441558444311
1000000
1441558420098
1001050
1441558432221
+12123
+1050
+12090
+1000
basetime
Anatomy of a chunk [v2]5bytesheader
basevalue
valuetime
valuetime
valuetime
valuetime
... (one per timestamp)
... (one per value)
1002040
1441558444311
1000000
1441558420098
1001050
1441558432221
1000000 1441558420098
1050 12123
-60 -33
-50 -56
+50 -8
+12123
+1050
+12090
+1000
basetime
Anatomy of a chunk [v2]5bytesheader
basevalue
valuetime
valuetime
valuetime
valuetime
... (one per timestamp)
... (one per value)
13:14 < nostrovsk> Hey guys, Looking for a sanity check here
13:15 < nostrovsk> 500 machines per server, each running node and jmx exporters, for 1 week is
only 30gb of data?
13:36 <@ bbrazil> what's your scrape rate and how heavy are those jmx exporters?
13:37 <@ bbrazil> doesn't sound implausible to me
13:42 <@ bbrazil> we're 25GB/two weeks with ~5k samples/s
13:45 <@ beorn7> Compression, it works... ;)
13:53 < fish_> beorn7: nothing says better 'good job' than people coming to this channel
because they can't believe that things are soo good :)
rate(prometheus_local_storage_ingested_samples_total[1m])
Checkpointing
On shutdown and regularly to limit data loss in case of a crash.
memory
disk
checkpoint file
Prometheus Storage

Contenu connexe

Tendances

Tendances (20)

Terraform: An Overview & Introduction
Terraform: An Overview & IntroductionTerraform: An Overview & Introduction
Terraform: An Overview & Introduction
 
Terraform
TerraformTerraform
Terraform
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Terraform Introduction
Terraform IntroductionTerraform Introduction
Terraform Introduction
 
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max IndenEnd to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
Comprehensive Terraform Training
Comprehensive Terraform TrainingComprehensive Terraform Training
Comprehensive Terraform Training
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
 
Observability: Beyond the Three Pillars with Spring
Observability: Beyond the Three Pillars with SpringObservability: Beyond the Three Pillars with Spring
Observability: Beyond the Three Pillars with Spring
 
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
Infrastructure-as-Code (IaC) Using Terraform (Advanced Edition)
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as Code
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & Grafana
 
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
 
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
Using HashiCorp’s Terraform to build your infrastructure on AWS - Pop-up Loft...
 
Kyma: Extending Business systems with Kubernetes, Istio and <fill the blank>.
Kyma: Extending Business systems with Kubernetes, Istio and <fill the blank>.Kyma: Extending Business systems with Kubernetes, Istio and <fill the blank>.
Kyma: Extending Business systems with Kubernetes, Istio and <fill the blank>.
 
Prometheus – a next-gen Monitoring System
Prometheus – a next-gen Monitoring SystemPrometheus – a next-gen Monitoring System
Prometheus – a next-gen Monitoring System
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheus
 

Similaire à Prometheus Storage

Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.
David Tollmyr
 
Chapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab FunctionsChapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab Functions
Siva Gopal
 
Introduction to Apache Spark / PUT 06.2014
Introduction to Apache Spark / PUT 06.2014Introduction to Apache Spark / PUT 06.2014
Introduction to Apache Spark / PUT 06.2014
bbogacki
 
Chapter 5 - Plotting
Chapter 5 - PlottingChapter 5 - Plotting
Chapter 5 - Plotting
Siva Gopal
 

Similaire à Prometheus Storage (20)

Prometheus – Storage
Prometheus – StoragePrometheus – Storage
Prometheus – Storage
 
Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.Gotcha! Ruby things that will come back to bite you.
Gotcha! Ruby things that will come back to bite you.
 
Debugging TV Frame 0x24
Debugging TV Frame 0x24Debugging TV Frame 0x24
Debugging TV Frame 0x24
 
DB2 Workload Manager Histograms
DB2 Workload Manager HistogramsDB2 Workload Manager Histograms
DB2 Workload Manager Histograms
 
Chapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab FunctionsChapter 3 -Built-in Matlab Functions
Chapter 3 -Built-in Matlab Functions
 
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...Shrimp: A Rather Practical Example Of Application Development With RESTinio a...
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...
 
Bayside pv gearhead_brochure
Bayside pv gearhead_brochureBayside pv gearhead_brochure
Bayside pv gearhead_brochure
 
Czzawk
CzzawkCzzawk
Czzawk
 
Introduction to Apache Spark / PUT 06.2014
Introduction to Apache Spark / PUT 06.2014Introduction to Apache Spark / PUT 06.2014
Introduction to Apache Spark / PUT 06.2014
 
Event-Driven Microservices: Back to the Basics
Event-Driven Microservices: Back to the BasicsEvent-Driven Microservices: Back to the Basics
Event-Driven Microservices: Back to the Basics
 
How and Why Prometheus' New Storage Engine Pushes the Limits of Time Series D...
How and Why Prometheus' New Storage Engine Pushes the Limits of Time Series D...How and Why Prometheus' New Storage Engine Pushes the Limits of Time Series D...
How and Why Prometheus' New Storage Engine Pushes the Limits of Time Series D...
 
6. Vectors – Data Frames
6. Vectors – Data Frames6. Vectors – Data Frames
6. Vectors – Data Frames
 
제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀
 
Data Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes backData Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes back
 
Performance and stability testing \w Gatling
Performance and stability testing \w GatlingPerformance and stability testing \w Gatling
Performance and stability testing \w Gatling
 
Chapter 5 - Plotting
Chapter 5 - PlottingChapter 5 - Plotting
Chapter 5 - Plotting
 
Varnish Caching
Varnish CachingVarnish Caching
Varnish Caching
 
OSMC 2016 - Alerting with Time Series by Fabian Reinartz
OSMC 2016 - Alerting with Time Series by Fabian ReinartzOSMC 2016 - Alerting with Time Series by Fabian Reinartz
OSMC 2016 - Alerting with Time Series by Fabian Reinartz
 
OSMC 2016 | Alerting with Time Series by Fabian Reinartz
OSMC 2016 | Alerting with Time Series by Fabian ReinartzOSMC 2016 | Alerting with Time Series by Fabian Reinartz
OSMC 2016 | Alerting with Time Series by Fabian Reinartz
 
OpenWorld 2018 - Common Application Developer Disasters
OpenWorld 2018 - Common Application Developer DisastersOpenWorld 2018 - Common Application Developer Disasters
OpenWorld 2018 - Common Application Developer Disasters
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Prometheus Storage

  • 1. Time Series in Prometheus Fabian Reinartz – Engineer, SoundCloud Ltd.
  • 3.
  • 4. ... http_requests_total{status="200",method="GET"} @1434317560938 94355 http_requests_total{status="200",method="GET"} @1434317561287 94934 http_requests_total{status="200",method="GET"} @1434317562344 96483 http_requests_total{status="404",method="GET"} @1434317560938 38473 http_requests_total{status="404",method="GET"} @1434317561249 38544 http_requests_total{status="404",method="GET"} @1434317562588 38663 http_requests_total{status="200",method="POST"} @1434317560885 4748 http_requests_total{status="200",method="POST"} @1434317561483 4795 http_requests_total{status="200",method="POST"} @1434317562589 4833 http_requests_total{status="404",method="POST"} @1434317560939 122 ... Prometheus Metrics Metric name Labels Timestamp Sample Value
  • 5. ● 1 million time series ● 10 second sample resolution ● 64bit timestamp + 64bit value Requirements 100,000 samples/sec
  • 6. time [~weeks] series [~millions] Writes The Fundamental Problem Orthogonal write and read patterns. Reads
  • 7. ... http_requests_total{status="200",method="GET"} @1434317560938 94355 http_requests_total{status="200",method="GET"} @1434317561287 94934 http_requests_total{status="200",method="GET"} @1434317562344 96483 http_requests_total{status="404",method="GET"} @1434317560938 38473 http_requests_total{status="404",method="GET"} @1434317561249 38544 http_requests_total{status="404",method="GET"} @1434317562588 38663 http_requests_total{status="200",method="POST"} @1434317560885 4748 http_requests_total{status="200",method="POST"} @1434317561483 4795 http_requests_total{status="200",method="POST"} @1434317562589 4833 http_requests_total{status="404",method="POST"} @1434317560939 122 ... Prometheus Metrics Key-Value store (with BigTable semantics) seems suitable. Metric name Labels Timestamp Sample Value VALUEKEY
  • 8.
  • 9. Ingestion PromQL Storage in-memory data append(series, time, value) series iterators HDD / SSD LevelDB Encode Decode Compress Decompress
  • 10.
  • 11. http_requests_total{status="200",method="GET"} Prometheus Metrics Metric name Labels {__name__="http_requests_total",status="200",method="GET"} Labels
  • 12. Prometheus Metrics Learning the hard way __name__ = http_requests_total status = 200 method = GET fnv(sort( ) fnv( __name__ = http_requests_total ) fnv( status = 200 ) fnv( method = GET ) ⊕ ⊕
  • 13. 1KB chunks chunk in memory [complete and immutable] head chunk [incomplete] Sample Ingestion append(series, time, value) memory disk evictable chunks (LRU) chunk on disk [complete and immutable] PromQL series iterator one file per time series series hash:
  • 16. basetime Anatomy of a chunk [v0]5bytesheader basevalue valuetime valuetime valuetime valuetime ... (one per timestamp) ... (one per value) 1000000 1441558420098 1001050 1441558432221 1002040 1441558444311 1002040 1441558444311 1000000 1441558420098 1001050 1441558432221
  • 17. basetime Anatomy of a chunk [v1]5bytesheader basevalue valuetime valuetime valuetime valuetime ... (one per timestamp) ... (one per value) 1000000 1441558420098 1050 12123 2040 24213 3100 36313 4250 48500 1002040 1441558444311 1000000 1441558420098 1001050 1441558432221 +12123 +1050 +12090 +1000
  • 18. basetime Anatomy of a chunk [v2]5bytesheader basevalue valuetime valuetime valuetime valuetime ... (one per timestamp) ... (one per value) 1002040 1441558444311 1000000 1441558420098 1001050 1441558432221 1000000 1441558420098 1050 12123 -60 -33 -50 -56 +50 -8 +12123 +1050 +12090 +1000
  • 19. basetime Anatomy of a chunk [v2]5bytesheader basevalue valuetime valuetime valuetime valuetime ... (one per timestamp) ... (one per value) 13:14 < nostrovsk> Hey guys, Looking for a sanity check here 13:15 < nostrovsk> 500 machines per server, each running node and jmx exporters, for 1 week is only 30gb of data? 13:36 <@ bbrazil> what's your scrape rate and how heavy are those jmx exporters? 13:37 <@ bbrazil> doesn't sound implausible to me 13:42 <@ bbrazil> we're 25GB/two weeks with ~5k samples/s 13:45 <@ beorn7> Compression, it works... ;) 13:53 < fish_> beorn7: nothing says better 'good job' than people coming to this channel because they can't believe that things are soo good :)
  • 20.
  • 22. Checkpointing On shutdown and regularly to limit data loss in case of a crash. memory disk checkpoint file