SlideShare a Scribd company logo
1 of 48
Download to read offline
Build Fast, Scalable
App Monitoring with
Open Source
Robert Hodges - Altinity
Roman Khavronenko - VictoriaMetrics
1
Let’s make some introductions
2
Robert Hodges
Database geek with 30+ years
on DBMS systems. Day job:
CEO at Altinity
Roman Khavronenko
Distributed systems and
monitoring engineer. Day job:
SE at VictoriaMetrics
What is
application
monitoring?
3
Monitoring is for answering questions
● Why users are getting errors?
● When it started?
● How many users are affected?
● Which service is failing?
4
To get an answer to the question you need 3 things
1. The question
2. The information to process
3. The respondent
5
6
7
8
9
10
Using
VictoriaMetrics
11
VictoriaMetrics - Open Source Time Series Database & Monitoring Solution
● Vertically and horizontally scalable
● Operational simplicity
● Cost-efficient
● Prometheus compatible
● Free forever
12
VictoriaMetrics - Open Source Time Series Database & Monitoring Solution
● Kubernetes monitoring
● Hardware and infrastructure monitoring
● Application Performance Monitoring (APM)
● IoT
● Edge computing
● Alerting
13
14
Metric is a numeric measure or observation of something:
● Number of served requests
● Requests latency
● CPU or memory usage
● Occupied or free disk space
What is a metric?
15
Metrics structure
16
Storage for metrics
17
● VictoriaMetrics data model is schemaless
● No need to define metric names or their labels in advance
● User is free to add or change ingested metrics anytime.
Storage for metrics
18
OSA Con 2021: How ClickHouse Inspired Us
to Build a High Performance TSDB
● VictoriaMetrics is specialized solution for time series data
● Compression reaches 0.4 Bytes per sample
● Ingestions speed 300k samples/s per CPU core
● Scanning speed 50Mil samples/s per CPU core
19
> curl https://my.application/metrics
requests_total{path="/",code="200"}10
requests_total{path="/",code="240300"}1
20
> curl -d "requests_total{path="/",code="200"} 10" -X POST
http://victoriametrics/api/v1/import/prometheus
21
More than one protocol for metrics
● Prometheus remote write API.
● Prometheus text exposition format.
● DataDog protocol.
● InfluxDB line protocol over HTTP, TCP and UDP.
● Graphite plaintext protocol with tags.
● OpenTSDB put message.
● HTTP OpenTSDB /api/put requests.
● JSON line format.
● Arbitrary CSV data.
22
Querying via MetricsQL
23
Querying via MetricsQL
24
Demo time!
● Run VictoriaMetrics
● Write some metrics
● Execute read queries
25
Frequently asked questions
● Can I monitor MySQL Server, Postgres, MongoDB, ClickHouse?
○ Yes, there are plenty of exporters, dashboards and alerting rules there.
● Can I monitor my applications?
○ Yes, there are libraries for multiple programming languages to instrument the application with
metrics.
● How expensive monitoring is?
○ With VictoriaMetrics, cost of storing metrics from 100 instances, each instance emits 1000
metrics every 30s for the total cost will be:
■ 100GB of disk space $0.045 per GB-month: 100*0.045*12 = $54
■ One t3.medium instance, $0.0418 per hour: 0.0418*730*12 = $366
■ Total: $420 per year for monitoring 100 instances.
● Can I run it in Kubernetes?
○ Sure! We have k8s operator and helm charts for VictoriaMetrics!
26
Using
ClickHouse
27
ClickHouse: a real-time analytic database
It understands SQL
It’s Apache 2.0
It handles many use cases beyond monitoring
It also handles time series data very well
28
ClickHouse optimizes for fast response on large datasets
29
Highly compressed column
storage with indexing
Automatic replication
between nodes
SELECT host, avg(idle)
FROM vmstat GROUP BY host
Parallelized/vectorized
query
Table replica
ClickHouse can load millions of events per second
30
Unaggregated
event data
Source data table(s)
Parallel load
Event
Queue
(Kafka)
Custom
Application
Data Lake
(S3, HDFS)
Precomputed
aggregates
Precomputed
aggregates
Precomputed
aggregates
Materialized views
Instantly queryable
…And supports [many] dozens of input formats
31
INSERT INTO some_table Format <format>
TabSeparated
TabSeparatedWithNames
CSV
CSVWithNames
CustomSeparated
Values
JSON
JSONEachRow
Protobuf
Parquet
...
There are many ways to store and manipulate time data
Date -- Precision to day
DateTime -- Precision to second
DateTime64 -- Precision to
nanosecond
toYear(), toMonth(), toWeek(),
toDayOfWeek, toDay(), toHour(), ...
toStartOfYear(), toStartOfQuarter(),
toStartOfMonth(), toStartOfHour(),
toStartOfMinute(), …, toStartOfInterval()
toYYYYMM()
toYYYYMMDD()
toYYYYMMDDhhmmsss()
And many more!
32
BI tools like Grafana like
DateTime values
Let’s build a simple host monitoring system
33
$ vmstat 1 -n
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 166912 2645740 36792 3360652 0 0 3 101 1 1 2 1 98 0 0
1 0 166912 2645360 36792 3360652 0 0 0 0 1182 3986 7 1 93 0 0
ClickHouse
Grafana
Dashboard
Step 1: Generate vmstat data
34
#!/usr/bin/env python3
import datetime, json, socket, subprocess
host = socket.gethostname()
with subprocess.Popen(['vmstat', '-n', '1'], stdout=subprocess.PIPE) as proc:
proc.stdout.readline() # discard first line
header_names = proc.stdout.readline().decode().split()
values = proc.stdout.readline().decode()
while values != '' and proc.poll() is None:
dict = {}
dict['timestamp'] = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
dict['host'] = host
for (header, value) in zip(header_names, values.split()):
dict[header] = int(value)
print(json.dumps(dict), flush=True)
values = proc.stdout.readline().decode()
Here’s the output
35
{"timestamp": "2023-01-22 18:13:16", "host": "logos3", "r": 0, "b":
0, "swpd": 166912, "free": 2523688, "buff": 41412, "cache": 3408292,
"si": 0, "so": 0, "bi": 3, "bo": 101, "in": 1, "cs": 0, "us": 2,
"sy": 1, "id": 98, "wa": 0, "st": 0}
{"timestamp": "2023-01-22 18:13:17", "host": "logos3", "r": 0, "b":
0, "swpd": 166912, "free": 2523696, "buff": 41412, "cache": 3408316,
"si": 0, "so": 0, "bi": 0, "bo": 216, "in": 1214, "cs": 4320, "us":
1, "sy": 1, "id": 98, "wa": 0, "st": 0}
{"timestamp": "2023-01-22 18:13:18", "host": "logos3", "r": 0, "b":
0, "swpd": 166912, "free": 2527120, "buff": 41412, "cache": 3408572,
"si": 0, "so": 0, "bi": 0, "bo": 0, "in": 1172, "cs": 4162, "us": 2,
"sy": 1, "id": 98, "wa": 0, "st": 0}
Step 2: Design a ClickHouse table to hold data
36
CREATE TABLE monitoring.vmstat (
timestamp DateTime,
day UInt32 default toYYYYMMDD(timestamp),
host String,
r UInt64, b UInt64, -- procs
swpd UInt64, free UInt64, buff UInt64, cache UInt64, -- memory
si UInt64, so UInt64, -- swap
bi UInt64, bo UInt64, -- io
in UInt64, cs UInt64, -- system
us UInt64, sy UInt64, id UInt64, wa UInt64, st UInt64 -- cpu
) ENGINE=MergeTree
PARTITION BY day
ORDER BY (host, timestamp)
Dimensions
Measurements
Step 3: Load data into ClickHouse
37
INSERT INTO vmstat Format JSONEachRow
E.g.
INSERT='INSERT%20INTO%20vmstat%20Format%20JSONEachRow'
cat vmstat.dat | curl -X POST --data-binary @- 
"http://logos3:8123/?database=monitoring&query=${INSERT}"
(Or a Python script)
Step 4: Build a Grafana dashboard to show results
38
ClickHouse data source for Grafana Altinity plugin for ClickHouse
After loading you can go crazy with analytical queries
39
SELECT host, count() AS loaded_minutes
FROM (
SELECT
toStartOfMinute(timestamp) AS minute, host, avg(100 - id) AS load
FROM monitoring.vmstat
WHERE timestamp > (now() - toIntervalDay(1))
GROUP BY minute, host HAVING load > 25
)
GROUP BY host ORDER BY loaded_minutes DESC
┌─host───┬─loaded_minutes─┐
│ logos3 │ 6 │
│ logos2 │ 5 │
└────────┴────────────────┘
2 hosts had > 25% load for at least
a minute in the last 24 hours
40
DEMO TIME!
Can ClickHouse store data in a “schemaless” way?
{{"timestamp":
"2023-01-23
19:53:14",
"host": "logos3",
...}
SQL Table
JSON
String
JSON String (“blob”) with
derived header values
One table can handle
many entity types!
41
More schemaless ways to store data
SQL Table
Array
of
Keys
Arrays: Header values
with key-value pairs
Array
of
Values
SQL Table
Map
with
Key/Values
Map: Header values &
key value pairs
SQL Table
JSON
Data
Type
JSON data type mapped to
column storage
42
Where is the software to build monitoring?
43
Event streaming
● Apache Kafka
● Apache Pulsar
● Vectorized Redpanda
ELT
● Apache Airflow
● Rudderstack
Rendering/Display
● Apache Superset
● Cube.js
● Grafana
Client Libraries
● C++ - ClickHouse CPP
● Golang - ClickHouse Go
● Java - ClickHouse JDBC
● Javascript/Node.js - Apla
● ODBC - ODBC Driver for ClickHouse
● Python - ClickHouse Driver, ClickHouse
SQLAlchemy
More client library links HERE
Kubernetes
● Altinity Operator for ClickHouse
Where can I find out more about ClickHouse?
ClickHouse official docs – https://clickhouse.com/docs/
Altinity Blog – https://altinity.com/blog/
Altinity Youtube Channel –
https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA
Altinity Knowledge Base – https://kb.altinity.com/
Meetups, other blogs, and external resources. Use your powers of Search!
44
Wrap-up
45
Comparing VictoriaMetrics and ClickHouse databases
VictoriaMetrics
Talks MetricsQL, PromQL, Graphite QL
Stores time series data
No explicit schema
Easy to load data using simple clients
Can pull data from Prometheus exporters and Kafka
Time-series specific functions and transformations
Integrates with any BI tool that speaks PromQL
Extremely fast and scalable
ClickHouse
Talks SQL
Stores any kind of data
Uses tables; many ways to represent data
Easy to load data using simple clients
Can pull data from Kafka and object storage
Versatile queries including JOIN and aggregation
Most BI tools have ClickHouse adapters
Extremely fast and scalable
46
Help for building monitoring systems that work
VictoriaMetrics Inc.
VictoriaMetrics Community
VictoriaMetrics Enterprise
VictoriaMetrics Managed platform
Altinity Inc.
Altinity.Cloud managed ClickHouse platform
Enterprise support for ClickHouse
Altinity Developer Academy classes
Altinity Stable Builds for ClickHouse
Altinity Kubernetes Operator for ClickHouse
47
Thank you!
Questions?
https://altinity.com
Contact Altinity
48
https://victoriametrics.com
Contact VictoriaMetrics

More Related Content

What's hot

patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
hyeongchae lee
 

What's hot (20)

Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouse
 
Understanding the architecture of MariaDB ColumnStore
Understanding the architecture of MariaDB ColumnStoreUnderstanding the architecture of MariaDB ColumnStore
Understanding the architecture of MariaDB ColumnStore
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Deep dive into Kubernetes Networking
Deep dive into Kubernetes NetworkingDeep dive into Kubernetes Networking
Deep dive into Kubernetes Networking
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Monitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and GrafanaMonitoring MySQL with Prometheus and Grafana
Monitoring MySQL with Prometheus and Grafana
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 

Similar to Application Monitoring using Open Source: VictoriaMetrics - ClickHouse

Similar to Application Monitoring using Open Source: VictoriaMetrics - ClickHouse (20)

Metarhia: Node.js Macht Frei
Metarhia: Node.js Macht FreiMetarhia: Node.js Macht Frei
Metarhia: Node.js Macht Frei
 
Presto GeoSpatial @ Strata New York 2017
Presto GeoSpatial @ Strata New York 2017Presto GeoSpatial @ Strata New York 2017
Presto GeoSpatial @ Strata New York 2017
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
KubeCon EU 2016 Keynote: Pushing Kubernetes Forward
KubeCon EU 2016 Keynote: Pushing Kubernetes ForwardKubeCon EU 2016 Keynote: Pushing Kubernetes Forward
KubeCon EU 2016 Keynote: Pushing Kubernetes Forward
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
 
Time series denver an introduction to prometheus
Time series denver   an introduction to prometheusTime series denver   an introduction to prometheus
Time series denver an introduction to prometheus
 
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDWDeveloping and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
 
ClickHouse Analytical DBMS. Introduction and usage, by Alexander Zaitsev
ClickHouse Analytical DBMS. Introduction and usage, by Alexander ZaitsevClickHouse Analytical DBMS. Introduction and usage, by Alexander Zaitsev
ClickHouse Analytical DBMS. Introduction and usage, by Alexander Zaitsev
 
Siddhi - cloud-native stream processor
Siddhi - cloud-native stream processorSiddhi - cloud-native stream processor
Siddhi - cloud-native stream processor
 
OpenTelemetry For Architects
OpenTelemetry For ArchitectsOpenTelemetry For Architects
OpenTelemetry For Architects
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase Update
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 

More from VictoriaMetrics

What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
VictoriaMetrics
 
WEDOS & VictoriaMetrics
WEDOS & VictoriaMetricsWEDOS & VictoriaMetrics
WEDOS & VictoriaMetrics
VictoriaMetrics
 

More from VictoriaMetrics (17)

VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
VictoriaMetrics December 2023 Meetup: Community Update
VictoriaMetrics December 2023 Meetup: Community UpdateVictoriaMetrics December 2023 Meetup: Community Update
VictoriaMetrics December 2023 Meetup: Community Update
 
VictoriaMetrics for the Atlas Cluster
VictoriaMetrics for the Atlas ClusterVictoriaMetrics for the Atlas Cluster
VictoriaMetrics for the Atlas Cluster
 
WEDOS & VictoriaMetrics
WEDOS & VictoriaMetricsWEDOS & VictoriaMetrics
WEDOS & VictoriaMetrics
 
VictoriaMetrics December 2023 Meetup: Anomaly Detection
VictoriaMetrics December 2023 Meetup: Anomaly DetectionVictoriaMetrics December 2023 Meetup: Anomaly Detection
VictoriaMetrics December 2023 Meetup: Anomaly Detection
 
VictoriaMetrics December 2023 Meetup: Managed VictoriaMetrics Update
VictoriaMetrics December 2023 Meetup: Managed VictoriaMetrics UpdateVictoriaMetrics December 2023 Meetup: Managed VictoriaMetrics Update
VictoriaMetrics December 2023 Meetup: Managed VictoriaMetrics Update
 
December 2024 Meetup: Welcome & VictoriaMetrics Updates
December 2024 Meetup: Welcome & VictoriaMetrics UpdatesDecember 2024 Meetup: Welcome & VictoriaMetrics Updates
December 2024 Meetup: Welcome & VictoriaMetrics Updates
 
What’s new in VictoriaLogs (Q3 2023)
What’s new in VictoriaLogs (Q3 2023)What’s new in VictoriaLogs (Q3 2023)
What’s new in VictoriaLogs (Q3 2023)
 
Q3 Meet Up '23 - Community Update
Q3 Meet Up '23 - Community UpdateQ3 Meet Up '23 - Community Update
Q3 Meet Up '23 - Community Update
 
Managed VictoriaMetrics: Intro & Update
Managed VictoriaMetrics: Intro & UpdateManaged VictoriaMetrics: Intro & Update
Managed VictoriaMetrics: Intro & Update
 
VM Anomaly Detection: Introduction
VM Anomaly Detection: IntroductionVM Anomaly Detection: Introduction
VM Anomaly Detection: Introduction
 
Q3 2023 Meet Up: What's New in VictoriaMetrics
Q3 2023 Meet Up: What's New in VictoriaMetricsQ3 2023 Meet Up: What's New in VictoriaMetrics
Q3 2023 Meet Up: What's New in VictoriaMetrics
 
VictoriaMetrics: Welcome to the Virtual Meet Up March 2023
VictoriaMetrics: Welcome to the Virtual Meet Up March 2023VictoriaMetrics: Welcome to the Virtual Meet Up March 2023
VictoriaMetrics: Welcome to the Virtual Meet Up March 2023
 
VictoriaMetrics 15/12 Meet Up: Updates on Managed VictoriaMetrics
VictoriaMetrics 15/12 Meet Up: Updates on Managed VictoriaMetricsVictoriaMetrics 15/12 Meet Up: Updates on Managed VictoriaMetrics
VictoriaMetrics 15/12 Meet Up: Updates on Managed VictoriaMetrics
 
VictoriaMetrics 2023 Roadmap
VictoriaMetrics 2023 RoadmapVictoriaMetrics 2023 Roadmap
VictoriaMetrics 2023 Roadmap
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Application Monitoring using Open Source: VictoriaMetrics - ClickHouse

  • 1. Build Fast, Scalable App Monitoring with Open Source Robert Hodges - Altinity Roman Khavronenko - VictoriaMetrics 1
  • 2. Let’s make some introductions 2 Robert Hodges Database geek with 30+ years on DBMS systems. Day job: CEO at Altinity Roman Khavronenko Distributed systems and monitoring engineer. Day job: SE at VictoriaMetrics
  • 4. Monitoring is for answering questions ● Why users are getting errors? ● When it started? ● How many users are affected? ● Which service is failing? 4
  • 5. To get an answer to the question you need 3 things 1. The question 2. The information to process 3. The respondent 5
  • 6. 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. 10
  • 12. VictoriaMetrics - Open Source Time Series Database & Monitoring Solution ● Vertically and horizontally scalable ● Operational simplicity ● Cost-efficient ● Prometheus compatible ● Free forever 12
  • 13. VictoriaMetrics - Open Source Time Series Database & Monitoring Solution ● Kubernetes monitoring ● Hardware and infrastructure monitoring ● Application Performance Monitoring (APM) ● IoT ● Edge computing ● Alerting 13
  • 14. 14
  • 15. Metric is a numeric measure or observation of something: ● Number of served requests ● Requests latency ● CPU or memory usage ● Occupied or free disk space What is a metric? 15
  • 17. Storage for metrics 17 ● VictoriaMetrics data model is schemaless ● No need to define metric names or their labels in advance ● User is free to add or change ingested metrics anytime.
  • 19. OSA Con 2021: How ClickHouse Inspired Us to Build a High Performance TSDB ● VictoriaMetrics is specialized solution for time series data ● Compression reaches 0.4 Bytes per sample ● Ingestions speed 300k samples/s per CPU core ● Scanning speed 50Mil samples/s per CPU core 19
  • 21. > curl -d "requests_total{path="/",code="200"} 10" -X POST http://victoriametrics/api/v1/import/prometheus 21
  • 22. More than one protocol for metrics ● Prometheus remote write API. ● Prometheus text exposition format. ● DataDog protocol. ● InfluxDB line protocol over HTTP, TCP and UDP. ● Graphite plaintext protocol with tags. ● OpenTSDB put message. ● HTTP OpenTSDB /api/put requests. ● JSON line format. ● Arbitrary CSV data. 22
  • 25. Demo time! ● Run VictoriaMetrics ● Write some metrics ● Execute read queries 25
  • 26. Frequently asked questions ● Can I monitor MySQL Server, Postgres, MongoDB, ClickHouse? ○ Yes, there are plenty of exporters, dashboards and alerting rules there. ● Can I monitor my applications? ○ Yes, there are libraries for multiple programming languages to instrument the application with metrics. ● How expensive monitoring is? ○ With VictoriaMetrics, cost of storing metrics from 100 instances, each instance emits 1000 metrics every 30s for the total cost will be: ■ 100GB of disk space $0.045 per GB-month: 100*0.045*12 = $54 ■ One t3.medium instance, $0.0418 per hour: 0.0418*730*12 = $366 ■ Total: $420 per year for monitoring 100 instances. ● Can I run it in Kubernetes? ○ Sure! We have k8s operator and helm charts for VictoriaMetrics! 26
  • 28. ClickHouse: a real-time analytic database It understands SQL It’s Apache 2.0 It handles many use cases beyond monitoring It also handles time series data very well 28
  • 29. ClickHouse optimizes for fast response on large datasets 29 Highly compressed column storage with indexing Automatic replication between nodes SELECT host, avg(idle) FROM vmstat GROUP BY host Parallelized/vectorized query Table replica
  • 30. ClickHouse can load millions of events per second 30 Unaggregated event data Source data table(s) Parallel load Event Queue (Kafka) Custom Application Data Lake (S3, HDFS) Precomputed aggregates Precomputed aggregates Precomputed aggregates Materialized views Instantly queryable
  • 31. …And supports [many] dozens of input formats 31 INSERT INTO some_table Format <format> TabSeparated TabSeparatedWithNames CSV CSVWithNames CustomSeparated Values JSON JSONEachRow Protobuf Parquet ...
  • 32. There are many ways to store and manipulate time data Date -- Precision to day DateTime -- Precision to second DateTime64 -- Precision to nanosecond toYear(), toMonth(), toWeek(), toDayOfWeek, toDay(), toHour(), ... toStartOfYear(), toStartOfQuarter(), toStartOfMonth(), toStartOfHour(), toStartOfMinute(), …, toStartOfInterval() toYYYYMM() toYYYYMMDD() toYYYYMMDDhhmmsss() And many more! 32 BI tools like Grafana like DateTime values
  • 33. Let’s build a simple host monitoring system 33 $ vmstat 1 -n procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 166912 2645740 36792 3360652 0 0 3 101 1 1 2 1 98 0 0 1 0 166912 2645360 36792 3360652 0 0 0 0 1182 3986 7 1 93 0 0 ClickHouse Grafana Dashboard
  • 34. Step 1: Generate vmstat data 34 #!/usr/bin/env python3 import datetime, json, socket, subprocess host = socket.gethostname() with subprocess.Popen(['vmstat', '-n', '1'], stdout=subprocess.PIPE) as proc: proc.stdout.readline() # discard first line header_names = proc.stdout.readline().decode().split() values = proc.stdout.readline().decode() while values != '' and proc.poll() is None: dict = {} dict['timestamp'] = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S") dict['host'] = host for (header, value) in zip(header_names, values.split()): dict[header] = int(value) print(json.dumps(dict), flush=True) values = proc.stdout.readline().decode()
  • 35. Here’s the output 35 {"timestamp": "2023-01-22 18:13:16", "host": "logos3", "r": 0, "b": 0, "swpd": 166912, "free": 2523688, "buff": 41412, "cache": 3408292, "si": 0, "so": 0, "bi": 3, "bo": 101, "in": 1, "cs": 0, "us": 2, "sy": 1, "id": 98, "wa": 0, "st": 0} {"timestamp": "2023-01-22 18:13:17", "host": "logos3", "r": 0, "b": 0, "swpd": 166912, "free": 2523696, "buff": 41412, "cache": 3408316, "si": 0, "so": 0, "bi": 0, "bo": 216, "in": 1214, "cs": 4320, "us": 1, "sy": 1, "id": 98, "wa": 0, "st": 0} {"timestamp": "2023-01-22 18:13:18", "host": "logos3", "r": 0, "b": 0, "swpd": 166912, "free": 2527120, "buff": 41412, "cache": 3408572, "si": 0, "so": 0, "bi": 0, "bo": 0, "in": 1172, "cs": 4162, "us": 2, "sy": 1, "id": 98, "wa": 0, "st": 0}
  • 36. Step 2: Design a ClickHouse table to hold data 36 CREATE TABLE monitoring.vmstat ( timestamp DateTime, day UInt32 default toYYYYMMDD(timestamp), host String, r UInt64, b UInt64, -- procs swpd UInt64, free UInt64, buff UInt64, cache UInt64, -- memory si UInt64, so UInt64, -- swap bi UInt64, bo UInt64, -- io in UInt64, cs UInt64, -- system us UInt64, sy UInt64, id UInt64, wa UInt64, st UInt64 -- cpu ) ENGINE=MergeTree PARTITION BY day ORDER BY (host, timestamp) Dimensions Measurements
  • 37. Step 3: Load data into ClickHouse 37 INSERT INTO vmstat Format JSONEachRow E.g. INSERT='INSERT%20INTO%20vmstat%20Format%20JSONEachRow' cat vmstat.dat | curl -X POST --data-binary @- "http://logos3:8123/?database=monitoring&query=${INSERT}" (Or a Python script)
  • 38. Step 4: Build a Grafana dashboard to show results 38 ClickHouse data source for Grafana Altinity plugin for ClickHouse
  • 39. After loading you can go crazy with analytical queries 39 SELECT host, count() AS loaded_minutes FROM ( SELECT toStartOfMinute(timestamp) AS minute, host, avg(100 - id) AS load FROM monitoring.vmstat WHERE timestamp > (now() - toIntervalDay(1)) GROUP BY minute, host HAVING load > 25 ) GROUP BY host ORDER BY loaded_minutes DESC ┌─host───┬─loaded_minutes─┐ │ logos3 │ 6 │ │ logos2 │ 5 │ └────────┴────────────────┘ 2 hosts had > 25% load for at least a minute in the last 24 hours
  • 41. Can ClickHouse store data in a “schemaless” way? {{"timestamp": "2023-01-23 19:53:14", "host": "logos3", ...} SQL Table JSON String JSON String (“blob”) with derived header values One table can handle many entity types! 41
  • 42. More schemaless ways to store data SQL Table Array of Keys Arrays: Header values with key-value pairs Array of Values SQL Table Map with Key/Values Map: Header values & key value pairs SQL Table JSON Data Type JSON data type mapped to column storage 42
  • 43. Where is the software to build monitoring? 43 Event streaming ● Apache Kafka ● Apache Pulsar ● Vectorized Redpanda ELT ● Apache Airflow ● Rudderstack Rendering/Display ● Apache Superset ● Cube.js ● Grafana Client Libraries ● C++ - ClickHouse CPP ● Golang - ClickHouse Go ● Java - ClickHouse JDBC ● Javascript/Node.js - Apla ● ODBC - ODBC Driver for ClickHouse ● Python - ClickHouse Driver, ClickHouse SQLAlchemy More client library links HERE Kubernetes ● Altinity Operator for ClickHouse
  • 44. Where can I find out more about ClickHouse? ClickHouse official docs – https://clickhouse.com/docs/ Altinity Blog – https://altinity.com/blog/ Altinity Youtube Channel – https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA Altinity Knowledge Base – https://kb.altinity.com/ Meetups, other blogs, and external resources. Use your powers of Search! 44
  • 46. Comparing VictoriaMetrics and ClickHouse databases VictoriaMetrics Talks MetricsQL, PromQL, Graphite QL Stores time series data No explicit schema Easy to load data using simple clients Can pull data from Prometheus exporters and Kafka Time-series specific functions and transformations Integrates with any BI tool that speaks PromQL Extremely fast and scalable ClickHouse Talks SQL Stores any kind of data Uses tables; many ways to represent data Easy to load data using simple clients Can pull data from Kafka and object storage Versatile queries including JOIN and aggregation Most BI tools have ClickHouse adapters Extremely fast and scalable 46
  • 47. Help for building monitoring systems that work VictoriaMetrics Inc. VictoriaMetrics Community VictoriaMetrics Enterprise VictoriaMetrics Managed platform Altinity Inc. Altinity.Cloud managed ClickHouse platform Enterprise support for ClickHouse Altinity Developer Academy classes Altinity Stable Builds for ClickHouse Altinity Kubernetes Operator for ClickHouse 47