SlideShare a Scribd company logo
1 of 62
Roman Vynar, Tim Vaillancourt
Percona
Open Source Monitoring for MySQL and MongoDB with
Grafana and Prometheus
Agenda
2
This is a hands-on tutorial on setting up the monitoring and graphing for MySQL and MongoDB
servers using Prometheus monitoring system and time-series database with Grafana feature
rich metrics dashboard.
• Prometheus overview
• Prometheus metric exporters
• Queries and expressions on Prometheus DB
• Grafana overview
• Creating graphs and dashboards in Grafana
• MySQL graphing capabilities
• MongoDB graphing capabilities
• Creating alerts in Prometheus
• Using Alertmanager for getting notifications
• Working with Prometheus HTTP API
• Using InfluxDB with Prometheus as a long-term storage option
Virtualbox preparation
3
There is an appliance containing two pre-installed virtual machines:
• db1.vm - monitor and master db server
• db2.vm - slave db server
Copy the files from USB stick provided to your laptop
Double-click on the .OVA file to import appliance into Virtualbox
Virtualbox network
4
Each instance is configured with 2 network adapters:
• Host-only adapter
• NAT
Configure host-only network from the main menu:
Virtualbox > Preferences > Network > Host-only Networks > “vboxnet0” or “Virtualbox Host-
Only Ethernet Adapter” > edit and set: 192.168.56.1 / 255.255.255.0
Windows users only: open Setting > Network and click OK to re-save host-only network
adapter.
Starting VMs
5
Internal static IP addresses assigned:
• db1.vm - 192.168.56.201
• db2.vm - 192.168.56.202
Both instances are running CentOS 7 and have all the necessary packages pre-installed.
Unix and MySQL root password: PerconaLive_123
Start both machines
Verify network connectivity
IMPORTANT! The system time should be in sync:
systemctl restart ntpd.service
Pre-installed packages
6
Percona YUM repo and database packages:
rpm -Uvh http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-
release-0.1-3.noarch.rpm
yum install Percona-Server-server-57 Percona-Server-client-57 Percona-Server-shared-57
yum install Percona-Server-MongoDB
yum install sysbench
Grafana:
yum install initscripts fontconfig
yum install https://grafanarel.s3.amazonaws.com/builds/grafana-2.6.0-1.x86_64.rpm
InfluxDB:
yum install https://s3.amazonaws.com/influxdb/influxdb-0.10.0-1.x86_64.rpm
pip influxdb pyyaml
Prometheus software
7
Prometheus and Alertmanager tarballs:
• https://github.com/prometheus/prometheus/releases/download/0.17.0/prometheus-
0.17.0.linux-amd64.tar.gz
• https://github.com/prometheus/alertmanager/releases/download/0.1.1/alertmanager-
0.1.1.linux-amd64.tar.gz
Pre-compiled exporters from the sources:
• https://github.com/prometheus/node_exporter
• https://github.com/prometheus/mysqld_exporter
• https://github.com/Percona-Lab/prometheus_mongodb_exporter
Prometheus overview
8
Prometheus is an open-source monitoring system and time series database.
Main features:
• a multi-dimensional data model (time series identified by metric name and key/value pairs)
• a flexible query language to leverage this dimensionality
• no reliance on distributed storage; single server nodes are autonomous
• time series collection happens via a pull model over HTTP
• pushing time series is supported via an intermediary gateway
• targets are discovered via service discovery or static configuration
• multiple modes of graphing and dashboarding support
Prometheus architecture
9
Prometheus metric exporters
10
Official:
• Node/system metrics exporter
• AWS CloudWatch exporter
• Blackbox exporter
• Collectd exporter
• Consul exporter
• Graphite exporter
• HAProxy exporter
• InfluxDB exporter
• JMX exporter
• Mesos task exporter
• MySQL server exporter
• SNMP exporter
• StatsD exporter
Third-party:
• Apache exporter
• BIND exporter
• Django exporter
• Jenkins exporter
• Memcached exporter
• Minecraft exporter module
• MongoDB exporter
• New Relic exporter
• Nginx metric library
• PostgreSQL exporter
• RabbitMQ exporter
• Redis exporter
• … many more …
Start Prometheus
11
Most of the actions we will be running on db1 which is a monitor server.
Let’s review Prometheus config prepared for this tutorial:
cat prometheus.yml
Extract binaries:
tar zxf prometheus-0.17.0.linux-amd64.tar.gz
Check out the startup script:
cat start.sh
Start Prometheus:
./start.sh prometheus
tail -f /var/log/prometheus.log
Access web interface
12
Go to http://192.168.56.201:9090
Querying Prometheus DB
13
Prometheus provides a functional expression language that lets the user select and aggregate
time series data in real time.
The result of an expression can either be shown as a graph, viewed as tabular data in
Prometheus's expression browser, or consumed by external systems via the HTTP API.
Examples:
• http_requests_total
• http_requests_total{job="prometheus", handler="static"}
• {__name__=~"process_.+"}
• scrape_duration_seconds
• scrape_duration_seconds + 2
PromQL functions
14
Functions:
•abs()
•absent()
•bottomk()
•ceil()
•changes()
•clamp_max()
•clamp_min()
•count_scalar()
•delta()
•deriv()
•drop_common_labels()
•exp()
•floor()
•histogram_quantile()
•increase()
•irate()
•label_replace()
•ln()
•log2()
•log10()
•predict_linear()
•rate()
•resets()
•round()
•scalar()
•sort()
•sort_desc()
•sqrt()
•time()
•topk()
•vector()
•<aggregation>_over_time()
Grafana overview
15
Grafana is an open source, feature rich metrics dashboard and graph editor for Graphite, Elasticsearch,
OpenTSDB, Prometheus and InfluxDB.
Main features:
• User-friendly interface
• Rich graphing, flexible scaling
• Mixed styling
• Themes
• Template variables
• Scripted dashboards
• Repeating graphs and panels
• Authentication, LDAP support
• Annotations
• Shapshot sharing
Start using Grafana
16
Login to Grafana http://192.168.56.201:3000 using admin/admin credentials.
Add datasource
17
Patch Grafana 2.6.0
18
It is important to apply the following patch on your Grafana in order to use the interval
template variable to get the good zoomable graphs. The fix is simply to allow variable in Step
field on Grafana graph editor page. For more information, you can look at Grafana’s github
PR#3757 and PR#4257. We hope the fix will be released in the next Grafana version.
sed -i 's/step_input:""/step_input:c.target.step/; s/ HH:MM/ HH:mm/;
s/,function(c)/,"templateSrv",function(c,g)/;
s/expr:c.target.expr/expr:g.replace(c.target.expr,c.panel.scopedVars
)/'
/usr/share/grafana/public/app/plugins/datasource/prometheus/query_ct
rl.js
sed -i 's/h=a.interval/h=g.replace(a.interval, c.scopedVars)/'
/usr/share/grafana/public/app/plugins/datasource/prometheus/datasour
ce.js
Percona Grafana dashboards
19
Open-source and available @ https://github.com/percona/grafana-dashboards
This is a set of Grafana dashboards to be used with Prometheus and InfluxDB datasources for
MySQL and system monitoring. MongoDB dashboard to be shared separately.
MySQL:
• MySQL InnoDB Metrics
• MySQL MyISAM Metrics
• MySQL Overview
• MySQL Performance Schema
• MySQL Query Response Time
• MySQL Replication
• MySQL Table Statistics
• MySQL User Statistics
• Galera Graphs
• TokuDB Graphs
System:
• System Overview
• Disk Space
• Disk Performance
Mixed:
• Cross Server Graphs
• Summary Dashboard
• Trends Dashboard
• Prometheus
• [InfluxDB] 5m downsample
• [InfluxDB] 1h downsample
Install dashboards
20
Copy dashboard files:
cp -r grafana-dashboards/dashboards/ /var/lib/grafana/
Enable JSON dashboards by adding those lines to /etc/grafana/grafana.ini:
[dashboards.json]
enabled = true
path = /var/lib/grafana/dashboards
Restart Grafana:
systemctl restart grafana-server.service
Creating and using dashboards
21
node_exporter collectors
22
Enabled in this tutorial:
• diskstats
• filesystem
• loadavg
• meminfo
• netdev
• stat
• time
• uname
• vmstat
Other available collectors:
• conntrack
• cpu
• entropy
• filefd
• mdadm
• netstat
• textfile
• version
• bonding
• devstat
• gmond
• interrupts
• ipvs
• ksmd
• lastlogin
• megacli
• meminfo_numa
• ntp
• runit
• supervisord
• systemd
• tcpstat
mysqld_exporter collectors
23
Enabled in this tutorial:
-collect.global_status
-collect.global_variables
-collect.slave_status
-collect.info_schema.tables
-collect.binlog_size
-collect.info_schema.processlist
-collect.auto_increment.columns
-collect.info_schema.tablestats
-collect.info_schema.userstats
-collect.info_schema.query_response_time
-collect.info_schema.innodb_metrics
-collect.perf_schema.file_events
-collect.perf_schema.eventsstatements
-collect.perf_schema.indexiowaits
-collect.perf_schema.tableiowaits
-collect.perf_schema.eventswaits
Other collectors:
-collect.engine_tokudb_status
-collect.perf_schema.tablelocks
Running exporters
24
Let’s start the exporters on both nodes.
Start node_exporter:
./start.sh node_exporter
tail -20f /var/log/node_exporter.log
Start mysqld_exporter:
./start.sh mysqld_exporter
tail -f /var/log/mysqld_exporter.log
Start mongo instances and mongodb_exporters:
cd ~/grafana_mongodb_dashboards/examples
./start-example-cluster.sh
./start-example-exporters.sh
tail -f example/log/*/mongodb_exporter*
MySQL access for mysqld_exporter
25
mysqld_exporter requires MySQL credentials to connect to MySQL.
There are a few options:
• command-line argument: -config.my-cnf=<path>/.my.cnf
Note, if you use tilde to specify user’s homedir it may not always expand to the actual path.
• using environment variables:
export DATA_SOURCE_NAME='user:pass@(localhost:3306)/'
export DATA_SOURCE_NAME='user:pass@unix(/var/lib/mysql/mysql.sock)/'
export DATA_SOURCE_NAME='user:pass@tcp(localhost:3306)/'
Check exporters status
26
db1, in the terminal:
curl http://localhost:9100/metrics
curl http://localhost:9104/metrics
curl http://localhost:9105/metrics
db2, via web browser:
http://192.168.56.202:9100/metrics
http://192.168.56.202:9104/metrics
http://192.168.56.202:9105/metrics
Prometheus endpoints status:
http://192.168.56.201:9090/status
Prometheus targets
27
At this point, you should see such picture
Monitoring system metrics
28
Monitoring disk performance
29
MySQL graphing capabilities
30
Let’s generate some MySQL activity by running OLTP test with sysbench:
./sysbench.sh
Observe MySQL dashboards
MongoDB Dashboards
31
cp -r /root/grafana_mongodb_dashboards/dashboards/* /var/lib/grafana/dashboards/
Restart grafana (systemctl restart grafana-server.service)
MongoDB graphing capabilities - Before
32
1. Beginning on ‘dcu/mongodb_exporter’
2. Server Status output ‘db.serverStatus()’
1. Uptime
2. Asserts
3. Durability
4. BackgroundFlushing
5. Connections
6. ExtraInfo
7. GlobalLock
8. IndexCounter
9. Locks
10.Network
11.Opcounters
12.OpcountersRepl
13.Memory
14.Metrics
15.Cursors
MongoDB graphing capabilities - After
33
1. Server Status output ‘db.serverStatus()’
1. Uptime
<trimmed>
15. Cursors
2. Replica Set Status Output ‘rs.status()’
1. Replica Set State
2. Replica Set Optime
3. Replica Set Node-to-Node Ping
4. Replica Set Elections
3. Replica Set Oplog Info
1. Oplog head/tail timestamp
2. Oplog size bytes
3. Oplog item count
MongoDB graphing capabilities - After
34
4. Sharding Info (mongos)
1. Balancer Locks and Lock Updates
2. Is Cluster Balanced?
3. # of Shards, DBs, Collections, Chunks
4. # of Mongos processes
5. # of Balancer, Split and Sharding events
5. WiredTiger storage-engine (experimental)
6. Cache Usage
7. Block Usage
8. Transactions
9. Etc
MongoDB graphing capabilities - After
35
1. Server Status output ‘db.serverStatus()’
1. Uptime
<trimmed>
15. Cursors
2. Replica Set Status Output ‘rs.status()’
1. Replica Set State
2. Replica Set Optime
3. Replica Set Node-to-Node Ping
4. Replica Set Elections
3. Replica Set Oplog Info
1. Oplog head/tail timestamp
2. Oplog size bytes
3. Oplog item count
MongoDB Exporter Metric Summary
36
Per-collection Summary:
1. 60 x DB-level MongoDB metrics on ‘mongos’ nodes w/1-shard
• +5-8~ metrics per shard added
2. 157 x DB-level MongoDB metrics on ‘mongod’ replica set nodes w/2 x members
• +5-8~ metrics per shard added
3. 676 x OS-level metrics on recent Linux 3.x+
Total metrics: 893+ per Collection (at minimum)!
Total MongoDB MMS metrics: “400 per ping packet” Reference: http://www.slideshare.net/mongodb/using-the-mongodb-monitoring-service-mms
Per-collection size:
• Raw: 35kb Mongod Replset w/1-node, 17kb Mongos w/1-shard, 91kb Linux node_exporter
• Estimated Snappy compression (used in LevelDB) is about 80%
Recommended fetch interval:
• 5 sec if possible, enough disk space (possibly less?)
• 10 sec (default) if not
Prometheus Metric Grouping with Labels
37
• Metrics level labels vs Target level labels
• Target-level labels can combine multiple exporters together
Mongo Node
<- Grafana
Templating
MongoDB graphing capabilities
38
MongoDB graphing capabilities
39
MongoDB graphing capabilities
40
Prometheus Auto-discovery (Future)
41
<- Consul
<- Prometheus
=
New WiredTiger Metrics and the Future
42
WiredTiger Supported:
• Cache
• BlockManager
• Transaction
• ConcurrentTransaction
• Log (coming soon!)
Future Metrics:
• PerconaFT engine metrics
• RocksDB engine metrics
• Profiler metrics
Making a Go-based Prometheus Exporter
43
Overall Steps:
1. Metric definition:
2. Function to “collect” the data (most of the logic):
Making a Go-based Prometheus Exporter
44
Overall Steps:
3. Function to “export” the data:
4. Function to “describe” the data:
Making a Go-based Prometheus Exporter
45
• Tips / Advice
• Always try to user incremented total values
• Everything is a float64 - store what provides value
• Do “math” operations on values in Grafana
• Vector labels are for high-cardinality, be conservative
• Not everything needs to be a graph / Prometheus query interface is powerful
Alerting with Prometheus
46
Alerting with Prometheus is separated into two parts. Alerting rules in Prometheus servers send
alerts to an Alertmanager.
The Alertmanager then manages those alerts, including silencing, inhibition, aggregation and
sending out notifications via methods such as email, PagerDuty, HipChat, Slack, Pushover.
The main steps to setting up alerting and notifications are:
• Create alerting rules in Prometheus
• Setup and configure the Alertmanager
• Configure Prometheus to talk to the Alertmanager with the -alertmanager.url flag
Prometheus alerts
47
ALERT ExporterDown
IF up == 0
FOR 1m
LABELS { severity = "page" }
ANNOTATIONS {
summary = "{{$labels.alias}}: exporter down",
description = "Exporter on job '{{$labels.job}}' is not responding"
}
ALERT SystemMemory
IF round((node_memory_MemAvailable OR (node_memory_MemFree + node_memory_Buffers +
node_memory_Cached)) / node_memory_MemTotal * 100) < 5
FOR 1m
LABELS { severity = "page" }
ANNOTATIONS {
summary = "{{$labels.alias}}: low memory",
description = "Free {{$value}}% of memory"
}
Configuring alerts in Prometheus
48
Let’s review alert definitions prepared for this tutorial:
cat alerting.rules
Include alerting rules into prometheus.yml:
rule_files:
- alerting.rules
Reload prometheus:
kill -HUP `pidof prometheus`
Alerts in Prometheus web UI
49
Using Alertmanager
50
Let’s review Alertmanager config prepared for this tutorial:
cat alertmanager.yml
Edit it with the appropriate email addresses for testing.
Start Alertmanager
51
Extract binaries:
tar zxf alertmanager-0.1.1.linux-amd64.tar.gz
Start Alertmanager:
./start.sh alertmanager
Uncomment ALERTMANAGER line in start.sh
Restart Prometheus:
kill `pidof prometheus`
./start.sh prometheus
Alertmanager web UI
52
Go to Alertmanager web interface http://192.168.56.201:9093
Alert paged by email
53
Prometheus recording rules
54
Let’s review recording rules prepared for this tutorial:
cat recording.rules
Include alerting rules into prometheus.yml:
rule_files:
- recording.rules
Reload prometheus:
kill -HUP `pidof prometheus`
Query for newly created metrics
55
Working with Prometheus HTTP API
56
Instant and range queries, at a single point in time or range:
curl -sg 'http://localhost:9090/api/v1/query?query=up{job="mysql"}' | python -m json.tool
curl -sg 'http://localhost:9090/api/v1/query?query=ALERTS{alertstate="firing"}' | python -m
json.tool
curl -sg "http://localhost:9090/api/v1/query_range?query=node_load1&start=`expr $(date +%s) -
3600`&end=`date +%s`&step=5m" | python -m json.tool
Label values across the whole DB:
curl http://localhost:9090/api/v1/label/alias/values
List of series matching the expression:
curl -sg
'http://localhost:9090/api/v1/series?match[]=node_filesystem_size{fstype!~"rootfs|selinuxfs|autofs
|rpc_pipefs|tmpfs"}'| python -m json.tool
Delete series:
curl -g -X DELETE 'http://localhost:9090/api/v1/series?match[]={alias="db2"}'
InfluxDB overview
57
InfluxDB is an open source time series database. It's useful for recording metrics, events, and
performing analytics.
Web interface http://192.168.56.201:8083
Why InfluxDB?
• Currently, one of a few available remote storage options for Prometheus to use as a long-
term solution
• Multiple retention policies
• Easy to use
• Grafana support
• Clustering
Configure Prometheus with InfluxDB
58
Create prometheus db in InfluxDB:
influx
create database prometheus;
Uncomment INFLUXDB line in start.sh
Restart Prometheus:
kill `pidof prometheus`
./start.sh prometheus
Load continuous queries to downsample data:
python grafana-dashboards/influxdb_cq.py
Using InfluxDB
59
Browse data:
influx
use prometheus;
show measurements;
show continuous queries;
select * from node_load1;
use trending;
show retention policies on trending;
select * from trending."5m".node_load1;
show shards;
Add InfluxDB datasource to Grafana
60
What’s next?
61
• Grafana 3.0 release: pie charts, more functionality, improved Prometheus datasource?
• More long-term storage options for Prometheus
• Alertmanager production-ready status?
• InfluxDB or not InfluxDB?
Thank you!
62
Questions?

More Related Content

What's hot

Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With PrometheusKnoldus Inc.
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaYoungHeon (Roy) Kim
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusMarco Pas
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy Docker, Inc.
 
Prometheus – a next-gen Monitoring System
Prometheus – a next-gen Monitoring SystemPrometheus – a next-gen Monitoring System
Prometheus – a next-gen Monitoring SystemFabian Reinartz
 
Fall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using GrafanaFall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using Grafanatorkelo
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introductionRico Chen
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...Tokuhiro Matsuno
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaSridhar Kumar N
 
Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018Grafana Labs
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheusKasper Nissen
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...LibbySchulze
 
Prometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingPrometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingJulien Pivotto
 

What's hot (20)

Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
Cloud Monitoring tool Grafana
Cloud Monitoring  tool Grafana Cloud Monitoring  tool Grafana
Cloud Monitoring tool Grafana
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & Grafana
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
 
Prometheus + Grafana = Awesome Monitoring
Prometheus + Grafana = Awesome MonitoringPrometheus + Grafana = Awesome Monitoring
Prometheus + Grafana = Awesome Monitoring
 
Prometheus – a next-gen Monitoring System
Prometheus – a next-gen Monitoring SystemPrometheus – a next-gen Monitoring System
Prometheus – a next-gen Monitoring System
 
Fall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using GrafanaFall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using Grafana
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
 
Grafana.pptx
Grafana.pptxGrafana.pptx
Grafana.pptx
 
Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018Explore your prometheus data in grafana - Promcon 2018
Explore your prometheus data in grafana - Promcon 2018
 
Prometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb SolutionPrometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb Solution
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheus
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
 
Prometheus 101
Prometheus 101Prometheus 101
Prometheus 101
 
Prometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingPrometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is coming
 

Viewers also liked

One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...Tim Vaillancourt
 
Monitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildMonitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildTim Vaillancourt
 
Alerting in Grafana, Grafanacon 2015
Alerting in Grafana, Grafanacon 2015Alerting in Grafana, Grafanacon 2015
Alerting in Grafana, Grafanacon 2015Dieter Plaetinck
 
Monitoring MySQL with Prometheus, Grafana and Percona Dashboards
Monitoring MySQL with Prometheus, Grafana and Percona DashboardsMonitoring MySQL with Prometheus, Grafana and Percona Dashboards
Monitoring MySQL with Prometheus, Grafana and Percona DashboardsJulien Pivotto
 
Real-Time Monitoring with Grafana, StatsD and InfluxDB
Real-Time Monitoring with Grafana, StatsD and InfluxDBReal-Time Monitoring with Grafana, StatsD and InfluxDB
Real-Time Monitoring with Grafana, StatsD and InfluxDBArtur Prado
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBleesjensen
 
Why observability matters - now and in the future (w/guest Grafana)
Why observability matters - now and in the future (w/guest Grafana)Why observability matters - now and in the future (w/guest Grafana)
Why observability matters - now and in the future (w/guest Grafana)Weaveworks
 

Viewers also liked (7)

One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
 
Monitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildMonitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the Wild
 
Alerting in Grafana, Grafanacon 2015
Alerting in Grafana, Grafanacon 2015Alerting in Grafana, Grafanacon 2015
Alerting in Grafana, Grafanacon 2015
 
Monitoring MySQL with Prometheus, Grafana and Percona Dashboards
Monitoring MySQL with Prometheus, Grafana and Percona DashboardsMonitoring MySQL with Prometheus, Grafana and Percona Dashboards
Monitoring MySQL with Prometheus, Grafana and Percona Dashboards
 
Real-Time Monitoring with Grafana, StatsD and InfluxDB
Real-Time Monitoring with Grafana, StatsD and InfluxDBReal-Time Monitoring with Grafana, StatsD and InfluxDB
Real-Time Monitoring with Grafana, StatsD and InfluxDB
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDB
 
Why observability matters - now and in the future (w/guest Grafana)
Why observability matters - now and in the future (w/guest Grafana)Why observability matters - now and in the future (w/guest Grafana)
Why observability matters - now and in the future (w/guest Grafana)
 

Similar to Monitoring_with_Prometheus_Grafana_Tutorial

[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and dockerFabio Fumarola
 
Docker HK Meetup - 201707
Docker HK Meetup - 201707Docker HK Meetup - 201707
Docker HK Meetup - 201707Clarence Ho
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpNathan Handler
 
Deploying windows containers with kubernetes
Deploying windows containers with kubernetesDeploying windows containers with kubernetes
Deploying windows containers with kubernetesBen Hall
 
Bare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefBare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefMatt Ray
 
DevOPS training - Day 2/2
DevOPS training - Day 2/2DevOPS training - Day 2/2
DevOPS training - Day 2/2Vincent Mercier
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
 
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, KyivKubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, KyivAleksey Asiutin
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)DECK36
 
Linux sever building
Linux sever buildingLinux sever building
Linux sever buildingEdmond Yu
 
Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Anthony Dahanne
 
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google CloudDrupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google CloudDropsolid
 
How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018Antonios Giannopoulos
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
 
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius SchumacherOSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius SchumacherNETWAYS
 
Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Bo-Yi Wu
 

Similar to Monitoring_with_Prometheus_Grafana_Tutorial (20)

[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and docker
 
Docker HK Meetup - 201707
Docker HK Meetup - 201707Docker HK Meetup - 201707
Docker HK Meetup - 201707
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Deploying windows containers with kubernetes
Deploying windows containers with kubernetesDeploying windows containers with kubernetes
Deploying windows containers with kubernetes
 
Bare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefBare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and Chef
 
DevOPS training - Day 2/2
DevOPS training - Day 2/2DevOPS training - Day 2/2
DevOPS training - Day 2/2
 
Devopstore
DevopstoreDevopstore
Devopstore
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
 
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, KyivKubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
 
Build Automation 101
Build Automation 101Build Automation 101
Build Automation 101
 
Linux sever building
Linux sever buildingLinux sever building
Linux sever building
 
Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018Kubernetes for java developers - Tutorial at Oracle Code One 2018
Kubernetes for java developers - Tutorial at Oracle Code One 2018
 
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google CloudDrupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
Drupaljam 2017 - Deploying Drupal 8 onto Hosted Kubernetes in Google Cloud
 
How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
GoDocker presentation
GoDocker presentationGoDocker presentation
GoDocker presentation
 
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius SchumacherOSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
 
Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署
 

Monitoring_with_Prometheus_Grafana_Tutorial

  • 1. Roman Vynar, Tim Vaillancourt Percona Open Source Monitoring for MySQL and MongoDB with Grafana and Prometheus
  • 2. Agenda 2 This is a hands-on tutorial on setting up the monitoring and graphing for MySQL and MongoDB servers using Prometheus monitoring system and time-series database with Grafana feature rich metrics dashboard. • Prometheus overview • Prometheus metric exporters • Queries and expressions on Prometheus DB • Grafana overview • Creating graphs and dashboards in Grafana • MySQL graphing capabilities • MongoDB graphing capabilities • Creating alerts in Prometheus • Using Alertmanager for getting notifications • Working with Prometheus HTTP API • Using InfluxDB with Prometheus as a long-term storage option
  • 3. Virtualbox preparation 3 There is an appliance containing two pre-installed virtual machines: • db1.vm - monitor and master db server • db2.vm - slave db server Copy the files from USB stick provided to your laptop Double-click on the .OVA file to import appliance into Virtualbox
  • 4. Virtualbox network 4 Each instance is configured with 2 network adapters: • Host-only adapter • NAT Configure host-only network from the main menu: Virtualbox > Preferences > Network > Host-only Networks > “vboxnet0” or “Virtualbox Host- Only Ethernet Adapter” > edit and set: 192.168.56.1 / 255.255.255.0 Windows users only: open Setting > Network and click OK to re-save host-only network adapter.
  • 5. Starting VMs 5 Internal static IP addresses assigned: • db1.vm - 192.168.56.201 • db2.vm - 192.168.56.202 Both instances are running CentOS 7 and have all the necessary packages pre-installed. Unix and MySQL root password: PerconaLive_123 Start both machines Verify network connectivity IMPORTANT! The system time should be in sync: systemctl restart ntpd.service
  • 6. Pre-installed packages 6 Percona YUM repo and database packages: rpm -Uvh http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona- release-0.1-3.noarch.rpm yum install Percona-Server-server-57 Percona-Server-client-57 Percona-Server-shared-57 yum install Percona-Server-MongoDB yum install sysbench Grafana: yum install initscripts fontconfig yum install https://grafanarel.s3.amazonaws.com/builds/grafana-2.6.0-1.x86_64.rpm InfluxDB: yum install https://s3.amazonaws.com/influxdb/influxdb-0.10.0-1.x86_64.rpm pip influxdb pyyaml
  • 7. Prometheus software 7 Prometheus and Alertmanager tarballs: • https://github.com/prometheus/prometheus/releases/download/0.17.0/prometheus- 0.17.0.linux-amd64.tar.gz • https://github.com/prometheus/alertmanager/releases/download/0.1.1/alertmanager- 0.1.1.linux-amd64.tar.gz Pre-compiled exporters from the sources: • https://github.com/prometheus/node_exporter • https://github.com/prometheus/mysqld_exporter • https://github.com/Percona-Lab/prometheus_mongodb_exporter
  • 8. Prometheus overview 8 Prometheus is an open-source monitoring system and time series database. Main features: • a multi-dimensional data model (time series identified by metric name and key/value pairs) • a flexible query language to leverage this dimensionality • no reliance on distributed storage; single server nodes are autonomous • time series collection happens via a pull model over HTTP • pushing time series is supported via an intermediary gateway • targets are discovered via service discovery or static configuration • multiple modes of graphing and dashboarding support
  • 10. Prometheus metric exporters 10 Official: • Node/system metrics exporter • AWS CloudWatch exporter • Blackbox exporter • Collectd exporter • Consul exporter • Graphite exporter • HAProxy exporter • InfluxDB exporter • JMX exporter • Mesos task exporter • MySQL server exporter • SNMP exporter • StatsD exporter Third-party: • Apache exporter • BIND exporter • Django exporter • Jenkins exporter • Memcached exporter • Minecraft exporter module • MongoDB exporter • New Relic exporter • Nginx metric library • PostgreSQL exporter • RabbitMQ exporter • Redis exporter • … many more …
  • 11. Start Prometheus 11 Most of the actions we will be running on db1 which is a monitor server. Let’s review Prometheus config prepared for this tutorial: cat prometheus.yml Extract binaries: tar zxf prometheus-0.17.0.linux-amd64.tar.gz Check out the startup script: cat start.sh Start Prometheus: ./start.sh prometheus tail -f /var/log/prometheus.log
  • 12. Access web interface 12 Go to http://192.168.56.201:9090
  • 13. Querying Prometheus DB 13 Prometheus provides a functional expression language that lets the user select and aggregate time series data in real time. The result of an expression can either be shown as a graph, viewed as tabular data in Prometheus's expression browser, or consumed by external systems via the HTTP API. Examples: • http_requests_total • http_requests_total{job="prometheus", handler="static"} • {__name__=~"process_.+"} • scrape_duration_seconds • scrape_duration_seconds + 2
  • 15. Grafana overview 15 Grafana is an open source, feature rich metrics dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB. Main features: • User-friendly interface • Rich graphing, flexible scaling • Mixed styling • Themes • Template variables • Scripted dashboards • Repeating graphs and panels • Authentication, LDAP support • Annotations • Shapshot sharing
  • 16. Start using Grafana 16 Login to Grafana http://192.168.56.201:3000 using admin/admin credentials.
  • 18. Patch Grafana 2.6.0 18 It is important to apply the following patch on your Grafana in order to use the interval template variable to get the good zoomable graphs. The fix is simply to allow variable in Step field on Grafana graph editor page. For more information, you can look at Grafana’s github PR#3757 and PR#4257. We hope the fix will be released in the next Grafana version. sed -i 's/step_input:""/step_input:c.target.step/; s/ HH:MM/ HH:mm/; s/,function(c)/,"templateSrv",function(c,g)/; s/expr:c.target.expr/expr:g.replace(c.target.expr,c.panel.scopedVars )/' /usr/share/grafana/public/app/plugins/datasource/prometheus/query_ct rl.js sed -i 's/h=a.interval/h=g.replace(a.interval, c.scopedVars)/' /usr/share/grafana/public/app/plugins/datasource/prometheus/datasour ce.js
  • 19. Percona Grafana dashboards 19 Open-source and available @ https://github.com/percona/grafana-dashboards This is a set of Grafana dashboards to be used with Prometheus and InfluxDB datasources for MySQL and system monitoring. MongoDB dashboard to be shared separately. MySQL: • MySQL InnoDB Metrics • MySQL MyISAM Metrics • MySQL Overview • MySQL Performance Schema • MySQL Query Response Time • MySQL Replication • MySQL Table Statistics • MySQL User Statistics • Galera Graphs • TokuDB Graphs System: • System Overview • Disk Space • Disk Performance Mixed: • Cross Server Graphs • Summary Dashboard • Trends Dashboard • Prometheus • [InfluxDB] 5m downsample • [InfluxDB] 1h downsample
  • 20. Install dashboards 20 Copy dashboard files: cp -r grafana-dashboards/dashboards/ /var/lib/grafana/ Enable JSON dashboards by adding those lines to /etc/grafana/grafana.ini: [dashboards.json] enabled = true path = /var/lib/grafana/dashboards Restart Grafana: systemctl restart grafana-server.service
  • 21. Creating and using dashboards 21
  • 22. node_exporter collectors 22 Enabled in this tutorial: • diskstats • filesystem • loadavg • meminfo • netdev • stat • time • uname • vmstat Other available collectors: • conntrack • cpu • entropy • filefd • mdadm • netstat • textfile • version • bonding • devstat • gmond • interrupts • ipvs • ksmd • lastlogin • megacli • meminfo_numa • ntp • runit • supervisord • systemd • tcpstat
  • 23. mysqld_exporter collectors 23 Enabled in this tutorial: -collect.global_status -collect.global_variables -collect.slave_status -collect.info_schema.tables -collect.binlog_size -collect.info_schema.processlist -collect.auto_increment.columns -collect.info_schema.tablestats -collect.info_schema.userstats -collect.info_schema.query_response_time -collect.info_schema.innodb_metrics -collect.perf_schema.file_events -collect.perf_schema.eventsstatements -collect.perf_schema.indexiowaits -collect.perf_schema.tableiowaits -collect.perf_schema.eventswaits Other collectors: -collect.engine_tokudb_status -collect.perf_schema.tablelocks
  • 24. Running exporters 24 Let’s start the exporters on both nodes. Start node_exporter: ./start.sh node_exporter tail -20f /var/log/node_exporter.log Start mysqld_exporter: ./start.sh mysqld_exporter tail -f /var/log/mysqld_exporter.log Start mongo instances and mongodb_exporters: cd ~/grafana_mongodb_dashboards/examples ./start-example-cluster.sh ./start-example-exporters.sh tail -f example/log/*/mongodb_exporter*
  • 25. MySQL access for mysqld_exporter 25 mysqld_exporter requires MySQL credentials to connect to MySQL. There are a few options: • command-line argument: -config.my-cnf=<path>/.my.cnf Note, if you use tilde to specify user’s homedir it may not always expand to the actual path. • using environment variables: export DATA_SOURCE_NAME='user:pass@(localhost:3306)/' export DATA_SOURCE_NAME='user:pass@unix(/var/lib/mysql/mysql.sock)/' export DATA_SOURCE_NAME='user:pass@tcp(localhost:3306)/'
  • 26. Check exporters status 26 db1, in the terminal: curl http://localhost:9100/metrics curl http://localhost:9104/metrics curl http://localhost:9105/metrics db2, via web browser: http://192.168.56.202:9100/metrics http://192.168.56.202:9104/metrics http://192.168.56.202:9105/metrics Prometheus endpoints status: http://192.168.56.201:9090/status
  • 27. Prometheus targets 27 At this point, you should see such picture
  • 30. MySQL graphing capabilities 30 Let’s generate some MySQL activity by running OLTP test with sysbench: ./sysbench.sh Observe MySQL dashboards
  • 31. MongoDB Dashboards 31 cp -r /root/grafana_mongodb_dashboards/dashboards/* /var/lib/grafana/dashboards/ Restart grafana (systemctl restart grafana-server.service)
  • 32. MongoDB graphing capabilities - Before 32 1. Beginning on ‘dcu/mongodb_exporter’ 2. Server Status output ‘db.serverStatus()’ 1. Uptime 2. Asserts 3. Durability 4. BackgroundFlushing 5. Connections 6. ExtraInfo 7. GlobalLock 8. IndexCounter 9. Locks 10.Network 11.Opcounters 12.OpcountersRepl 13.Memory 14.Metrics 15.Cursors
  • 33. MongoDB graphing capabilities - After 33 1. Server Status output ‘db.serverStatus()’ 1. Uptime <trimmed> 15. Cursors 2. Replica Set Status Output ‘rs.status()’ 1. Replica Set State 2. Replica Set Optime 3. Replica Set Node-to-Node Ping 4. Replica Set Elections 3. Replica Set Oplog Info 1. Oplog head/tail timestamp 2. Oplog size bytes 3. Oplog item count
  • 34. MongoDB graphing capabilities - After 34 4. Sharding Info (mongos) 1. Balancer Locks and Lock Updates 2. Is Cluster Balanced? 3. # of Shards, DBs, Collections, Chunks 4. # of Mongos processes 5. # of Balancer, Split and Sharding events 5. WiredTiger storage-engine (experimental) 6. Cache Usage 7. Block Usage 8. Transactions 9. Etc
  • 35. MongoDB graphing capabilities - After 35 1. Server Status output ‘db.serverStatus()’ 1. Uptime <trimmed> 15. Cursors 2. Replica Set Status Output ‘rs.status()’ 1. Replica Set State 2. Replica Set Optime 3. Replica Set Node-to-Node Ping 4. Replica Set Elections 3. Replica Set Oplog Info 1. Oplog head/tail timestamp 2. Oplog size bytes 3. Oplog item count
  • 36. MongoDB Exporter Metric Summary 36 Per-collection Summary: 1. 60 x DB-level MongoDB metrics on ‘mongos’ nodes w/1-shard • +5-8~ metrics per shard added 2. 157 x DB-level MongoDB metrics on ‘mongod’ replica set nodes w/2 x members • +5-8~ metrics per shard added 3. 676 x OS-level metrics on recent Linux 3.x+ Total metrics: 893+ per Collection (at minimum)! Total MongoDB MMS metrics: “400 per ping packet” Reference: http://www.slideshare.net/mongodb/using-the-mongodb-monitoring-service-mms Per-collection size: • Raw: 35kb Mongod Replset w/1-node, 17kb Mongos w/1-shard, 91kb Linux node_exporter • Estimated Snappy compression (used in LevelDB) is about 80% Recommended fetch interval: • 5 sec if possible, enough disk space (possibly less?) • 10 sec (default) if not
  • 37. Prometheus Metric Grouping with Labels 37 • Metrics level labels vs Target level labels • Target-level labels can combine multiple exporters together Mongo Node <- Grafana Templating
  • 42. New WiredTiger Metrics and the Future 42 WiredTiger Supported: • Cache • BlockManager • Transaction • ConcurrentTransaction • Log (coming soon!) Future Metrics: • PerconaFT engine metrics • RocksDB engine metrics • Profiler metrics
  • 43. Making a Go-based Prometheus Exporter 43 Overall Steps: 1. Metric definition: 2. Function to “collect” the data (most of the logic):
  • 44. Making a Go-based Prometheus Exporter 44 Overall Steps: 3. Function to “export” the data: 4. Function to “describe” the data:
  • 45. Making a Go-based Prometheus Exporter 45 • Tips / Advice • Always try to user incremented total values • Everything is a float64 - store what provides value • Do “math” operations on values in Grafana • Vector labels are for high-cardinality, be conservative • Not everything needs to be a graph / Prometheus query interface is powerful
  • 46. Alerting with Prometheus 46 Alerting with Prometheus is separated into two parts. Alerting rules in Prometheus servers send alerts to an Alertmanager. The Alertmanager then manages those alerts, including silencing, inhibition, aggregation and sending out notifications via methods such as email, PagerDuty, HipChat, Slack, Pushover. The main steps to setting up alerting and notifications are: • Create alerting rules in Prometheus • Setup and configure the Alertmanager • Configure Prometheus to talk to the Alertmanager with the -alertmanager.url flag
  • 47. Prometheus alerts 47 ALERT ExporterDown IF up == 0 FOR 1m LABELS { severity = "page" } ANNOTATIONS { summary = "{{$labels.alias}}: exporter down", description = "Exporter on job '{{$labels.job}}' is not responding" } ALERT SystemMemory IF round((node_memory_MemAvailable OR (node_memory_MemFree + node_memory_Buffers + node_memory_Cached)) / node_memory_MemTotal * 100) < 5 FOR 1m LABELS { severity = "page" } ANNOTATIONS { summary = "{{$labels.alias}}: low memory", description = "Free {{$value}}% of memory" }
  • 48. Configuring alerts in Prometheus 48 Let’s review alert definitions prepared for this tutorial: cat alerting.rules Include alerting rules into prometheus.yml: rule_files: - alerting.rules Reload prometheus: kill -HUP `pidof prometheus`
  • 49. Alerts in Prometheus web UI 49
  • 50. Using Alertmanager 50 Let’s review Alertmanager config prepared for this tutorial: cat alertmanager.yml Edit it with the appropriate email addresses for testing.
  • 51. Start Alertmanager 51 Extract binaries: tar zxf alertmanager-0.1.1.linux-amd64.tar.gz Start Alertmanager: ./start.sh alertmanager Uncomment ALERTMANAGER line in start.sh Restart Prometheus: kill `pidof prometheus` ./start.sh prometheus
  • 52. Alertmanager web UI 52 Go to Alertmanager web interface http://192.168.56.201:9093
  • 53. Alert paged by email 53
  • 54. Prometheus recording rules 54 Let’s review recording rules prepared for this tutorial: cat recording.rules Include alerting rules into prometheus.yml: rule_files: - recording.rules Reload prometheus: kill -HUP `pidof prometheus`
  • 55. Query for newly created metrics 55
  • 56. Working with Prometheus HTTP API 56 Instant and range queries, at a single point in time or range: curl -sg 'http://localhost:9090/api/v1/query?query=up{job="mysql"}' | python -m json.tool curl -sg 'http://localhost:9090/api/v1/query?query=ALERTS{alertstate="firing"}' | python -m json.tool curl -sg "http://localhost:9090/api/v1/query_range?query=node_load1&start=`expr $(date +%s) - 3600`&end=`date +%s`&step=5m" | python -m json.tool Label values across the whole DB: curl http://localhost:9090/api/v1/label/alias/values List of series matching the expression: curl -sg 'http://localhost:9090/api/v1/series?match[]=node_filesystem_size{fstype!~"rootfs|selinuxfs|autofs |rpc_pipefs|tmpfs"}'| python -m json.tool Delete series: curl -g -X DELETE 'http://localhost:9090/api/v1/series?match[]={alias="db2"}'
  • 57. InfluxDB overview 57 InfluxDB is an open source time series database. It's useful for recording metrics, events, and performing analytics. Web interface http://192.168.56.201:8083 Why InfluxDB? • Currently, one of a few available remote storage options for Prometheus to use as a long- term solution • Multiple retention policies • Easy to use • Grafana support • Clustering
  • 58. Configure Prometheus with InfluxDB 58 Create prometheus db in InfluxDB: influx create database prometheus; Uncomment INFLUXDB line in start.sh Restart Prometheus: kill `pidof prometheus` ./start.sh prometheus Load continuous queries to downsample data: python grafana-dashboards/influxdb_cq.py
  • 59. Using InfluxDB 59 Browse data: influx use prometheus; show measurements; show continuous queries; select * from node_load1; use trending; show retention policies on trending; select * from trending."5m".node_load1; show shards;
  • 60. Add InfluxDB datasource to Grafana 60
  • 61. What’s next? 61 • Grafana 3.0 release: pie charts, more functionality, improved Prometheus datasource? • More long-term storage options for Prometheus • Alertmanager production-ready status? • InfluxDB or not InfluxDB?