SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Build Full Stack
Monitoring and Notification 

with Prometheus
1
Jazz Yao-Tsung Wang
Initiator of Taiwan Data Engineering Association
Co-Founder of Taiwan Hadoop User Group
Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
Hello!
I am Jazz Wang
Co-Founder of Hadoop.TW
Initiator of Taiwan Data Engineering Association (TDEA)
Hadoop Evangelist since 2008.
Open Source Promoter. System Admin (Ops).
- 11 years (2002/08 ~ 2014/02) Researcher in HPC field.
- 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP),
Product Management of ‘Big Data Platform Management Product’
- 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding
You can find me at @jazzwang_tw or

https://fb.com/groups/dataengineering.tw 

https://slideshare.net/jazzwang
2
1.
/ /
Why do I need Full Stack Monitoring and Notification ?
Let’s start with Jazz’s Jobs / Pains / Gains
3
AWS
Hybrid ….
4
VM
Azure
GCP
5
NetAdmin
Research
Developer
Security
Cloud Ops
SysAdmin
Data Engineer
6
NetAdmin
Research
Developer
Security
Cacti
NewRelic 

Server
OpsCenter
Kafka Manager
NewRelic 

Synthetic / APM
Status Cake
++ ++ DataDog
Pain
▷ Data Fragments
▷
▷
▷ Data Retention
▷ 7
▷ Black Box
▷ (Metrics)
▷ Metrics
▷ Vendor Lock-in
▷
7
Gain —
▷ Centralized Time-serious Database
▷
▷ Support Alert Notification
▷ Slack, E-mail, SMS …
▷ Self-defined Data Retention Rate
▷
▷ White Box
▷ Metrics = (Metrics)
▷ Self-defined Dashboard
▷ Ex. Data Pipeline
8
( ) …. Inspired by Outlier …
https://www.outlyer.com/
~~ ~~
9
2.
/ /
Introduction to Prometheus Ecosystem
Features / Pain Relievers / Gain Creators
10
11
Concepts
Common Building Blocks
12
Target
Collector
Exporter
Time-Series
Database
Rule
Dashboard
Alert Message
Collector
Exporter
Exporter
Dashboard
Dashboard
TargetTarget
Rule
Rule
Alert Message
Annotation
Push
Pull
Ranking of Time Series DBMS
13https://db-engines.com/en/ranking/time+series+dbms
Comparison of Common Monitor and Notification System
14
Target / Exporter DBMS
Dashboard
Alert
snmpd
Pull
Cacti — Device
( snmpwalk )
RRDTool Cacti — Graph Plugin*
gmond
Pull
Ganglia
gmetad
RRDTool Ganglia Nagios
newrelic-agent
Push (?) NewRelic ?? NewRelic NewRelic Alert
statsD
Push Carbon / whisper Graphite Grafana Grafana
Telegraf
Push Telegraf InfluxDB Grafana Grafana
Pull
Push*
snmp_expoter
node_exporter
jmx_exporter …
Prometheus Grafana AlertManager
15
About Prometheus
▷ https://prometheus.io/
▷ 2012 11 SoundCloud
▷ Go Apache 2.0
▷ 2016 Cloud Native Computing Foundation

Kubernates K8S Prometheus
▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08
▷ PromQL
▷ Grafana
▷ AlertManager
▷ v2.0
16
Components of Prometheus
Push
Pull
Query
Comparison of Time-Series DBMS
17
Prometheus
HA
Prometheus
Data Model
Client Libraries
18
▷ Official Prometheus client library
▷ Go
▷ Java or Scala
▷ Python
▷ Ruby
▷ Unofficial 3rd-party client library
▷ Bash
▷ C++
▷ Common Lisp
▷ Elixir
▷ Erlang
▷ Haskell
▷ Lua for Nginx
▷ Lua for Tarantool
▷ .NET / C#
▷ Node.js
▷ PHP
▷ Rust
19
3.
Docker Compose
Full Stack
Show me the source code!!
○ https://github.com/jazzwang/prometheus-labs
○ Docker Compose
○
20
— Data Pipeline
21
in_dummy Fluentd out_kafka
Kafka
in_kafka_group Fluentd
out_file
Network Layer
▷ snmp_exporter
○ https://github.com/prometheus/snmp_exporter
○ snmp Metrics
○ MIB OID
○ 

snmp_exporter generator
snmp.yml
▷ blackbox_exporter
○ https://github.com/prometheus/blackbox_exporter
○ HTTP, HTTPS, DNS, TCP ICMP
○ 

Web Service SSH DNS
Ping blackbox_exporter
22
System Layer
▷ node_exporter
○ https://github.com/prometheus/node_exporter
○ OS Level Metrics
23
Middleware Layer
▷ jmx_exporter
○ https://github.com/prometheus/jmx_exporter
○ Java YAML
Prometheus Metrics
○
■ Apache Kafka
■ Apache Cassandra
■ Apache Flink
■ Apache Spark
■ Apache Tomcat
■ Apache ZooKeeper
■ Apache ActiveMQ Artemis 2.x
■ WebLogic
■ WildFly 10
24
Kafka
▷ `jmx_exporter` Kafka Cassandra
○ Docker - https://github.com/RobustPerception/docker_examples
▷ kafka_topic_exporter
○ Java Jetty
○ https://github.com/ogibayashi/kafka-topic-exporter
▷ kafka_zookeeper_exporter
○ ZK topic_partition
○ https://github.com/cloudflare/kafka_zookeeper_exporter
▷ prometheus-kafka-consumer-group-exporter
○ Python Metrics consumer_group_offset topic_highwater
Lag
○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter
▷ burrow_exporter
○ LinkedIn Kafka Lag Burrow (Go ,
sliding window )
○ https://github.com/jirwin/burrow_exporter
25
Kafka
▷ kafka-consumer-group-exporter
○ Go kafka-consumer-groups.sh
○ https://github.com/kawamuray/prometheus-kafka-consumer-group-
exporter
▷ kafka-prometheus-exporter
○ Go consumergoup_lag metrics
○ Kafka 0.8 (ZK)
○ https://github.com/ogibayashi/kafka-topic-exporter
▷ kafka_zookeeper_exporter
○ Go Metrics
○ Kafka 0.9 (KF)
○ https://github.com/danielqsj/kafka_exporter
26
Fluentd
▷ fluent-agent-lite_exporter
○ Tagamoris fluent-agent-lite [1]
○ https://github.com/matsumana/fluent-agent-lite_exporter
○ [1] https://github.com/tagomoris/fluent-agent-lite
▷ fluent-plugin-prometheus
○ fluentd → monitor_agent → fluent-plugin-prometheus
○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd
○ https://github.com/fluent/fluent-plugin-prometheus
▷ fluentd_exporter
○ Release,
○ https://github.com/wyukawa/fluentd_exporter
▷ fluentd_exporter
○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus
○ https://github.com/wyukawa/fluentd_exporter
27
Application Layer
28
▷ https://prometheus.io/docs/instrumenting/clientlibs/
Application Layer
29
▷ http://metrics.dropwizard.io/4.0.0/
30
4.
Lesson Learned
Lesson Learned
▷ Lesson #1



Prometheus 

▷ Lesson #2





Metrics exporter 

○ exporter

https://prometheus.io/docs/instrumenting/exporters/
○ Port

https://github.com/prometheus/prometheus/wiki/Default-port-allocations
○ exporter Metrics
31
Lesson Learned
▷
○ github
○ exporter Metrics
○ http://prometheus:9090/graph
○ Grafana Dashboard
○ Grafana Alert
32
33
Thanks!
Any questions?
You can find me at @jazzwang_tw or

https://fb.com/groups/dataengineering.tw 

https://slideshare.net/jazzwang
https://github.com/jazzwang
Github *^__^*

Contenu connexe

Tendances

Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...Sergey Lukjanov
 
High Performance Python on Apache Spark
High Performance Python on Apache SparkHigh Performance Python on Apache Spark
High Performance Python on Apache SparkWes McKinney
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...Spark Summit
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsDataWorks Summit
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
 
Spark,Hadoop,Presto Comparition
Spark,Hadoop,Presto ComparitionSpark,Hadoop,Presto Comparition
Spark,Hadoop,Presto ComparitionSandish Kumar H N
 
Sparkler - Spark Crawler
Sparkler - Spark Crawler Sparkler - Spark Crawler
Sparkler - Spark Crawler Thamme Gowda
 
Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Alluxio, Inc.
 
Hadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant birdHadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant birdKevin Weil
 
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersHadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersDataWorks Summit/Hadoop Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-PatternsDouglas Moore
 

Tendances (20)

Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
 
High Performance Python on Apache Spark
High Performance Python on Apache SparkHigh Performance Python on Apache Spark
High Performance Python on Apache Spark
 
Empower Data-Driven Organizations
Empower Data-Driven OrganizationsEmpower Data-Driven Organizations
Empower Data-Driven Organizations
 
Cassandra: Now and the Future @ Yahoo! JAPAN
Cassandra: Now and the Future @ Yahoo! JAPANCassandra: Now and the Future @ Yahoo! JAPAN
Cassandra: Now and the Future @ Yahoo! JAPAN
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Treasure Data and Fluentd
Treasure Data and FluentdTreasure Data and Fluentd
Treasure Data and Fluentd
 
Powering a Virtual Power Station with Big Data
Powering a Virtual Power Station with Big DataPowering a Virtual Power Station with Big Data
Powering a Virtual Power Station with Big Data
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Spark,Hadoop,Presto Comparition
Spark,Hadoop,Presto ComparitionSpark,Hadoop,Presto Comparition
Spark,Hadoop,Presto Comparition
 
Big Data A La Carte Menu
Big Data A La Carte MenuBig Data A La Carte Menu
Big Data A La Carte Menu
 
Sparkler - Spark Crawler
Sparkler - Spark Crawler Sparkler - Spark Crawler
Sparkler - Spark Crawler
 
Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3
 
hotdog a TD tool for DD
hotdog a TD tool for DDhotdog a TD tool for DD
hotdog a TD tool for DD
 
Hadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant birdHadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant bird
 
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersHadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
 

Similaire à Full Stack Monitoring with Prometheus and Grafana

Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupWojciech Biela
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Timothy Spann
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Wojciech Biela
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrTimothy Spann
 
Using apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesUsing apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesTimothy Spann
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practicesTimothy Spann
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonInsuk (Chris) Cho
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101Timothy Spann
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300Timothy Spann
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsTimothy Spann
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceTimothy Spann
 
Monitoring Kafka w/ Prometheus
Monitoring Kafka w/ PrometheusMonitoring Kafka w/ Prometheus
Monitoring Kafka w/ Prometheuskawamuray
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Appsssuser73434e
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkdatamantra
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsKeiichiro Ono
 

Similaire à Full Stack Monitoring with Prometheus and Grafana (20)

Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
 
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solr
 
Using apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesUsing apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelines
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practices
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of Python
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
 
Monitoring Kafka w/ Prometheus
Monitoring Kafka w/ PrometheusMonitoring Kafka w/ Prometheus
Monitoring Kafka w/ Prometheus
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Apps
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
 
解讀雲端大數據新趨勢
解讀雲端大數據新趨勢解讀雲端大數據新趨勢
解讀雲端大數據新趨勢
 
FLiP Into Trino
FLiP Into TrinoFLiP Into Trino
FLiP Into Trino
 

Dernier

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Dernier (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Full Stack Monitoring with Prometheus and Grafana

  • 1. Build Full Stack Monitoring and Notification 
 with Prometheus 1 Jazz Yao-Tsung Wang Initiator of Taiwan Data Engineering Association Co-Founder of Taiwan Hadoop User Group Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
  • 2. Hello! I am Jazz Wang Co-Founder of Hadoop.TW Initiator of Taiwan Data Engineering Association (TDEA) Hadoop Evangelist since 2008. Open Source Promoter. System Admin (Ops). - 11 years (2002/08 ~ 2014/02) Researcher in HPC field. - 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP), Product Management of ‘Big Data Platform Management Product’ - 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang 2
  • 3. 1. / / Why do I need Full Stack Monitoring and Notification ? Let’s start with Jazz’s Jobs / Pains / Gains 3
  • 7. Pain ▷ Data Fragments ▷ ▷ ▷ Data Retention ▷ 7 ▷ Black Box ▷ (Metrics) ▷ Metrics ▷ Vendor Lock-in ▷ 7
  • 8. Gain — ▷ Centralized Time-serious Database ▷ ▷ Support Alert Notification ▷ Slack, E-mail, SMS … ▷ Self-defined Data Retention Rate ▷ ▷ White Box ▷ Metrics = (Metrics) ▷ Self-defined Dashboard ▷ Ex. Data Pipeline 8
  • 9. ( ) …. Inspired by Outlier … https://www.outlyer.com/ ~~ ~~ 9
  • 10. 2. / / Introduction to Prometheus Ecosystem Features / Pain Relievers / Gain Creators 10
  • 12. Common Building Blocks 12 Target Collector Exporter Time-Series Database Rule Dashboard Alert Message Collector Exporter Exporter Dashboard Dashboard TargetTarget Rule Rule Alert Message Annotation Push Pull
  • 13. Ranking of Time Series DBMS 13https://db-engines.com/en/ranking/time+series+dbms
  • 14. Comparison of Common Monitor and Notification System 14 Target / Exporter DBMS Dashboard Alert snmpd Pull Cacti — Device ( snmpwalk ) RRDTool Cacti — Graph Plugin* gmond Pull Ganglia gmetad RRDTool Ganglia Nagios newrelic-agent Push (?) NewRelic ?? NewRelic NewRelic Alert statsD Push Carbon / whisper Graphite Grafana Grafana Telegraf Push Telegraf InfluxDB Grafana Grafana Pull Push* snmp_expoter node_exporter jmx_exporter … Prometheus Grafana AlertManager
  • 15. 15 About Prometheus ▷ https://prometheus.io/ ▷ 2012 11 SoundCloud ▷ Go Apache 2.0 ▷ 2016 Cloud Native Computing Foundation
 Kubernates K8S Prometheus ▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08 ▷ PromQL ▷ Grafana ▷ AlertManager ▷ v2.0
  • 17. Comparison of Time-Series DBMS 17 Prometheus HA Prometheus Data Model
  • 18. Client Libraries 18 ▷ Official Prometheus client library ▷ Go ▷ Java or Scala ▷ Python ▷ Ruby ▷ Unofficial 3rd-party client library ▷ Bash ▷ C++ ▷ Common Lisp ▷ Elixir ▷ Erlang ▷ Haskell ▷ Lua for Nginx ▷ Lua for Tarantool ▷ .NET / C# ▷ Node.js ▷ PHP ▷ Rust
  • 20. Show me the source code!! ○ https://github.com/jazzwang/prometheus-labs ○ Docker Compose ○ 20
  • 21. — Data Pipeline 21 in_dummy Fluentd out_kafka Kafka in_kafka_group Fluentd out_file
  • 22. Network Layer ▷ snmp_exporter ○ https://github.com/prometheus/snmp_exporter ○ snmp Metrics ○ MIB OID ○ 
 snmp_exporter generator snmp.yml ▷ blackbox_exporter ○ https://github.com/prometheus/blackbox_exporter ○ HTTP, HTTPS, DNS, TCP ICMP ○ 
 Web Service SSH DNS Ping blackbox_exporter 22
  • 23. System Layer ▷ node_exporter ○ https://github.com/prometheus/node_exporter ○ OS Level Metrics 23
  • 24. Middleware Layer ▷ jmx_exporter ○ https://github.com/prometheus/jmx_exporter ○ Java YAML Prometheus Metrics ○ ■ Apache Kafka ■ Apache Cassandra ■ Apache Flink ■ Apache Spark ■ Apache Tomcat ■ Apache ZooKeeper ■ Apache ActiveMQ Artemis 2.x ■ WebLogic ■ WildFly 10 24
  • 25. Kafka ▷ `jmx_exporter` Kafka Cassandra ○ Docker - https://github.com/RobustPerception/docker_examples ▷ kafka_topic_exporter ○ Java Jetty ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ ZK topic_partition ○ https://github.com/cloudflare/kafka_zookeeper_exporter ▷ prometheus-kafka-consumer-group-exporter ○ Python Metrics consumer_group_offset topic_highwater Lag ○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter ▷ burrow_exporter ○ LinkedIn Kafka Lag Burrow (Go , sliding window ) ○ https://github.com/jirwin/burrow_exporter 25
  • 26. Kafka ▷ kafka-consumer-group-exporter ○ Go kafka-consumer-groups.sh ○ https://github.com/kawamuray/prometheus-kafka-consumer-group- exporter ▷ kafka-prometheus-exporter ○ Go consumergoup_lag metrics ○ Kafka 0.8 (ZK) ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ Go Metrics ○ Kafka 0.9 (KF) ○ https://github.com/danielqsj/kafka_exporter 26
  • 27. Fluentd ▷ fluent-agent-lite_exporter ○ Tagamoris fluent-agent-lite [1] ○ https://github.com/matsumana/fluent-agent-lite_exporter ○ [1] https://github.com/tagomoris/fluent-agent-lite ▷ fluent-plugin-prometheus ○ fluentd → monitor_agent → fluent-plugin-prometheus ○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd ○ https://github.com/fluent/fluent-plugin-prometheus ▷ fluentd_exporter ○ Release, ○ https://github.com/wyukawa/fluentd_exporter ▷ fluentd_exporter ○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus ○ https://github.com/wyukawa/fluentd_exporter 27
  • 31. Lesson Learned ▷ Lesson #1
 
 Prometheus 
 ▷ Lesson #2
 
 
 Metrics exporter 
 ○ exporter
 https://prometheus.io/docs/instrumenting/exporters/ ○ Port
 https://github.com/prometheus/prometheus/wiki/Default-port-allocations ○ exporter Metrics 31
  • 32. Lesson Learned ▷ ○ github ○ exporter Metrics ○ http://prometheus:9090/graph ○ Grafana Dashboard ○ Grafana Alert 32
  • 33. 33 Thanks! Any questions? You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang https://github.com/jazzwang Github *^__^*