SlideShare a Scribd company logo
1 of 49
Download to read offline
Performance Monitoring
for Docker environments
Monitoring Docker
Anomaly detection
Live demo
About me
@coscale
www.coscale.com
@spolfliet
stijn.polfliet@coscale.com
• Scale & dynamic behavior:
Number of containers >> number of servers
Containers come and go at a much faster pace
Container monitoring challenges
• Diversity
Different application technologies
Overload of metrics to monitor and alert on
Monolithic application monitoring
(Virtualized) OS
Application
End user
System / Infrastructure monitoring
Application performance monitoring (APM)
Real user monitoring (RUM)
Microservices monitoring
(Virtualized) OS
End user
System / Infrastructure monitoring
Container monitoring +
In-container application monitoring
Real user monitoring (RUM)
Container
Application
component
Container
Application
component
Container
Application
component
Hosts (CPU, memory, disk)
Orchestrator (services, volumes, replication controllers, …)
Containers (cpu, memory, disk, network, ...)
Container internals (application, database, caching, etc.)
Impact on user and application performance
What to monitor?
Lightweight monitoring for lightweight microservices environment
Docker stats API
$ docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
1285939c1fd3 0.07% 796 KiB / 64 MiB 1.21% 788 B / 648 B 3.568 MB / 512 KB
9c76f7834ae2 0.07% 2.746 MiB / 64 MiB 4.29% 1.266 KB / 648 B 12.4 MB / 0 B
d1ea048f04e4 0.03% 4.583 MiB / 64 MiB 6.30% 2.854 KB / 648 B 27.7 MB / 0 B
Docker API
docker run 
--volume=/:/rootfs:ro 
--volume=/var/run:/var/run:rw 
--volume=/sys:/sys:ro 
--volume=/var/lib/docker/:/var/lib/docker:ro 
--publish=8080:8080 
--detach=true 
--name=cadvisor 
google/cadvisor:latest
open http://<your-hostname>:8080/
CAdvisor
agent runs in 1 container or on host
container resource usage
basic application monitoring
15$ / month / server
Datadog
datadoghq.com
kernel module captures system calls
container resource usage
basic application monitoring
Sysdig
sysdig.com
Heavyweight, deep application monitoring
Designed for monolithic application in specific programming language
Too many dynamic metrics to handle with static alerts
Putting an agent inside a container is an anti-pattern
100+$ / month / server
APM vendors
● Extra work in setting up, maintaining, and supporting
● Generic tools, no specific container or cluster visualizations
● No Real User Monitoring
● No out-of-the-box anomaly detection and predictive analytics
Prometheus
Open source
Performance Monitoring
for Docker environments
Anomaly detection
Anomaly: definition
Static alerts
TODO : more realistic business examples
!
!
!
?
seasonality
correlations
changing or dynamic
environment
Static alert limitations
Challenges
statistical significance relevance
⇏
Simple technique: 3- rule
Exponential smoothing: α=0.03, z=3
Does not work with seasonal data
Holt-Winters
● seasonal exponential smoothing
● works quite well on ‘laboratory data’
● calculation of prediction intervals
relies on normal distribution after
removal of seasonality
● => on our real world seasonal data
generates too many false positives
Sliding window approach
model
evaluation
of new
data
Local outlier factor
Existing instance based machine learning technique (lazy,
~kNN)
Based on concept of local density
local outlier factor(A) =
density at point A
average density of kNN of point A
LOF >> 1 ⇒ outlier
en.wikipedia.org/wiki/Local_outlier_factor
Load balance detector
Compare multiple signals (mean + variance) in load-balanced environment
Anomaly detection @ service level
Lightweight agent
• Server metrics from OS
• Container and cluster metrics from Kubernetes and Docker APIs
• Application metrics from log files and management interfaces
• Business & custom metrics from various sources
Contextual events
• Container lifecycle
• Deployments & software releases
• Infrastructure changes
• Custom events
CoScale approach
Scalable Architecture
APP
APP
APP
APP
APP
API
APP
APP
RUM
Postgresql
Metadata
Cassandra
Metric data
Elasticsearch
Event data
HaProxy
Loadbalancer
HTTPS handling
Analysis workers
Alerting workers
Data workers
RUM
Boomerang.js
Agent
Log & api parsing
DEMO
Questions?
or contact me at
stijn.polfliet@coscale.com
@spolfliet
Backup slides
Local outlier factor, no strong model assumption
heavy
process
Local outlier factor, no strong model assumption
Local outlier factor, no free lunch
Scaling: comparing apples and oranges
scale ⇒ distance ⇒ density ⇒ LOF-score
Autoscaling? (Mahalanobis distance) => enlarges
dimensions with low variance
“Curse of dimensionality”
dimensionality reduction preprocessing (e.g. PCA), but don’t throw
away the anomalies with the bathwater
Choosing cross-sections of data to analyze together, e.g.
different metric on same container
same metric on different containers

More Related Content

What's hot

Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ Docker, Inc.
 
Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipesinside-BigData.com
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerAkshay Mathur
 
Container orchestration overview
Container orchestration overviewContainer orchestration overview
Container orchestration overviewWyn B. Van Devanter
 
Kubernetes fundamentals
Kubernetes fundamentalsKubernetes fundamentals
Kubernetes fundamentalsVictor Morales
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Docker, Inc.
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginerYousun Jeong
 
[Draft] Fast Prototyping with DPDK and eBPF in Containernet
[Draft] Fast Prototyping with DPDK and eBPF in Containernet[Draft] Fast Prototyping with DPDK and eBPF in Containernet
[Draft] Fast Prototyping with DPDK and eBPF in ContainernetAndrew Wang
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend
 
"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014
"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014
"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014Piyush Kumar
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudDatadog
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesLightbend
 
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case studyOSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case studyNETWAYS
 
DotNext 2020 - When and How to Use the Actor Model and Akka.NET
DotNext 2020 - When and How to Use the Actor Model and Akka.NETDotNext 2020 - When and How to Use the Actor Model and Akka.NET
DotNext 2020 - When and How to Use the Actor Model and Akka.NETpetabridge
 
AWS Elastic Container Service (ECS) with a CI Pipeline Overview
AWS Elastic Container Service (ECS) with a CI Pipeline OverviewAWS Elastic Container Service (ECS) with a CI Pipeline Overview
AWS Elastic Container Service (ECS) with a CI Pipeline OverviewWyn B. Van Devanter
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...NETWAYS
 
DSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - ElzingaDSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - ElzingaDeltares
 

What's hot (20)

Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
 
GW Tester
GW TesterGW Tester
GW Tester
 
Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipes
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning Controller
 
Container orchestration overview
Container orchestration overviewContainer orchestration overview
Container orchestration overview
 
Kubernetes fundamentals
Kubernetes fundamentalsKubernetes fundamentals
Kubernetes fundamentals
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
 
[Draft] Fast Prototyping with DPDK and eBPF in Containernet
[Draft] Fast Prototyping with DPDK and eBPF in Containernet[Draft] Fast Prototyping with DPDK and eBPF in Containernet
[Draft] Fast Prototyping with DPDK and eBPF in Containernet
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
 
"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014
"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014
"In love with Open Source : Past, Present and Future" : Keynote OSDConf 2014
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case studyOSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
 
DotNext 2020 - When and How to Use the Actor Model and Akka.NET
DotNext 2020 - When and How to Use the Actor Model and Akka.NETDotNext 2020 - When and How to Use the Actor Model and Akka.NET
DotNext 2020 - When and How to Use the Actor Model and Akka.NET
 
AWS Elastic Container Service (ECS) with a CI Pipeline Overview
AWS Elastic Container Service (ECS) with a CI Pipeline OverviewAWS Elastic Container Service (ECS) with a CI Pipeline Overview
AWS Elastic Container Service (ECS) with a CI Pipeline Overview
 
Developer workflow with docker
Developer workflow with dockerDeveloper workflow with docker
Developer workflow with docker
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 
DSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - ElzingaDSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
 

Viewers also liked

How to turn your social media into your HR recruiter robot.
How to turn your social media into your HR recruiter robot.How to turn your social media into your HR recruiter robot.
How to turn your social media into your HR recruiter robot.Thibaut Vanderhofstadt
 
Monitoring Docker containers - Docker NYC Feb 2015
Monitoring Docker containers - Docker NYC Feb 2015Monitoring Docker containers - Docker NYC Feb 2015
Monitoring Docker containers - Docker NYC Feb 2015Datadog
 
Measuring Micro-services. Richard Rodger
Measuring Micro-services. Richard RodgerMeasuring Micro-services. Richard Rodger
Measuring Micro-services. Richard RodgerFuture Insights
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios
 
Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016Matt Bentley
 
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)Martin Etmajer
 
Docker and Windows: The State of the Union
Docker and Windows: The State of the UnionDocker and Windows: The State of the Union
Docker and Windows: The State of the UnionElton Stoneman
 

Viewers also liked (7)

How to turn your social media into your HR recruiter robot.
How to turn your social media into your HR recruiter robot.How to turn your social media into your HR recruiter robot.
How to turn your social media into your HR recruiter robot.
 
Monitoring Docker containers - Docker NYC Feb 2015
Monitoring Docker containers - Docker NYC Feb 2015Monitoring Docker containers - Docker NYC Feb 2015
Monitoring Docker containers - Docker NYC Feb 2015
 
Measuring Micro-services. Richard Rodger
Measuring Micro-services. Richard RodgerMeasuring Micro-services. Richard Rodger
Measuring Micro-services. Richard Rodger
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
 
Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016Docker Indy Meetup Monitoring 30-Aug-2016
Docker Indy Meetup Monitoring 30-Aug-2016
 
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
 
Docker and Windows: The State of the Union
Docker and Windows: The State of the UnionDocker and Windows: The State of the Union
Docker and Windows: The State of the Union
 

Similar to Performance monitoring for Docker - Lucerne meetup

Performance Monitoring for Docker Environments - Docker Amsterdam June Meetup
Performance Monitoring for Docker Environments - Docker Amsterdam June MeetupPerformance Monitoring for Docker Environments - Docker Amsterdam June Meetup
Performance Monitoring for Docker Environments - Docker Amsterdam June MeetupStijn Polfliet
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environmentsDocker, Inc.
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent MonitoringIntelie
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
PreMonR - A Reactive Platform To Monitor Reactive Application
PreMonR - A Reactive Platform To Monitor Reactive ApplicationPreMonR - A Reactive Platform To Monitor Reactive Application
PreMonR - A Reactive Platform To Monitor Reactive ApplicationKnoldus Inc.
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSAmazon Web Services
 
Debs 2011 tutorial on non functional properties of event processing
Debs 2011 tutorial  on non functional properties of event processingDebs 2011 tutorial  on non functional properties of event processing
Debs 2011 tutorial on non functional properties of event processingOpher Etzion
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsThomas Weise
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Comsysto Reply GmbH
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Interpreting Performance Test Results
Interpreting Performance Test ResultsInterpreting Performance Test Results
Interpreting Performance Test ResultsEric Proegler
 
Azure Monitoring Overview
Azure Monitoring OverviewAzure Monitoring Overview
Azure Monitoring Overviewgjuljo
 
Vulnerability Discovery in the Cloud
Vulnerability Discovery in the CloudVulnerability Discovery in the Cloud
Vulnerability Discovery in the CloudDevOps.com
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseBig Data Spain
 
IOT model to Unified Communication Events in SDN
IOT model to Unified Communication  Events in SDNIOT model to Unified Communication  Events in SDN
IOT model to Unified Communication Events in SDNChandrashekhar Rao
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahuDr. Prakash Sahu
 
Monitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSMonitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSAmazon Web Services
 
Making Runtime Data Useful for Incident Diagnosis: An Experience Report
Making Runtime Data Useful for Incident Diagnosis: An Experience ReportMaking Runtime Data Useful for Incident Diagnosis: An Experience Report
Making Runtime Data Useful for Incident Diagnosis: An Experience ReportQAware GmbH
 
Ginsbourg.com - Performance and Load Test Report Template LTR 1.5
Ginsbourg.com - Performance and Load Test Report Template LTR 1.5Ginsbourg.com - Performance and Load Test Report Template LTR 1.5
Ginsbourg.com - Performance and Load Test Report Template LTR 1.5Shay Ginsbourg
 

Similar to Performance monitoring for Docker - Lucerne meetup (20)

Performance Monitoring for Docker Environments - Docker Amsterdam June Meetup
Performance Monitoring for Docker Environments - Docker Amsterdam June MeetupPerformance Monitoring for Docker Environments - Docker Amsterdam June Meetup
Performance Monitoring for Docker Environments - Docker Amsterdam June Meetup
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
PreMonR - A Reactive Platform To Monitor Reactive Application
PreMonR - A Reactive Platform To Monitor Reactive ApplicationPreMonR - A Reactive Platform To Monitor Reactive Application
PreMonR - A Reactive Platform To Monitor Reactive Application
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
Debs 2011 tutorial on non functional properties of event processing
Debs 2011 tutorial  on non functional properties of event processingDebs 2011 tutorial  on non functional properties of event processing
Debs 2011 tutorial on non functional properties of event processing
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Interpreting Performance Test Results
Interpreting Performance Test ResultsInterpreting Performance Test Results
Interpreting Performance Test Results
 
Azure Monitoring Overview
Azure Monitoring OverviewAzure Monitoring Overview
Azure Monitoring Overview
 
Vulnerability Discovery in the Cloud
Vulnerability Discovery in the CloudVulnerability Discovery in the Cloud
Vulnerability Discovery in the Cloud
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
IOT model to Unified Communication Events in SDN
IOT model to Unified Communication  Events in SDNIOT model to Unified Communication  Events in SDN
IOT model to Unified Communication Events in SDN
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahu
 
Monitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSMonitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECS
 
Making Runtime Data Useful for Incident Diagnosis: An Experience Report
Making Runtime Data Useful for Incident Diagnosis: An Experience ReportMaking Runtime Data Useful for Incident Diagnosis: An Experience Report
Making Runtime Data Useful for Incident Diagnosis: An Experience Report
 
Ginsbourg.com - Performance and Load Test Report Template LTR 1.5
Ginsbourg.com - Performance and Load Test Report Template LTR 1.5Ginsbourg.com - Performance and Load Test Report Template LTR 1.5
Ginsbourg.com - Performance and Load Test Report Template LTR 1.5
 

Recently uploaded

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 

Recently uploaded (20)

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 

Performance monitoring for Docker - Lucerne meetup

  • 1. Performance Monitoring for Docker environments Monitoring Docker Anomaly detection Live demo
  • 3.
  • 4.
  • 5. • Scale & dynamic behavior: Number of containers >> number of servers Containers come and go at a much faster pace Container monitoring challenges • Diversity Different application technologies Overload of metrics to monitor and alert on
  • 6. Monolithic application monitoring (Virtualized) OS Application End user System / Infrastructure monitoring Application performance monitoring (APM) Real user monitoring (RUM)
  • 7. Microservices monitoring (Virtualized) OS End user System / Infrastructure monitoring Container monitoring + In-container application monitoring Real user monitoring (RUM) Container Application component Container Application component Container Application component
  • 8. Hosts (CPU, memory, disk) Orchestrator (services, volumes, replication controllers, …) Containers (cpu, memory, disk, network, ...) Container internals (application, database, caching, etc.) Impact on user and application performance What to monitor? Lightweight monitoring for lightweight microservices environment
  • 9. Docker stats API $ docker stats CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O 1285939c1fd3 0.07% 796 KiB / 64 MiB 1.21% 788 B / 648 B 3.568 MB / 512 KB 9c76f7834ae2 0.07% 2.746 MiB / 64 MiB 4.29% 1.266 KB / 648 B 12.4 MB / 0 B d1ea048f04e4 0.03% 4.583 MiB / 64 MiB 6.30% 2.854 KB / 648 B 27.7 MB / 0 B Docker API
  • 10. docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --detach=true --name=cadvisor google/cadvisor:latest open http://<your-hostname>:8080/ CAdvisor
  • 11. agent runs in 1 container or on host container resource usage basic application monitoring 15$ / month / server Datadog datadoghq.com
  • 12. kernel module captures system calls container resource usage basic application monitoring Sysdig sysdig.com
  • 13. Heavyweight, deep application monitoring Designed for monolithic application in specific programming language Too many dynamic metrics to handle with static alerts Putting an agent inside a container is an anti-pattern 100+$ / month / server APM vendors
  • 14. ● Extra work in setting up, maintaining, and supporting ● Generic tools, no specific container or cluster visualizations ● No Real User Monitoring ● No out-of-the-box anomaly detection and predictive analytics Prometheus Open source
  • 15. Performance Monitoring for Docker environments Anomaly detection
  • 17. Static alerts TODO : more realistic business examples ! ! !
  • 20. Simple technique: 3- rule Exponential smoothing: α=0.03, z=3
  • 21.
  • 22.
  • 23. Does not work with seasonal data
  • 24. Holt-Winters ● seasonal exponential smoothing ● works quite well on ‘laboratory data’ ● calculation of prediction intervals relies on normal distribution after removal of seasonality ● => on our real world seasonal data generates too many false positives
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Local outlier factor Existing instance based machine learning technique (lazy, ~kNN) Based on concept of local density local outlier factor(A) = density at point A average density of kNN of point A LOF >> 1 ⇒ outlier en.wikipedia.org/wiki/Local_outlier_factor
  • 32. Load balance detector Compare multiple signals (mean + variance) in load-balanced environment
  • 33. Anomaly detection @ service level
  • 34.
  • 35.
  • 36. Lightweight agent • Server metrics from OS • Container and cluster metrics from Kubernetes and Docker APIs • Application metrics from log files and management interfaces • Business & custom metrics from various sources Contextual events • Container lifecycle • Deployments & software releases • Infrastructure changes • Custom events CoScale approach
  • 37. Scalable Architecture APP APP APP APP APP API APP APP RUM Postgresql Metadata Cassandra Metric data Elasticsearch Event data HaProxy Loadbalancer HTTPS handling Analysis workers Alerting workers Data workers RUM Boomerang.js Agent Log & api parsing
  • 38. DEMO
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. Questions? or contact me at stijn.polfliet@coscale.com @spolfliet
  • 48. Local outlier factor, no strong model assumption heavy process Local outlier factor, no strong model assumption
  • 49. Local outlier factor, no free lunch Scaling: comparing apples and oranges scale ⇒ distance ⇒ density ⇒ LOF-score Autoscaling? (Mahalanobis distance) => enlarges dimensions with low variance “Curse of dimensionality” dimensionality reduction preprocessing (e.g. PCA), but don’t throw away the anomalies with the bathwater Choosing cross-sections of data to analyze together, e.g. different metric on same container same metric on different containers