SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Guide to 
Anomaly 
Detection 
A Practical 
for DevOps
2 categories 
Anomaly Detection 
log analysis metric analysis
log analysis 
identify suspicious event 
patterns in log files
2 categories 
Anomaly Detection 
log analysis metric analysis
metric analysis 
identify misbehaving 
time-series metrics
Why is anomaly detection worth our time? 
1 
It reveals dangerous patterns 
that previously were undetected 
The static nature of rule-based and threshold-based alerts 
encourages 
a) false positives during peak times 
b) false negatives during quieter times 2
Why is anomaly detection worth our time? 
It reveals dangerous patterns 
that previously were undetected 
12 
The static nature of rule-based and threshold-based alerts 
encourages 
a) false positives during peak times 
b) false negatives during quieter times
weapons of 
mass detection
weapons of 
mass detection anomaly
Anomaly Detective by Prelert 
• Product: Anomaly Detective for Splunk 
• Pricing: $0-$225 / month (quote-based pricing > 10GB) 
• Setup: On premise (OS X, Windows, Linux & SunOS) 
• Installation: Easy (with Splunk Enterprise) 
• Main Datatype: Log lines
Anomaly Detective by Prelert 
Highlights: 
• Capable of consuming any stream of machine-data 
• Can identify rare or unusual messages. 
• A robust REST API, which can process almost any data feed 
• Offers an out-of-the-box app for Splunk Enterprise 
• Extends the Splunk search language with verbs tailored for anomaly 
detection
Sumo Logic 
• Pricing: Quote-based 
• Setup: SaaS (+ on-premise data collectors) 
• Ease of Installation: Average (deploy Sumo Logic's full solution) 
• Main Datatype: Log lines
Sumo Logic 
Highlights: 
• LogReduce: a useful log crunching capability which consolidates 
thousands of log lines into just a few items by detecting recurring patterns. 
• Sumo Logic scans your historical data to evaluate a baseline of normal 
data rates. Then it focuses on the last few minutes and looks for rates 
above or below the baseline. 
• Anomaly detection will work even if the log lines are not exactly identical.
Grok 
• Pricing: $219/month for 200 instances & custom metrics 
• Setup: Dedicated AWS instance 
• Ease of Installation: Easy 
• Main Datatype: System Metrics
Grok 
Highlights: 
• Designed to monitor AWS (works with EC2, EBS, ELB, RDS). 
• Grok API for custom metrics (it’s fairly easy to process data from statsd). 
• Warns you in real time. 
• Customizable alerts for email or mobile notifications. 
• Grok uses their Android mobile app as their main UI. 
• Installation requires a dedicated Grok instance in your cloud environment.
Skyline 
• Pricing: Open source 
• Setup: On-premise 
• Ease of Installation: Average (need python, redis and graphite) 
• Main Datatype: System Metrics
Skyline 
Highlights: 
• Etsy’s minimalist web UI lists anomalies & visualizes underlying graphs. 
• Horizon accepts time-series data via TCP & UDP inputs. 
• Stream Graphite metrics into Horizon. Horizon uploads data to a redis 
instance where it is processed by Analyzer - a python daemon helping to find 
time-series which are behaving abnormally. 
• Oculus, the other half of the Kale stack, is a search engine for graphs. Input 
one graph then locate other graphs that behave like it. Detect an anomaly 
using Skyline, then use Oculus to search for graphs that are suspiciously 
correlated to the offending graph.
But detecting anomalies 
! 
is only half the battle...
BigPanda + Anomaly Detection 
BigPanda uses an algorithmic, 
data science approach to 
simplify & automate incident 
management 
! 
! 
incident 
! 
management 
! 
Anomaly Detection
Come take a look at what 
BigPanda is building! 
http://bigpanda.io 
Follow us 
online!

Contenu connexe

Tendances

Tendances (20)

CNIT 152: 4 Starting the Investigation & 5 Leads
CNIT 152: 4 Starting the Investigation & 5 LeadsCNIT 152: 4 Starting the Investigation & 5 Leads
CNIT 152: 4 Starting the Investigation & 5 Leads
 
.conf Go 2022 - Observability Session
.conf Go 2022 - Observability Session.conf Go 2022 - Observability Session
.conf Go 2022 - Observability Session
 
Intrusion Detection Systems and Intrusion Prevention Systems
Intrusion Detection Systems  and Intrusion Prevention Systems Intrusion Detection Systems  and Intrusion Prevention Systems
Intrusion Detection Systems and Intrusion Prevention Systems
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
 
DataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenter
 
Do You Really Need to Evolve From Monitoring to Observability?
Do You Really Need to Evolve From Monitoring to Observability?Do You Really Need to Evolve From Monitoring to Observability?
Do You Really Need to Evolve From Monitoring to Observability?
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with Splunk
 
Observability & Datadog
Observability & DatadogObservability & Datadog
Observability & Datadog
 
Observability at Scale
Observability at Scale Observability at Scale
Observability at Scale
 
Honeypots (Ravindra Singh Rathore)
Honeypots (Ravindra Singh Rathore)Honeypots (Ravindra Singh Rathore)
Honeypots (Ravindra Singh Rathore)
 
Splunk for IT Operations
Splunk for IT OperationsSplunk for IT Operations
Splunk for IT Operations
 
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
 
ExtraHop Product Overview Datasheet
ExtraHop Product Overview DatasheetExtraHop Product Overview Datasheet
ExtraHop Product Overview Datasheet
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
 
Introducing Confluent Cloud: Apache Kafka as a Service
Introducing Confluent Cloud: Apache Kafka as a Service Introducing Confluent Cloud: Apache Kafka as a Service
Introducing Confluent Cloud: Apache Kafka as a Service
 
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
 
Windows Threat Hunting
Windows Threat HuntingWindows Threat Hunting
Windows Threat Hunting
 
Power of Splunk Search Processing Language (SPL)
Power of Splunk Search Processing Language (SPL)Power of Splunk Search Processing Language (SPL)
Power of Splunk Search Processing Language (SPL)
 
Threat Hunting
Threat HuntingThreat Hunting
Threat Hunting
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 

Similaire à A Practical Guide to Anomaly Detection for DevOps

Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 

Similaire à A Practical Guide to Anomaly Detection for DevOps (20)

Deep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data ServicesDeep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data Services
 
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
 
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETT
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software
 
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toMonitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-to
 
Cassandra in xPatterns
Cassandra in xPatternsCassandra in xPatterns
Cassandra in xPatterns
 
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
 
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an..."Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
 

Dernier

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 

Dernier (20)

GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
A Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationA Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data Migration
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 

A Practical Guide to Anomaly Detection for DevOps

  • 1. Guide to Anomaly Detection A Practical for DevOps
  • 2. 2 categories Anomaly Detection log analysis metric analysis
  • 3. log analysis identify suspicious event patterns in log files
  • 4. 2 categories Anomaly Detection log analysis metric analysis
  • 5. metric analysis identify misbehaving time-series metrics
  • 6. Why is anomaly detection worth our time? 1 It reveals dangerous patterns that previously were undetected The static nature of rule-based and threshold-based alerts encourages a) false positives during peak times b) false negatives during quieter times 2
  • 7. Why is anomaly detection worth our time? It reveals dangerous patterns that previously were undetected 12 The static nature of rule-based and threshold-based alerts encourages a) false positives during peak times b) false negatives during quieter times
  • 8. weapons of mass detection
  • 9. weapons of mass detection anomaly
  • 10. Anomaly Detective by Prelert • Product: Anomaly Detective for Splunk • Pricing: $0-$225 / month (quote-based pricing > 10GB) • Setup: On premise (OS X, Windows, Linux & SunOS) • Installation: Easy (with Splunk Enterprise) • Main Datatype: Log lines
  • 11. Anomaly Detective by Prelert Highlights: • Capable of consuming any stream of machine-data • Can identify rare or unusual messages. • A robust REST API, which can process almost any data feed • Offers an out-of-the-box app for Splunk Enterprise • Extends the Splunk search language with verbs tailored for anomaly detection
  • 12. Sumo Logic • Pricing: Quote-based • Setup: SaaS (+ on-premise data collectors) • Ease of Installation: Average (deploy Sumo Logic's full solution) • Main Datatype: Log lines
  • 13. Sumo Logic Highlights: • LogReduce: a useful log crunching capability which consolidates thousands of log lines into just a few items by detecting recurring patterns. • Sumo Logic scans your historical data to evaluate a baseline of normal data rates. Then it focuses on the last few minutes and looks for rates above or below the baseline. • Anomaly detection will work even if the log lines are not exactly identical.
  • 14. Grok • Pricing: $219/month for 200 instances & custom metrics • Setup: Dedicated AWS instance • Ease of Installation: Easy • Main Datatype: System Metrics
  • 15. Grok Highlights: • Designed to monitor AWS (works with EC2, EBS, ELB, RDS). • Grok API for custom metrics (it’s fairly easy to process data from statsd). • Warns you in real time. • Customizable alerts for email or mobile notifications. • Grok uses their Android mobile app as their main UI. • Installation requires a dedicated Grok instance in your cloud environment.
  • 16. Skyline • Pricing: Open source • Setup: On-premise • Ease of Installation: Average (need python, redis and graphite) • Main Datatype: System Metrics
  • 17. Skyline Highlights: • Etsy’s minimalist web UI lists anomalies & visualizes underlying graphs. • Horizon accepts time-series data via TCP & UDP inputs. • Stream Graphite metrics into Horizon. Horizon uploads data to a redis instance where it is processed by Analyzer - a python daemon helping to find time-series which are behaving abnormally. • Oculus, the other half of the Kale stack, is a search engine for graphs. Input one graph then locate other graphs that behave like it. Detect an anomaly using Skyline, then use Oculus to search for graphs that are suspiciously correlated to the offending graph.
  • 18. But detecting anomalies ! is only half the battle...
  • 19. BigPanda + Anomaly Detection BigPanda uses an algorithmic, data science approach to simplify & automate incident management ! ! incident ! management ! Anomaly Detection
  • 20. Come take a look at what BigPanda is building! http://bigpanda.io Follow us online!