Contenu connexe Similaire à Large scale predictive analytics for anomaly detection - Nicolas Hohn (20) Large scale predictive analytics for anomaly detection - Nicolas Hohn1. © 2014 Guavus, Inc. All rights reserved.
Nicolas Hohn
Director of Analytics
Guavus
LARGE SCALE PREDICTIVE ANALYTICS
FOR ANOMALY DETECTION
2015
2. © 2015 Guavus, Inc. All rights reserved. 2
Guavus Applications Focus
Planning
Engineering
• Network Analytics
• Capacity Management
• Trending & Forecasting
• Value-based
Network Planning
• Quality of Experience
• Complaints Mitigation
• Proactive Care
• Revenue Assurance
• Self Care Usage Portal
Network
Operations
MarketingCare
• Service Management
• QoS Management
• Performance Monitoring
• Proactive Service
Assurance
• Subscriber Profiling
• Personalization &
Targeting
• CSP Data
Monetization
Service Assurance Customer Experience
3. © 2015 Guavus, Inc. All rights reserved. 3
Anomaly Detection
• Anomaly: something that is unusual or unexpected
• Detection: extraction of particular information from a larger stream of information
without specific cooperation from or synchronization with the sender
Implementation
• Rule based: manual thresholds
• Automated: thresholds set by machine learning
Operational Intelligence
Service
Degrading
Problem!
Service
Anomalies
Identification!
Root-Cause
Analysis!
Problem
Resolution!
Quantify how ‘unusual’ a signal value is
Unsupervised learning to send trigger
when signal is ‘unexpected enough’
DOES NOT SCALE
spike step slope
time time time
KPI
4. © 2015 Guavus, Inc. All rights reserved. 4
Anomaly detection
Event Arrival Times
2014-09-16 00:00:06
2014-09-16 00:00:09
2014-09-16 00:00:40
2014-09-16 00:00:42
2014-09-16 00:00:45
2014-09-16 00:01:00
2014-09-16 00:01:09
2014-09-16 00:01:11
2014-09-16 00:01:20
2014-09-16 00:02:09
……
5
4
Define KPI and time scale
Predict conditional baseline (black line) and
probability density given historical data
KPI value (green line)
Trigger alert (red dot) if data point
significantly above baseline, i.e. outside
confidence interval (gray bands)
1 2 3
4
• 4 step process
#events
Time
5. © 2015 Guavus, Inc. All rights reserved. 5
KPI
time
EASIER
Challenges
• Predict distribution of current value based on past values
– Uni/Multi variate time series analysis
• Unify uncertainty metric across all types of input signals to build a global ranking of alarms
• Scale on limited hardware footprint. Real time monitoring of potentially millions of time series
• Keep customer happy (no alarm flooding, limit false positives, rank alarms by severity)
KPI
time
HARDER
HARD
HARD
HARD
6. © 2015 Guavus, Inc. All rights reserved. 6
Solutions
time
Anomalyindicator
#events
• Data Science:
– Robust to past anomalies
• No guarantee that ‘training’ data is anomaly free
– Adapt to changes
• Retrain model
– Cannot rely on labeled data:
• Understand customer ‘utility function’, business impact of anomalies
• Set thresholds automatically
• Quantify cost of false positives and false negatives
• Engineering:
– Intelligent caching
– Compression
– Scalable system
7. © 2015 Guavus, Inc. All rights reserved. 7
• Monitor KPIs, such as dropped call rate on each base station in a 4G network
• Detect anomalies
• Infer root cause by analyzing 1000s of other KPIs available on each cell of the network
Use case: Networks analytics
8. © 2015 Guavus, Inc. All rights reserved. 8
Architecture of the solution
Data
fusion
aggrega/on
Compute Cluster Analytics Cluster
Intelligent
Cache
Collector
Adapter
1
Custom
Adapter
2
Columnar
Storage
Anomaly
Detec/on
UserInterface
Time
series
analy/cs
Rules/
alerts
frame-‐work
M2MInterface
withcustomer
system
Data
streams
9. © 2015 Guavus, Inc. All rights reserved. 9
Conclusion
Lessons learned
• No silver bullet, but multiple methods each with their own pros/cons
• Simple and scalable solution
• Adapt to:
– data changes
– customer needs
• API design:
– black-box approach: hide complexity from developers
10. © 2015 Guavus, Inc. All rights reserved.
QA
Nicolas Hohn, Director of Analytics
nicolas.hohn@guavus.com
2015