3. Contents
●
Real Data and Ideal Models
●
Load Testing (Tuning)
●
Production Monitoring
●
Correlation
●
Tools
4. Real Data vs. Ideal Models
●
noise (human actions)
●
outliers
●
missing data
●
different resolutions
●
counter update frequencies
●
quantization
●
not Gaussian and not random walk
●
what is normal for the system?
35. 2-Sample Tests: Good
Kolmogorov–Smirnov, Cramér–von Mises
●
good for request size and latency (unaggregated)
●
work on periodic data
●
outlier resistant
●
good for data exploration
36. 2-Sample Tests: Bad
Kolmogorov–Smirnov, Cramér–von Mises
●
false positives on trends and seasonal changes
●
need many unique values
●
computational complexity
●
bad for alerting
41. Clustering
●
non-euclidean (ultrametric) space
●
many small clusters
●
local clustering around events
●
false positives
–
cron jobs (log rotation)
–
human actions (restarts, reconfigurations)
–
cache expirations
–
…
47. Skyline Algorithms
●
median absolute deviation
●
mean subtraction cumulation
●
grubbs
●
least squares
●
first hour average
●
histogram bins
●
stddev from average
●
ks test
●
stddev from moving average
●
second order anomalies