4. Skymind
We take Deep Learning models to production on premise
Using Scala (think Python for production)
Java Virtual Machine stack connected to C++ (eg: first class
access to big data systems) with native compute
We make SKIL(Skymind Intelligence Layer): A production deep
learning system for building deep learning applications in
production
5. What’s an “Anomaly?”
Abnormal Patterns in Data
Fraud Detection - “Bad credit card Transactions”
ALSO Fraud detection - Detecting fake locations with call detail
records
Network Intrusion - Abnormal Activity in a network
Broken Computers in a data center
6. Brief Case Studies - eg: Why am I up here?
Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-
learning-software-to-identify-fraud/
Network Infrastructure:
https://insights.ubuntu.com/2016/04/25/making-deep-
learning-accessible-on-openstack/
7. Network Infra - Save time and Money avoiding
Broken workloads by auto migration before it happens
8. Why Deep Learning?
Learns well from lots of data
Own feature representation: Robust to noise and allows for
learning cross domain patterns
Already applied in ads: Google itself invests lots in this same
kind of pattern recognition (targeting/relevance)
9. Techniques
Unsupervised - Use autoencoder reconstruction error and moving averages with
dropout over a set time window
Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from
a series of time steps and predict when an anomaly is about to occur.
Use streaming/minibatches (all neural nets can learn like this)
11. Recurrent Net Anomalies
Learn a softmax over time series:
Given a fixed window, the goal is to predict a probability of an anomaly
occurring given a sequence
12. Sequences Time Series/Windows with RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
13. Some definitions
Reconstruction Error: Autoencoders can learn from
unsupervised pretraining and learn how to reconstruct data.
Minimize KL Divergence (the delta between two probability
distributions)
RNN/Time Series: See http://deeplearning4j.org/usingrnns
16. Reference Architecture for Anomaly Detection
External
World
Ingest from
external with
nifi Send to
kafka
Make a
prediction
about the
data
Index the
prediction in
elasticsearch
with logstash
Render
the
data
with
kibana
Store raw
events in
cassandra
17. Summary
Real ML pipeline
Cassandra for storing raw data results
ELK (Elasticsearch, Logstash, Kibana) stack for alerting and
visualization
Kafka for model ingestion
Lagom for serving model predictions
NiFi for designing data pipelines