We thoroughly enjoyed sharing some early strategies to perform security analysis on Neural Networks (Deep Learning/Machine Learning Models) at Shopify.
The field is still ripe and a lot more advancements need to happen in order to build Enterprise grade scanners.
Our discussion was recorded, and your comments and opinions would help drive the field forward. To the best of our knowledge, this talk is a #First of its kind on youtube.
Video Link: https://lnkd.in/erP9tUE
HTML Injection Attacks: Impact and Mitigation Strategies
Securing Neural Networks
1. [TensorFuzz] Debugging Neural
Networks with Coverage-Guided Fuzzing
Authors: Augustus Odena, Ian Goodfellow
Presentor: Tahseen Shabab
Facilitators: Susan Shu, Serena McDonnell
Date: 26th August, 2019
Cybersecurity AI
2. Tahseen Shabab
Presenter
CEO, Bibu Labs
Susan Shu Serena McDonnell
Facilitator
Data Scientist, Bell
Facilitator
Senior Data Scientist, Delphia
Speakers
3.
4. We
Prof Hassan Khan
Chief Scientist, Bibu Labs
Prof. Kate Larson Prof. Larry Smith
Advisor - AI, Bibu Labs Advisor - Strategy, Bibu Labs
We Are Growing!
9. Cylance Hack: Reverse Engineer Model
7000 Feature
Vectors
Neural Network
Post Processing
Added Filter
White/Black List
10. Cylance Hack: Exploit Model Bias
● Researchers found bias in the model
○ A small set of features that have significant effect on outcome
● “Added Filter” uses Clusters with specific names to Whitelist files,
one being a famous game
● Researchers added strings from games executable to real
malicious file
● Game Over!
12. Lawd & Meek (2005) and Wittel & WU (2004)
● Attacks against statistical spam filters
○ Add good words
○ Words the filter consider indicative of non-spam to spam
● Append words which appear often in ham emails and rarely in spam
to a spam email
● Spam Filter Fooled!
14. ● Traditional Software
○ Devs directly specify logic of
system
● ML System
○ NN learns rules automatically
○ Developers can indirectly modify
decision logic by manipulating
■ Training data
■ Feature selection
■ Models architecture
○ NN’s underlying rules are
mostly unknown to developers!https://arxiv.org/pdf/1705.06640.pdf
Source of Blind Spots
16. Adaptive Nature of Hackers
● Hackers Take Path of Least Resistance
● If a Patch is deployed, Hackers will take the path of least resistance
Vulnerability 1
Vulnerability 2
Vulnerability 3
18. ● Hackers strategically insert
attack data
● Model trains periodically
● Decision boundary is altered
Data Poisoning
secml.github.io
19.
20.
21. ● Add Noise
● Classifier Misclassifies Object
● Model learns differently than
humans
Attack: Induce Specific Output
“Explaining and Harnessing Adversarial Examples”, Ian Goodfellow
22. Submit queries, observe response
● Training Data
● Architecture
● Optimization Procedures
Attack: Expose Model Attributes
"Towards Reverse Engineering Black Box Neural Networks”, Seong Oh
23. Taxonomy of Attacks Against ML Systems
Axis Attack Properties
Influence Causative -
influences training
and test data
Exploratory - Influences test data
Security
Violation
Confidentiality -
goal is to uncover
training data
Integrity - goal is
false negatives
(FNs)
Availability - goal is
false positives (FPs)
Specificity Targeted - influence
prediction of
particular test
instances
Indiscriminate - influence predictions of all
test instances
Adversarial Machine Learning - Joseph, Nelson, Rubinstein and Tygar, 2019
24. Exploratory Attacks Against Trained Classifier
● Attacker doesn’t have access to training data
● Most known detection techniques are susceptible to blind spots
● How difficult is it for adversary to discover blind spots that is most
advantageous to them?
25. How Can We Find
these Blind Spots?
https://www.theemotionmachine.com/listen-to-family-and-friends-how-to-protect-your
self-from-blind-spots/
26. ● Check erroneous corner cases
● Input: Unlabeled test input
● Objective: Generate test data
to:
○ Activate large number of neurons
○ Force DNNs to behave differently
● Joint Optimization Problem:
Maximize
○ Differential behaviour
○ Neuron coverage
DeepXplore: White Box Testing
27. ● Perform gradient guided local
search
○ Starting: seed input
○ Find new inputs that maximize
desired goal
● Similar to backpropagation,
but:
○ Inputs: Variable
○ Weights: Constant
DeepXplore: Example
28. ● Bayesian Neural Network
● Adding dropout before every
weight layer approximation of
gaussian process
○ Both training and test
● Dropout during test
○ Different output for same input
■ [4,5,1,2,3,6]
○ Equivalent to MC sampling
○ High Variance = High uncertainty
Bayesian NN: Modelling
Uncertainty
https://www.cs.ox.ac.uk/people/yarin.gal/web
site/blog_2248.html
30. TensorFuzz
● Open Source Tool
● Discovers errors which occur only for rare inputs (Blind Spots)
● Key Techniques:
○ Coverage Guided Fuzzing
○ Property Based Testing
○ Approximate Nearest Neighbor
31. TensorFuzz
● Open Source Tool
● Discovers errors which occur only for rare inputs (Blind Spots)
● Key Techniques:
○ Coverage Guided Fuzzing
○ Property Based Testing
○ Approximate Nearest Neighbor
32. ● Instrument Program for
coverage
○ Add instructions to code allowing
fuzzer to detect code paths
● Feed Random Inputs into
program
● Continue to mutate inputs that
exercised new part of the
program
○ Genetic Algorithm
● Identify bugs
Coverage Guided Fuzzing (AFL)
33. ● Aids the discovery of subtle
fault conditions in the
underlying code
● Security vulnerabilities are
often associated with
unexpected or incorrect state
transitions
AFL: Branch Edge Coverage
AFL Documentation
34. ● Identifies potentially interesting
control flow changes,
○ Ex. A block of code being
executed twice when it was
normally hit only once
AFL Documentation
AFL: Hit Count
35. ● Sequential bit flips with
varying lengths and stepovers,
● Sequential addition and
subtraction of small integers,
● Sequential insertion of known
interesting integers (0, 1,
INT_MAX, etc)
AFL: Mutation Strategy
36. TensorFuzz
● Open Source Tool
● Discovers errors which occur only for rare inputs (Blind Spots)
● Key Techniques:
○ Coverage Guided Fuzzing
○ Property Based Testing
○ Approximate Nearest Neighbor
37. ● Verifies a function or program
abides by a property
● Properties check for useful
characteristics that must be seen
in output
Property Based Testing
https://medium.com/criteo-labs/introduction-to
-property-based-testing-f5236229d237
38. ● Cover the scope of all possible inputs
○ Does not restrict the generated
inputs
● Shrink the input in case of failure
○ On failure, the framework tries to
reduce the input to a smaller input
● Reproducible and replayable
○ Each time it runs a property test,
a seed is produced in order to be
able to re-run the test again on the
same datasets
Advantage
https://medium.com/criteo-labs/introduction-to
-property-based-testing-f5236229d237
39. TensorFuzz
● Open Source Tool
● Discovers errors which occur only for rare inputs (Blind Spots)
● Key Techniques:
○ Coverage Guided Fuzzing
○ Property Based Testing
○ Approximate Nearest Neighbor
43. ● Coverage Metrics
○ Lines of Code Executed
○ Which branches have been taken
Traditional Software Workflow
https://arxiv.org/pdf/1705.06640.pdf
44. ● Software implementation may
contain many branching statements
○ Based on architecture
○ Mostly independent of input
● Different inputs will often execute
○ same lines of code
○ same branches,
● But will produce interesting
variations in behaviour
Neural Network Workflow
https://arxiv.org/pdf/1705.06640.pdf
48. 2. Valid neural network
inputs are fed instead of big
array of bytes.
Ex. For, if inputs are sequences
of character, only allow
characters that are in
vocabulary extracted from the
training set
TensorFuzz
49. 3. Input Chooser intelligently chooses
elements from input corpus.
Following heuristics is used:
: Probability of choosing corpus
element ck at time t
tk: Time when ck was added to the
corpus
Intuition: Recently sampled inputs are
more likely to yield useful new coverage
when mutated, but advantage decays
over time.
TensorFuzz
50. 4. Mutator modifies input in a
controlled manner
For text input, mutation occurs
in accordance to following
policy:
Uniformly at random perform
one of following operations:
- Delete, Add, Subtract
- Random character at
random location
TensorFuzz
51. Diving Deeper
5. Mutated inputs are fed to
Neural Network. The following
are extracted from NN
- Set of coverage arrays
- Enables computation
of coverage
- Set of metadata arrays
- Fed as input to
objective function
52. 5.a Objective Function
- Desired Outcome
- Ex. Error, crash
Outputted Metadata arrays is
fed into Objective function, and
inputs causing system to reach
goal of objective function are
flagged
TensorFuzz
53. 5.b Coverage Analyzer
Core part of product
Reading arrays from
TensorFlow runtime, turning
them into python objects
representing coverage,
checking whether that coverage
is new
TensorFuzz
54. Desired Properties of Coverage Analyzer
● Check if Neural Network is in new state
○ Enables detection of misbehaviour
● Check has to be fast
● Should work with many different computation graphs
○ Remove Manual Intervention as much as possible
● Exercising all of the coverage should be hard
○ Or else we won’t cover much of possible behaviours
55. Use Fast Approximate Nearest Neighbour
● Determine if two sets of NN activations are meaningfully different
from each other
● Provides a coverage metric producing useful results for neural
network
○ Even if underlying software implementation of the neural network does not make
use of many data-dependent branches
57. ● On New Activation Vector
a. Use Approximate nearest
neighbors Algorithm
b. Look up nearest neighbour
c. Check distance between
current and nearest neighbour
in Euclidean distance
d. Add input to corpus if
distance is greater than Lhttps://medium.com/@erikhallstrm/backpropa
gation-from-the-beginning-77356edf427d
Coverage Analyzer: Details
58. ● Note: Often, good results
are achieved only by looking
at logits or layer before
logits
https://medium.com/@erikhallstrm/backpropa
gation-from-the-beginning-77356edf427d
Coverage Analyzer: Details
59. 6. Mutated input is:
- Add to corpus if
- New coverage is achieved
- Added to list of test cases if
- Objective function is satisfied
TensorFuzz
62. Experiment: Finding NaNs
● NaNs consistently cause trouble for researchers and practitioners, but
they are hard to track them down
● A bad loss function is “fault injected” into a neural network
● TesnorFuzz could find NaNs substantially faster than a baseline
random search
63. ● Left: Coverage overtime for 10
different random restarts
● Right: An example of a random
image that causes neural
network to NaN
Experiment: Finding NaNs
64. Experiment: Quantization Errors
● We often want to quantize neural networks
● How to test for accuracy?
● We can look at differences in test sets, but often few show up
● Instead, we can fuzz for inputs that surface differences
65. ● Left: Coverage overtime for 10
different random restarts. Note
that 3 runs fail
● Right: An example of an image
correctly classified by the
original neural network but
incorrectly classified by the
quantized network
Experiment: Quantization Errors
67. Discussion Points
● How do we embed security testing into the ML Solution development
lifecycle?
● Can explainable inference help to detect blind spots?
● Can we use multiple classifiers in parallel to reduce the implications of an
attack on a specific model?