SlideShare a Scribd company logo
1 of 29
Center for Bioinformatics Tübingen




       Understanding the Risk Factors of
      Learning in Adversarial Environments
           Blaine Nelson1, Battista Biggio2 and Pavel Laskov1
                   (1) Cognitive Systems Group
 Reactive          Wilhelm Schickard Institute for Computer Science
 Security          University of Tuebingen, Germany


   PRA
    group
                  (2) Pattern Recognition and Applications Group
                  Department of Electrical and Electronic Engineering (DIEE)
                  University of Cagliari, Italy


                  Computer Science Department          Computer Architecture   Prof. Zell
Robust Classification for Security Applications

   • Machine Learning can be used in security apps
         • E.g., spam filtering, fraud/intrusion detection
         • Benefits: adaptability, scalability & sound inference


   • ML in security domains is susceptible to attacks
         • Classification relies on stationarity
         • Learner can be manipulated by adversary


   • Security requires a sound theoretical foundation
     to provide assurances against adversaries [1,6]


B. Nelson, B. Biggio and P. Laskov                                 2
Background: Robust Estimation

   • Core idea of Robustness: small
     perturbations should have small
     impact on estimator [5]




                                     Mean
                                     Median

B. Nelson, B. Biggio and P. Laskov            3
Background: Robust Estimation

   • Core idea of Robustness: small
     perturbations should have small
     impact on estimator [5]




                                                 Perturbation


                                          Mean
                                     Median

B. Nelson, B. Biggio and P. Laskov                              4
Background Influence Function

   • Influence function (IF) is the response of
     estimator to infinitesimal contamination at x [4]
                                        IF of mean
                                                   IF of median




   • IF shows quantitative effect of contamination
         • Bounded IF is an indication of robustness

B. Nelson, B. Biggio and P. Laskov                            5
Extending to Classifiers

   • Influence Function approach extends to
     regression, statistical testing, & other settings

   • Classification presents challenges:
         • Classifiers are bounded functions
         • Robustness must measure change in decision over
           space


   • Approach via influence function measures
     change in classifier’s parameters.

B. Nelson, B. Biggio and P. Laskov                           6
Problem with IF Approach


   • IF approach intuitive but has strange
     implications:
         • Every classifier is robust if space is bounded
         • Every classifier with bounded params is robust




B. Nelson, B. Biggio and P. Laskov                          7
Problem with IF Approach


   • IF approach intuitive but has strange
     implications:
         • Every classifier is robust if space is bounded
         • Every classifier with bounded params is robust




                          Contamination



B. Nelson, B. Biggio and P. Laskov                          8
Rotational Intuition

   • For hyperplanes, robustness should capture a
     notion of rotational invariance under
     contamination




   • Is their a general principle behind this intuition?
   • Main Result: we connect this intuition to
     empirical risk minimization; c.f., [2]

B. Nelson, B. Biggio and P. Laskov                         9
Empirical Risk Framework


   • Learners seek to minimize risk (ie, avg. loss)
          min Ε D ~P [ RP ( f )]         Rp ( f )   E ( x , y ) P [ ( y , f ( x))]
           f F




                                     f


B. Nelson, B. Biggio and P. Laskov                                                   10
Empirical Risk Framework


   • Approximation error due to limited hyp. space, F
                                     apprx       ΕP [ RP ( f † ) RP ( f )]




                                                     †
                                                                     F
                                                 f
                                                         apprx
                                             f


B. Nelson, B. Biggio and P. Laskov                                           11
Empirical Risk Framework


   • Estimation error due to limited dataset, D
                                     est   Ε P [ RP ( f N ) RP ( f † )]




                                               fN                  F
                                               f†     est

                                                    apprx
                                           f


B. Nelson, B. Biggio and P. Laskov                                        12
Empirical Risk Framework


   • Modeling contamination gives notion of stability:
                                                           ˆ
                                                 ΕP [ RP ( f N ) RP ( f N )]
                                     rbst




                                            ˆ
                                            fN
                                                     rbst
                                                fN                     F
                                                f†     est

                                                     apprx
                                            f


B. Nelson, B. Biggio and P. Laskov                                             13
Empirical Risk Framework


   • Expected risk decomposed into 3 components:
                                      ˆ
                           Ε P [ RP ( f N ) RP ( f )]     rbst   est   apprx




                                        ˆ
                                        fN
                                                  rbst
                                             fN                  F
                                             f†     est

                                                  apprx
                                         f


B. Nelson, B. Biggio and P. Laskov                                             14
Bounding Robustness Component

   • For classifiers, we consider the 0-1 loss

   • Classifiers are of form:                         f ( x)    2 Ι[ g ( x)   0] 1
         •     i.e. a threshold on a decision function g
   • Algorithmic stability can thus be bounded:

                           rbst
                                                  ˆ
                                     ΕP [Prx ~P [ g N (x) g N (x)   0]]

   • Robustness is related to disagreement between
     decision functions learned for clean and
     contaminated data

B. Nelson, B. Biggio and P. Laskov                                               15
Distributional Independence

   • Problem: bound depends on distribution of X…


                                                     Support of X




                                Support of X




                              rbst
                                                     ˆ
                                       Ε P [Prx ~P [ g N ( x) g N ( x)   0]]
                                                         0

B. Nelson, B. Biggio and P. Laskov                                             16
Distributional Independence

   • Problem: bound depends on distribution of X…




                              rbst
                                                   ˆ
                                     Ε P [Prx ~P [ g N ( x) g N ( x)   0]]
                                                       1

B. Nelson, B. Biggio and P. Laskov                                           17
Distributional Independence

   • Problem: bound depends on distribution of X…




                        Measure against uniform distribution
                        as measure of change over space!

                                                  ˆ
                                     EP [Prx ~U [ g N (x) g N (x)   0]]


B. Nelson, B. Biggio and P. Laskov                                        18
Case of Hyperplanes

   • For hyperplane w (through origin), uniform
     measure yields expected angular change:
                               U      1             1     ˆN
                                                          wT w N
                                          E P cos
                               rbst
                                                        ˆ
                                                        wN    wN

         • Result from Dasgupta et al. [3]
         • Expectation is over datasets and their resulting
           transformation by adversary
         • Robustness component is bounded between 0 (no
           change) and 1 (complete rotation)
   • This measure gives an intuitive way to compare
     (linear) learning algorithms

B. Nelson, B. Biggio and P. Laskov                                 19
Discussion

   • Incorporation of rotational stability is needed for
     robust classification


   • Feasibility of estimation of    rbst   under realistic
     contamination



   • Development of algorithms based on tradeoffs
     between rbst and other error terms

B. Nelson, B. Biggio and P. Laskov                            20
References

   1)    M. Barreno, B. Nelson, A. D. Joseph, and J. D. Tygar. The Security
         of Machine Learning. MLJ, 81(2): 121-148, 2010.
   2)    L. Bottou and O. Bosquet. The Tradeoffs of Large Sclae Learning.
         In NIPS, volume 20, pages 161-168, 2008.
   3)    S. Dasgupta, A. T. Kalai, and C. Monteleoni. Analysis of
         Perceptron-based Active Learning. JMLR, 10:281-299, 2009.
   4)    F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel.
         Robust Statistics: The Approach Based on Influence Functions.
         John Wiley and Sons, 1986.
   5)    P. Huber. Robust Statistics. John Wiley and Sons, 1981.
   6)    P. Laskov and M. Kloft. A Framework for Quantitative Security
         Analysis of Machine Learning. In AISec Workshop, pages 1-4,
         2009.




B. Nelson, B. Biggio and P. Laskov                                       21
B. Nelson, B. Biggio and P. Laskov   22
Bound Derivations



B. Nelson, B. Biggio and P. Laskov   23
Bound based on triangle inequality:                                 ( x, y )             ( x, z )            ( z, y)


                          ˆ
               Ε P [ RP ( f N ) RP ( f N )]    Ε P [E ( x , y )~ P ( fˆN ( x ), y ) E ( x , y )~ P ( f N ( x), y )]
        rbst

                                                                           ˆ
                                                  Ε P [E ( x , y )~ P [ ( f N ( x ), y ) ( f N ( x), y )]]
                                                             Ε [E                 ˆ
                                                                                ( f ( x ), f ( x))]
                                                              P     ( x , y )~ P   N        N




                                                                                       1
Bound using alternative 0-1 loss:                             0 1   ( x, y )           2
                                                                                           (1 x y )

                                                                ˆ
                                                  Ε P [E x ~P ( f N ( x), f N ( x))]
                                     rbst

                                              1                   ˆ
                                                (1 Ε P [E x ~P [ f N x f N x ]])
                                              2




B. Nelson, B. Biggio and P. Laskov                                                                                    24
Classifiers of form:                      f ( x)        2Ι[ g ( x)                  0] 1

  * Product of pair given by g1 and g2

              f1 ( x ) f 2 ( x)      4 Ι[ g1 ( x)      0]* Ι[ g 2 ( x)                0] 2 ( Ι[ g1 ( x)           0] Ι[ g 2 ( x)              0]) 1
                                              1      g1 ( x ) 0 and g 2 ( x ) 0

                                     4 Ι[ g1 ( x )     0 and g 2 ( x)                 0] 2 ( Ι[ g1 ( x)           0] Ι[ g 2 ( x)              0]) 1
                                                                                                          0 if g1 ( x ) 0 and g 2 ( x ) 0
                                                                                                          1   if g1 ( x ) 0 xor g 2 ( x ) 0
                                                                                                          2 if g1 ( x ) 0 and g 2 ( x ) 0

                                                               2 Ι[ g1 ( x )                0 xor g 2 ( x)       0] 1
                                                      2 ( Ι[ ( g1 ( x )                     0 xor g 2 ( x)       0)] 1) 1
                                                                    2 Ι[ g1 ( x ) g 2 ( x)                0] 1

  * Bound becomes:                                                                1     1                 ˆ
                                                                                            Ε P [E x ~P [ f N ( x) f N ( x)]]
                                     rbst                                         2     2
                                                                    1
                                                                    2
                                                                          1
                                                                          2
                                                                                               ˆ
                                                                              Ε P [E x ~P 2 I[ g N ( x) g N ( x)                0]]       1
                                                                                                                                          2

                                                                  ˆ
                                                  1 Ε P [Prx ~P [ g N ( x) g N ( x)                                    ˆ
                                                                                                     0]] Ε P [Prx ~P [ g N ( x) g N ( x)              0]]


B. Nelson, B. Biggio and P. Laskov                                                                                                              25
Models of Adversary Capabilities

   • Outlier Injection
         • Adversary arbitrarily alters some data (fixed size)


   • Data Perturbation
         • Adversary manipulates all data (limited degree)


   • Label Flipping
         • Adversary only changes data labels (fixed #)


   • Feature-Constrained Changes
         • Adversary only alters fixed set of features

B. Nelson, B. Biggio and P. Laskov                               26
B. Nelson, B. Biggio and P. Laskov   27
B. Nelson, B. Biggio and P. Laskov   28
Classical Risk Minimization


   • Learners seek to minimize risk (ie, avg. loss)

              min Ε D ~P RP f        Rp f N   E   x, y   P
                                                              y, f N x
               f F



   • Risk is classically decomposed into 2
     components: Ε P RP f N RP f       Ε P RP                            fN            RP f †
                                                                              est


                                                             Ε P RP f †                RP f
              estis estimation error due to finite data                       approx


              approx is approximation error from hyp. space



B. Nelson, B. Biggio and P. Laskov                                                          29

More Related Content

More from Pluribus One

Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Pluribus One
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Pluribus One
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Pluribus One
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense Slides
Pluribus One
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...
Pluribus One
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...
Pluribus One
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 Poster
Pluribus One
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Pluribus One
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011
Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern Analysis
Pluribus One
 

More from Pluribus One (20)

Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial Settings
 
Secure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksSecure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion Attacks
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
 
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
 
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
 
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense Slides
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 Poster
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern Analysis
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 

Understanding the risk factors of learning in adversarial environments

  • 1. Center for Bioinformatics Tübingen Understanding the Risk Factors of Learning in Adversarial Environments Blaine Nelson1, Battista Biggio2 and Pavel Laskov1 (1) Cognitive Systems Group Reactive Wilhelm Schickard Institute for Computer Science Security University of Tuebingen, Germany PRA group (2) Pattern Recognition and Applications Group Department of Electrical and Electronic Engineering (DIEE) University of Cagliari, Italy Computer Science Department Computer Architecture Prof. Zell
  • 2. Robust Classification for Security Applications • Machine Learning can be used in security apps • E.g., spam filtering, fraud/intrusion detection • Benefits: adaptability, scalability & sound inference • ML in security domains is susceptible to attacks • Classification relies on stationarity • Learner can be manipulated by adversary • Security requires a sound theoretical foundation to provide assurances against adversaries [1,6] B. Nelson, B. Biggio and P. Laskov 2
  • 3. Background: Robust Estimation • Core idea of Robustness: small perturbations should have small impact on estimator [5] Mean Median B. Nelson, B. Biggio and P. Laskov 3
  • 4. Background: Robust Estimation • Core idea of Robustness: small perturbations should have small impact on estimator [5] Perturbation Mean Median B. Nelson, B. Biggio and P. Laskov 4
  • 5. Background Influence Function • Influence function (IF) is the response of estimator to infinitesimal contamination at x [4] IF of mean IF of median • IF shows quantitative effect of contamination • Bounded IF is an indication of robustness B. Nelson, B. Biggio and P. Laskov 5
  • 6. Extending to Classifiers • Influence Function approach extends to regression, statistical testing, & other settings • Classification presents challenges: • Classifiers are bounded functions • Robustness must measure change in decision over space • Approach via influence function measures change in classifier’s parameters. B. Nelson, B. Biggio and P. Laskov 6
  • 7. Problem with IF Approach • IF approach intuitive but has strange implications: • Every classifier is robust if space is bounded • Every classifier with bounded params is robust B. Nelson, B. Biggio and P. Laskov 7
  • 8. Problem with IF Approach • IF approach intuitive but has strange implications: • Every classifier is robust if space is bounded • Every classifier with bounded params is robust Contamination B. Nelson, B. Biggio and P. Laskov 8
  • 9. Rotational Intuition • For hyperplanes, robustness should capture a notion of rotational invariance under contamination • Is their a general principle behind this intuition? • Main Result: we connect this intuition to empirical risk minimization; c.f., [2] B. Nelson, B. Biggio and P. Laskov 9
  • 10. Empirical Risk Framework • Learners seek to minimize risk (ie, avg. loss) min Ε D ~P [ RP ( f )] Rp ( f ) E ( x , y ) P [ ( y , f ( x))] f F f B. Nelson, B. Biggio and P. Laskov 10
  • 11. Empirical Risk Framework • Approximation error due to limited hyp. space, F apprx ΕP [ RP ( f † ) RP ( f )] † F f apprx f B. Nelson, B. Biggio and P. Laskov 11
  • 12. Empirical Risk Framework • Estimation error due to limited dataset, D est Ε P [ RP ( f N ) RP ( f † )] fN F f† est apprx f B. Nelson, B. Biggio and P. Laskov 12
  • 13. Empirical Risk Framework • Modeling contamination gives notion of stability: ˆ ΕP [ RP ( f N ) RP ( f N )] rbst ˆ fN rbst fN F f† est apprx f B. Nelson, B. Biggio and P. Laskov 13
  • 14. Empirical Risk Framework • Expected risk decomposed into 3 components: ˆ Ε P [ RP ( f N ) RP ( f )] rbst est apprx ˆ fN rbst fN F f† est apprx f B. Nelson, B. Biggio and P. Laskov 14
  • 15. Bounding Robustness Component • For classifiers, we consider the 0-1 loss • Classifiers are of form: f ( x) 2 Ι[ g ( x) 0] 1 • i.e. a threshold on a decision function g • Algorithmic stability can thus be bounded: rbst ˆ ΕP [Prx ~P [ g N (x) g N (x) 0]] • Robustness is related to disagreement between decision functions learned for clean and contaminated data B. Nelson, B. Biggio and P. Laskov 15
  • 16. Distributional Independence • Problem: bound depends on distribution of X… Support of X Support of X rbst ˆ Ε P [Prx ~P [ g N ( x) g N ( x) 0]] 0 B. Nelson, B. Biggio and P. Laskov 16
  • 17. Distributional Independence • Problem: bound depends on distribution of X… rbst ˆ Ε P [Prx ~P [ g N ( x) g N ( x) 0]] 1 B. Nelson, B. Biggio and P. Laskov 17
  • 18. Distributional Independence • Problem: bound depends on distribution of X… Measure against uniform distribution as measure of change over space! ˆ EP [Prx ~U [ g N (x) g N (x) 0]] B. Nelson, B. Biggio and P. Laskov 18
  • 19. Case of Hyperplanes • For hyperplane w (through origin), uniform measure yields expected angular change: U 1 1 ˆN wT w N E P cos rbst ˆ wN wN • Result from Dasgupta et al. [3] • Expectation is over datasets and their resulting transformation by adversary • Robustness component is bounded between 0 (no change) and 1 (complete rotation) • This measure gives an intuitive way to compare (linear) learning algorithms B. Nelson, B. Biggio and P. Laskov 19
  • 20. Discussion • Incorporation of rotational stability is needed for robust classification • Feasibility of estimation of rbst under realistic contamination • Development of algorithms based on tradeoffs between rbst and other error terms B. Nelson, B. Biggio and P. Laskov 20
  • 21. References 1) M. Barreno, B. Nelson, A. D. Joseph, and J. D. Tygar. The Security of Machine Learning. MLJ, 81(2): 121-148, 2010. 2) L. Bottou and O. Bosquet. The Tradeoffs of Large Sclae Learning. In NIPS, volume 20, pages 161-168, 2008. 3) S. Dasgupta, A. T. Kalai, and C. Monteleoni. Analysis of Perceptron-based Active Learning. JMLR, 10:281-299, 2009. 4) F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel. Robust Statistics: The Approach Based on Influence Functions. John Wiley and Sons, 1986. 5) P. Huber. Robust Statistics. John Wiley and Sons, 1981. 6) P. Laskov and M. Kloft. A Framework for Quantitative Security Analysis of Machine Learning. In AISec Workshop, pages 1-4, 2009. B. Nelson, B. Biggio and P. Laskov 21
  • 22. B. Nelson, B. Biggio and P. Laskov 22
  • 23. Bound Derivations B. Nelson, B. Biggio and P. Laskov 23
  • 24. Bound based on triangle inequality: ( x, y ) ( x, z ) ( z, y) ˆ Ε P [ RP ( f N ) RP ( f N )] Ε P [E ( x , y )~ P ( fˆN ( x ), y ) E ( x , y )~ P ( f N ( x), y )] rbst ˆ Ε P [E ( x , y )~ P [ ( f N ( x ), y ) ( f N ( x), y )]] Ε [E ˆ ( f ( x ), f ( x))] P ( x , y )~ P N N 1 Bound using alternative 0-1 loss: 0 1 ( x, y ) 2 (1 x y ) ˆ Ε P [E x ~P ( f N ( x), f N ( x))] rbst 1 ˆ (1 Ε P [E x ~P [ f N x f N x ]]) 2 B. Nelson, B. Biggio and P. Laskov 24
  • 25. Classifiers of form: f ( x) 2Ι[ g ( x) 0] 1 * Product of pair given by g1 and g2 f1 ( x ) f 2 ( x) 4 Ι[ g1 ( x) 0]* Ι[ g 2 ( x) 0] 2 ( Ι[ g1 ( x) 0] Ι[ g 2 ( x) 0]) 1 1 g1 ( x ) 0 and g 2 ( x ) 0 4 Ι[ g1 ( x ) 0 and g 2 ( x) 0] 2 ( Ι[ g1 ( x) 0] Ι[ g 2 ( x) 0]) 1 0 if g1 ( x ) 0 and g 2 ( x ) 0 1 if g1 ( x ) 0 xor g 2 ( x ) 0 2 if g1 ( x ) 0 and g 2 ( x ) 0 2 Ι[ g1 ( x ) 0 xor g 2 ( x) 0] 1 2 ( Ι[ ( g1 ( x ) 0 xor g 2 ( x) 0)] 1) 1 2 Ι[ g1 ( x ) g 2 ( x) 0] 1 * Bound becomes: 1 1 ˆ Ε P [E x ~P [ f N ( x) f N ( x)]] rbst 2 2 1 2 1 2 ˆ Ε P [E x ~P 2 I[ g N ( x) g N ( x) 0]] 1 2 ˆ 1 Ε P [Prx ~P [ g N ( x) g N ( x) ˆ 0]] Ε P [Prx ~P [ g N ( x) g N ( x) 0]] B. Nelson, B. Biggio and P. Laskov 25
  • 26. Models of Adversary Capabilities • Outlier Injection • Adversary arbitrarily alters some data (fixed size) • Data Perturbation • Adversary manipulates all data (limited degree) • Label Flipping • Adversary only changes data labels (fixed #) • Feature-Constrained Changes • Adversary only alters fixed set of features B. Nelson, B. Biggio and P. Laskov 26
  • 27. B. Nelson, B. Biggio and P. Laskov 27
  • 28. B. Nelson, B. Biggio and P. Laskov 28
  • 29. Classical Risk Minimization • Learners seek to minimize risk (ie, avg. loss) min Ε D ~P RP f Rp f N E x, y P y, f N x f F • Risk is classically decomposed into 2 components: Ε P RP f N RP f Ε P RP fN RP f † est Ε P RP f † RP f estis estimation error due to finite data approx approx is approximation error from hyp. space B. Nelson, B. Biggio and P. Laskov 29