SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
What is cross-entropy?
       From Riemann to Monte-Carlo
           Cross-Entropy techniques
                Cross-Entropy tricks
                           Questions




Using cross-entropy techniques for rare event
        simulation and optimization

                         Arthur Breitman

                   NYC Machine learning meetup


                         August 18, 2011




                    Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Entropy
                       Cross-Entropy techniques
                                                   Kullback-Leibler divergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                   Entropy
                     Cross-Entropy techniques
                                                   Kullback-Leibler divergence
                          Cross-Entropy tricks
                                     Questions



Information entropy


   definition of information entropy
     ◮   Entropy measures disorder of a physical system
     ◮   Entropy measures information (Shannon)
     ◮   Entropy measures ignorance (E.T. Jaynes)
     ◮   Formally:
                                 H=−               p(x) ln(p(x))
                                             x∈Ω




                              Arthur Breitman      crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                  Entropy
                    Cross-Entropy techniques
                                                  Kullback-Leibler divergence
                         Cross-Entropy tricks
                                    Questions



The continuous case



   In the continuous case, for a random variable X with p.d.f p(x)
   entropy is defined as

                       H(X ) = −                P(x) ln(p(x))dx
                                           Ω

   Simple, right?




                             Arthur Breitman      crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                 Entropy
                     Cross-Entropy techniques
                                                 Kullback-Leibler divergence
                          Cross-Entropy tricks
                                     Questions



The entropy of a probability distribution is meaningless



   Wrong!
     ◮   Not invariant under a change of variable
     ◮   Can even be negative!
     ◮   Not an extension of Shannon’s entropy.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                 Entropy
                     Cross-Entropy techniques
                                                 Kullback-Leibler divergence
                          Cross-Entropy tricks
                                     Questions



E.T. Jaynes to the rescue


   E.T. Jaynes, adjusted the definition. Consider a sequence of
   discrete values in Ω dense in Ω, it must a approach a distribution
   m. Set
                                            p(x)
                     H(X ) = − P(x) ln             dx
                                 Ω          m(x)
   N.B. m is not necessarily a probability distribution, just a density,
   so improper priors are O.K.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Entropy
                       Cross-Entropy techniques
                                                   Kullback-Leibler divergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                  From Riemann to Monte-Carlo
                                                       Entropy
                      Cross-Entropy techniques
                                                       Kullback-Leibler divergence
                           Cross-Entropy tricks
                                      Questions



Definition of KL divergence

   Kullback-Leibler divergence: entropy of a probability distribution p
   relative to probability distribution q

                                                                    p(x)
                  DKL (P||Q) = −                      P(x) ln                   dx
                                                  Ω                 q(x)

     ◮   Similar but distinct from entropy.
     ◮   Expected number of nats (or bits) to encode data drawn from
         Q assuming it is drawn from P.
     ◮   Not symmetric!



                               Arthur Breitman         crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Entropy
                    Cross-Entropy techniques
                                                Kullback-Leibler divergence
                         Cross-Entropy tricks
                                    Questions



Why code length matter



    ◮   All ML problems ⇔ fitting a probability distribution
    ◮   KL divergence measures how concise your description is
    ◮   Relates to MDL and Solomonoff induction
    ◮   PAC-learning patches against a lack of epistemology




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Entropy
                    Cross-Entropy techniques
                                                Kullback-Leibler divergence
                         Cross-Entropy tricks
                                    Questions



Likelihood of parameters and Cross-Entropy



   Given a sample {q}i of Q, and {P}θ∈Θ ,

                                                                1
             LL(θ|{q}i ) = H(Pθ ) + DKL                 Pθ                    δqi
                                                                N
                                                                       i

   The likelihood of θ is the KL-divergence of Pθ w.r.t a Dirac comb.




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo     Riemann integration
                       Cross-Entropy techniques    Monte-Carlo integration
                            Cross-Entropy tricks   Importance sampling
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo       Riemann integration
                    Cross-Entropy techniques      Monte-Carlo integration
                         Cross-Entropy tricks     Importance sampling
                                    Questions



Riemann integration



   How does one compute the integral of a function? Rectangle
   method:
                  b             N−1
                              1                    i
                    f (x)dx →       f a + (b − a)
                a             N                    N
                                            i=0

   Linear convergence.




                             Arthur Breitman      crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo        Riemann integration
                       Cross-Entropy techniques       Monte-Carlo integration
                            Cross-Entropy tricks      Importance sampling
                                       Questions



The curse of dimensionality



   Multiple dimensions?

       b1          bm                         N−1           N−1
                                  1                                             1
            ···         f (x)dx → m                   ···           f   a+        i ◦ (b − a)
      a1          am             N                                              N
                                              i1 =0         im =0

   Computation is exponential in m.




                                Arthur Breitman       crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo     Riemann integration
                       Cross-Entropy techniques    Monte-Carlo integration
                            Cross-Entropy tricks   Importance sampling
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo     Riemann integration
                    Cross-Entropy techniques    Monte-Carlo integration
                         Cross-Entropy tricks   Importance sampling
                                    Questions



Monte-Carlo integration



   If P is a probability distribution over Ω, draw {x}i from P:
                                                    N
                                         1               f (xi )
                               f (x)dx ∼
                             Ω           N               p(xi )
                                                   i=1

   Very simple to implement, often p ∼ 1




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Monte-Carlo convergence



    ◮   Let random variable Xp = f (x)/p(x)
    ◮   If var(Xp ) < ∞, convergence is O(N 1/2 ) by the central-limit
        theorem!
    ◮   If m > 2, Monte-Carlo becomes attractive.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Problems with MC




    ◮   If the mass of f is concentrated in a small region, convergence
        can be very slow.
    ◮   also a problem with Riemann integration...




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo     Riemann integration
                       Cross-Entropy techniques    Monte-Carlo integration
                            Cross-Entropy tricks   Importance sampling
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Importance sampling


    ◮   Sample preferably the regions of interest by picking p to
        minimize the variance of f /p
    ◮   In Riemann world, equivalent to an irregular grid
                                                                  f
    ◮   Ideal sampling distribution (if f > 0) is                     f
                                                                          , but we don’t
        know    f!
    ◮   Best convergence when χ2 of f w.r.t p is minimized




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo     Riemann integration
                    Cross-Entropy techniques    Monte-Carlo integration
                         Cross-Entropy tricks   Importance sampling
                                    Questions



Adaptive importance sampling




    ◮   What if we don’t know the shape of f ?
    ◮   Learn it adaptively from the sampling.
    ◮   Iteratively improve the importance sampling function.




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Vegas algorithm and cross-entropy




    ◮   Vegas algorithm, use histograms and separate variables
    ◮   Cross-entropy algorithm, pick p from a family of distributions
        to minimize cross-entropy to the sample




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Why cross-entropy?



   In many cases, the expression is analytical and computationally
   cheap to derive, e.g.
     ◮   the uniform distribution
     ◮   the categorical distribution (finite, discrete)
     ◮   all the natural exponential family




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



The natural exponential distribution?



                        fX (x|θ) = h(x) exp (θ∗ x − A(θ))


     ◮   theta is the sufficient statistic
     ◮   maximum cross-entropy distribution given θ w.r.t dH
     ◮   Examples: normal, multivariate normal, gamma, binomial,
         multinomial, negative binomial




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                 Analytical expressions
                 From Riemann to Monte-Carlo
                                                 Simulation of rare events
                     Cross-Entropy techniques
                                                 Optimization
                          Cross-Entropy tricks
                                                 Fitting parameters
                                     Questions



Beta distribution
   Not analytical! To fit, start with approximate values from the
   moment’s method
                  ¯      ¯
                  X (1 − X                                         ¯      ¯
                                                                   X (1 − X
          ¯
        α=X                               ¯
                           − 1 , β = (1 − X )                               −1
                     S2                                               S2

   The likelihood is given by
                                                            n                               n
   n(ln(Γ(α+β)−ln(Γ(α)−ln(Γ(β))+(α−1)                            ln(Xi )+(β−1)                   ln(1−Xi )
                                                           i=0                            i=0

   The first and second derivatives are the digamma and trigamma
   function, available in the gsl. Newton’s method using the Jacobian
   converges in a couple iterations. Very useful to model bounded
   variables.
                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                             Analytical expressions
             From Riemann to Monte-Carlo
                                             Simulation of rare events
                 Cross-Entropy techniques
                                             Optimization
                      Cross-Entropy tricks
                                             Fitting parameters
                                 Questions



Surviving the zombie hordes




              Figure: Electric fences, the horde and you

                          Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                 Analytical expressions
                 From Riemann to Monte-Carlo
                                                 Simulation of rare events
                     Cross-Entropy techniques
                                                 Optimization
                          Cross-Entropy tricks
                                                 Fitting parameters
                                     Questions



Simulating zombie breakouts



    ◮   Each fence (Ui , λi ) delivers u ∼ max(Ui − Exp(λi ), 0) volts.
    ◮   Crossing a fence deals u damage to a zombie
    ◮   Zombies come from everywhere and can take 5 damage hits
        each.
    ◮   Zombies outbreaks are very rare!




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Mere integration fails!




     ◮   We can estimate this probability by sampling the random
         voltages and finding a shortest path.
     ◮   Speed of Monte-Carlo proportional to poutbreak (1 − poutbreak ),
         too slow!




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                Analytical expressions
                From Riemann to Monte-Carlo
                                                Simulation of rare events
                    Cross-Entropy techniques
                                                Optimization
                         Cross-Entropy tricks
                                                Fitting parameters
                                    Questions



Cross-Entropy to the rescue



    ◮   We want to approximate the multivariate power distribution
        conditional on an outbreak occurring!
    ◮   Approximate the shape by changing the parameters Ui and λi
        for each fence
    ◮   Generate samples, fit Ui and λi on the samples inducing an
        outbreak




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



The elite sample

   What if the probability is so low that we don’t observe any
   outbreak in our sample?
     ◮   Generate n samplings using the sampling distribution
     ◮   If more than e samples are outbreaks, fit to those samples,
         break
     ◮   Otherwise, fit on the e best sample, the elite sample.
     ◮   Iterate
     ◮   Generate a sample, weight each points by the importance
         sampling weight, estimate probability



                                Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                 Analytical expressions
                 From Riemann to Monte-Carlo
                                                 Simulation of rare events
                     Cross-Entropy techniques
                                                 Optimization
                          Cross-Entropy tricks
                                                 Fitting parameters
                                     Questions



Other examples




    ◮   Modeling rare event for any complex probability distribution,
        e.g. Bayesian networks.
    ◮   Estimating tails for the sum of fat-tailed distributions




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



From integration to optimization

   Using an elite sample to help convergence is a trick that does a
   form of hill climbing of a smooth function approximating the
   indicator function of the rare event.
     ◮   Interesting even if not interested in integrating f .
     ◮   Keep iterating based on an elite sample to converge towards
         one global maximum.
     ◮   variance of the sampling distribution follows the curvature of
         f.
     ◮   e.g. using a multivariate normal allows the covariance to
         reflect the differential


                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Combinatorial optimization



   One classical example if combinatorial optimization. To solve a
   TSP with Cross-Entropy:
     ◮   Assume the travel is a Markov chain on the graph nodes.
     ◮   Generate travels by coercing them to be permutations.
     ◮   Update transition probabilities from the elite sample.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Clustering


   CE does clustering too!
     ◮   Assign probabilities of membership to classes for each point
         (the sampling distribution).
     ◮   Sample random membership assignments.
     ◮   Use average distance to centroids to find an elite sample.
     ◮   Slower than K-means but much less sensitive to initial choice
         of centroids.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



A form of global optimization



   Is it global optimization?
     ◮   If the sampling distribution is bounded below by a distribution
         that covers the global maximum, yes, with probability 1!
     ◮   In practice we may never see one maximum and converge to
         another local maximum.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Fitting model parameters with CE


   Cross-Entropy techniques work generally very well for finding ML
   parameters of a model. Why?
     ◮   Models often have different sensitivities to different
         parameters, CE reflects that.
     ◮   With a covariance structure, it does a form of gradient ascent.
     ◮   But it can deal with discrete parameters at the same time!
     ◮   It does not tend to get trapped in local maxima.
     ◮   Well suited for high-dimensional parameter spaces.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Multiple maxima
                       Cross-Entropy techniques
                                                   Slow convergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Multiple maxima
                    Cross-Entropy techniques
                                                Slow convergence
                         Cross-Entropy tricks
                                    Questions



Forgetting maxima



   Some maxima can be ”forgotten”
    ◮   Smooth changes in the sampling function.
    ◮   Expand the sampling function (equivalent to applying a prior
        or ”shrinkage”).
    ◮   Keep the entire sample




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                 Multiple maxima
                     Cross-Entropy techniques
                                                 Slow convergence
                          Cross-Entropy tricks
                                     Questions



Not converging to a maximum



  Multiple maxima may prevent variance of the sampling from
  decreasing.
    ◮   Mixtures of multivariate normals can deal with this.
    ◮   They can be introduced dynamically.
    ◮   Fit with EM.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Multiple maxima
                       Cross-Entropy techniques
                                                   Slow convergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Multiple maxima
                    Cross-Entropy techniques
                                                Slow convergence
                         Cross-Entropy tricks
                                    Questions



Independent variables




   If the sampling distribution is separable, convergence can be sped
   up by sampling over one dimension at a time.




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                    Cross-Entropy techniques
                         Cross-Entropy tricks
                                    Questions



Questions




   Questions?




                             Arthur Breitman    crossentropy for rare event simulation and optimization

Contenu connexe

Dernier

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Dernier (20)

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 

En vedette

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Simulation of rare events and optimisation with the cross-entropy method

  • 1. What is cross-entropy? From Riemann to Monte-Carlo Cross-Entropy techniques Cross-Entropy tricks Questions Using cross-entropy techniques for rare event simulation and optimization Arthur Breitman NYC Machine learning meetup August 18, 2011 Arthur Breitman crossentropy for rare event simulation and optimization
  • 2. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 3. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Information entropy definition of information entropy ◮ Entropy measures disorder of a physical system ◮ Entropy measures information (Shannon) ◮ Entropy measures ignorance (E.T. Jaynes) ◮ Formally: H=− p(x) ln(p(x)) x∈Ω Arthur Breitman crossentropy for rare event simulation and optimization
  • 4. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions The continuous case In the continuous case, for a random variable X with p.d.f p(x) entropy is defined as H(X ) = − P(x) ln(p(x))dx Ω Simple, right? Arthur Breitman crossentropy for rare event simulation and optimization
  • 5. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions The entropy of a probability distribution is meaningless Wrong! ◮ Not invariant under a change of variable ◮ Can even be negative! ◮ Not an extension of Shannon’s entropy. Arthur Breitman crossentropy for rare event simulation and optimization
  • 6. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions E.T. Jaynes to the rescue E.T. Jaynes, adjusted the definition. Consider a sequence of discrete values in Ω dense in Ω, it must a approach a distribution m. Set p(x) H(X ) = − P(x) ln dx Ω m(x) N.B. m is not necessarily a probability distribution, just a density, so improper priors are O.K. Arthur Breitman crossentropy for rare event simulation and optimization
  • 7. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 8. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Definition of KL divergence Kullback-Leibler divergence: entropy of a probability distribution p relative to probability distribution q p(x) DKL (P||Q) = − P(x) ln dx Ω q(x) ◮ Similar but distinct from entropy. ◮ Expected number of nats (or bits) to encode data drawn from Q assuming it is drawn from P. ◮ Not symmetric! Arthur Breitman crossentropy for rare event simulation and optimization
  • 9. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Why code length matter ◮ All ML problems ⇔ fitting a probability distribution ◮ KL divergence measures how concise your description is ◮ Relates to MDL and Solomonoff induction ◮ PAC-learning patches against a lack of epistemology Arthur Breitman crossentropy for rare event simulation and optimization
  • 10. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Likelihood of parameters and Cross-Entropy Given a sample {q}i of Q, and {P}θ∈Θ , 1 LL(θ|{q}i ) = H(Pθ ) + DKL Pθ δqi N i The likelihood of θ is the KL-divergence of Pθ w.r.t a Dirac comb. Arthur Breitman crossentropy for rare event simulation and optimization
  • 11. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 12. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Riemann integration How does one compute the integral of a function? Rectangle method: b N−1 1 i f (x)dx → f a + (b − a) a N N i=0 Linear convergence. Arthur Breitman crossentropy for rare event simulation and optimization
  • 13. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions The curse of dimensionality Multiple dimensions? b1 bm N−1 N−1 1 1 ··· f (x)dx → m ··· f a+ i ◦ (b − a) a1 am N N i1 =0 im =0 Computation is exponential in m. Arthur Breitman crossentropy for rare event simulation and optimization
  • 14. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 15. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Monte-Carlo integration If P is a probability distribution over Ω, draw {x}i from P: N 1 f (xi ) f (x)dx ∼ Ω N p(xi ) i=1 Very simple to implement, often p ∼ 1 Arthur Breitman crossentropy for rare event simulation and optimization
  • 16. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Monte-Carlo convergence ◮ Let random variable Xp = f (x)/p(x) ◮ If var(Xp ) < ∞, convergence is O(N 1/2 ) by the central-limit theorem! ◮ If m > 2, Monte-Carlo becomes attractive. Arthur Breitman crossentropy for rare event simulation and optimization
  • 17. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Problems with MC ◮ If the mass of f is concentrated in a small region, convergence can be very slow. ◮ also a problem with Riemann integration... Arthur Breitman crossentropy for rare event simulation and optimization
  • 18. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 19. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Importance sampling ◮ Sample preferably the regions of interest by picking p to minimize the variance of f /p ◮ In Riemann world, equivalent to an irregular grid f ◮ Ideal sampling distribution (if f > 0) is f , but we don’t know f! ◮ Best convergence when χ2 of f w.r.t p is minimized Arthur Breitman crossentropy for rare event simulation and optimization
  • 20. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Adaptive importance sampling ◮ What if we don’t know the shape of f ? ◮ Learn it adaptively from the sampling. ◮ Iteratively improve the importance sampling function. Arthur Breitman crossentropy for rare event simulation and optimization
  • 21. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Vegas algorithm and cross-entropy ◮ Vegas algorithm, use histograms and separate variables ◮ Cross-entropy algorithm, pick p from a family of distributions to minimize cross-entropy to the sample Arthur Breitman crossentropy for rare event simulation and optimization
  • 22. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 23. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Why cross-entropy? In many cases, the expression is analytical and computationally cheap to derive, e.g. ◮ the uniform distribution ◮ the categorical distribution (finite, discrete) ◮ all the natural exponential family Arthur Breitman crossentropy for rare event simulation and optimization
  • 24. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions The natural exponential distribution? fX (x|θ) = h(x) exp (θ∗ x − A(θ)) ◮ theta is the sufficient statistic ◮ maximum cross-entropy distribution given θ w.r.t dH ◮ Examples: normal, multivariate normal, gamma, binomial, multinomial, negative binomial Arthur Breitman crossentropy for rare event simulation and optimization
  • 25. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Beta distribution Not analytical! To fit, start with approximate values from the moment’s method ¯ ¯ X (1 − X ¯ ¯ X (1 − X ¯ α=X ¯ − 1 , β = (1 − X ) −1 S2 S2 The likelihood is given by n n n(ln(Γ(α+β)−ln(Γ(α)−ln(Γ(β))+(α−1) ln(Xi )+(β−1) ln(1−Xi ) i=0 i=0 The first and second derivatives are the digamma and trigamma function, available in the gsl. Newton’s method using the Jacobian converges in a couple iterations. Very useful to model bounded variables. Arthur Breitman crossentropy for rare event simulation and optimization
  • 26. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 27. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Surviving the zombie hordes Figure: Electric fences, the horde and you Arthur Breitman crossentropy for rare event simulation and optimization
  • 28. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Simulating zombie breakouts ◮ Each fence (Ui , λi ) delivers u ∼ max(Ui − Exp(λi ), 0) volts. ◮ Crossing a fence deals u damage to a zombie ◮ Zombies come from everywhere and can take 5 damage hits each. ◮ Zombies outbreaks are very rare! Arthur Breitman crossentropy for rare event simulation and optimization
  • 29. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Mere integration fails! ◮ We can estimate this probability by sampling the random voltages and finding a shortest path. ◮ Speed of Monte-Carlo proportional to poutbreak (1 − poutbreak ), too slow! Arthur Breitman crossentropy for rare event simulation and optimization
  • 30. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Cross-Entropy to the rescue ◮ We want to approximate the multivariate power distribution conditional on an outbreak occurring! ◮ Approximate the shape by changing the parameters Ui and λi for each fence ◮ Generate samples, fit Ui and λi on the samples inducing an outbreak Arthur Breitman crossentropy for rare event simulation and optimization
  • 31. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions The elite sample What if the probability is so low that we don’t observe any outbreak in our sample? ◮ Generate n samplings using the sampling distribution ◮ If more than e samples are outbreaks, fit to those samples, break ◮ Otherwise, fit on the e best sample, the elite sample. ◮ Iterate ◮ Generate a sample, weight each points by the importance sampling weight, estimate probability Arthur Breitman crossentropy for rare event simulation and optimization
  • 32. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Other examples ◮ Modeling rare event for any complex probability distribution, e.g. Bayesian networks. ◮ Estimating tails for the sum of fat-tailed distributions Arthur Breitman crossentropy for rare event simulation and optimization
  • 33. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 34. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions From integration to optimization Using an elite sample to help convergence is a trick that does a form of hill climbing of a smooth function approximating the indicator function of the rare event. ◮ Interesting even if not interested in integrating f . ◮ Keep iterating based on an elite sample to converge towards one global maximum. ◮ variance of the sampling distribution follows the curvature of f. ◮ e.g. using a multivariate normal allows the covariance to reflect the differential Arthur Breitman crossentropy for rare event simulation and optimization
  • 35. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Combinatorial optimization One classical example if combinatorial optimization. To solve a TSP with Cross-Entropy: ◮ Assume the travel is a Markov chain on the graph nodes. ◮ Generate travels by coercing them to be permutations. ◮ Update transition probabilities from the elite sample. Arthur Breitman crossentropy for rare event simulation and optimization
  • 36. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Clustering CE does clustering too! ◮ Assign probabilities of membership to classes for each point (the sampling distribution). ◮ Sample random membership assignments. ◮ Use average distance to centroids to find an elite sample. ◮ Slower than K-means but much less sensitive to initial choice of centroids. Arthur Breitman crossentropy for rare event simulation and optimization
  • 37. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions A form of global optimization Is it global optimization? ◮ If the sampling distribution is bounded below by a distribution that covers the global maximum, yes, with probability 1! ◮ In practice we may never see one maximum and converge to another local maximum. Arthur Breitman crossentropy for rare event simulation and optimization
  • 38. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 39. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Fitting model parameters with CE Cross-Entropy techniques work generally very well for finding ML parameters of a model. Why? ◮ Models often have different sensitivities to different parameters, CE reflects that. ◮ With a covariance structure, it does a form of gradient ascent. ◮ But it can deal with discrete parameters at the same time! ◮ It does not tend to get trapped in local maxima. ◮ Well suited for high-dimensional parameter spaces. Arthur Breitman crossentropy for rare event simulation and optimization
  • 40. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 41. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Forgetting maxima Some maxima can be ”forgotten” ◮ Smooth changes in the sampling function. ◮ Expand the sampling function (equivalent to applying a prior or ”shrinkage”). ◮ Keep the entire sample Arthur Breitman crossentropy for rare event simulation and optimization
  • 42. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Not converging to a maximum Multiple maxima may prevent variance of the sampling from decreasing. ◮ Mixtures of multivariate normals can deal with this. ◮ They can be introduced dynamically. ◮ Fit with EM. Arthur Breitman crossentropy for rare event simulation and optimization
  • 43. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 44. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Independent variables If the sampling distribution is separable, convergence can be sped up by sampling over one dimension at a time. Arthur Breitman crossentropy for rare event simulation and optimization
  • 45. What is cross-entropy? From Riemann to Monte-Carlo Cross-Entropy techniques Cross-Entropy tricks Questions Questions Questions? Arthur Breitman crossentropy for rare event simulation and optimization