SlideShare une entreprise Scribd logo
1  sur  109
DERIVATIVE-FREE
                     OPTIMIZATION

                       http://www.lri.fr/~teytaud/dfo.pdf
                           (or Quentin's web page ?)



Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège           using also Slides from A. Auger
The next slide is the most important
                                      of all.




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
In case of trouble,
                               Interrupt me.

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
In case of trouble,
                                    Interrupt me.

         Further discussion needed:
            - R82A, Montefiore institute
            - olivier.teytaud@inria.fr
            - or after the lessons (the 25          th
                                                         , not the 18th)
Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
I. Optimization and DFO
  II. Evolutionary algorithms
  III. From math. programming
  IV. Using machine learning
  V. Conclusions

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Derivative-free optimization of f




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Derivative-free optimization of f




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Derivative-free optimization of f




                                                              No gradient !
                                                    Only depends on the x's and f(x)'s
Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Derivative-free optimization of f




      Why derivative free optimization ?
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
       But sometimes you have no derivative
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
       But sometimes you have no derivative
       It's simpler (by far) ==> less bugs
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
       But sometimes you have no derivative
       It's simpler (by far)
       It's more robust (to noise, to strange functions...)
Derivative-free optimization of f


          Optimization algorithms
      ==> Newton optimization ?
      Why derivative free
       ==> Quasi-Newton (BFGS)
       Ok, it's slower
       But sometimes you have no derivative
       ==> Gradient descent
       It's simpler (by far)

       ==> ...robust (to noise, to strange functions...)
       It's more
Derivative-free optimization of f



          Optimization algorithms
      Why derivative free optimization ?
       Ok, it's slower
        Derivative-free        optimization
       But sometimes you have no derivative
           (don't need gradients)
       It's simpler (by far)
       It's more robust (to noise, to strange functions...)
Derivative-free optimization of f



          Optimization algorithms
      Why derivative free optimization ?
        Derivative-free optimization
       Ok, it's slower
       But sometimes you have no derivative
              Comparison-based optimization
                      (coming soon),
       It's simpler (by far)comparisons,
                  just needing
       It's more robust (to noise, to strange functions...)
               incuding evolutionary algorithms
I. Optimization and DFO
  II. Evolutionary algorithms
  III. From math. programming
  IV. Using machine learning
  V. Conclusions

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
II. Evolutionary algorithms
          a. Fundamental elements
          b. Algorithms
          c. Math. analysis

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Preliminaries:
- Gaussian distribution
- Multivariate Gaussian distribution
- Non-isotropic Gaussian distribution
- Markov chains
   ==> for theoretical analysis
Preliminaries:
- Gaussian distribution
- Multivariate Gaussian distribution
- Non-isotropic Gaussian distribution
- Markov chains
K exp( - p(x) ) with
                      - p(x) a degree 2 polynomial (neg. dom coef)
                      - K a normalization constant



Preliminaries:
- Gaussian distribution
- Multivariate Gaussian distribution
- Non-isotropic Gaussian distribution
- Markov chains
K exp( - p(x) ) with
                                 - p(x) a degree 2 polynomial (neg. dom coef)
                                 - K a normalization constant
                  Translation
                    of the
Preliminaries:Gaussian
        Sze
         of the
- Gaussian distribution
       Gaussian


- Multivariate Gaussian distribution
- Non-isotropic Gaussian distribution
- Markov chains
Preliminaries:
- Gaussian distribution
- Multivariate Gaussian distribution
- Non-isotropic Gaussian distribution
Preliminaries:
  Isotropic case:
- Gaussian distribution
- Multivariate Gaussian distribution||2 /22)
==> general case: density = K exp( - || x - 
==> level sets are rotationally invariant
- Non-isotropic Gaussian distribution
==> completely defined by  and 
- Markov chains
       (do you understand why K is fixed by ?)
==> “isotropic” Gaussian
Preliminaries:
- Gaussian distribution
- Multivariate Gaussian distribution
- Non-isotropic Gaussian distribution
Step-size different
        on each axis



       K exp( - p(x) ) with
- p(x) a quadratic form (--> + infinity)
- K a normalization constant
Notions that we will see:
- Evolutionary algorithm
- Cross-over
- Truncation selection / roulette wheel
- Linear / log-linear convergence
- Estimation of Distribution Algorithm
- EMNA
- Self-adaptation
- (1+1)-ES with 1/5th rule
- Voronoi representation
- Non-isotropy
Comparison-based optimization



     Observation: we want robustness w.r.t that:



                  is comparison-based if




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   29
Comparison-based optimization


                                                            yi=f(xi)




                  is comparison-based if




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   30
Population-based comparison-based algorithms ?




  X(1)=( x(1,1),x(1,2),...,x(1,) ) = Opt()
  X(2)=( x(2,1),x(2,2),...,x(2,) ) = Opt(x(1),
                                      signs of diff)
           …             …            ...
  x(n)=( x(n,1),x(n,2),...,x(n,) ) = Opt(x(n-1),
                                      signs of diff)
    ==> let's write it for =2.


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   31
Population-based comparison-based algorithms ?



  x(1)=(x(1,1),x(1,2)) = Opt()
  x(2)=(x(2,1),x(2,2)) = Opt(x(1),
                               sign(y(1,1)-y(1,2)) )
           …           …           ...
  x(n)=(x(n,1),x(n,2)) = Opt(x(n-1),
                               sign(y(n-1,1)-y(n-1,2))

                   with y(i,j) = f ( x(i,j) )

Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   32
Population-based comparison-based algorithms ?




  Abstract notations: x(i) is a population

  x(1) = Opt()
  x(2) = Opt(x(1), sign(y(1,1)-y(1,2)) )
          …           …            ...
  x(n) = Opt(x(n-1), sign(y(n-1,1)-y(n-1,2))



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   33
Population-based comparison-based algorithms ?



  Abstract notations: x(i) is a population, I(i) is an
                   internal state of the algorithm.

  x(1),I(1) = Opt()
  x(2),I(2) = Opt(x(1), sign(y(1,1)-y(1,2)), I(1) )
            …          …           ...
  x(n),I(n) = Opt(x(n-1),sign(y(n-1,1)-y(n-1,2) ,I(n-1))



Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   34
Population-based comparison-based algorithms ?




  Abstract notations: x(i) is a population, I(i) is an
                   internal state of the algorithm.

  x(1),I(1) = Opt()
  x(2),I(2) = Opt(x(1), (1), I(1) )
            …          …             ...
  x(n),I(n) = Opt(x(n-1),(n-1) ,I(n-1))


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   35
Comparison-based optimization



           ==> Same behavior on many functions



                  is comparison-based if




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   36
Comparison-based optimization



           ==> Same behavior on many functions



                  is comparison-based if




         Quasi-Newton methods very poor on this.
Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   37
Why comparison-based algorithms ?
         ==> more robust
         ==> this can be mathematically
                  formalized: comparison-based opt.
                  are slow ( d log ||xn-x*||/n ~ constant)
                but robust (optimal for some worst
                  case analysis)
Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
II. Evolutionary algorithms
          a. Fundamental elements
          b. Algorithms
          c. Math. analysis

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Parameters:
                       Generate  points around x
   x,
                      ( x +  N where N is a standard
                                Gaussian)
       o f an
                egy
   ema
           Strat
 c sch
         ution
Basi
       Evol
Parameters:
                       Generate  points around x
   x,
                      ( x +  N where N is a standard
                                Gaussian)
       o f an
                egy

                      Compute their  fitness values
   ema
           Strat
 c sch
         ution
Basi
       Evol
Parameters:
                       Generate  points around x
   x,
                      ( x +  N where N is a standard
                                Gaussian)
       o f an
                egy

                      Compute their  fitness values
   ema
           Strat
 c sch
         ution




                             Select the  best
Basi
       Evol
Parameters:
                        Generate  points around x
   x,
                      ( x +  N where N is a standard
                                Gaussian)
       o f an
                egy

                      Compute their  fitness values
   ema
           Strat
 c sch
         ution




                             Select the  best
Basi
       Evol




                      Let x = average of these  best
Parameters:
                        Generate  points around x
   x,
                      ( x +  N where N is a standard
                                Gaussian)
       o f an
                egy

                      Compute their  fitness values
   ema
           Strat
 c sch
         ution




                             Select the  best
Basi
       Evol




                      Let x = average of these  best
Parameters:
                     Generate  points around x
    x,
                   ( x +  N where N is a standard
                             Gaussian)
          llel
     para



                   Compute their  fitness values
                   Multi-cores,
   sly 




                 Clusters, Grids...
     ou




                          Select the  best
Obvi




                   Let x = average of these  best
Parameters:
                        Generate  points around x
    x,
                      ( x +  N where N is a standard
                                Gaussian)
          llel
     para



                      Compute their  fitness values
   sly 
               ple.
     ou




                             Select the  best
         ly sim
Obvi
     Real




                      Let x = average of these  best
Parameters:
                        Generate  points around x
    x,
                      ( x +  N where N is a standard
                                Gaussian)
          llel
     para



                                 Not a negligible advantage.
                      Compute their  fitness values
                             When I accessed, for the 1st time,
   sly 




                                     to a crucial industrial
               ple.




                                     code of an important
     ou




                             Select the  best
                                      company, I believed
         ly sim
Obvi




                                         that it would be
     Real




                                      clean and bug free.
                      Let x = average of these  best
                                                        (I was young)
Parameters:
                            Generate 1 point x' around x
  x,
                           ( x +  N where N is a standard
                                      Gaussian)




                              Compute its fitness value
        ) - ES
                  le




                               Keep the best (x or x').
                      ru
     (1 +1
               1/5 th




                           x=best(x,x')
   The
          with




                           =2 if x' best
                           =0.84 otherwise
This is x...
I generate =6 points
I select the =3 best points
x=average of these =3 best points
Ok.
Choosing an initial
      x is as in any algorithm.
But how do I choose sigma ?
Ok.
Choosing x is as in any algorithm.
But how do I choose sigma ?


Sometimes by human guess.
But for large number of iterations,
there is better.
log || xn – x* || ~ - C n
Usually termed “linear convergence”,
      ==> but it's in log-scale.
     log || xn – x* || ~ - C n
Examples of evolutionary algorithms




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   60
Estimation of Multivariate Normal Algorithm




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   61
Estimation of Multivariate Normal Algorithm




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   62
Estimation of Multivariate Normal Algorithm




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   63
Estimation of Multivariate Normal Algorithm




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   64
EMNA is usually non-isotropic




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   65
EMNA is usually non-isotropic




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   66
Self-adaptation (works in many frameworks)




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud   parallel evolution   67
Self-adaptation (works in many frameworks)




                                              Can be used for non-isotropic
                                                   multivariate Gaussian
                                                       distributions.


Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud                   parallel evolution   68
Let's generalize.

 We have seen algorithms which work as follows:

  - we keep one search point in memory
    (and one step-size)
  - we generate individuals
  - we evaluate these individuals
  - we regenerate a search point and a step-size

Maybe we could keep more than one search point ?
Let's generalize.

  We have seen algorithms which work as follows:

  - we keep one search point in memory
    (and one step-size) points
       ==> mu search
  - we generate individuals
  - we evaluate thesegenerated individuals
       ==> lambda individuals
  - we regenerate a search point and a step-size

Maybe we could keep more than one search point ?
Parameters:                Generate  points
  x1,...,x                  around x1,...,x
                      e.g. each x randomly generated
                             from two points
          llel
     para



                      Compute their  fitness values
   sly 
               ple.
     ou




                             Select the  best
         ly sim
Obvi
     Real




                             Don't average...
Generate  points
       around x1,...,x
e.g. each x randomly generated
       from two points
Generate  points
       around x1,...,x
e.g. each x randomly generated    This is a
       from two points           cross-over
Generate  points
        around x1,...,x
 e.g. each x randomly generated              This is a
        from two points                     cross-over
Example of procedure for generating a point:

  - Randomly draw k parents x1,...,xk
       (truncation selection: randomly in selected individuals)


  - For generating the ith coordinate of new individual z:
                           u=random(1,k)
                             z(i) = x(u)i
Let's summarize:
We have seen a general scheme for optimization:
 - generate a population (e.g. from some distribution, or from
             a set of search points)
 - select the best = new search points


==> Small difference between an
    Evolutionary Algorithm (EA) and an
    Estimation of Distribution Algorithm (EDA).
==> Some EA (older than the EDA acronym) are EDAs.
Let's summarize:
We have seen a general scheme for optimization:
 - generate a population (e.g. from some distribution, or from
             a set of search points)
 - select the best = new search points      EDA
                         EA
==> Small difference between an
    Evolutionary Algorithm (EA) and an
    Estimation of Distribution Algorithm (EDA).
==> Some EA (older than the EDA acronym) are EDAs.
Gives a lot freedom:
 - choose your representation
        and operators (depending on the problem)
 - if you have a step-size, choose adaptation rule
 - choose your population-size (depending on your
                       computer/grid )

 - choose  (carefully) e.g. min(dimension,  /4)
Gives a lot freedom:
 - choose your operators (depending on the problem)
 - if you have a step-size, choose adaptation rule
 - choose your population-size (depending on your
                        computer/grid )

 - choose  (carefully) e.g. min(dimension,  /4)


Can handle strange things:
  - optimize a physical structure ?
  - structure represented as a Voronoi
  - cross-over makes sense, benefits from local structure
  - not so many algorithms can work on that
Voronoi representation:
  - a family of points
Voronoi representation:
  - a family of points
Voronoi representation:
  - a family of points
     - their labels
Voronoi representation:
     - a family of points
        - their labels
==> cross-over makes sense
==> you can optimize a shape
Voronoi representation:
                        - a family of points
                           - their labels
                  ==> cross-over makes sense
                 ==> you can optimize a shape
                  ==> not that mathematical;
                          but really useful


Mutations: each label is changed with proba 1/n
Cross-over: each point/label is randomly drawn from one of
      the two parents
Voronoi representation:
                        - a family of points
                           - their labels
                  ==> cross-over makes sense
                  ==> you can optimize a shape
                   ==> not that mathematical;
                           but really useful


Mutations: each label is changed with proba 1/n
Cross-over: randomly pick one split in the representation:
                 - left part from parent 1
                 - right part from parent 2
               ==> related to biology
Gives a lot freedom:
 - choose your operators (depending on the problem)
 - if you have a step-size, choose adaptation rule
 - choose your population-size (depending on your
                        computer/grid )

 - choose  (carefully) e.g. min(dimension,  /4)


Can handle strange things:
  - optimize a physical structure ?
  - structure represented as a Voronoi
  - cross-over makes sense, benefits from local structure
  - not so many algorithms can work on that
II. Evolutionary algorithms
          a. Fundamental elements
          b. Algorithms
          c. Math. Analysis

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Consider the (1+1)-ES.
  x(n) = x(n-1) or x(n-1) + (n-1)N
  We want to maximize:
              - E log || x(n) - f* ||



Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Consider the (1+1)-ES.
  x(n) = x(n-1) or x(n-1) + (n-1)N
  We want to maximize:
              - E log || x(n) - f* ||
           --------------------------
            - E log || x(n-1) – f* ||
Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Consider the (1+1)-ES.
  x(n) = x(n-1) or x(n-1) + (n-1)N
                                                        We don't know f*.
  We want to maximize:
                                                    How can we optimize this ?
              - E log || x(n) - f* ||
                                                         We will observe
           -------------------------- the acceptance rate,
            - E log || x(n-1) – f* ||
                                                     and we will deduce if   
Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège    is too large or too small..
- E log || x(n) - f* ||
                                                    ON THE NORM FUNCTION
    --------------------------
   - E log || x(n-1) – f* ||


Rejected                                                         Accepted
mutations                                                        mutations




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
- E log || x(n) - f* ||                             For each step-size,
    --------------------------                  evaluate this “expected progress rate”
   - E log || x(n-1) – f* ||                        and evaluate “P(acceptance)”


Rejected                                                           Accepted
mutations                                                          mutations




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Progress rate




                Rejected
                mutations




                            Acceptance rate
Progress rate        We want to be here!




                Rejected                    We observe
                mutations                  (approximately)
                                            this variable




                                             Acceptance rate
Progress rate




                Rejected
                mutations




                     Big      Acceptance rate
                  step-size
Progress rate




                Rejected
                mutations



                             Small
                            step-size   Acceptance rate
Progress rate




            Rejected
Small acceptance rate
            mutations
 ==> decrease sigma




                        Acceptance rate
Progress rate




                Rejected    Big acceptance rate
                mutations   ==> increase sigma




                                        Acceptance rate
th
1/5 rule


                                Based on maths showing
                                     that good step-size
                                <==> success rate < 1/5




Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud           parallel evolution   98
I. Optimization and DFO
  II. Evolutionary algorithms
  III. From math. programming
  IV. Using machine learning
  V. Conclusions

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
III. From math. programming
  ==>pattern search method
                                                    Comparison with ES:
                                                    - code more complicated
                                                    - same rate
                                                    - deterministic
                                                    - less robust




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
III. From math. programming

             Also:
             - Nelder-Mead algorithm (similar to pattern search,
                  better constant in the rate)




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
III. From math. programming

             Also:
             - Nelder-Mead algorithm (similar to pattern search,
                  better constant in the rate)
             - NEWUOA (using value functions and
                   not only comparisons)




Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
I. Optimization and DFO
  II. Evolutionary algorithms
  III. From math. programming
  IV. Using machine learning
  V. Conclusions

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
IV. Using machine learning


    What if computing f takes days ?
==> parallelism
==> and “learn” an approximation of f

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
IV. Using machine learning



Statistical tools: f ' (x) = approximation
                             ( x, x1,f(x1), x2,f(x2), … , xn,f(xn))
                y(n+1) = f ' (x(n+1) )


        e.g. f' = quadratic function closest to f on the x(i)'s.
IV. Using machine learning


 ==> keyword “surrogate models”
 ==> use f' instead of f
 ==> periodically, re-use the real f
I. Optimization and DFO
  II. Evolutionary algorithms
  III. From math. programming
  IV. Using machine learning
  V. Conclusions

Olivier Teytaud
Inria Tao, en visite dans la belle ville de Liège
Derivative free optimization is fun.


==> nice maths
==> nice applications + easily parallel algorithms
==> can handle really complicated domains
   (mixed continuous / integer, optimization
   on sets of programs)


Yet,
often suboptimal on highly structured problems (when
       BFGS is easy to use, thanks to fast gradients)
Keywords, readings


==> cross-entropy (so close to evolution strategies)
==> genetic programming (evolutionary algorithms for
               automatically building programs)
==> H.-G. Beyer's book on ES = good starting point
==> many resources on the web
==> keep in mind that representation / operators are
   often the key
==> we only considered isotropic algorithms; sometimes not
   a good idea at all

Contenu connexe

Tendances

11.solution of linear and nonlinear partial differential equations using mixt...
11.solution of linear and nonlinear partial differential equations using mixt...11.solution of linear and nonlinear partial differential equations using mixt...
11.solution of linear and nonlinear partial differential equations using mixt...Alexander Decker
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Spectral Learning Methods for Finite State Machines with Applications to Na...
  Spectral Learning Methods for Finite State Machines with Applications to Na...  Spectral Learning Methods for Finite State Machines with Applications to Na...
Spectral Learning Methods for Finite State Machines with Applications to Na...LARCA UPC
 
Approximate Bayesian Computation on GPUs
Approximate Bayesian Computation on GPUsApproximate Bayesian Computation on GPUs
Approximate Bayesian Computation on GPUsMichael Stumpf
 
Neural Processes
Neural ProcessesNeural Processes
Neural ProcessesSangwoo Mo
 
Influence of the sampling on Functional Data Analysis
Influence of the sampling on Functional Data AnalysisInfluence of the sampling on Functional Data Analysis
Influence of the sampling on Functional Data Analysistuxette
 
A current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systemsA current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systemsAlexander Decker
 
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...Alexander Decker
 
Artificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismArtificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismOlivier Teytaud
 
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...Alex (Oleksiy) Varfolomiyev
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Visionzukun
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMCdiscount
 
Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputszukun
 
simplex.pdf
simplex.pdfsimplex.pdf
simplex.pdfgrssieee
 

Tendances (17)

11.solution of linear and nonlinear partial differential equations using mixt...
11.solution of linear and nonlinear partial differential equations using mixt...11.solution of linear and nonlinear partial differential equations using mixt...
11.solution of linear and nonlinear partial differential equations using mixt...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Spectral Learning Methods for Finite State Machines with Applications to Na...
  Spectral Learning Methods for Finite State Machines with Applications to Na...  Spectral Learning Methods for Finite State Machines with Applications to Na...
Spectral Learning Methods for Finite State Machines with Applications to Na...
 
Approximate Bayesian Computation on GPUs
Approximate Bayesian Computation on GPUsApproximate Bayesian Computation on GPUs
Approximate Bayesian Computation on GPUs
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
 
Influence of the sampling on Functional Data Analysis
Influence of the sampling on Functional Data AnalysisInfluence of the sampling on Functional Data Analysis
Influence of the sampling on Functional Data Analysis
 
A current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systemsA current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systems
 
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
 
Artificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismArtificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with Parallelism
 
Rouviere
RouviereRouviere
Rouviere
 
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Vision
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAM
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputs
 
simplex.pdf
simplex.pdfsimplex.pdf
simplex.pdf
 
Fdtd
FdtdFdtd
Fdtd
 

En vedette

Undecidability in partially observable deterministic games
Undecidability in partially observable deterministic gamesUndecidability in partially observable deterministic games
Undecidability in partially observable deterministic gamesOlivier Teytaud
 
reinforcement learning for difficult settings
reinforcement learning for difficult settingsreinforcement learning for difficult settings
reinforcement learning for difficult settingsOlivier Teytaud
 
France presented to Taiwanese people
France presented to Taiwanese peopleFrance presented to Taiwanese people
France presented to Taiwanese peopleOlivier Teytaud
 
Complexity of multiobjective optimization
Complexity of multiobjective optimizationComplexity of multiobjective optimization
Complexity of multiobjective optimizationOlivier Teytaud
 
Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)Olivier Teytaud
 
Presentación1 1
Presentación1 1Presentación1 1
Presentación1 1Jhavi17
 
Machine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree SearchMachine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree SearchOlivier Teytaud
 
Why power system studies (and many others!) should be open data / open source
Why power system studies (and many others!) should be open data / open sourceWhy power system studies (and many others!) should be open data / open source
Why power system studies (and many others!) should be open data / open sourceOlivier Teytaud
 

En vedette (11)

Plantilla trazos
Plantilla trazosPlantilla trazos
Plantilla trazos
 
Undecidability in partially observable deterministic games
Undecidability in partially observable deterministic gamesUndecidability in partially observable deterministic games
Undecidability in partially observable deterministic games
 
reinforcement learning for difficult settings
reinforcement learning for difficult settingsreinforcement learning for difficult settings
reinforcement learning for difficult settings
 
France presented to Taiwanese people
France presented to Taiwanese peopleFrance presented to Taiwanese people
France presented to Taiwanese people
 
Complexity of multiobjective optimization
Complexity of multiobjective optimizationComplexity of multiobjective optimization
Complexity of multiobjective optimization
 
An introduction to SVN
An introduction to SVNAn introduction to SVN
An introduction to SVN
 
Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)Derandomized evolution strategies (quasi-random)
Derandomized evolution strategies (quasi-random)
 
Presentación1 1
Presentación1 1Presentación1 1
Presentación1 1
 
Machine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree SearchMachine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree Search
 
Why power system studies (and many others!) should be open data / open source
Why power system studies (and many others!) should be open data / open sourceWhy power system studies (and many others!) should be open data / open source
Why power system studies (and many others!) should be open data / open source
 
Batchal slides
Batchal slidesBatchal slides
Batchal slides
 

Similaire à Derivative Free Optimization

Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimizationOlivier Teytaud
 
Dynamic Programming
Dynamic ProgrammingDynamic Programming
Dynamic ProgrammingSahil Kumar
 
Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyOlivier Teytaud
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Xin-She Yang
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeMagdi Mohamed
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practiceguest3550292
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...npinto
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisXin-She Yang
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingEnthought, Inc.
 
Open GL 04 linealgos
Open GL 04 linealgosOpen GL 04 linealgos
Open GL 04 linealgosRoziq Bahtiar
 
Cuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An IntroductionCuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An IntroductionXin-She Yang
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchAhmed BESBES
 
Introduction
IntroductionIntroduction
Introductionbutest
 
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...Laurent Duval
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Swift for tensorflow
Swift for tensorflowSwift for tensorflow
Swift for tensorflow규영 허
 

Similaire à Derivative Free Optimization (20)

Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimization
 
Dynamic Programming
Dynamic ProgrammingDynamic Programming
Dynamic Programming
 
Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) Survey
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms
 
numdoc
numdocnumdoc
numdoc
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and Practice
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practice
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 
Automatic bayesian cubature
Automatic bayesian cubatureAutomatic bayesian cubature
Automatic bayesian cubature
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
 
Open GL 04 linealgos
Open GL 04 linealgosOpen GL 04 linealgos
Open GL 04 linealgos
 
Cuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An IntroductionCuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An Introduction
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Introduction
IntroductionIntroduction
Introduction
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Alg1
Alg1Alg1
Alg1
 
Swift for tensorflow
Swift for tensorflowSwift for tensorflow
Swift for tensorflow
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Derivative Free Optimization

  • 1. DERIVATIVE-FREE OPTIMIZATION http://www.lri.fr/~teytaud/dfo.pdf (or Quentin's web page ?) Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège using also Slides from A. Auger
  • 2. The next slide is the most important of all. Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 3. In case of trouble, Interrupt me. Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 4. In case of trouble, Interrupt me. Further discussion needed: - R82A, Montefiore institute - olivier.teytaud@inria.fr - or after the lessons (the 25 th , not the 18th) Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 5.
  • 6. I. Optimization and DFO II. Evolutionary algorithms III. From math. programming IV. Using machine learning V. Conclusions Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 7. Derivative-free optimization of f Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 8. Derivative-free optimization of f Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 9. Derivative-free optimization of f No gradient ! Only depends on the x's and f(x)'s Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 10. Derivative-free optimization of f Why derivative free optimization ?
  • 11. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower
  • 12. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower But sometimes you have no derivative
  • 13. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower But sometimes you have no derivative It's simpler (by far) ==> less bugs
  • 14. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower But sometimes you have no derivative It's simpler (by far) It's more robust (to noise, to strange functions...)
  • 15. Derivative-free optimization of f Optimization algorithms ==> Newton optimization ? Why derivative free ==> Quasi-Newton (BFGS) Ok, it's slower But sometimes you have no derivative ==> Gradient descent It's simpler (by far) ==> ...robust (to noise, to strange functions...) It's more
  • 16. Derivative-free optimization of f Optimization algorithms Why derivative free optimization ? Ok, it's slower Derivative-free optimization But sometimes you have no derivative (don't need gradients) It's simpler (by far) It's more robust (to noise, to strange functions...)
  • 17. Derivative-free optimization of f Optimization algorithms Why derivative free optimization ? Derivative-free optimization Ok, it's slower But sometimes you have no derivative Comparison-based optimization (coming soon), It's simpler (by far)comparisons, just needing It's more robust (to noise, to strange functions...) incuding evolutionary algorithms
  • 18. I. Optimization and DFO II. Evolutionary algorithms III. From math. programming IV. Using machine learning V. Conclusions Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 19. II. Evolutionary algorithms a. Fundamental elements b. Algorithms c. Math. analysis Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 20. Preliminaries: - Gaussian distribution - Multivariate Gaussian distribution - Non-isotropic Gaussian distribution - Markov chains ==> for theoretical analysis
  • 21. Preliminaries: - Gaussian distribution - Multivariate Gaussian distribution - Non-isotropic Gaussian distribution - Markov chains
  • 22. K exp( - p(x) ) with - p(x) a degree 2 polynomial (neg. dom coef) - K a normalization constant Preliminaries: - Gaussian distribution - Multivariate Gaussian distribution - Non-isotropic Gaussian distribution - Markov chains
  • 23. K exp( - p(x) ) with - p(x) a degree 2 polynomial (neg. dom coef) - K a normalization constant Translation of the Preliminaries:Gaussian Sze of the - Gaussian distribution Gaussian - Multivariate Gaussian distribution - Non-isotropic Gaussian distribution - Markov chains
  • 24. Preliminaries: - Gaussian distribution - Multivariate Gaussian distribution - Non-isotropic Gaussian distribution
  • 25. Preliminaries: Isotropic case: - Gaussian distribution - Multivariate Gaussian distribution||2 /22) ==> general case: density = K exp( - || x -  ==> level sets are rotationally invariant - Non-isotropic Gaussian distribution ==> completely defined by  and  - Markov chains (do you understand why K is fixed by ?) ==> “isotropic” Gaussian
  • 26. Preliminaries: - Gaussian distribution - Multivariate Gaussian distribution - Non-isotropic Gaussian distribution
  • 27. Step-size different on each axis K exp( - p(x) ) with - p(x) a quadratic form (--> + infinity) - K a normalization constant
  • 28. Notions that we will see: - Evolutionary algorithm - Cross-over - Truncation selection / roulette wheel - Linear / log-linear convergence - Estimation of Distribution Algorithm - EMNA - Self-adaptation - (1+1)-ES with 1/5th rule - Voronoi representation - Non-isotropy
  • 29. Comparison-based optimization Observation: we want robustness w.r.t that: is comparison-based if Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 29
  • 30. Comparison-based optimization yi=f(xi) is comparison-based if Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 30
  • 31. Population-based comparison-based algorithms ? X(1)=( x(1,1),x(1,2),...,x(1,) ) = Opt() X(2)=( x(2,1),x(2,2),...,x(2,) ) = Opt(x(1), signs of diff) … … ... x(n)=( x(n,1),x(n,2),...,x(n,) ) = Opt(x(n-1), signs of diff) ==> let's write it for =2. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 31
  • 32. Population-based comparison-based algorithms ? x(1)=(x(1,1),x(1,2)) = Opt() x(2)=(x(2,1),x(2,2)) = Opt(x(1), sign(y(1,1)-y(1,2)) ) … … ... x(n)=(x(n,1),x(n,2)) = Opt(x(n-1), sign(y(n-1,1)-y(n-1,2)) with y(i,j) = f ( x(i,j) ) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 32
  • 33. Population-based comparison-based algorithms ? Abstract notations: x(i) is a population x(1) = Opt() x(2) = Opt(x(1), sign(y(1,1)-y(1,2)) ) … … ... x(n) = Opt(x(n-1), sign(y(n-1,1)-y(n-1,2)) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 33
  • 34. Population-based comparison-based algorithms ? Abstract notations: x(i) is a population, I(i) is an internal state of the algorithm. x(1),I(1) = Opt() x(2),I(2) = Opt(x(1), sign(y(1,1)-y(1,2)), I(1) ) … … ... x(n),I(n) = Opt(x(n-1),sign(y(n-1,1)-y(n-1,2) ,I(n-1)) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 34
  • 35. Population-based comparison-based algorithms ? Abstract notations: x(i) is a population, I(i) is an internal state of the algorithm. x(1),I(1) = Opt() x(2),I(2) = Opt(x(1), (1), I(1) ) … … ... x(n),I(n) = Opt(x(n-1),(n-1) ,I(n-1)) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 35
  • 36. Comparison-based optimization ==> Same behavior on many functions is comparison-based if Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 36
  • 37. Comparison-based optimization ==> Same behavior on many functions is comparison-based if Quasi-Newton methods very poor on this. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 37
  • 38. Why comparison-based algorithms ? ==> more robust ==> this can be mathematically formalized: comparison-based opt. are slow ( d log ||xn-x*||/n ~ constant) but robust (optimal for some worst case analysis) Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 39. II. Evolutionary algorithms a. Fundamental elements b. Algorithms c. Math. analysis Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 40. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) o f an egy ema Strat c sch ution Basi Evol
  • 41. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) o f an egy Compute their  fitness values ema Strat c sch ution Basi Evol
  • 42. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) o f an egy Compute their  fitness values ema Strat c sch ution Select the  best Basi Evol
  • 43. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) o f an egy Compute their  fitness values ema Strat c sch ution Select the  best Basi Evol Let x = average of these  best
  • 44. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) o f an egy Compute their  fitness values ema Strat c sch ution Select the  best Basi Evol Let x = average of these  best
  • 45. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) llel para Compute their  fitness values Multi-cores, sly  Clusters, Grids... ou Select the  best Obvi Let x = average of these  best
  • 46. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) llel para Compute their  fitness values sly  ple. ou Select the  best ly sim Obvi Real Let x = average of these  best
  • 47. Parameters: Generate  points around x x, ( x +  N where N is a standard Gaussian) llel para Not a negligible advantage. Compute their  fitness values When I accessed, for the 1st time, sly  to a crucial industrial ple. code of an important ou Select the  best company, I believed ly sim Obvi that it would be Real clean and bug free. Let x = average of these  best (I was young)
  • 48. Parameters: Generate 1 point x' around x x, ( x +  N where N is a standard Gaussian) Compute its fitness value ) - ES le Keep the best (x or x'). ru (1 +1 1/5 th x=best(x,x') The with =2 if x' best =0.84 otherwise
  • 51. I select the =3 best points
  • 52. x=average of these =3 best points
  • 53. Ok. Choosing an initial x is as in any algorithm. But how do I choose sigma ?
  • 54. Ok. Choosing x is as in any algorithm. But how do I choose sigma ? Sometimes by human guess. But for large number of iterations, there is better.
  • 55.
  • 56.
  • 57.
  • 58. log || xn – x* || ~ - C n
  • 59. Usually termed “linear convergence”, ==> but it's in log-scale. log || xn – x* || ~ - C n
  • 60. Examples of evolutionary algorithms Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 60
  • 61. Estimation of Multivariate Normal Algorithm Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 61
  • 62. Estimation of Multivariate Normal Algorithm Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 62
  • 63. Estimation of Multivariate Normal Algorithm Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 63
  • 64. Estimation of Multivariate Normal Algorithm Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 64
  • 65. EMNA is usually non-isotropic Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 65
  • 66. EMNA is usually non-isotropic Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 66
  • 67. Self-adaptation (works in many frameworks) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 67
  • 68. Self-adaptation (works in many frameworks) Can be used for non-isotropic multivariate Gaussian distributions. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 68
  • 69. Let's generalize. We have seen algorithms which work as follows: - we keep one search point in memory (and one step-size) - we generate individuals - we evaluate these individuals - we regenerate a search point and a step-size Maybe we could keep more than one search point ?
  • 70. Let's generalize. We have seen algorithms which work as follows: - we keep one search point in memory (and one step-size) points ==> mu search - we generate individuals - we evaluate thesegenerated individuals ==> lambda individuals - we regenerate a search point and a step-size Maybe we could keep more than one search point ?
  • 71. Parameters: Generate  points x1,...,x around x1,...,x e.g. each x randomly generated from two points llel para Compute their  fitness values sly  ple. ou Select the  best ly sim Obvi Real Don't average...
  • 72. Generate  points around x1,...,x e.g. each x randomly generated from two points
  • 73. Generate  points around x1,...,x e.g. each x randomly generated This is a from two points cross-over
  • 74. Generate  points around x1,...,x e.g. each x randomly generated This is a from two points cross-over Example of procedure for generating a point: - Randomly draw k parents x1,...,xk (truncation selection: randomly in selected individuals) - For generating the ith coordinate of new individual z: u=random(1,k) z(i) = x(u)i
  • 75. Let's summarize: We have seen a general scheme for optimization: - generate a population (e.g. from some distribution, or from a set of search points) - select the best = new search points ==> Small difference between an Evolutionary Algorithm (EA) and an Estimation of Distribution Algorithm (EDA). ==> Some EA (older than the EDA acronym) are EDAs.
  • 76. Let's summarize: We have seen a general scheme for optimization: - generate a population (e.g. from some distribution, or from a set of search points) - select the best = new search points EDA EA ==> Small difference between an Evolutionary Algorithm (EA) and an Estimation of Distribution Algorithm (EDA). ==> Some EA (older than the EDA acronym) are EDAs.
  • 77. Gives a lot freedom: - choose your representation and operators (depending on the problem) - if you have a step-size, choose adaptation rule - choose your population-size (depending on your computer/grid ) - choose  (carefully) e.g. min(dimension,  /4)
  • 78. Gives a lot freedom: - choose your operators (depending on the problem) - if you have a step-size, choose adaptation rule - choose your population-size (depending on your computer/grid ) - choose  (carefully) e.g. min(dimension,  /4) Can handle strange things: - optimize a physical structure ? - structure represented as a Voronoi - cross-over makes sense, benefits from local structure - not so many algorithms can work on that
  • 79. Voronoi representation: - a family of points
  • 80. Voronoi representation: - a family of points
  • 81. Voronoi representation: - a family of points - their labels
  • 82. Voronoi representation: - a family of points - their labels ==> cross-over makes sense ==> you can optimize a shape
  • 83. Voronoi representation: - a family of points - their labels ==> cross-over makes sense ==> you can optimize a shape ==> not that mathematical; but really useful Mutations: each label is changed with proba 1/n Cross-over: each point/label is randomly drawn from one of the two parents
  • 84. Voronoi representation: - a family of points - their labels ==> cross-over makes sense ==> you can optimize a shape ==> not that mathematical; but really useful Mutations: each label is changed with proba 1/n Cross-over: randomly pick one split in the representation: - left part from parent 1 - right part from parent 2 ==> related to biology
  • 85. Gives a lot freedom: - choose your operators (depending on the problem) - if you have a step-size, choose adaptation rule - choose your population-size (depending on your computer/grid ) - choose  (carefully) e.g. min(dimension,  /4) Can handle strange things: - optimize a physical structure ? - structure represented as a Voronoi - cross-over makes sense, benefits from local structure - not so many algorithms can work on that
  • 86. II. Evolutionary algorithms a. Fundamental elements b. Algorithms c. Math. Analysis Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 87. Consider the (1+1)-ES. x(n) = x(n-1) or x(n-1) + (n-1)N We want to maximize: - E log || x(n) - f* || Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 88. Consider the (1+1)-ES. x(n) = x(n-1) or x(n-1) + (n-1)N We want to maximize: - E log || x(n) - f* || -------------------------- - E log || x(n-1) – f* || Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 89. Consider the (1+1)-ES. x(n) = x(n-1) or x(n-1) + (n-1)N We don't know f*. We want to maximize: How can we optimize this ? - E log || x(n) - f* || We will observe -------------------------- the acceptance rate, - E log || x(n-1) – f* || and we will deduce if  Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège is too large or too small..
  • 90. - E log || x(n) - f* || ON THE NORM FUNCTION -------------------------- - E log || x(n-1) – f* || Rejected Accepted mutations mutations Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 91. - E log || x(n) - f* || For each step-size, -------------------------- evaluate this “expected progress rate” - E log || x(n-1) – f* || and evaluate “P(acceptance)” Rejected Accepted mutations mutations Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 92. Progress rate Rejected mutations Acceptance rate
  • 93. Progress rate We want to be here! Rejected We observe mutations (approximately) this variable Acceptance rate
  • 94. Progress rate Rejected mutations Big Acceptance rate step-size
  • 95. Progress rate Rejected mutations Small step-size Acceptance rate
  • 96. Progress rate Rejected Small acceptance rate mutations ==> decrease sigma Acceptance rate
  • 97. Progress rate Rejected Big acceptance rate mutations ==> increase sigma Acceptance rate
  • 98. th 1/5 rule Based on maths showing that good step-size <==> success rate < 1/5 Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 98
  • 99. I. Optimization and DFO II. Evolutionary algorithms III. From math. programming IV. Using machine learning V. Conclusions Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 100. III. From math. programming ==>pattern search method Comparison with ES: - code more complicated - same rate - deterministic - less robust Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 101. III. From math. programming Also: - Nelder-Mead algorithm (similar to pattern search, better constant in the rate) Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 102. III. From math. programming Also: - Nelder-Mead algorithm (similar to pattern search, better constant in the rate) - NEWUOA (using value functions and not only comparisons) Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 103. I. Optimization and DFO II. Evolutionary algorithms III. From math. programming IV. Using machine learning V. Conclusions Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 104. IV. Using machine learning What if computing f takes days ? ==> parallelism ==> and “learn” an approximation of f Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 105. IV. Using machine learning Statistical tools: f ' (x) = approximation ( x, x1,f(x1), x2,f(x2), … , xn,f(xn)) y(n+1) = f ' (x(n+1) ) e.g. f' = quadratic function closest to f on the x(i)'s.
  • 106. IV. Using machine learning ==> keyword “surrogate models” ==> use f' instead of f ==> periodically, re-use the real f
  • 107. I. Optimization and DFO II. Evolutionary algorithms III. From math. programming IV. Using machine learning V. Conclusions Olivier Teytaud Inria Tao, en visite dans la belle ville de Liège
  • 108. Derivative free optimization is fun. ==> nice maths ==> nice applications + easily parallel algorithms ==> can handle really complicated domains (mixed continuous / integer, optimization on sets of programs) Yet, often suboptimal on highly structured problems (when BFGS is easy to use, thanks to fast gradients)
  • 109. Keywords, readings ==> cross-entropy (so close to evolution strategies) ==> genetic programming (evolutionary algorithms for automatically building programs) ==> H.-G. Beyer's book on ES = good starting point ==> many resources on the web ==> keep in mind that representation / operators are often the key ==> we only considered isotropic algorithms; sometimes not a good idea at all