SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Introduction       Mixed-Norm-Elasticnet-MKL   Mini-max             Lp -MKL               Conclusion           References
 . . . . . . . .   . . . . . . . .




          .
                                                                                                                .
                                 Fast Convergence Rate of
                                  Multiple Kernel Learning
                               with Elastic-Net Regularization
          .
          ..                                                                                               .




                                                                                                                .
                                   †                   †                 ‡


                                                   †



                                               ‡




                                               2011        4   25



                                                                              .   .   .        .       .            .
Introduction        Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .    . . . . . . . .



Outline

           .
        . . Introduction
          1
               MKL


           .
        . . Mixed-Norm-Elasticnet-MKL
          2


                   Mixed-Elasticnet-MKL

           .
        . . Mini-max
          3


           .
        . . Lp -MKL
          4


           .
        . . Conclusion
          5




                                                                     .   .   .        .       .      .
Introduction        Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .    . . . . . . . .



Outline

           .
        . . Introduction
          1
               MKL


           .
        . . Mixed-Norm-Elasticnet-MKL
          2


                   Mixed-Elasticnet-MKL

           .
        . . Mini-max
          3


           .
        . . Lp -MKL
          4


           .
        . . Conclusion
          5




                                                                     .   .   .        .       .      .
Introduction       Mixed-Norm-Elasticnet-MKL       Mini-max               Lp -MKL                          Conclusion       References
 . . . . . . . .   . . . . . . . .
MKL




                                                                              (RKHS)

                                                  k(x, x ′ )          ⇔             Hk



                                                       1∑
                                                         n
                                      f ← min
                                      ˆ                    ℓ(yi , f (xi )) + C ∥f ∥Hk
                                               f ∈Hk   n
                                                              i=1




                                                                              ∑
                                                                              n
                                      ∃αi ∈ R          s.t.         ˆ
                                                                    f (x) =             αi k(xi , x)
                                                                              i=1




                                                                                    .        .         .        .       .      .
Introduction          Mixed-Norm-Elasticnet-MKL       Mini-max       Lp -MKL               Conclusion       References
 . . . . . . . .      . . . . . . . .
MKL




          Challenge
                                    ,             ,              ,



                   Multiple Kernel Leaning




                                                                               .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL       Mini-max         Lp -MKL                 Conclusion       References
 . . . . . . . .     . . . . . . . .
MKL


Multiple Kernel Learning

                   Single Kernel Learning

                                                         1∑
                                                           n
                                        f ← min
                                        ˆ                    ℓ(yi , f (xi )) + C ∥f ∥Hk
                                                 f ∈Hk   n
                                                                i=1

                   Multiple Kernel Learning (Lanckriet et al., 2004; Bach et al., 2004)
                                                  (                  )
                          ∑M                  ∑n        ∑M                ∑M
                     ˆ=
                     f        ˆ ← min 1
                              fm                 ℓ yi ,      fm (xi ) + C     ∥fm ∥Hm
                                    fm ∈Hm n
                          m=1                           m=1i=1            m=1

                   (Hm :                  km                      RKHS)
                        Group Lasso

                                                              (Sonnenburg et al., 2006;
                        Rakotomamonjy et al., 2008; Suzuki & Tomioka, 2009)

                                                                                .   .     .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL          Mini-max              Lp -MKL               Conclusion       References
 . . . . . . . .     . . . . . . . .
MKL




                   L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004)
                              ( M     )
                                ∑              ∑M
                       min L        fm + C        ∥fm ∥Hm
                     fm ∈Hm
                                        m=1                 m=1

                   L2 -MKL
                                    (              )
                                        ∑
                                        M                   ∑
                                                            M
                      min       L             fm       +C          ∥fm ∥2 m
                                                                        H
                     fm ∈Hm
                                        m=1                 m=1




                                                                                        .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL           Mini-max              Lp -MKL               Conclusion       References
 . . . . . . . .     . . . . . . . .
MKL




                   L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004)
                              ( M     )
                                ∑              ∑M
                       min L        fm + C        ∥fm ∥Hm
                     fm ∈Hm
                                        m=1                 m=1

                   L2 -MKL
                                    (              )
                                        ∑
                                        M                   ∑
                                                            M
                      min       L             fm       +C           ∥fm ∥2 m
                                                                         H
                     fm ∈Hm
                                        m=1                 m=1


                   Elasticnet-MKL (Tomioka & Suzuki, 2009)
                               ( M    )
                                ∑           ∑M              ∑
                                                            M
                       min L        fm + C1    ∥fm ∥Hm + C2   ∥fm ∥2 m
                                                                   H
                     fm ∈Hm
                                        m=1                  m=1                    m=1

                   Mixed-Norm-Elasticnet-MKL (Meier et al., 2009)
                             ( M     )
                               ∑           ∑√
                                            M                              ∑
                                                                           M
                      min L        fm + C1       ∥fm ∥2 + C2 ∥fm ∥2 m + C3
                                                      n           H          ∥fm ∥2 m
                                                                                  H
                     fm ∈Hm
                                        m=1                  m=1                                 m=1
                                                  ∑n
                              ∥f   ∥2
                                    n   :=    1
                                              n    i=1
                                                               2
                                                         f (xi ) .
                                                                                         .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL           Mini-max              Lp -MKL               Conclusion       References
 . . . . . . . .     . . . . . . . .
MKL




                   L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004)
                              ( M     )
                                ∑              ∑M
                       min L        fm + C        ∥fm ∥Hm
                     fm ∈Hm
                                        m=1                 m=1

                   L2 -MKL
                                    (              )
                                        ∑
                                        M                   ∑
                                                            M
                      min       L             fm       +C           ∥fm ∥2 m
                                                                         H
                     fm ∈Hm
                                        m=1                 m=1


                   Elasticnet-MKL (Tomioka & Suzuki, 2009)
                               ( M    )
                                ∑           ∑M              ∑
                                                            M
                       min L        fm + C1    ∥fm ∥Hm + C2   ∥fm ∥2 m
                                                                   H
                     fm ∈Hm
                                        m=1                  m=1                    m=1

                   Mixed-Norm-Elasticnet-MKL (Meier et al., 2009)
                             ( M     )
                               ∑           ∑√
                                            M                              ∑
                                                                           M
                      min L        fm + C1       ∥fm ∥2 + C2 ∥fm ∥2 m + C3
                                                      n           H          ∥fm ∥2 m
                                                                                  H
                     fm ∈Hm
                                        m=1                  m=1                                 m=1
                                                  ∑n
                              ∥f   ∥2
                                    n   :=    1
                                              n    i=1
                                                               2
                                                         f (xi ) .
                                                                                         .   .   .        .       .      .
Introduction       Mixed-Norm-Elasticnet-MKL       Mini-max             Lp -MKL               Conclusion       References
 . . . . . . . .   . . . . . . . .




                        Mixed-Norm-Elasticnet-MKL



                                  regression

                                                              1∑
                                                                n
                                                 L(f ) =          (f (xi ) − yi )2
                                                              n
                                                                i=1




                                                           ∑
                                                           M
                                               f ∗ (x) =          ∗
                                                                 fm (x)(= E[Y |x])
                                                           m=1




                                                                                  .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL     Mini-max      Lp -MKL               Conclusion       References
 . . . . . . . .     . . . . . . . .




          ∥f − f ∗ ∥2 2
           ˆ
                    L              d                                   ∗
                                                             d=|{m | ∥fm ∥Hm̸=0}|.
              L1 -MKL (Koltchinskii & Yuan, 2008):
                                    (                         )
                                                     d log(M)
                                       1+s n − 1+s +
                                       1−s      1
                                Op d
                                                         n
                   Mixed-Norm-Elasticnet-MKL (Meier et al., 2009): mini-max
                                                    ( (       ) 1 )
                                                        log(M) 1+s
                                                  Op d
                                                           n
                   Mixed-Norm-L1 -MKL (Koltchinskii & Yuan, 2010): mini-max
                                   ∑
                                     m (C1 ∥fm ∥n + C2 ∥fm ∥Hm )
                                        (                      )
                                                     d log(M)
                                     Op dn− 1+s +
                                                1

                                                         n
                   Mini-max                  (Raskutti et al., 2009)
                                                  (                        )
                                                       − 1+s
                                                          1     d log(M/d)
                                               Op dn         +
                                                                     n
                                                                           .   .   .        .       .      .
Introduction            Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .        . . . . . . . .




          Mixed-Norm-Elasticnet-MKL
                                    (                                    )
                                        1+q      1+q      2s
                                                                d log(M)
                  ∥f − f ∗ ∥2 2 = Op d 1+q+s n− 1+q+s R21+q+s +
                   ˆ        L                                              .
                                                                    n



                                   f∗                q
                                   f∗      “       ”R2
                   ℓ2                      mini-max               ℓ∞




                                                                         .   .   .        .       .      .
Introduction        Mixed-Norm-Elasticnet-MKL     Mini-max          Lp -MKL                                 Conclusion       References
 . . . . . . . .    . . . . . . . .




                                                       (q)
                                                                                      1−s           1
               K&Y (2008)                        q=1            ?                 d   1+s   n− 1+s + d log(M)
                                                                                            (       ) 1 n
                                                                                                log(M)         1+s
               Meier et al. (2009)               q=0                                    d          n
                                                                                                1
               K&Y (2010)                        q=0         ℓ∞ -ball                 dn− 1+s +
                                                                                           d log(M)
                                                                                               n
                                                                                          1+q
                                                                                       − 1+q+s
               IBIS2010                         0≤q≤1        ℓ∞ -ball          dn       + d log(M)
                                                                                                 n
                                                                          ( d ) 1+q+s 1+q+s
                                                                                 1+q    2s
                                                0≤q≤1        ℓ2 -ball       n
                                                                                     R2       + d log(M)
                                                                                                    n




                                                                              .             .           .        .       .      .
Introduction        Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .    . . . . . . . .



Outline

           .
        . . Introduction
          1
               MKL


           .
        . . Mixed-Norm-Elasticnet-MKL
          2


                   Mixed-Elasticnet-MKL

           .
        . . Mini-max
          3


           .
        . . Lp -MKL
          4


           .
        . . Conclusion
          5




                                                                     .   .   .        .       .      .
Introduction       Mixed-Norm-Elasticnet-MKL      Mini-max       Lp -MKL               Conclusion       References
 . . . . . . . .   . . . . . . . .




                                                            ∗
                                               I0 := {m | ∥fm ∥Hm ̸= 0}

                                                 ∗
                                               ∥fm ∥Hm > 0 (m ∈ I0 ),
                                                 ∗
                                               ∥fm ∥Hm = 0 (m ∈ I0 ).
                                                                 c


          d = |I0 | (                               )




                                                                           .   .   .        .       .      .
Introduction           Mixed-Norm-Elasticnet-MKL       Mini-max            Lp -MKL                Conclusion           References
 . . . . . . . .       . . . . . . . .



Spectrum Condition (s)
          0 < s < 1:

          Mercer
                                                        ∑∞
                                       km (x, x ′ ) =       ℓ=1    µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ )
                         {ϕℓ,m }∞
                              L2 (P)
                                ℓ=1                               ONS.
         .
         Spectrum Condition (s)                                                                                         .
        ..
                 0<s<1

                                                   µℓ,m ≤ C ℓ− s
                                                                    1

          .                                                              (∀ℓ, m).
          ..                                                                                                       .




                                                                                                                        .
          s        RKHS
                   s                               s
         .
         Proposition (Steinwart et al. (2009))                                                                          .
        ..
                        µℓ,m ∼ ℓ− s ⇔ N(B(Hm ), ϵ, L2 (P)) ∼ ϵ−2s
                                   1

         .
         ..                                                                                                        .




                                                                                                                        .
                                                        .    .                                .        .       .            .
Introduction       Mixed-Norm-Elasticnet-MKL       Mini-max         Lp -MKL               Conclusion           References
 . . . . . . . .   . . . . . . . .



Convolution Condition (q)

          0 ≤ q ≤ 1:          f∗

         Σm : Hm → Hm      ⟨f , Σm g ⟩Hm := E[f (X )g (X )]
         .
         Convolution Condition (q) (Caponnetto & de Vito, 2007)                                                 .
        ..
                                 ∗
                  0 ≤ q ≤ 1 gm ∈ Hm
                                                      ∗         ∗
                                                     fm = Σq/2 gm
                                                           m


          .
          ..                                                                                               .




                                                                                                                .
                         ∑∞ q/2
          km (x, x ′ ) := ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ )
            (q/2)

                                               ∫
                                     ∗
                                    fm (x) =        km (x, x ′ )gm (x ′ )dP(x ′ ),
                                                     (q/2)       ∗




                                                                              .   .   .        .       .            .
Introduction       Mixed-Norm-Elasticnet-MKL   Mini-max            Lp -MKL                   Conclusion       References
 . . . . . . . .   . . . . . . . .



s            q




                           f*
                                                              f*                               f*



                   (a) s        q=0                   (b) s        q>0               (c) s            q>0




                                                                             .   .     .          .       .      .
Introduction          Mixed-Norm-Elasticnet-MKL        Mini-max        Lp -MKL               Conclusion           References
 . . . . . . . .      . . . . . . . .



Incoherece Condition

         .
         Incoherece Condition (Koltchinskii & Yuan, 2008; Meier et al., 2009)                                      .
        ..
                  0<C

          .                                       0 < C < κ(I0 )(1 − ρ2 (I0 )).
          ..                                                                                                  .




                                                                                                                   .
                                    {               ∑                                         }
                                                   ∥ m∈I fm ∥2 2
                   κ(I ) := sup κ ≥ 0 | κ ≤ ∑                   L
                                                                2 , ∀fm ∈ Hm (m ∈ I ) ,
                                                      m∈I ∥fm ∥L2
                                {                                                               }
                                    ⟨fI , gI c ⟩L2
                   ρ(I ) := sup                     | fI ∈ HI , gI c ∈ HI c , fI ̸= 0, gI c ̸= 0 .
                                  ∥fI ∥L2 ∥gI c ∥L2

          I0                                                      .


                                                                                 .   .   .        .       .            .
Introduction       Mixed-Norm-Elasticnet-MKL     Mini-max       Lp -MKL                   Conclusion           References
 . . . . . . . .   . . . . . . . .




         .
         Basic Condition                                                                                        .
        ..                          ∑M
              E[Y |X ] = f ∗ (X ) = m=1 fm (X )
                                           ∗
                                                                                  ϵ := Y − f (X )  ∗

                     |ϵ| ≤ L.
         .    supX ∈X |km (X , X )| ≤ 1 (∀m).
         ..                                                                                                .




                                                                                                                .
         .
         ∞-norm Bound Condition                                                                                 .
        ..
         Spectrum Condition (s)

                                           ∥fm ∥∞ ≤ C ∥fm ∥1−s ∥fm ∥s m .
                                                           L2 (P)   H
          .
          ..                                                                                               .




                                                                                                                .
          Gaussian                     Sobolev
                             Mendelson and Neeman (2010); Steinwart et al. (2009)



                                                                          .   .       .        .       .            .
Introduction         Mixed-Norm-Elasticnet-MKL              Mini-max               Lp -MKL               Conclusion           References
 . . . . . . . .     . . . . . . . .
Mixed-Elasticnet-MKL


                                  Mixed-Norm-Elasticnet-MKL
                              (              )
                                  ∑
                                  M
                                                      (n)
                                                            ∑√
                                                            M
                                                                        (n)           (n)
                                                                                          ∑
                                                                                          M
               min        L             fm       + λ1         ∥fm ∥2 + λ2 ∥fm ∥2 m + λ3
                                                                   n           H            ∥fm ∥2 m .
                                                                                                 H
              fm ∈Hm
                                  m=1                       m=1                                      m=1


       .
       Theorem (Suzuki et al. (2011))                                                                                          .
      ..
       Spectrum Condition (s), Convolution Condition (q), Incoherence
       Condition, Basic Condition, ∞-norm Bound Condition
                                            (n)    (n)    (n)
         n                                 λ1 , λ2 , λ3
                                 (                                     )
                                     1+q      1+q      2s
                                                              d log(M)
              ∥f − f ∗ ∥2 2 ≤ C ′ d 1+q+s n− 1+q+s R2,g ∗ +
               ˆ        L
                                                     1+q+s
                                                                         η(t)2 ,
                                                                  n
                                  √               √
       .            1 − e−         nt
                                        − e−          n
                                                          (∀t ≥ 1)
       ..                                                                                                                 .




                                                                                                                               .
                                         √         √
                   η(t) := max( t, t/ n)                                 R2,g ∗                             :
                                                                   (                    )1
                                                                       ∑
                                                                       M
                                                                               ∗
                                                                                         2

                                                  R   2,g ∗   :=             ∥gm ∥2 m
                                                                                  H          .
                                                                       m=1
                                                                                             .   .   .          .     .            .
Introduction         Mixed-Norm-Elasticnet-MKL           Mini-max              Lp -MKL                   Conclusion       References
 . . . . . . . .     . . . . . . . .
Mixed-Elasticnet-MKL


Bound

       q=0

                                                                                         dn− 1+s +
                                                                                                 1   d log(M)
                   Koltchinskii and Yuan (2010)                                                          n    .
                                                   1+q              1+q   2s
                                                 d 1+q+s n− 1+q+s R2,g ∗ +
                                                                    1+q+s           d log(M)
                                                                                        n    .


          ...
           1      ∗
                ∥fm ∥Hm = 1 (m = 1, . . . , d):
                                  dn− 1+s + d log(M)
                                        1

                                                n
                       Koltchinskii and Yuan (2010)
          ...
           2    ∥fm ∥Hm = m−1 (m = 1, . . . , d):
                  ∗

                                  d 1+s n− 1+s + d log(M)
                                     1      1

                                                     n                                       s
                       Koltchinskii and Yuan (2010)                                  d 1+s

                            (s = 0)

                                                                                         .       .   .        .       .      .
Introduction        Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .    . . . . . . . .



Outline

           .
        . . Introduction
          1
               MKL


           .
        . . Mixed-Norm-Elasticnet-MKL
          2


                   Mixed-Elasticnet-MKL

           .
        . . Mini-max
          3


           .
        . . Lp -MKL
          4


           .
        . . Conclusion
          5




                                                                     .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL         Mini-max               Lp -MKL                Conclusion       References
 . . . . . . . .     . . . . . . . .



Mini-max

          Mini-max
                     q
           ∗        ∗
          fm = Σm gm
                 2

               (∑               )1
            1
             .
            ..     M
                   m=1
                         ∗
                       ∥gm ∥2 m
                            H
                                 2
                                   ≤ R2                       g∗              R2        ℓ2

                                                 1+q          1+q        2s
                                                                                d log(M/d)
                                            d 1+q+s n− 1+q+s R21+q+s +
                                                                                     n


             ...
               2          ∗
                   maxm ∥gm ∥Hm ≤ R∞                   g∗                R∞        ℓ∞

                                                        1+q         2s
                                                                              d log(M/d)
                                                 dn− 1+q+s R∞ +
                                                            1+q+s

                                                                                   n
                     q = 0, R∞ = 1                            Koltchinskii and Yuan (2010)


                                                                                        .    .   .        .       .      .
Introduction        Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .    . . . . . . . .



Outline

           .
        . . Introduction
          1
               MKL


           .
        . . Mixed-Norm-Elasticnet-MKL
          2


                   Mixed-Elasticnet-MKL

           .
        . . Mini-max
          3


           .
        . . Lp -MKL
          4


           .
        . . Conclusion
          5




                                                                     .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL          Mini-max               Lp -MKL                        Conclusion           References
 . . . . . . . .     . . . . . . . .



Lp -MKL
                   Lp -MKL (Kloft et al., 2009)
                                            ( M   )
                                              ∑       (n)
                                                          ∑
                                                          M
                                  min L         fm + λ1     ∥fm ∥p m
                                                                 H
                                         fm ∈Hm
                                                              m=1                   m=1

                      √ t                                (∑                      )p
                                                                                  1
                                                               M       ∗
          η(t) := max( t, √n ), Rp :=                          m=1   ∥fm ∥p m
                                                                          H
         .
         Theorem (Lp -MKL               )                                      .
        ..
         Spectrum Condition(s), Incoherence Condition, Basic Condition, ∞-norm
                                                                                 2s
                                                                          1−                 − 1+s
                                                                                                2
                                                      λ1 = n− 1+s M
                                                        (n)          1         p(1+s)
          Bound Condition                                                               Rp
                                                 (                                                       )
                                                                     2s    2s            M log(M)
                     ∥f − f ∗ ∥2 2 ≤ C               n− 1+s M 1− p(1+s) Rp1+s +
                                                         1
                      ˆ        L                                                                             η(t)2 ,
                                                                                             n
                                        √
          .          1 − exp(−t) − exp(− n)
          ..                                                                                                                   .




                                                                                                                                    .
                            Mini-max                                                     .           .   .         .       .          .
Introduction        Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .    . . . . . . . .



Outline

           .
        . . Introduction
          1
               MKL


           .
        . . Mixed-Norm-Elasticnet-MKL
          2


                   Mixed-Elasticnet-MKL

           .
        . . Mini-max
          3


           .
        . . Lp -MKL
          4


           .
        . . Conclusion
          5




                                                                     .   .   .        .       .      .
Introduction         Mixed-Norm-Elasticnet-MKL        Mini-max     Lp -MKL               Conclusion       References
 . . . . . . . .     . . . . . . . .



Conclusion



                   Mixed-Norm-Elasticnet–MKL


                     f∗                      q
                                                 ℓ2              mini-max
                   Lp -MKL



                                  arXiv     http://arxiv.org/abs/1103.0431
          slide: http://www.simplex.t.u-tokyo.ac.jp/˜s-taiji/data/IBISML2011.pdf




                                                                             .   .   .        .       .      .
Introduction       Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL      Conclusion   References
 . . . . . . . .   . . . . . . . .


          Bach, F., Lanckriet, G., & Jordan, M. (2004). Multiple kernel learning,
            conic duality, and the SMO algorithm. the 21st International
            Conference on Machine Learning (pp. 41–48).
          Caponnetto, A., & de Vito, E. (2007). Optimal rates for regularized
            least-squares algorithm. Foundations of Computational Mathematics,
            7, 331–368.
          Kloft, M., Brefeld, U., Sonnenburg, S., Laskov, P., M¨ller, K.-R., & Zien,
                                                                   u
            A. (2009). Efficient and accurate ℓp -norm multiple kernel learning.
            Advances in Neural Information Processing Systems 22 (pp.
            997–1005). Cambridge, MA: MIT Press.
          Koltchinskii, V., & Yuan, M. (2008). Sparse recovery in large ensembles
            of kernel machines. Proceedings of the Annual Conference on Learning
            Theory (pp. 229–238).
          Koltchinskii, V., & Yuan, M. (2010). Sparsity in multiple kernel learning.
            The Annals of Statistics, 38, 3660–3695.
          Lanckriet, G., Cristianini, N., Ghaoui, L. E., Bartlett, P., & Jordan, M.
            (2004). Learning the kernel matrix with semi-definite programming.
            Journal of Machine Learning Research, 5, 27–72.
          Meier, L., van de Geer, S., & B¨hlmann, P. (2009). High-dimensional
                                           u
            additive modeling. The Annals of Statistics, 37, 3779–3821. .
                                                             .     .    .         .    .
Introduction       Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .   . . . . . . . .


          Mendelson, S., & Neeman, J. (2010). Regularization in kernel learning.
           The Annals of Statistics, 38, 526–565.
          Rakotomamonjy, A., Bach, F., Canu, S., & Y., G. (2008). SimpleMKL.
            Journal of Machine Learning Research, 9, 2491–2521.
          Raskutti, G., Wainwright, M., & Yu, B. (2009). Lower bounds on
            minimax rates for nonparametric regression with additive sparsity and
            smoothness. In Advances in neural information processing systems 22,
            1563–1570. Cambridge, MA: MIT Press.
          Sonnenburg, S., R¨tsch, G., Sch¨fer, C., & Sch¨lkopf, B. (2006). Large
                            a              a             o
            scale multiple kernel learning. Journal of Machine Learning Research,
            7, 1531–1565.
          Steinwart, I., Hush, D., & Scovel, C. (2009). Optimal rates for
            regularized least squares regression. Proceedings of the Annual
            Conference on Learning Theory (pp. 79–93).
          Suzuki, T., & Tomioka, R. (2009). SpicyMKL. arXiv:0909.5026.
          Suzuki, T., Tomioka, R., & Sugiyama, M. (2011). Fast convergence rate
            of multiple kernel learning with elastic-net regularization.
            arXiv:1103.0431.
                                                                    .   .   .        .       .      .
Introduction       Mixed-Norm-Elasticnet-MKL   Mini-max   Lp -MKL               Conclusion       References
 . . . . . . . .   . . . . . . . .


          Tomioka, R., & Suzuki, T. (2009). Sparsity-accuracy trade-off in MKL.
            NIPS 2009 Workshop:: Understanding Multiple Kernel Learning
            Methods. Whistler. arXiv:1001.2615.




                                                                    .   .   .        .       .      .

Contenu connexe

Tendances

DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transformvijayanand Kandaswamy
 
Lossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal waveletsLossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal waveletssipij
 
Autoregression
AutoregressionAutoregression
Autoregressionjchristo06
 
Fatigue damage in solder joint interconnects - presentation
Fatigue damage in solder joint interconnects - presentationFatigue damage in solder joint interconnects - presentation
Fatigue damage in solder joint interconnects - presentationDr. Adnan Judeh (Abdul-Baqi)
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...grssieee
 
Bayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal modelsBayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal modelskhbrodersen
 
Doering Savov
Doering SavovDoering Savov
Doering Savovgh
 
Dynamic stiffness and eigenvalues of nonlocal nano beams
Dynamic stiffness and eigenvalues of nonlocal nano beamsDynamic stiffness and eigenvalues of nonlocal nano beams
Dynamic stiffness and eigenvalues of nonlocal nano beamsUniversity of Glasgow
 
Non-linear optics by means of dynamical Berry phase
Non-linear optics  by means of  dynamical Berry phaseNon-linear optics  by means of  dynamical Berry phase
Non-linear optics by means of dynamical Berry phaseClaudio Attaccalite
 
Feedback of zonal flows on Rossby-wave turbulence driven by small scale inst...
Feedback of zonal flows on  Rossby-wave turbulence driven by small scale inst...Feedback of zonal flows on  Rossby-wave turbulence driven by small scale inst...
Feedback of zonal flows on Rossby-wave turbulence driven by small scale inst...Colm Connaughton
 
Dynamic response of oscillators to general excitations
Dynamic response of oscillators to general excitationsDynamic response of oscillators to general excitations
Dynamic response of oscillators to general excitationsUniversity of Glasgow
 

Tendances (20)

DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
 
Cb25464467
Cb25464467Cb25464467
Cb25464467
 
Lossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal waveletsLossless image compression using new biorthogonal wavelets
Lossless image compression using new biorthogonal wavelets
 
NC time seminar
NC time seminarNC time seminar
NC time seminar
 
Autoregression
AutoregressionAutoregression
Autoregression
 
Fatigue damage in solder joint interconnects - presentation
Fatigue damage in solder joint interconnects - presentationFatigue damage in solder joint interconnects - presentation
Fatigue damage in solder joint interconnects - presentation
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Lect5 v2
Lect5 v2Lect5 v2
Lect5 v2
 
Cd Simon
Cd SimonCd Simon
Cd Simon
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
 
Bayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal modelsBayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal models
 
Isome hoa pdf
Isome hoa pdfIsome hoa pdf
Isome hoa pdf
 
Doering Savov
Doering SavovDoering Savov
Doering Savov
 
Dynamic stiffness and eigenvalues of nonlocal nano beams
Dynamic stiffness and eigenvalues of nonlocal nano beamsDynamic stiffness and eigenvalues of nonlocal nano beams
Dynamic stiffness and eigenvalues of nonlocal nano beams
 
UCB 2012-02-28
UCB 2012-02-28UCB 2012-02-28
UCB 2012-02-28
 
Non-linear optics by means of dynamical Berry phase
Non-linear optics  by means of  dynamical Berry phaseNon-linear optics  by means of  dynamical Berry phase
Non-linear optics by means of dynamical Berry phase
 
Feedback of zonal flows on Rossby-wave turbulence driven by small scale inst...
Feedback of zonal flows on  Rossby-wave turbulence driven by small scale inst...Feedback of zonal flows on  Rossby-wave turbulence driven by small scale inst...
Feedback of zonal flows on Rossby-wave turbulence driven by small scale inst...
 
Sdof
SdofSdof
Sdof
 
Dynamic response of oscillators to general excitations
Dynamic response of oscillators to general excitationsDynamic response of oscillators to general excitations
Dynamic response of oscillators to general excitations
 

En vedette

Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
 
Sparse estimation tutorial 2014
Sparse estimation tutorial 2014Sparse estimation tutorial 2014
Sparse estimation tutorial 2014Taiji Suzuki
 
統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)
統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)
統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)Taiji Suzuki
 
機械学習におけるオンライン確率的最適化の理論
機械学習におけるオンライン確率的最適化の理論機械学習におけるオンライン確率的最適化の理論
機械学習におけるオンライン確率的最適化の理論Taiji Suzuki
 
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...Taiji Suzuki
 

En vedette (7)

Jokyokai
JokyokaiJokyokai
Jokyokai
 
Ibis2016
Ibis2016Ibis2016
Ibis2016
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
Sparse estimation tutorial 2014
Sparse estimation tutorial 2014Sparse estimation tutorial 2014
Sparse estimation tutorial 2014
 
統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)
統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)
統計的学習理論チュートリアル: 基礎から応用まで (Ibis2012)
 
機械学習におけるオンライン確率的最適化の理論
機械学習におけるオンライン確率的最適化の理論機械学習におけるオンライン確率的最適化の理論
機械学習におけるオンライン確率的最適化の理論
 
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
 

Similaire à Jokyokai2

NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningzukun
 
The Effective Fragment Molecular Orbital Method
The Effective Fragment Molecular Orbital MethodThe Effective Fragment Molecular Orbital Method
The Effective Fragment Molecular Orbital Methodcsteinmann
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
 
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...Tuong Do
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingSSA KPI
 
Cluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentationCluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentationColm Connaughton
 
Mit2 092 f09_lec23
Mit2 092 f09_lec23Mit2 092 f09_lec23
Mit2 092 f09_lec23Rahman Hakim
 

Similaire à Jokyokai2 (7)

NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
 
The Effective Fragment Molecular Orbital Method
The Effective Fragment Molecular Orbital MethodThe Effective Fragment Molecular Orbital Method
The Effective Fragment Molecular Orbital Method
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...
 
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated Annealing
 
Cluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentationCluster aggregation with complete collisional fragmentation
Cluster aggregation with complete collisional fragmentation
 
Mit2 092 f09_lec23
Mit2 092 f09_lec23Mit2 092 f09_lec23
Mit2 092 f09_lec23
 

Plus de Taiji Suzuki

[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...Taiji Suzuki
 
[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...
[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...
[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...Taiji Suzuki
 
深層学習の数理:カーネル法, スパース推定との接点
深層学習の数理:カーネル法, スパース推定との接点深層学習の数理:カーネル法, スパース推定との接点
深層学習の数理:カーネル法, スパース推定との接点Taiji Suzuki
 
Iclr2020: Compression based bound for non-compressed network: unified general...
Iclr2020: Compression based bound for non-compressed network: unified general...Iclr2020: Compression based bound for non-compressed network: unified general...
Iclr2020: Compression based bound for non-compressed network: unified general...Taiji Suzuki
 
数学で解き明かす深層学習の原理
数学で解き明かす深層学習の原理数学で解き明かす深層学習の原理
数学で解き明かす深層学習の原理Taiji Suzuki
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理Taiji Suzuki
 
はじめての機械学習
はじめての機械学習はじめての機械学習
はじめての機械学習Taiji Suzuki
 

Plus de Taiji Suzuki (7)

[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...
 
[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...
[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...
[NeurIPS2020 (spotlight)] Generalization bound of globally optimal non convex...
 
深層学習の数理:カーネル法, スパース推定との接点
深層学習の数理:カーネル法, スパース推定との接点深層学習の数理:カーネル法, スパース推定との接点
深層学習の数理:カーネル法, スパース推定との接点
 
Iclr2020: Compression based bound for non-compressed network: unified general...
Iclr2020: Compression based bound for non-compressed network: unified general...Iclr2020: Compression based bound for non-compressed network: unified general...
Iclr2020: Compression based bound for non-compressed network: unified general...
 
数学で解き明かす深層学習の原理
数学で解き明かす深層学習の原理数学で解き明かす深層学習の原理
数学で解き明かす深層学習の原理
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理
 
はじめての機械学習
はじめての機械学習はじめての機械学習
はじめての機械学習
 

Jokyokai2

  • 1. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . . . Fast Convergence Rate of Multiple Kernel Learning with Elastic-Net Regularization . .. . . † † ‡ † ‡ 2011 4 25 . . . . . .
  • 2. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 3. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 4. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . MKL (RKHS) k(x, x ′ ) ⇔ Hk 1∑ n f ← min ˆ ℓ(yi , f (xi )) + C ∥f ∥Hk f ∈Hk n i=1 ∑ n ∃αi ∈ R s.t. ˆ f (x) = αi k(xi , x) i=1 . . . . . .
  • 5. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . MKL Challenge , , , Multiple Kernel Leaning . . . . . .
  • 6. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . MKL Multiple Kernel Learning Single Kernel Learning 1∑ n f ← min ˆ ℓ(yi , f (xi )) + C ∥f ∥Hk f ∈Hk n i=1 Multiple Kernel Learning (Lanckriet et al., 2004; Bach et al., 2004) ( ) ∑M ∑n ∑M ∑M ˆ= f ˆ ← min 1 fm ℓ yi , fm (xi ) + C ∥fm ∥Hm fm ∈Hm n m=1 m=1i=1 m=1 (Hm : km RKHS) Group Lasso (Sonnenburg et al., 2006; Rakotomamonjy et al., 2008; Suzuki & Tomioka, 2009) . . . . . .
  • 7. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 . . . . . .
  • 8. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 Elasticnet-MKL (Tomioka & Suzuki, 2009) ( M ) ∑ ∑M ∑ M min L fm + C1 ∥fm ∥Hm + C2 ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 Mixed-Norm-Elasticnet-MKL (Meier et al., 2009) ( M ) ∑ ∑√ M ∑ M min L fm + C1 ∥fm ∥2 + C2 ∥fm ∥2 m + C3 n H ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 ∑n ∥f ∥2 n := 1 n i=1 2 f (xi ) . . . . . . .
  • 9. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . MKL L1 -MKL (Lanckriet et al., 2004; Bach et al., 2004) ( M ) ∑ ∑M min L fm + C ∥fm ∥Hm fm ∈Hm m=1 m=1 L2 -MKL ( ) ∑ M ∑ M min L fm +C ∥fm ∥2 m H fm ∈Hm m=1 m=1 Elasticnet-MKL (Tomioka & Suzuki, 2009) ( M ) ∑ ∑M ∑ M min L fm + C1 ∥fm ∥Hm + C2 ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 Mixed-Norm-Elasticnet-MKL (Meier et al., 2009) ( M ) ∑ ∑√ M ∑ M min L fm + C1 ∥fm ∥2 + C2 ∥fm ∥2 m + C3 n H ∥fm ∥2 m H fm ∈Hm m=1 m=1 m=1 ∑n ∥f ∥2 n := 1 n i=1 2 f (xi ) . . . . . . .
  • 10. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Norm-Elasticnet-MKL regression 1∑ n L(f ) = (f (xi ) − yi )2 n i=1 ∑ M f ∗ (x) = ∗ fm (x)(= E[Y |x]) m=1 . . . . . .
  • 11. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . ∥f − f ∗ ∥2 2 ˆ L d ∗ d=|{m | ∥fm ∥Hm̸=0}|. L1 -MKL (Koltchinskii & Yuan, 2008): ( ) d log(M) 1+s n − 1+s + 1−s 1 Op d n Mixed-Norm-Elasticnet-MKL (Meier et al., 2009): mini-max ( ( ) 1 ) log(M) 1+s Op d n Mixed-Norm-L1 -MKL (Koltchinskii & Yuan, 2010): mini-max ∑ m (C1 ∥fm ∥n + C2 ∥fm ∥Hm ) ( ) d log(M) Op dn− 1+s + 1 n Mini-max (Raskutti et al., 2009) ( ) − 1+s 1 d log(M/d) Op dn + n . . . . . .
  • 12. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Norm-Elasticnet-MKL ( ) 1+q 1+q 2s d log(M) ∥f − f ∗ ∥2 2 = Op d 1+q+s n− 1+q+s R21+q+s + ˆ L . n f∗ q f∗ “ ”R2 ℓ2 mini-max ℓ∞ . . . . . .
  • 13. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . (q) 1−s 1 K&Y (2008) q=1 ? d 1+s n− 1+s + d log(M) ( ) 1 n log(M) 1+s Meier et al. (2009) q=0 d n 1 K&Y (2010) q=0 ℓ∞ -ball dn− 1+s + d log(M) n 1+q − 1+q+s IBIS2010 0≤q≤1 ℓ∞ -ball dn + d log(M) n ( d ) 1+q+s 1+q+s 1+q 2s 0≤q≤1 ℓ2 -ball n R2 + d log(M) n . . . . . .
  • 14. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 15. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . ∗ I0 := {m | ∥fm ∥Hm ̸= 0} ∗ ∥fm ∥Hm > 0 (m ∈ I0 ), ∗ ∥fm ∥Hm = 0 (m ∈ I0 ). c d = |I0 | ( ) . . . . . .
  • 16. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Spectrum Condition (s) 0 < s < 1: Mercer ∑∞ km (x, x ′ ) = ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ ) {ϕℓ,m }∞ L2 (P) ℓ=1 ONS. . Spectrum Condition (s) . .. 0<s<1 µℓ,m ≤ C ℓ− s 1 . (∀ℓ, m). .. . . s RKHS s s . Proposition (Steinwart et al. (2009)) . .. µℓ,m ∼ ℓ− s ⇔ N(B(Hm ), ϵ, L2 (P)) ∼ ϵ−2s 1 . .. . . . . . . . .
  • 17. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Convolution Condition (q) 0 ≤ q ≤ 1: f∗ Σm : Hm → Hm ⟨f , Σm g ⟩Hm := E[f (X )g (X )] . Convolution Condition (q) (Caponnetto & de Vito, 2007) . .. ∗ 0 ≤ q ≤ 1 gm ∈ Hm ∗ ∗ fm = Σq/2 gm m . .. . . ∑∞ q/2 km (x, x ′ ) := ℓ=1 µℓ,m ϕℓ,m (x)ϕℓ,m (x ′ ) (q/2) ∫ ∗ fm (x) = km (x, x ′ )gm (x ′ )dP(x ′ ), (q/2) ∗ . . . . . .
  • 18. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . s q f* f* f* (a) s q=0 (b) s q>0 (c) s q>0 . . . . . .
  • 19. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Incoherece Condition . Incoherece Condition (Koltchinskii & Yuan, 2008; Meier et al., 2009) . .. 0<C . 0 < C < κ(I0 )(1 − ρ2 (I0 )). .. . . { ∑ } ∥ m∈I fm ∥2 2 κ(I ) := sup κ ≥ 0 | κ ≤ ∑ L 2 , ∀fm ∈ Hm (m ∈ I ) , m∈I ∥fm ∥L2 { } ⟨fI , gI c ⟩L2 ρ(I ) := sup | fI ∈ HI , gI c ∈ HI c , fI ̸= 0, gI c ̸= 0 . ∥fI ∥L2 ∥gI c ∥L2 I0 . . . . . . .
  • 20. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . . Basic Condition . .. ∑M E[Y |X ] = f ∗ (X ) = m=1 fm (X ) ∗ ϵ := Y − f (X ) ∗ |ϵ| ≤ L. . supX ∈X |km (X , X )| ≤ 1 (∀m). .. . . . ∞-norm Bound Condition . .. Spectrum Condition (s) ∥fm ∥∞ ≤ C ∥fm ∥1−s ∥fm ∥s m . L2 (P) H . .. . . Gaussian Sobolev Mendelson and Neeman (2010); Steinwart et al. (2009) . . . . . .
  • 21. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Elasticnet-MKL Mixed-Norm-Elasticnet-MKL ( ) ∑ M (n) ∑√ M (n) (n) ∑ M min L fm + λ1 ∥fm ∥2 + λ2 ∥fm ∥2 m + λ3 n H ∥fm ∥2 m . H fm ∈Hm m=1 m=1 m=1 . Theorem (Suzuki et al. (2011)) . .. Spectrum Condition (s), Convolution Condition (q), Incoherence Condition, Basic Condition, ∞-norm Bound Condition (n) (n) (n) n λ1 , λ2 , λ3 ( ) 1+q 1+q 2s d log(M) ∥f − f ∗ ∥2 2 ≤ C ′ d 1+q+s n− 1+q+s R2,g ∗ + ˆ L 1+q+s η(t)2 , n √ √ . 1 − e− nt − e− n (∀t ≥ 1) .. . . √ √ η(t) := max( t, t/ n) R2,g ∗ : ( )1 ∑ M ∗ 2 R 2,g ∗ := ∥gm ∥2 m H . m=1 . . . . . .
  • 22. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mixed-Elasticnet-MKL Bound q=0 dn− 1+s + 1 d log(M) Koltchinskii and Yuan (2010) n . 1+q 1+q 2s d 1+q+s n− 1+q+s R2,g ∗ + 1+q+s d log(M) n . ... 1 ∗ ∥fm ∥Hm = 1 (m = 1, . . . , d): dn− 1+s + d log(M) 1 n Koltchinskii and Yuan (2010) ... 2 ∥fm ∥Hm = m−1 (m = 1, . . . , d): ∗ d 1+s n− 1+s + d log(M) 1 1 n s Koltchinskii and Yuan (2010) d 1+s (s = 0) . . . . . .
  • 23. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 24. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mini-max Mini-max q ∗ ∗ fm = Σm gm 2 (∑ )1 1 . .. M m=1 ∗ ∥gm ∥2 m H 2 ≤ R2 g∗ R2 ℓ2 1+q 1+q 2s d log(M/d) d 1+q+s n− 1+q+s R21+q+s + n ... 2 ∗ maxm ∥gm ∥Hm ≤ R∞ g∗ R∞ ℓ∞ 1+q 2s d log(M/d) dn− 1+q+s R∞ + 1+q+s n q = 0, R∞ = 1 Koltchinskii and Yuan (2010) . . . . . .
  • 25. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 26. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Lp -MKL Lp -MKL (Kloft et al., 2009) ( M ) ∑ (n) ∑ M min L fm + λ1 ∥fm ∥p m H fm ∈Hm m=1 m=1 √ t (∑ )p 1 M ∗ η(t) := max( t, √n ), Rp := m=1 ∥fm ∥p m H . Theorem (Lp -MKL ) . .. Spectrum Condition(s), Incoherence Condition, Basic Condition, ∞-norm 2s 1− − 1+s 2 λ1 = n− 1+s M (n) 1 p(1+s) Bound Condition Rp ( ) 2s 2s M log(M) ∥f − f ∗ ∥2 2 ≤ C n− 1+s M 1− p(1+s) Rp1+s + 1 ˆ L η(t)2 , n √ . 1 − exp(−t) − exp(− n) .. . . Mini-max . . . . . .
  • 27. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Outline . . . Introduction 1 MKL . . . Mixed-Norm-Elasticnet-MKL 2 Mixed-Elasticnet-MKL . . . Mini-max 3 . . . Lp -MKL 4 . . . Conclusion 5 . . . . . .
  • 28. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Conclusion Mixed-Norm-Elasticnet–MKL f∗ q ℓ2 mini-max Lp -MKL arXiv http://arxiv.org/abs/1103.0431 slide: http://www.simplex.t.u-tokyo.ac.jp/˜s-taiji/data/IBISML2011.pdf . . . . . .
  • 29. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Bach, F., Lanckriet, G., & Jordan, M. (2004). Multiple kernel learning, conic duality, and the SMO algorithm. the 21st International Conference on Machine Learning (pp. 41–48). Caponnetto, A., & de Vito, E. (2007). Optimal rates for regularized least-squares algorithm. Foundations of Computational Mathematics, 7, 331–368. Kloft, M., Brefeld, U., Sonnenburg, S., Laskov, P., M¨ller, K.-R., & Zien, u A. (2009). Efficient and accurate ℓp -norm multiple kernel learning. Advances in Neural Information Processing Systems 22 (pp. 997–1005). Cambridge, MA: MIT Press. Koltchinskii, V., & Yuan, M. (2008). Sparse recovery in large ensembles of kernel machines. Proceedings of the Annual Conference on Learning Theory (pp. 229–238). Koltchinskii, V., & Yuan, M. (2010). Sparsity in multiple kernel learning. The Annals of Statistics, 38, 3660–3695. Lanckriet, G., Cristianini, N., Ghaoui, L. E., Bartlett, P., & Jordan, M. (2004). Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5, 27–72. Meier, L., van de Geer, S., & B¨hlmann, P. (2009). High-dimensional u additive modeling. The Annals of Statistics, 37, 3779–3821. . . . . . .
  • 30. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Mendelson, S., & Neeman, J. (2010). Regularization in kernel learning. The Annals of Statistics, 38, 526–565. Rakotomamonjy, A., Bach, F., Canu, S., & Y., G. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521. Raskutti, G., Wainwright, M., & Yu, B. (2009). Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness. In Advances in neural information processing systems 22, 1563–1570. Cambridge, MA: MIT Press. Sonnenburg, S., R¨tsch, G., Sch¨fer, C., & Sch¨lkopf, B. (2006). Large a a o scale multiple kernel learning. Journal of Machine Learning Research, 7, 1531–1565. Steinwart, I., Hush, D., & Scovel, C. (2009). Optimal rates for regularized least squares regression. Proceedings of the Annual Conference on Learning Theory (pp. 79–93). Suzuki, T., & Tomioka, R. (2009). SpicyMKL. arXiv:0909.5026. Suzuki, T., Tomioka, R., & Sugiyama, M. (2011). Fast convergence rate of multiple kernel learning with elastic-net regularization. arXiv:1103.0431. . . . . . .
  • 31. Introduction Mixed-Norm-Elasticnet-MKL Mini-max Lp -MKL Conclusion References . . . . . . . . . . . . . . . . Tomioka, R., & Suzuki, T. (2009). Sparsity-accuracy trade-off in MKL. NIPS 2009 Workshop:: Understanding Multiple Kernel Learning Methods. Whistler. arXiv:1001.2615. . . . . . .