Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Convex Optimization    for Imaging      Gabriel Peyré   www.numerical-tours.com
Convex OptimizationSetting: G : H     R ⇤ {+⇥}     H: Hilbert space. Here: H = RN .           Problem:   min G(x)         ...
Convex OptimizationSetting: G : H     R ⇤ {+⇥}     H: Hilbert space. Here: H = RN .           Problem:    min G(x)        ...
Convex OptimizationSetting: G : H     R ⇤ {+⇥}     H: Hilbert space. Here: H = RN .           Problem:    min G(x)        ...
Convex OptimizationSetting: G : H     R ⇤ {+⇥}     H: Hilbert space. Here: H = RN .             Problem:     min G(x)     ...
1         Example:        RegularizationInverse problem:   measurements   y = Kf0 + w    f0              Kf0              ...
1         Example:         RegularizationInverse problem:   measurements      y = Kf0 + w    f0               Kf0         ...
1         Example:         RegularizationInverse problem:   measurements      y = Kf0 + w    f0               Kf0         ...
1         Example:          RegularizationInpainting: masking operator K                fi if i      ,                    ...
Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• D...
Sub-differentialSub-di erential:      G(x) = {u ⇥ H  ⇤ z, G(z)   G(x) + ⌅u, z   x⇧}                                       ...
Sub-differentialSub-di erential:      G(x) = {u ⇥ H  ⇤ z, G(z)       G(x) + ⌅u, z   x⇧}                                   ...
Sub-differentialSub-di erential:        G(x) = {u ⇥ H  ⇤ z, G(z)     G(x) + ⌅u, z   x⇧}                                   ...
Sub-differentialSub-di erential:        G(x) = {u ⇥ H  ⇤ z, G(z)          G(x) + ⌅u, z   x⇧}                              ...
1         Example:              Regularization                      1     x ⇥ argmin G(x) = ||y                   x||2 + |...
1           Example:              Regularization                        1       x ⇥ argmin G(x) = ||y                   x|...
1           Example:               Regularization                        1       x ⇥ argmin G(x) = ||y                    ...
Example: Total Variation DenoisingImportant: the optimization variable is f .                       1          f ⇥ argmin ...
Example: Total Variation Denoising                      1            f ⇥ argmin ||y         f ||2 + J(f )                 ...
Example: Total Variation Denoising                       1             f ⇥ argmin ||y                   f ||2 + J(f )     ...
Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• D...
Proximal OperatorsProximal operator of G:                         1      Prox G (x) = argmin ||x   z||2 + G(z)            ...
Proximal OperatorsProximal operator of G:                         1      Prox G (x) = argmin ||x      z||2 + G(z)         ...
Proximal OperatorsProximal operator of G:                         1      Prox G (x) = argmin ||x               z||2 + G(z)...
Proximal CalculusSeparability:    G(x) = G1 (x1 ) + . . . + Gn (xn )     ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
Proximal CalculusSeparability:    G(x) = G1 (x1 ) + . . . + Gn (xn )    ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))  ...
Proximal CalculusSeparability:     G(x) = G1 (x1 ) + . . . + Gn (xn )    ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) ...
Proximal CalculusSeparability:       G(x) = G1 (x1 ) + . . . + Gn (xn )    ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )...
Prox and SubdifferentialResolvant of G:         z = Prox   G (x)               0     z   x + ⇥G(z)          x    (Id + ⇥G)...
Prox and SubdifferentialResolvant of G:         z = Prox   G (x)                  0     z   x + ⇥G(z)          x    (Id + ...
Gradient and Proximal DescentsGradient descent:   x( +1) = x( )   G(x( ) )       [explicit]           G is C 1 and G is L-...
Gradient and Proximal DescentsGradient descent:   x( +1) = x( )   G(x( ) )                      [explicit]           G is ...
Gradient and Proximal DescentsGradient descent:   x( +1) = x( )   G(x( ) )                             [explicit]         ...
Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• D...
Proximal Splitting Methods           Solve     min E(x)                     x HProblem:      Prox   E   is not available.
Proximal Splitting Methods           Solve     min E(x)                     x HProblem:      Prox   E   is not available.S...
Proximal Splitting Methods           Solve     min E(x)                     x HProblem:      Prox   E   is not available.S...
Smooth + Simple SplittingInverse problem:    measurements      y = Kf0 + w    f0                Kf0                K      ...
Forward-BackwardFix point equation:   x    argmin F (x) + G(x)        0       F (x ) + G(x )           x                  ...
Forward-BackwardFix point equation:   x    argmin F (x) + G(x)           0       F (x ) + G(x )           x               ...
Forward-BackwardFix point equation:   x    argmin F (x) + G(x)           0         F (x ) + G(x )           x             ...
Forward-BackwardFix point equation:   x     argmin F (x) + G(x)               0         F (x ) + G(x )             x      ...
Example: L1 Regularization    1 min || x    y||2 + ||x||1             min F (x) + G(x)  x 2                               ...
Convergence Speedmin E(x) = F (x) + G(x) x            F is L-Lipschitz.           G is simple.Theorem:    If L > 0, FB ite...
Multi-steps AccelerationsBeck-Teboule accelerated FB:        t(0) = 1                     ✓                     ◆      (`+...
Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• D...
Douglas Rachford Scheme                  min G1 (x) + G2 (x)              ( )                   x                        S...
Douglas Rachford Scheme                  min G1 (x) + G2 (x)                 ( )                   x                      ...
DR Fix Point Equationmin G1 (x) + G2 (x)             0    (G1 + G2 )(x) x     z, z   x    ⇥( G1 )(x) and x         z      ...
DR Fix Point Equationmin G1 (x) + G2 (x)                        0       (G1 + G2 )(x) x     z, z     x    ⇥( G1 )(x) and x...
Example: Constrainted L1                  min ||x||1                min G1 (x) + G2 (x)                  x=y              ...
Example: Constrainted L1                  min ||x||1                min G1 (x) + G2 (x)                  x=y              ...
More than 2 Functionals      min G1 (x) + . . . + Gk (x)                 each Fi is simple         x    min G(x1 , . . . ,...
More than 2 Functionals            min G1 (x) + . . . + Gk (x)                    each Fi is simple             x        m...
Auxiliary Variablesmin G1 (x) + G2 A(x)              Linear map A : E   H.  x min G(z) +    C (z)              G1 , G2 sim...
Auxiliary Variables        min G1 (x) + G2 A(x)                Linear map A : E   H.          x        min G(z) +      C (...
Example: TV Regularization          1                                   ||u||1 =        ||ui ||      min ||Kf y||2 + ||⇥f ...
Example: TV Regularization          1                                    ||u||1 =        ||ui ||      min ||Kf y||2 + ||⇥f...
Example: TV Regularization  Orignal f0       y = f0 + w     Recovery fy = Kx0                                Iteration
Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• D...
GFB Splitting                                             n                        min F (x) +               Gi (x)   ( ) ...
GFB Splitting                                                 n                        min F (x) +                    Gi (...
GFB Splitting                                                 n                        min F (x) +                    Gi (...
GFB Fix Pointx   argmin F (x) +   i   Gi (x)         0      F (x ) +   i   Gi (x )    x RN            yi   Gi (x ),       ...
GFB Fix Pointx   argmin F (x) +       i   Gi (x)         0       F (x ) +       i   Gi (x )    x RN             yi       G...
GFB Fix Pointx   argmin F (x) +         i   Gi (x)          0            F (x ) +        i   Gi (x )    x RN             y...
GFB Fix Pointx   argmin F (x) +         i   Gi (x)          0            F (x ) +        i   Gi (x )    x RN             y...
Block Regularization        1       2                     block sparsity: G(x) =                 ||x[b] ||,          ||x[b...
Block Regularization     1        2                  block sparsity: G(x) =                    ||x[b] ||,      ||x[b] ||2 ...
Block Regularization     1        2                  block sparsity: G(x) =             ||x[b] ||,      ||x[b] ||2 =      ...
Deconv. + Inpaint. 2min+CP Y ⇥ P K x CP Y + P 1 K2                   Deconv. x 2Inpaint. min 2 ⇥ ` `                      ...
Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• D...
Legendre-Fenchel DualityLegendre-Fenchel transform: G (u) =     sup      u, x   G(x)                     eu           x do...
Legendre-Fenchel DualityLegendre-Fenchel transform: G (u) =     sup      u, x   G(x)                     eu           x do...
Legendre-Fenchel DualityLegendre-Fenchel transform: G (u) =     sup      u, x   G(x)                     eu           x do...
Indicator and HomogeneousPositively 1-homogeneous functional:      G( x) = |x|G(x)       Example: norm       G(x) = ||x||D...
Indicator and HomogeneousPositively 1-homogeneous functional:          G( x) = |x|G(x)        Example: norm          G(x) ...
Indicator and HomogeneousPositively 1-homogeneous functional:                     G( x) = |x|G(x)         Example: norm   ...
Primal-dual FormulationFenchel-Rockafellar duality:       A:H⇥   L     linearmin G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ...
Primal-dual FormulationFenchel-Rockafellar duality:       A:H⇥        L       linearmin G1 (x) + G2 A(x) = min G1 (x) + su...
Primal-dual FormulationFenchel-Rockafellar duality:        A:H⇥        L       linearmin G1 (x) + G2 A(x) = min G1 (x) + s...
Primal-dual FormulationFenchel-Rockafellar duality:            A:H⇥       L       linearmin G1 (x) + G2 A(x) = min G1 (x) ...
Forward-Backward on the DualIf G1 is strongly convex:    r2 G1 > cId                                                 c  G1...
Forward-Backward on the DualIf G1 is strongly convex:     r2 G1 > cId                                                 c  G...
Forward-Backward on the DualIf G1 is strongly convex:         r2 G1 > cId                                                 ...
Example: TV Denoising       1   min ||f      y||2 + ||⇥f ||1              min ||y + div(u)||2  f RN 2                     ...
Example: TV Denoising       1   min ||f         y||2 + ||⇥f ||1               min ||y + div(u)||2  f RN 2                 ...
Primal-Dual Algorithm          min G1 (x) + G2 A(x)          x H() min max G1 (x)    G⇤ (z) + hA(x), zi                   ...
Primal-Dual Algorithm          min G1 (x) + G2 A(x)          x H() min max G1 (x)            G⇤ (z) + hA(x), zi           ...
Primal-Dual Algorithm             min G1 (x) + G2 A(x)             x H() min max G1 (x)             G⇤ (z) + hA(x), zi    ...
ConclusionInverse problems in imaging:      Large scale, N 106 .     Non-smooth (sparsity, TV, . . . )     (Sometimes) con...
ConclusionInverse problems in imaging:      Large scale, N 106 .                 Towards More Complex Penalization     Non...
ConclusionInverse problems in imaging:      Large scale, N 106 .                 Towards More Complex Penalization     Non...
Prochain SlideShare
Chargement dans…5
×

Signal Processing Course : Convex Optimization

2 892 vues

Publié le

Slides for a course on signal and image processing.

  • Soyez le premier à commenter

Signal Processing Course : Convex Optimization

  1. 1. Convex Optimization for Imaging Gabriel Peyré www.numerical-tours.com
  2. 2. Convex OptimizationSetting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: min G(x) x H
  3. 3. Convex OptimizationSetting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: min G(x) x HClass of functions: x y Convex: G(tx + (1 t)y) tG(x) + (1 t)G(y) t [0, 1]
  4. 4. Convex OptimizationSetting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: min G(x) x HClass of functions: x y Convex: G(tx + (1 t)y) tG(x) + (1 t)G(y) t [0, 1] Lower semi-continuous: lim inf G(x) G(x0 ) x x0 Proper: {x ⇥ H G(x) ⇤= + } = ⌅ ⇤
  5. 5. Convex OptimizationSetting: G : H R ⇤ {+⇥} H: Hilbert space. Here: H = RN . Problem: min G(x) x HClass of functions: x y Convex: G(tx + (1 t)y) tG(x) + (1 t)G(y) t [0, 1] Lower semi-continuous: lim inf G(x) G(x0 ) x x0 Proper: {x ⇥ H G(x) ⇤= + } = ⌅ ⇤ 0 if x ⇥ C,Indicator: C (x) = + otherwise. (C closed and convex)
  6. 6. 1 Example: RegularizationInverse problem: measurements y = Kf0 + w f0 Kf0 K K : RN RP , P N
  7. 7. 1 Example: RegularizationInverse problem: measurements y = Kf0 + w f0 Kf0 K K : RN RP , P NModel: f0 = x0 sparse in dictionary RN Q ,Q N. x RQ f= x R N K y = Kf RPcoe cients image observations = K ⇥ ⇥ RP Q
  8. 8. 1 Example: RegularizationInverse problem: measurements y = Kf0 + w f0 Kf0 K K : RN RP , P NModel: f0 = x0 sparse in dictionary RN Q ,Q N. x RQ f= x R N K y = Kf RPcoe cients image observations = K ⇥ ⇥ RP QSparse recovery: f = x where x solves 1 min ||y x||2 + ||x||1 x RN 2 Fidelity Regularization
  9. 9. 1 Example: RegularizationInpainting: masking operator K fi if i , c (Kf )i = 0 otherwise. K : RN RP P =| | RN Q translation invariant wavelet frame.Orignal f0 = x0 y = x0 + w Recovery x
  10. 10. Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• Duality
  11. 11. Sub-differentialSub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x| G(0) = [ 1, 1]
  12. 12. Sub-differentialSub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x|Smooth functions: If F is C 1 , F (x) = { F (x)} G(0) = [ 1, 1]
  13. 13. Sub-differentialSub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x|Smooth functions: If F is C 1 , F (x) = { F (x)} G(0) = [ 1, 1]First-order conditions: x argmin G(x) 0 G(x ) x H
  14. 14. Sub-differentialSub-di erential: G(x) = {u ⇥ H ⇤ z, G(z) G(x) + ⌅u, z x⇧} G(x) = |x|Smooth functions: If F is C 1 , F (x) = { F (x)} G(0) = [ 1, 1]First-order conditions: x argmin G(x) 0 G(x ) x H U (x) xMonotone operator: U (x) = G(x) (u, v) U (x) U (y), y x, v u 0
  15. 15. 1 Example: Regularization 1 x ⇥ argmin G(x) = ||y x||2 + ||x||1 x RQ 2⇥G(x) = ( x y) + ⇥|| · ||1 (x) sign(xi ) if xi ⇥= 0,|| · ||1 (x)i = [ 1, 1] if xi = 0.
  16. 16. 1 Example: Regularization 1 x ⇥ argmin G(x) = ||y x||2 + ||x||1 x RQ 2 ⇥G(x) = ( x y) + ⇥|| · ||1 (x) sign(xi ) if xi ⇥= 0, xi || · ||1 (x)i = [ 1, 1] if xi = 0. iSupport of the solution: I = {i ⇥ {0, . . . , N 1} xi ⇤= 0}
  17. 17. 1 Example: Regularization 1 x ⇥ argmin G(x) = ||y x||2 + ||x||1 x RQ 2 ⇥G(x) = ( x y) + ⇥|| · ||1 (x) sign(xi ) if xi ⇥= 0, xi || · ||1 (x)i = [ 1, 1] if xi = 0. iSupport of the solution: I = {i ⇥ {0, . . . , N 1} xi ⇤= 0}First-order conditions: i, y x i s RN , ( x y) + s = 0 sI = sign(xI ), ||sI c || 1.
  18. 18. Example: Total Variation DenoisingImportant: the optimization variable is f . 1 f ⇥ argmin ||y f ||2 + J(f ) f RN 2Finite di erence gradient: :R N R N 2 ( f )i R2Discrete TV norm: J(f ) = ||( f )i || i = 0 (noisy)
  19. 19. Example: Total Variation Denoising 1 f ⇥ argmin ||y f ||2 + J(f ) f RN 2 J(f ) = G( f ) G(u) = ||ui || iComposition by linear maps: (J A) = A ( J) A J(f ) = div ( G( f )) ui if ui ⇥= 0, ⇥G(u)i = ||ui || R2 || || 1 if ui = 0.
  20. 20. Example: Total Variation Denoising 1 f ⇥ argmin ||y f ||2 + J(f ) f RN 2 J(f ) = G( f ) G(u) = ||ui || iComposition by linear maps: (J A) = A ( J) A J(f ) = div ( G( f )) ui if ui ⇥= 0, ⇥G(u)i = ||ui || R2 || || 1 if ui = 0.First-order conditions: v RN 2 , f = y + div(v) fi ⇥i I, vi = || fi || , I = {i (⇥f )i = 0} ⇥i I c , ||vi || 1
  21. 21. Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• Duality
  22. 22. Proximal OperatorsProximal operator of G: 1 Prox G (x) = argmin ||x z||2 + G(z) z 2
  23. 23. Proximal OperatorsProximal operator of G: 1 Prox G (x) = argmin ||x z||2 + G(z) z 2 12 log(1 + x2 )G(x) = ||x||1 = |xi | 10 |x| ||x||0 8 i 6 4 2 0G(x) = ||x||0 = | {i xi = 0} | −2 G(x) −10 −8 −6 −4 −2 0 2 4 6 8 10G(x) = log(1 + |xi |2 ) i
  24. 24. Proximal OperatorsProximal operator of G: 1 Prox G (x) = argmin ||x z||2 + G(z) z 2 12 log(1 + x2 )G(x) = ||x||1 = |xi | 10 |x| ||x||0 8 i Prox G (x)i = max 0, 1 6 xi 4 |xi | 2 0G(x) = ||x||0 = | {i xi = 0} | −2 G(x) −10 −8 −6 −4 −2 0 2 4 6 8 10 xi if |xi | 2 , 10 Prox G (x)i = 8 0 otherwise. 6 4 2 0G(x) = log(1 + |xi |2 ) −2 −4 i −6 3rd order polynomial root. −8 ProxG (x) −10 −10 −8 −6 −4 −2 0 2 4 6 8 10
  25. 25. Proximal CalculusSeparability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn ))
  26. 26. Proximal CalculusSeparability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) 1Quadratic functionals: G(x) = || x y||2 2 Prox G = (Id + ) 1 = (Id + ) 1
  27. 27. Proximal CalculusSeparability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) 1Quadratic functionals: G(x) = || x y||2 2 Prox G = (Id + ) 1 = (Id + ) 1Composition by tight frame: A A = Id ProxG A (x) =A ProxG A + Id A A
  28. 28. Proximal CalculusSeparability: G(x) = G1 (x1 ) + . . . + Gn (xn ) ProxG (x) = (ProxG1 (x1 ), . . . , ProxGn (xn )) 1Quadratic functionals: G(x) = || x y||2 2 Prox G = (Id + ) 1 = (Id + ) 1Composition by tight frame: A A = Id ProxG A (x) =A ProxG A + Id A A xIndicators: G(x) = C (x) C Prox G (x) = ProjC (x) ProjC (x) = argmin ||x z|| z C
  29. 29. Prox and SubdifferentialResolvant of G: z = Prox G (x) 0 z x + ⇥G(z) x (Id + ⇥G)(z) z = (Id + ⇥G) 1 (x)Inverse of a set-valued mapping: where x U (y) y U 1 (x) Prox G = (Id + ⇥G) 1 is a single-valued mapping
  30. 30. Prox and SubdifferentialResolvant of G: z = Prox G (x) 0 z x + ⇥G(z) x (Id + ⇥G)(z) z = (Id + ⇥G) 1 (x)Inverse of a set-valued mapping: where x U (y) y U 1 (x) Prox G = (Id + ⇥G) 1 is a single-valued mappingFix point: x argmin G(x) x 0 G(x ) x (Id + ⇥G)(x ) x⇥ = (Id + ⇥G) 1 (x⇥ ) = Prox G (x⇥ )
  31. 31. Gradient and Proximal DescentsGradient descent: x( +1) = x( ) G(x( ) ) [explicit] G is C 1 and G is L-Lipschitz Theorem: If 0 < < 2/L, x( ) x a solution.
  32. 32. Gradient and Proximal DescentsGradient descent: x( +1) = x( ) G(x( ) ) [explicit] G is C 1 and G is L-Lipschitz Theorem: If 0 < < 2/L, x( ) x a solution.Sub-gradient descent: x( +1) = x( ) v( ) , v( ) G(x( ) ) Theorem: If 1/⇥, x( ) x a solution. Problem: slow.
  33. 33. Gradient and Proximal DescentsGradient descent: x( +1) = x( ) G(x( ) ) [explicit] G is C 1 and G is L-Lipschitz Theorem: If 0 < < 2/L, x( ) x a solution.Sub-gradient descent: x( +1) = x( ) v( ) , v( ) G(x( ) ) Theorem: If 1/⇥, x( ) x a solution. Problem: slow.Proximal-point algorithm: x(⇥+1) = Prox G (x(⇥) ) [implicit] Theorem: If c > 0, x( ) x a solution. Prox G hard to compute.
  34. 34. Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• Duality
  35. 35. Proximal Splitting Methods Solve min E(x) x HProblem: Prox E is not available.
  36. 36. Proximal Splitting Methods Solve min E(x) x HProblem: Prox E is not available.Splitting: E(x) = F (x) + Gi (x) i Smooth Simple
  37. 37. Proximal Splitting Methods Solve min E(x) x HProblem: Prox E is not available.Splitting: E(x) = F (x) + Gi (x) i Smooth Simple F (x)Iterative algorithms using: Prox Gi (x) solves Forward-Backward: F + G Douglas-Rachford: Gi Primal-Dual: Gi A Generalized FB: F+ Gi
  38. 38. Smooth + Simple SplittingInverse problem: measurements y = Kf0 + w f0 Kf0 K K : RN RP , P NModel: f0 = x0 sparse in dictionary .Sparse recovery: f = x where x solves min F (x) + G(x) x RN Smooth Simple 1Data fidelity: F (x) = ||y x||2 =K ⇥ 2Regularization: G(x) = ||x||1 = |xi | i
  39. 39. Forward-BackwardFix point equation: x argmin F (x) + G(x) 0 F (x ) + G(x ) x (x F (x )) x + ⇥G(x ) x⇥ = Prox G (x⇥ F (x⇥ ))
  40. 40. Forward-BackwardFix point equation: x argmin F (x) + G(x) 0 F (x ) + G(x ) x (x F (x )) x + ⇥G(x ) x⇥ = Prox G (x⇥ F (x⇥ ))Forward-backward: x(⇥+1) = Prox G x(⇥) F (x(⇥) )
  41. 41. Forward-BackwardFix point equation: x argmin F (x) + G(x) 0 F (x ) + G(x ) x (x F (x )) x + ⇥G(x ) x⇥ = Prox G (x⇥ F (x⇥ ))Forward-backward: x(⇥+1) = Prox G x(⇥) F (x(⇥) )Projected gradient descent: G= C
  42. 42. Forward-BackwardFix point equation: x argmin F (x) + G(x) 0 F (x ) + G(x ) x (x F (x )) x + ⇥G(x ) x⇥ = Prox G (x⇥ F (x⇥ ))Forward-backward: x(⇥+1) = Prox G x(⇥) F (x(⇥) )Projected gradient descent: G= C Theorem: Let F be L-Lipschitz. If < 2/L, x( ) x a solution of ( )
  43. 43. Example: L1 Regularization 1 min || x y||2 + ||x||1 min F (x) + G(x) x 2 x 1 F (x) = || x y||2 2 F (x) = ( x y) L = || || G(x) = ||x||1 ⇥ Prox G (x)i = max 0, 1 xi |xi |Forward-backward Iterative soft thresholding
  44. 44. Convergence Speedmin E(x) = F (x) + G(x) x F is L-Lipschitz. G is simple.Theorem: If L > 0, FB iterates x( ) satisfies E(x( ) ) E(x ) C/ C degrades with L 0.
  45. 45. Multi-steps AccelerationsBeck-Teboule accelerated FB: t(0) = 1 ✓ ◆ (`+1) (`) 1 x = Prox1/L y rF (y (`) ) L 1+ 1 + 4(t( ) )2 t( +1) = 2() t 1 ( y ( +1) =x( +1) + ( +1) (x +1) x( ) ) t (see also Nesterov method) C Theorem: If L > 0, E(x ( ) ) E(x )Complexity theory: optimal in a worse-case sense.
  46. 46. Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• Duality
  47. 47. Douglas Rachford Scheme min G1 (x) + G2 (x) ( ) x Simple SimpleDouglas-Rachford iterations: z (⇥+1) = 1 z (⇥) + RProx G2 RProx G1 (z (⇥) ) 2 2 x(⇥+1) = Prox G2 (z (⇥+1) )Reflexive prox: RProx G (x) = 2Prox G (x) x
  48. 48. Douglas Rachford Scheme min G1 (x) + G2 (x) ( ) x Simple SimpleDouglas-Rachford iterations: z (⇥+1) = 1 z (⇥) + RProx G2 RProx G1 (z (⇥) ) 2 2 x(⇥+1) = Prox G2 (z (⇥+1) )Reflexive prox: RProx G (x) = 2Prox G (x) x Theorem: If 0 < < 2 and ⇥ > 0, x( ) x a solution of ( )
  49. 49. DR Fix Point Equationmin G1 (x) + G2 (x) 0 (G1 + G2 )(x) x z, z x ⇥( G1 )(x) and x z ⇥( G2 )(x) x = Prox G1 (z) and (2x z) x ⇥( G2 )(x)
  50. 50. DR Fix Point Equationmin G1 (x) + G2 (x) 0 (G1 + G2 )(x) x z, z x ⇥( G1 )(x) and x z ⇥( G2 )(x) x = Prox G1 (z) and (2x z) x ⇥( G2 )(x) x = Prox G2 (2x z) = Prox G2 RProx G1 (z) z = 2Prox G2 RProx G1 (y) (2x z) z = 2Prox G2 RProx G1 (z) RProx G1 (z) z = RProx G2 RProx G1 (z) z= 1 z+ RProx G2 RProx G1 (z) 2 2
  51. 51. Example: Constrainted L1 min ||x||1 min G1 (x) + G2 (x) x=y xG1 (x) = iC (x), C = {x x = y} Prox G1 (x) = ProjC (x) = x + ⇥ ( ⇥ ) 1 (y x)G2 (x) = ||x||1 Prox G2 (x) = max 0, 1 xi |xi | i e⇥cient if easy to invert.
  52. 52. Example: Constrainted L1 min ||x||1 min G1 (x) + G2 (x) x=y xG1 (x) = iC (x), C = {x x = y} Prox G1 (x) = ProjC (x) = x + ⇥ ( ⇥ ) 1 (y x)G2 (x) = ||x||1 Prox G2 (x) = max 0, 1 xi |xi | i e⇥cient if easy to invert. log10 (||x( ) ||1 ||x ||1 ) 1Example: compressed sensing −1 0 R100 400 Gaussian matrix −2 −3 = 0.01 y = x0 ||x0 ||0 = 17 −4 =1 −5 = 10 50 100 150 200 250
  53. 53. More than 2 Functionals min G1 (x) + . . . + Gk (x) each Fi is simple x min G(x1 , . . . , xk ) + C (x1 , . . . , xk ) xG(x1 , . . . , xk ) = G1 (x1 ) + . . . + Gk (xk )C = (x1 , . . . , xk ) Hk x1 = . . . = xk
  54. 54. More than 2 Functionals min G1 (x) + . . . + Gk (x) each Fi is simple x min G(x1 , . . . , xk ) + C (x1 , . . . , xk ) x G(x1 , . . . , xk ) = G1 (x1 ) + . . . + Gk (xk ) C = (x1 , . . . , xk ) Hk x1 = . . . = xkG and C are simple: Prox G (x1 , . . . , xk ) = (Prox Gi (xi ))i 1 Prox ⇥C (x1 , . . . , xk ) = (˜, . . . , x) x ˜ where x = ˜ xi k i
  55. 55. Auxiliary Variablesmin G1 (x) + G2 A(x) Linear map A : E H. x min G(z) + C (z) G1 , G2 simple.z⇥H E G(x, y) = G1 (x) + G2 (y) C = {(x, y) ⇥ H E Ax = y}
  56. 56. Auxiliary Variables min G1 (x) + G2 A(x) Linear map A : E H. x min G(z) + C (z) G1 , G2 simple. z⇥H E G(x, y) = G1 (x) + G2 (y) C = {(x, y) ⇥ H E Ax = y}Prox G (x, y) = (Prox G1 (x), Prox G2 (y))Prox C (x, y) = (x + A y , y ˜ y ) = (˜, A˜) ˜ x x y = (Id + AA ) ˜ 1 (Ax y) where x = (Id + A A) ˜ 1 (A y + x) e cient if Id + AA or Id + A A easy to invert.
  57. 57. Example: TV Regularization 1 ||u||1 = ||ui || min ||Kf y||2 + ||⇥f ||1 f 2 i min G1 (f ) + G2 (f ) xG1 (u) = ||u||1 Prox G1 (u)i = max 0, 1 ui ||ui || 1G2 (f ) = ||Kf y||2 Prox = (Id + K K) 1 K 2 G2C = (f, u) ⇥ RN RN 2 u = ⇤f ˜ ˜ Prox C (f, u) = (f , f )
  58. 58. Example: TV Regularization 1 ||u||1 = ||ui || min ||Kf y||2 + ||⇥f ||1 f 2 i min G1 (f ) + G2 (f ) xG1 (u) = ||u||1 Prox G1 (u)i = max 0, 1 ui ||ui || 1G2 (f ) = ||Kf y||2 Prox = (Id + K K) 1 K 2 G2C = (f, u) ⇥ RN RN 2 u = ⇤f ˜ ˜ Prox C (f, u) = (f , f )Compute the solution of: (Id + ˜ )f = div(u) + f O(N log(N )) operations using FFT.
  59. 59. Example: TV Regularization Orignal f0 y = f0 + w Recovery fy = Kx0 Iteration
  60. 60. Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• Duality
  61. 61. GFB Splitting n min F (x) + Gi (x) ( ) x RN i=1 Smooth Simplei = 1, . . . , n, (⇥+1) (⇥) (⇥) zi = zi + Proxn G (2x (⇥) zi F (x(⇥) )) x(⇥) n 1 ( +1) x( +1) = zi n i=1
  62. 62. GFB Splitting n min F (x) + Gi (x) ( ) x RN i=1 Smooth Simplei = 1, . . . , n, (⇥+1) (⇥) (⇥) zi = zi + Proxn G (2x (⇥) zi F (x(⇥) )) x(⇥) n 1 ( +1) x( +1) = zi n i=1 Theorem: Let F be L-Lipschitz. If < 2/L, x( ) x a solution of ( )
  63. 63. GFB Splitting n min F (x) + Gi (x) ( ) x RN i=1 Smooth Simplei = 1, . . . , n, (⇥+1) (⇥) (⇥) zi = zi + Proxn G (2x (⇥) zi F (x(⇥) )) x(⇥) n 1 ( +1) x( +1) = zi n i=1 Theorem: Let F be L-Lipschitz. If < 2/L, x( ) x a solution of ( ) n=1 Forward-backward. F =0 Douglas-Rachford.
  64. 64. GFB Fix Pointx argmin F (x) + i Gi (x) 0 F (x ) + i Gi (x ) x RN yi Gi (x ), F (x ) + i yi =0
  65. 65. GFB Fix Pointx argmin F (x) + i Gi (x) 0 F (x ) + i Gi (x ) x RN yi Gi (x ), F (x ) + i yi =0 1 (zi )n , i=1 i, x zi F (x ) ⇥Gi (x ) n x = 1 n i zi (use zi = x F (x ) N yi )
  66. 66. GFB Fix Pointx argmin F (x) + i Gi (x) 0 F (x ) + i Gi (x ) x RN yi Gi (x ), F (x ) + i yi =0 1 (zi )n , i=1 i, x zi F (x ) ⇥Gi (x ) n x = 1 n i zi (use zi = x F (x ) N yi ) (2x zi F (x )) x n ⇥Gi (x ) x⇥ = Proxn Gi (2x⇥ zi F (x⇥ )) zi = zi + Proxn G (2x⇥ zi F (x⇥ )) x⇥
  67. 67. GFB Fix Pointx argmin F (x) + i Gi (x) 0 F (x ) + i Gi (x ) x RN yi Gi (x ), F (x ) + i yi =0 1 (zi )n , i=1 i, x zi F (x ) ⇥Gi (x ) n x = 1 n i zi (use zi = x F (x ) N yi ) (2x zi F (x )) x n ⇥Gi (x ) x⇥ = Proxn Gi (2x⇥ zi F (x⇥ )) zi = zi + Proxn G (2x⇥ zi F (x⇥ )) x⇥ + Fix point equation on (x , z1 , . . . , zn ).
  68. 68. Block Regularization 1 2 block sparsity: G(x) = ||x[b] ||, ||x[b] ||2 = x2 m b B m biments Towards More Complex Penalization (2) Bk2 + ` 1 `2 4 k=1 x 1,2 b B1 i b xi ⇥ x⇥⇥1 = i ⇥xi ⇥ b B i b xi2 + i b xi N: 256 b B2 b B Image f = x Coe cients x.
  69. 69. Block Regularization 1 2 block sparsity: G(x) = ||x[b] ||, ||x[b] ||2 = x2 m b B m biments Towards More Complex Penalization Non-overlapping decomposition: B = B ... B Towards More Complex Penalization Towards More Complex Penalization n 1 n2 G(x) =4 x iBk (2) + ` ` k=1 G 1,2 (x) Gi (x) = ||x[b] ||, 1 2 i=1 b Bi b b 1b1 B1 i b xiixb xi 22 BB ⇥ x⇥x⇥x⇥⇥1 =i ⇥x⇥x⇥xi ⇥ ⇥= ++ + i b i ⇥ ⇥1 ⇥1 = i i ⇥i i ⇥ bb B B i Bb xii2bi2xi2 bbx i N: 256 b b 2b2 B2 i BB xi2 b2xi b b xi i b B Image f = x Coe cients x. Blocks B1 B1 B2
  70. 70. Block Regularization 1 2 block sparsity: G(x) = ||x[b] ||, ||x[b] ||2 = x2 m b B m biments Towards More Complex Penalization Non-overlapping decomposition: B = B ... B Towards More Complex Penalization Towards More Complex Penalization n 1 n2 G(x) =4 x iBk (2) + ` ` k=1 G 1,2 (x) Gi (x) = ||x[b] ||, 1 2 i=1 b Bi Each Gi is simple: b b 1b1 B1 i b xiixb xi BB 22 ⇥ x⇥x⇥x⇥⇥1 =i ⇥xG ⇥xi ⇥ m = b B B i b xii2bi2xi2 ⇥ ⇥1 = i ⇥i i x + i b i ⇤ m ⇥ b ⇥ Bi , ⇥ ⇥1Prox i ⇥xi ⇥(x) b max i0, 1 = Bb bx ++m N: 256 ||x[b]b||B xi2 b2xi 2 2 B2 b B b i b b xi i b B Image f = x Coe cients x. Blocks B1 B1 B2
  71. 71. Deconv. + Inpaint. 2min+CP Y ⇥ P K x CP Y + P 1 K2 Deconv. x 2Inpaint. min 2 ⇥ ` ` x x k=1 x+1,2` k=1 log10(E−E 2 1 `2 Numerical Illustration log10(E− 1 Numerical Experiments Experiments 1 Numerical 1 TI (2)`2 4 0 ||y x 1 ⇥x||368s PRx 2 minix(x)Y ⇥ K x= + `wavelets x Bk 2 0 : 283s; tPR: 298s; tCP:: 283s; t : 298s; t (2) Deconvolution min 2 Y ⇥ K tmin −1 EFB x 102 Deconvolution +GCP: 1` 4 −1 tEFB 2 + 10 40 20 368s 30 1 2 2 40k=1 ` x 1,2 1 k=1 20 30 3 iteration 3 # i EFB iteration # EFB log10(E−Emin) log10(E−Emin) PR PR 2 = convolution 2 = inpainting+convolution l1/l2 l1/l2 : 1.30e−03; CPλ2 : 1.30e−03; CP 2 λ tPR: 173s; tCP 190s noise: 0.025; convol.::it. #50; SNR: 22.49dB #50; SNR: 22.49dB tEFB: 161s; tPR: 173s; tCP N: 256 tEFB: 161s; noise: 0.025; :convol.: 2 190s 1 Numerical Experiments 2 1 EFB it. N: 256 EFB (4) Bk Y ⇥P K + 0 0 log10(E−Emin) 3 3 1 PR 2 PR 16 onv. + Inpaint. minx 2 CP 2 30 2 x CP `140`2 k=1 x 1,2 10 20 10 40 20 30 1 iteration # 1 iteration # 0 0 λ4 : 1.00e−03; λ4 : 1.00e−03; l1/l2 l1/l2 tEFB: 283s; tPR: 298s; tCP: 368s −1 noise: 0.025; degrad.: 0.4; 0.025; degrad.: 0.4; convol.: 2 noise: convol.: 2 −1 it. #50; SNR: 21.80dB #50; SNR: 21.80dB it. 10 20 iteration # 30 EFB 40 10 20 iteration # 30 40 x0 3 PRmin 2 CP λ2 : 1.30e−03; λ2 : 1.30e−03; l1/l2 l1/l2 1 noise: 0.025; convol.: 2 noise: 0.025; it. #50; SNR: 22.49dB convol.: 2 it. #50; SNR: 22.49dB10 0 log10 10 20 (E(x( ) ) # iteration 30 E(x )) y = x0 + w 40 x 4
  72. 72. Overview• Subdifferential Calculus• Proximal Calculus• Forward Backward• Douglas Rachford• Generalized Forward-Backward• Duality
  73. 73. Legendre-Fenchel DualityLegendre-Fenchel transform: G (u) = sup u, x G(x) eu x dom(G) G(x) S lop G (u) x
  74. 74. Legendre-Fenchel DualityLegendre-Fenchel transform: G (u) = sup u, x G(x) eu x dom(G) G(x) S lop G (u)Example: quadratic functional 1 x G(x) = Ax, x + x, b 2 1 G (u) = u b, A 1 (u b) 2
  75. 75. Legendre-Fenchel DualityLegendre-Fenchel transform: G (u) = sup u, x G(x) eu x dom(G) G(x) S lop G (u)Example: quadratic functional 1 x G(x) = Ax, x + x, b 2 1 G (u) = u b, A 1 (u b) 2Moreau’s identity: Prox G (x) = x ProxG/ (x/ ) G simple G simple
  76. 76. Indicator and HomogeneousPositively 1-homogeneous functional: G( x) = |x|G(x) Example: norm G(x) = ||x||Duality: G (x) = G (·) 1 (x) G (y) = min x, y G(x) 1
  77. 77. Indicator and HomogeneousPositively 1-homogeneous functional: G( x) = |x|G(x) Example: norm G(x) = ||x||Duality: G (x) = G (·) 1 (x) G (y) = min x, y G(x) 1 p norms: G(x) = ||x||p 1 1 + =1 1 p, q + G (x) = ||x||q p q
  78. 78. Indicator and HomogeneousPositively 1-homogeneous functional: G( x) = |x|G(x) Example: norm G(x) = ||x||Duality: G (x) = G (·) 1 (x) G (y) = min x, y G(x) 1 p norms: G(x) = ||x||p 1 1 + =1 1 p, q + G (x) = ||x||q p qExample: Proximal operator of norm Prox ||·|| = Id Proj||·||1 Proj||·||1 (x)i = max 0, 1 xi |xi | for a well-chosen ⇥ = ⇥ (x, )
  79. 79. Primal-dual FormulationFenchel-Rockafellar duality: A:H⇥ L linearmin G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui G⇤ (u) 2x2H x u2L
  80. 80. Primal-dual FormulationFenchel-Rockafellar duality: A:H⇥ L linearmin G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui G⇤ (u) 2x2H x u2LStrong duality: 0 2 ri(dom(G2 )) A ri(dom(G1 ))(min $ max) = max G⇤ (u) + min G1 (x) + hx, A⇤ ui 2 u x = max G⇤ (u) 2 G⇤ ( 1 A⇤ u) u
  81. 81. Primal-dual FormulationFenchel-Rockafellar duality: A:H⇥ L linearmin G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui G⇤ (u) 2x2H x u2LStrong duality: 0 2 ri(dom(G2 )) A ri(dom(G1 ))(min $ max) = max G⇤ (u) + min G1 (x) + hx, A⇤ ui 2 u x = max G⇤ (u) 2 G⇤ ( 1 A⇤ u) uRecovering x? from some u? : x? = argmin G1 (x? ) + hx? , A⇤ u? i x
  82. 82. Primal-dual FormulationFenchel-Rockafellar duality: A:H⇥ L linearmin G1 (x) + G2 A(x) = min G1 (x) + sup hAx, ui G⇤ (u) 2x2H x u2LStrong duality: 0 2 ri(dom(G2 )) A ri(dom(G1 ))(min $ max) = max G⇤ (u) + min G1 (x) + hx, A⇤ ui 2 u x = max G⇤ (u) 2 G⇤ ( 1 A⇤ u) uRecovering x? from some u? : x? = argmin G1 (x? ) + hx? , A⇤ u? i x () A⇤ u? 2 @G1 (x? ) () x? 2 (@G1 ) 1 ( A⇤ u? ) = @G⇤ ( A⇤ s? ) 1
  83. 83. Forward-Backward on the DualIf G1 is strongly convex: r2 G1 > cId c G1 (tx + (1 t)y) 6 tG1 (x) + (1 t)G1 (y) t(1 t)||x y||2 2
  84. 84. Forward-Backward on the DualIf G1 is strongly convex: r2 G1 > cId c G1 (tx + (1 t)y) 6 tG1 (x) + (1 t)G1 (y) t(1 t)||x y||2 2 x? uniquely defined. x? = rG? ( A⇤ u? ) 1 G? is of class C 1 . 1
  85. 85. Forward-Backward on the DualIf G1 is strongly convex: r2 G1 > cId c G1 (tx + (1 t)y) 6 tG1 (x) + (1 t)G1 (y) t(1 t)||x y||2 2 x? uniquely defined. x? = rG? ( A⇤ u? ) 1 G? is of class C 1 . 1 FB on the dual: min G1 (x) + G2 A(x) x2H = min G? ( A⇤ u) + G? (u) 1 2 u2L Smooth Simple ⇣ ⌘ u(`+1) = Prox⌧ G? u(`) + ⌧ A⇤ rG? ( A⇤ u(`) ) 2 1
  86. 86. Example: TV Denoising 1 min ||f y||2 + ||⇥f ||1 min ||y + div(u)||2 f RN 2 ||u|| ||u||1 = ||ui || ||u|| = max ||ui || i iDual solution u Primal solution f = y + div(u ) [Chambolle 2004]
  87. 87. Example: TV Denoising 1 min ||f y||2 + ||⇥f ||1 min ||y + div(u)||2 f RN 2 ||u|| ||u||1 = ||ui || ||u|| = max ||ui || i iDual solution u Primal solution f = y + div(u )FB (aka projected gradient descent): [Chambolle 2004] u( +1) = Proj||·|| u( ) + (y + div(u( ) )) ui v = Proj||·|| (u) vi = max(||ui ||/ , 1) 2 1 Convergence if < = ||div ⇥|| 4
  88. 88. Primal-Dual Algorithm min G1 (x) + G2 A(x) x H() min max G1 (x) G⇤ (z) + hA(x), zi 2 x z
  89. 89. Primal-Dual Algorithm min G1 (x) + G2 A(x) x H() min max G1 (x) G⇤ (z) + hA(x), zi 2 x z z (`+1) = Prox G⇤ 2 (z (`) + A(˜(`) ) x x(⇥+1) = Prox G1 (x(⇥) A (z (⇥) )) ˜ x( +1) = x( +1) + (x( +1) x( ) ) = 0: Arrow-Hurwicz algorithm. = 1: convergence speed on duality gap.
  90. 90. Primal-Dual Algorithm min G1 (x) + G2 A(x) x H() min max G1 (x) G⇤ (z) + hA(x), zi 2 x z z (`+1) = Prox G⇤ 2 (z (`) + A(˜(`) ) x x(⇥+1) = Prox G1 (x(⇥) A (z (⇥) )) ˜ x( +1) = x( +1) + (x( +1) x( ) ) = 0: Arrow-Hurwicz algorithm. = 1: convergence speed on duality gap. Theorem: [Chambolle-Pock 2011] If 0 1 and ⇥⇤ ||A||2 < 1 then x( ) x minimizer of G1 + G2 A.
  91. 91. ConclusionInverse problems in imaging: Large scale, N 106 . Non-smooth (sparsity, TV, . . . ) (Sometimes) convex. Highly structured (separability, p norms, . . . ).
  92. 92. ConclusionInverse problems in imaging: Large scale, N 106 . Towards More Complex Penalization Non-smooth (sparsity, TV, . . . ) (Sometimes) convex. b B1 i b xi 2 ⇥ x⇥⇥1 = i ⇥xi ⇥ b B 2 i p xi + Highly structured (separability, b norms, . . . ). b B2 i b xi2Proximal splitting: Unravel the structure of problems. Parallelizable. Decomposition G = k Gk
  93. 93. ConclusionInverse problems in imaging: Large scale, N 106 . Towards More Complex Penalization Non-smooth (sparsity, TV, . . . ) (Sometimes) convex. b B1 i b xi 2 ⇥ x⇥⇥1 = i ⇥xi ⇥ b B 2 i p xi + Highly structured (separability, b norms, . . . ). b B2 i b xi2Proximal splitting: Unravel the structure of problems. Parallelizable.Open problems: Decomposition G = k Gk Less structured problems without smoothness. Non-convex optimization.

×