SlideShare une entreprise Scribd logo
1  sur  28
IDCOM, University of Edinburgh



      Rank Aware Algorithms for Joint Sparse
                    Recovery




                               Mike Davies*
              Joint work with Yonina Eldar‡ and Jeff Blanchard†


         * Institute of Digital Communications, University of Edinburgh
              ‡ Technion, Israel           † Grinnell College, USA
IDCOM, University of Edinburgh

                                    Outline of Talk


•   Multiple Measurements vs Single Measurements
•   Nec.+suff. conditions for Joint Sparse Recovery
•   Reduced complexity combinatorial search
•   Classical approaches to sparse MMV problem
    – How good are SOMP and convex optimization?
• Rank Aware Pursuits
    – Evolution of the rank of residual matrices
    – A recovery guarantee
• Empirical simulations
IDCOM, University of Edinburgh
                  Sparse Single Measurement Vector
                               Problem

                        m x1          mxn          n x1




                   Measurements    Measurement
                                      Matrix




                                          Sparse Signal
                                       k nonzero elements


Given y ∈ Rm and Φ ∈ Rm×n with m < n find:

                    x = argmin | supp(x)| s.t. Φx = y.
                    ˆ
                               x
IDCOM, University of Edinburgh
                 Sparse Multiple Measurement Vector
                               Problem

                       m×l           m×n          n×l




                                  Measurement
                     Measurements    Matrix

                                                         row support

                                         Sparse Signal
                                        k nonzero rows


Given Y ∈ Rm×l and Φ ∈ Rm×n with m < n find:
                   ˆ
                   X = argmin | supp(X)| s.t. ΦX = Y.
                             X
IDCOM, University of Edinburgh

                                    MMV uniqueness
Worst Case
•   Uniqueness of solution for sparse MMV problem is equivalent to that for
    SMV problem. Simply replicate SMV problem:
                              X = {x, x, . . . , x}
    Hence nec. + suff. condition to uniquely determine each k-sparse vector x is
    given by SMV condition:
                                            spark(Φ)
                       | supp(X)| = k <
                                                2
Rank 'r' Case
•   If Rank(Y)=r then the necessary + sufficient conditions are less restrictive
    [Chen & Huo 2006, D. & Eldar 2010]:
                                spark(Φ) − 1 + rank(Y)
               | supp(X)| = k <
                                          2
    Equivalently we can replace rank(Y) with rank(X).

         More measurements (higher rank) makes recovery easier!
IDCOM, University of Edinburgh

                                 MMV uniqueness
Generic scenario:
Typical matrices achieve maximal spark:

                Φ ∈ Rm×n → spark(Φ) = m + 1

Typical matrices achieve maximal rank

            X ∈ Rk×l → rank(X) = r = min{k, l}

Hence generically we have uniqueness if

                m ≥ 2k − min{k, l} + 1 ≥ k + 1

When l ≥ k we typically only need k+1 measurements
IDCOM, University of Edinburgh

                        Exhaustive search solution

How does the rank change the exhaustive search?
SMV exhaustive search:


                 find    , | | = k s.t. ΦX = Y

 However since span(Y) ⊂ span(Φ ) and rank(Y) = r

     ∃γ⊂       , |γ| = k − r s.t. span([Φγ , Y ]) = span(Φ )

                                 n
In fact we have a reduced k−r+1      combinatorial search.
IDCOM, University of Edinburgh

                           Geometric Picture for MMV
                                  φ1              φ2             Y = ΦΛ XΛ,:


                                                         φ3


                                                       2−sparse vector ∈ span(Y )



                                                 span(Y )



If X is k-sparse and rank(Y) = r there exists a (k-r+1)-sparse vector in span(Y)
IDCOM, University of Edinburgh

             Maximal Rank Exhaustive Search: MUSIC
When we have maximal rank(X) = k the exhaustive search is linear and
can be solved with a modified MUSIC algorithm.

Let U = orth(Y) This is an orthonormal basis for span(Φ )

Then under identifiablity conditions we have:

               (I − UUT )φi      2   = 0, if and only if i ∈ .

(in practice select support by thresholding)

Theorem 1 (Feng 1996) Let Y = ΦX with | supp(X)| = k, rank(X) = k
                                                                  ˆ
and k < spark(Φ) − 1. Then MUSIC is guaranteed to recover X (i.e. X = X).
IDCOM, University of Edinburgh




                 Maximal rank problem is not NP-hard
                 Furthermore there is no constraint on
                                  n!
IDCOM, University of Edinburgh




                           Popular MMV solutions
IDCOM, University of Edinburgh

               Popular MMV sparse recovery solutions
Two classes of MMV sparse recovery algorithm:
greedy, e.g.
   Algorithm 1 Simultaneous Orthogonal Matching Pursuit (SOMP)
    1: initialization: R(0) = Y, X(0) = 0, 0 = ∅
    2: for n = 1; n := n + 1 until stopping criterion do
    3:   in = argmaxi φT R(n−1) q
                           i
           n     n−1     n
    4:       =        ∪i
           (n)      †
    5:   X n ,: = Φ n Y
    6:   R(n) = P ⊥(n) Y where P ⊥(n) := (I − Φ (n) Φ† (n) )
    7: end for


and relaxed, e.g.

   Algorithm 2 ℓ1 /ℓq Minimization
                         ˆ
                         X = argmin ||X||1,q s.t. ΦX = Y
                                 X
IDCOM, University of Edinburgh
                  Do such MMV solutions exploit the
                               rank?
Answer: NO. [D. & Eldar 2010]
Theorem 2 (SOMP is not rank aware) Let τ be given such that 1 ≤ τ ≤ k
and suppose that
                       max ||Φ† φj ||1 > 1
                             j∈

for some support , | | = k. Then there exists an X with supp(X) =      and
rank(X) = τ that SOMP cannot recover.
                                                         SMV OMP Exact
                                                        Recovery condition

   Proof - Rank r perturbation of rank 1 problem approaches rank 1 recovery
property due to continuity norm.
IDCOM, University of Edinburgh
                  Do such MMV solutions exploit the
                               rank?
Answer: NO. [D. & Eldar 2010]
Theorem 3 (ℓ1 /ℓq minimization is not rank aware) Let τ be given such
that 1 ≤ τ ≤ k and suppose that there exists a z ∈ N (Φ) such that

                                 ||z ||1 > ||z c ||1

for some support , | | = k. Then there exists an X with supp(X) =         ,
rank(X) l=Null
   SMV 1 τ that the mixed norm solution cannot recover.
 Space Property


   Proof - Rank r perturbation of rank 1 problem approaches rank 1 recovery
property due to continuity of norm.
IDCOM, University of Edinburgh




                             Rank Aware Pursuits
IDCOM, University of Edinburgh

                            Rank Aware Selection
Aim: to select individual atoms in a similar manner to modified MUSIC

Rank Aware Selection [D. & Eldar 2010]
At the nth iteration make the following selection:

               (n)       (n−1)
                     =           ∪ argmax ||φT U(n−1) ||2
                                             i
                                      i

where U(n−1) = orth(R(n−1) )

Properties:
     1. Worst case behaviour does not approach SMV case.
     2. When rank(R) = k it always selects a correct atom as with
        MUSIC
IDCOM, University of Edinburgh

                                 Rank Aware OMP
Rank Aware OMP
Let's simply replace the selection step in SOMP with the rank aware
selection.

Does this provide guaranteed recovery in the full rank scenario?

Answer: NO.

Why?
We get rank degeneration of the residual matrix:

                rank(R(i) ) ≤ min{rank(Y ), k − i}
As we take more steps the rank reduces to one while R(i) is typically still
k-sparse.
               We lose the rank benefits as we iterate
IDCOM, University of Edinburgh
               Rank Aware Order Recursive Matching
                             Pursuit
The fix...
We can fix this problem by forcing the sparsity to also reduce as a
function of iteration. This is achieved by:
 Algorithm 1 Rank Aware Order Recursive Matching Pursuit (RA-ORMP)
  1: Initialize R(0) = Y, X(0) = 0, 0 = ∅, P ⊥ = I
                                               (0)

  2: for n = 1; n := n + 1 until stopping criterion do
  3:   Calculate orthonormal basis for residual: U(n−1) = Orth(R(n−1) )
  4:   in = argmaxi∈ (n−1) φT U(n−1) 2 / P ⊥
                              i               (n−1) φi 2
         n
  5:       = n−1 ∪ in
         (n)
  6:   X n ,: = Φ† n Y
  7:   R(n) = P ⊥(n) Y where P ⊥ := (I − Φ
                                (n)          (n)   Φ† (n) )
  8: end for


                                               ˜
R(n)is (k-n)-sparse in the modified dictionary φi = P ⊥(n) φi / P ⊥(n) φi   2
IDCOM, University of Edinburgh

                                RA-OMP vs RA-ORMP
      Comparison of how (typical) residual rank (–) and sparsity (–)
                   evolve as a function of iteration

            RA-OMP                                     RA-ORMP

  k                                         k


  r                                         r




                            k                                         k
              iteration #                               iteration #
                     - region where correct selection is not guaranteed
IDCOM, University of Edinburgh
                                                               SOMP/RA-OMP/RA-ORMP
                                                                    Comparison

                                        SOMP                                                 RA−OMP                                               RA=ORMP
                             1                                                     1                                                     1
   Prob of Exact Recovery




                                                         Prob of Exact Recovery




                                                                                                               Prob of Exact Recovery
                            0.8                                                   0.8                                                   0.8

                            0.6                                                   0.6                                                   0.6

                            0.4                                                   0.4                                                   0.4

                            0.2                                                   0.2                                                   0.2

                             0                                                     0                                                     0
                                  0   10      20    30                                  0   10      20    30                                  0   10      20    30
                                       Sparsity k                                            Sparsity k                                            Sparsity k



n = 256, m = 32, l = 1,2,4,8,16,32. Dictionary ~ i.i.d. Gaussian and X
coefficients ~ Gaussian i.i.d. (note that this is beneficial to SOMP!)
IDCOM, University of Edinburgh

                                      Rank Aware OMP
Alternative Solutions
Recently two independent solutions have been proposed that are variations on
   a theme:
1. Compressive MUSIC [Kim et al 2010]
    i. perform SOMP for k-r-1 steps              but SOMP is rank blind
     ii.   apply modified MUSIC

2. Iterative MUSIC [Lee & Bresler 2010]
     1. orthogonalize: U = orth(Y )                orthogonalization is not
     2.    apply SOMP to {Φ, U } for k-r-1 steps   guaranteed beyond step 1
     3.    apply modified MUSIC

This motivates us to consider a minor modification of (2):

3 RA-OMP+MUSIC
     i.    perform RA-OMP for k-r-1 steps
     ii.   apply modified MUSIC
IDCOM, University of Edinburgh

                                 Recovery guarantee
Two nice rank aware solutions

a) Apply RA-OMP for k-r-1 steps then complete with modified MUSIC
b) Apply RA-ORMP for k steps (if first k-r steps make correct selection we
   have guaranteed recovery)

we now have the following recovery guarantee [Blanchard & D.]:
Theorem 4 (MMV CS recovery) Assume XΛ ∈ Rn×r is in general position
for some support set Λ, |Λ| = k > r and let Φ is a random matrix independent
of X, Φi,j ∼ N (0, m−1 ). Then (a) and (b) can recover X from Y with high
probability if:
                                        log N
                          m ≥ const.k          +1
                                           r

That is: as r increases the effect of the log N term diminishes
IDCOM, University of Edinburgh
                                                       RA-OMP+MUSIC / RA-ORMP
                                                             Comparison
                                             RA−OMP+MUSIC                                           RA=ORMP
                                    1                                                      1




                                                                 Prob of Exact Recovery
          Prob of Exact Recovery


                                   0.8                                                    0.8

                                   0.6                                                    0.6

                                   0.4                                                    0.4

                                   0.2                                                    0.2

                                    0                                                      0
                                         0    10      20    30                                  0   10      20    30
                                               Sparsity k                                            Sparsity k




n = 256, m = 32, l = 1,2,4,8,16,32. i.i.d. Gaussian Dictionary and X
coefficients ~ Gaussian i.i.d.
IDCOM, University of Edinburgh

                             Empirical Phase Transitions
                                        RA−OMP+MUSIC 16 16
                                          RA−ORMPl l= 16=
                                           RA−OMP = 16 l
                                            SOMP l =
                    50

                    45

                    40

                    35

                    30
                m




                    25

                    20

                    15

                    10

                     5


                         5    10   15     20   25   30       35   40   45   50
                                                k




        Gaussian dictionary "phase transitions" with Gaussian
                       significant coefficients
IDCOM, University of Edinburgh
                                              Correlated vs uncorrelated
                                                      coefficients
                          SOMP                                                          RA-ORMP
                              SOMP l = 16                                                RA−ORMP l = 16
     50                                                              50

     45                                                              45

     40                                                              40

     35                                                              35

     30                                                              30
 m




                                                                 m
     25                                                              25

     20                                                              20

     15                                                              15

     10                                                              10

      5                                                               5


          5    10   15   20      25     30   35   40   45   50            5   10   15   20    25    30    35   40   45   50
                                  k                                                            k




              Gaussian dictionary "phase transitions" with uncorrelated
                                sparse coefficients
IDCOM, University of Edinburgh
                                                 Correlated vs uncorrelated
                                                         coefficients
                           SOMP                                                            RA-ORMP
                    SOMP l = 16, highly correlated                                    RA−ORMP l = 16 highly correlated
     50                                                                 50

     45                                                                 45

     40                                                                 40

     35                                                                 35

     30                                                                 30
 m




                                                                    m
     25                                                                 25

     20                                                                 20

     15                                                                 15

     10                                                                 10

      5                                                                  5


           5   10   15    20     25     30      35   40   45   50            5   10   15     20     25    30     35      40   45   50
                                  k                                                                  k




          Gaussian dictionary "phase transitions" with highly correlated
                               sparse coefficients
IDCOM, University of Edinburgh

                                  Summary



• MMV problem is easier than SMV problem in general
• Don't dismiss using exhaustive search (not always NP-hard!)
• Good rank aware greedy algorithms exist

Questions
• Can we extend these ideas to IHT or CoSaMP?
• How can we incorporate rank awareness into convex optimization?
IDCOM, University of Edinburgh

                 Workshop : Signal Processing with Adaptive Sparse
                       Structured Representations (SPARS '11)
                     June 27-30, 2011 - Edinburgh, (Scotland, UK)




                            Plenary speakers :

                 David L. Donoho, Stanford University, USA
                     Martin Vetterli, EPFL, Switzerland
              Stephen J. Wright, University of Wisconsin, USA
               David J. Brady, Duke University, Durham, USA
           Yi Ma, University of Illinois at Urbana-Champaign, USA
             Joel Tropp, California Institute of Technology, USA
        Remi Gribonval, Centre de Recherche INRIA Rennes, France
        Francis Bach, Laboratoire d'Informatique de l'E.N.S., France

Contenu connexe

Tendances

Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
Caleb (Shiqiang) Jin
 
Ml mle_bayes
Ml  mle_bayesMl  mle_bayes
Ml mle_bayes
Phong Vo
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Generalization of Tensor Factorization and Applications
Generalization of Tensor Factorization and ApplicationsGeneralization of Tensor Factorization and Applications
Generalization of Tensor Factorization and Applications
Kohei Hayashi
 
Intro probability 4
Intro probability 4Intro probability 4
Intro probability 4
Phong Vo
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
butest
 
slides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhadslides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhad
Farhad Gholami
 

Tendances (19)

Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
 
Ml mle_bayes
Ml  mle_bayesMl  mle_bayes
Ml mle_bayes
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Generalization of Tensor Factorization and Applications
Generalization of Tensor Factorization and ApplicationsGeneralization of Tensor Factorization and Applications
Generalization of Tensor Factorization and Applications
 
A discussion on sampling graphs to approximate network classification functions
A discussion on sampling graphs to approximate network classification functionsA discussion on sampling graphs to approximate network classification functions
A discussion on sampling graphs to approximate network classification functions
 
Intro probability 4
Intro probability 4Intro probability 4
Intro probability 4
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Dft
DftDft
Dft
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
 
Iwsmbvs
IwsmbvsIwsmbvs
Iwsmbvs
 
slides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhadslides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhad
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
 
Richard Everitt's slides
Richard Everitt's slidesRichard Everitt's slides
Richard Everitt's slides
 
Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l_1/l_2 Regular...
Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l_1/l_2 Regular...Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l_1/l_2 Regular...
Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l_1/l_2 Regular...
 
Discrete Models in Computer Vision
Discrete Models in Computer VisionDiscrete Models in Computer Vision
Discrete Models in Computer Vision
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 

En vedette

Rank awarealgs small11
Rank awarealgs small11Rank awarealgs small11
Rank awarealgs small11
Jules Esp
 
Workflow - Insurance Policy Purchase
Workflow - Insurance Policy PurchaseWorkflow - Insurance Policy Purchase
Workflow - Insurance Policy Purchase
zaccart
 
Visuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštą
Visuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštąVisuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštą
Visuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštą
Simona Liukaityte
 
Relacion plaza ii eta excep0001
Relacion plaza ii eta excep0001Relacion plaza ii eta excep0001
Relacion plaza ii eta excep0001
selecos
 

En vedette (16)

Tweddle Annual Report 2011 2012
Tweddle Annual Report 2011 2012Tweddle Annual Report 2011 2012
Tweddle Annual Report 2011 2012
 
Rank awarealgs small11
Rank awarealgs small11Rank awarealgs small11
Rank awarealgs small11
 
Tweddle's joint submission to 'Victoria's Vulnerable Children Inquiry'
Tweddle's joint submission to 'Victoria's Vulnerable Children Inquiry'Tweddle's joint submission to 'Victoria's Vulnerable Children Inquiry'
Tweddle's joint submission to 'Victoria's Vulnerable Children Inquiry'
 
The China Analyst March 2011
The China Analyst   March 2011The China Analyst   March 2011
The China Analyst March 2011
 
D1
D1D1
D1
 
Tweddle's Strategic Plan 2012 - 2017
Tweddle's Strategic Plan 2012 - 2017Tweddle's Strategic Plan 2012 - 2017
Tweddle's Strategic Plan 2012 - 2017
 
Tweddle Annual Report 2010/2011
Tweddle Annual Report 2010/2011Tweddle Annual Report 2010/2011
Tweddle Annual Report 2010/2011
 
Tweddle annual report_2014_final
Tweddle annual report_2014_finalTweddle annual report_2014_final
Tweddle annual report_2014_final
 
Day Stay Program - Research and Evaluation - Tweddle Child and Family Health ...
Day Stay Program - Research and Evaluation - Tweddle Child and Family Health ...Day Stay Program - Research and Evaluation - Tweddle Child and Family Health ...
Day Stay Program - Research and Evaluation - Tweddle Child and Family Health ...
 
Empowering Somali Mums Research Report
Empowering Somali Mums Research Report Empowering Somali Mums Research Report
Empowering Somali Mums Research Report
 
Parenthése Internship Presentation - University of Brighton
Parenthése Internship Presentation - University of BrightonParenthése Internship Presentation - University of Brighton
Parenthése Internship Presentation - University of Brighton
 
Aracy unsettled infant behaviour report final (2)
Aracy unsettled infant behaviour report final (2)Aracy unsettled infant behaviour report final (2)
Aracy unsettled infant behaviour report final (2)
 
Workflow - Insurance Policy Purchase
Workflow - Insurance Policy PurchaseWorkflow - Insurance Policy Purchase
Workflow - Insurance Policy Purchase
 
Klimato kaita
Klimato kaitaKlimato kaita
Klimato kaita
 
Visuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštą
Visuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštąVisuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštą
Visuminės integruotos sklaidos matavimas naudojant įvairaus diametro pluoštą
 
Relacion plaza ii eta excep0001
Relacion plaza ii eta excep0001Relacion plaza ii eta excep0001
Relacion plaza ii eta excep0001
 

Similaire à Rank awarealgs small11

Csr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawalCsr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawal
CSR2011
 
MASSS_Presentation_20160209
MASSS_Presentation_20160209MASSS_Presentation_20160209
MASSS_Presentation_20160209
Yimin Wu
 
Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...
Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...
Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...
avinokurov
 

Similaire à Rank awarealgs small11 (20)

The low-rank basis problem for a matrix subspace
The low-rank basis problem for a matrix subspaceThe low-rank basis problem for a matrix subspace
The low-rank basis problem for a matrix subspace
 
Introduction to the theory of optimization
Introduction to the theory of optimizationIntroduction to the theory of optimization
Introduction to the theory of optimization
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Improving estimates for discrete polynomial averaging operators
Improving estimates for discrete polynomial averaging operatorsImproving estimates for discrete polynomial averaging operators
Improving estimates for discrete polynomial averaging operators
 
Cheatsheet unsupervised-learning
Cheatsheet unsupervised-learningCheatsheet unsupervised-learning
Cheatsheet unsupervised-learning
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Mcgill3
Mcgill3Mcgill3
Mcgill3
 
Csr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawalCsr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawal
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
ma112011id535
ma112011id535ma112011id535
ma112011id535
 
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
Talk given at the Twelfth Workshop on Non-Perurbative Quantum Chromodynamics ...
 
Accelerated reconstruction of a compressively sampled data stream
Accelerated reconstruction of a compressively sampled data streamAccelerated reconstruction of a compressively sampled data stream
Accelerated reconstruction of a compressively sampled data stream
 
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
 
MASSS_Presentation_20160209
MASSS_Presentation_20160209MASSS_Presentation_20160209
MASSS_Presentation_20160209
 
Optimal multi-configuration approximation of an N-fermion wave function
 Optimal multi-configuration approximation of an N-fermion wave function Optimal multi-configuration approximation of an N-fermion wave function
Optimal multi-configuration approximation of an N-fermion wave function
 
Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...
Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...
Theoretical and computational aspects of the SVM, EBCM, and PMM methods in li...
 

Rank awarealgs small11

  • 1. IDCOM, University of Edinburgh Rank Aware Algorithms for Joint Sparse Recovery Mike Davies* Joint work with Yonina Eldar‡ and Jeff Blanchard† * Institute of Digital Communications, University of Edinburgh ‡ Technion, Israel † Grinnell College, USA
  • 2. IDCOM, University of Edinburgh Outline of Talk • Multiple Measurements vs Single Measurements • Nec.+suff. conditions for Joint Sparse Recovery • Reduced complexity combinatorial search • Classical approaches to sparse MMV problem – How good are SOMP and convex optimization? • Rank Aware Pursuits – Evolution of the rank of residual matrices – A recovery guarantee • Empirical simulations
  • 3. IDCOM, University of Edinburgh Sparse Single Measurement Vector Problem m x1 mxn n x1 Measurements Measurement Matrix Sparse Signal k nonzero elements Given y ∈ Rm and Φ ∈ Rm×n with m < n find: x = argmin | supp(x)| s.t. Φx = y. ˆ x
  • 4. IDCOM, University of Edinburgh Sparse Multiple Measurement Vector Problem m×l m×n n×l Measurement Measurements Matrix row support Sparse Signal k nonzero rows Given Y ∈ Rm×l and Φ ∈ Rm×n with m < n find: ˆ X = argmin | supp(X)| s.t. ΦX = Y. X
  • 5. IDCOM, University of Edinburgh MMV uniqueness Worst Case • Uniqueness of solution for sparse MMV problem is equivalent to that for SMV problem. Simply replicate SMV problem: X = {x, x, . . . , x} Hence nec. + suff. condition to uniquely determine each k-sparse vector x is given by SMV condition: spark(Φ) | supp(X)| = k < 2 Rank 'r' Case • If Rank(Y)=r then the necessary + sufficient conditions are less restrictive [Chen & Huo 2006, D. & Eldar 2010]: spark(Φ) − 1 + rank(Y) | supp(X)| = k < 2 Equivalently we can replace rank(Y) with rank(X). More measurements (higher rank) makes recovery easier!
  • 6. IDCOM, University of Edinburgh MMV uniqueness Generic scenario: Typical matrices achieve maximal spark: Φ ∈ Rm×n → spark(Φ) = m + 1 Typical matrices achieve maximal rank X ∈ Rk×l → rank(X) = r = min{k, l} Hence generically we have uniqueness if m ≥ 2k − min{k, l} + 1 ≥ k + 1 When l ≥ k we typically only need k+1 measurements
  • 7. IDCOM, University of Edinburgh Exhaustive search solution How does the rank change the exhaustive search? SMV exhaustive search: find , | | = k s.t. ΦX = Y However since span(Y) ⊂ span(Φ ) and rank(Y) = r ∃γ⊂ , |γ| = k − r s.t. span([Φγ , Y ]) = span(Φ ) n In fact we have a reduced k−r+1 combinatorial search.
  • 8. IDCOM, University of Edinburgh Geometric Picture for MMV φ1 φ2 Y = ΦΛ XΛ,: φ3 2−sparse vector ∈ span(Y ) span(Y ) If X is k-sparse and rank(Y) = r there exists a (k-r+1)-sparse vector in span(Y)
  • 9. IDCOM, University of Edinburgh Maximal Rank Exhaustive Search: MUSIC When we have maximal rank(X) = k the exhaustive search is linear and can be solved with a modified MUSIC algorithm. Let U = orth(Y) This is an orthonormal basis for span(Φ ) Then under identifiablity conditions we have: (I − UUT )φi 2 = 0, if and only if i ∈ . (in practice select support by thresholding) Theorem 1 (Feng 1996) Let Y = ΦX with | supp(X)| = k, rank(X) = k ˆ and k < spark(Φ) − 1. Then MUSIC is guaranteed to recover X (i.e. X = X).
  • 10. IDCOM, University of Edinburgh Maximal rank problem is not NP-hard Furthermore there is no constraint on n!
  • 11. IDCOM, University of Edinburgh Popular MMV solutions
  • 12. IDCOM, University of Edinburgh Popular MMV sparse recovery solutions Two classes of MMV sparse recovery algorithm: greedy, e.g. Algorithm 1 Simultaneous Orthogonal Matching Pursuit (SOMP) 1: initialization: R(0) = Y, X(0) = 0, 0 = ∅ 2: for n = 1; n := n + 1 until stopping criterion do 3: in = argmaxi φT R(n−1) q i n n−1 n 4: = ∪i (n) † 5: X n ,: = Φ n Y 6: R(n) = P ⊥(n) Y where P ⊥(n) := (I − Φ (n) Φ† (n) ) 7: end for and relaxed, e.g. Algorithm 2 ℓ1 /ℓq Minimization ˆ X = argmin ||X||1,q s.t. ΦX = Y X
  • 13. IDCOM, University of Edinburgh Do such MMV solutions exploit the rank? Answer: NO. [D. & Eldar 2010] Theorem 2 (SOMP is not rank aware) Let τ be given such that 1 ≤ τ ≤ k and suppose that max ||Φ† φj ||1 > 1 j∈ for some support , | | = k. Then there exists an X with supp(X) = and rank(X) = τ that SOMP cannot recover. SMV OMP Exact Recovery condition Proof - Rank r perturbation of rank 1 problem approaches rank 1 recovery property due to continuity norm.
  • 14. IDCOM, University of Edinburgh Do such MMV solutions exploit the rank? Answer: NO. [D. & Eldar 2010] Theorem 3 (ℓ1 /ℓq minimization is not rank aware) Let τ be given such that 1 ≤ τ ≤ k and suppose that there exists a z ∈ N (Φ) such that ||z ||1 > ||z c ||1 for some support , | | = k. Then there exists an X with supp(X) = , rank(X) l=Null SMV 1 τ that the mixed norm solution cannot recover. Space Property Proof - Rank r perturbation of rank 1 problem approaches rank 1 recovery property due to continuity of norm.
  • 15. IDCOM, University of Edinburgh Rank Aware Pursuits
  • 16. IDCOM, University of Edinburgh Rank Aware Selection Aim: to select individual atoms in a similar manner to modified MUSIC Rank Aware Selection [D. & Eldar 2010] At the nth iteration make the following selection: (n) (n−1) = ∪ argmax ||φT U(n−1) ||2 i i where U(n−1) = orth(R(n−1) ) Properties: 1. Worst case behaviour does not approach SMV case. 2. When rank(R) = k it always selects a correct atom as with MUSIC
  • 17. IDCOM, University of Edinburgh Rank Aware OMP Rank Aware OMP Let's simply replace the selection step in SOMP with the rank aware selection. Does this provide guaranteed recovery in the full rank scenario? Answer: NO. Why? We get rank degeneration of the residual matrix: rank(R(i) ) ≤ min{rank(Y ), k − i} As we take more steps the rank reduces to one while R(i) is typically still k-sparse. We lose the rank benefits as we iterate
  • 18. IDCOM, University of Edinburgh Rank Aware Order Recursive Matching Pursuit The fix... We can fix this problem by forcing the sparsity to also reduce as a function of iteration. This is achieved by: Algorithm 1 Rank Aware Order Recursive Matching Pursuit (RA-ORMP) 1: Initialize R(0) = Y, X(0) = 0, 0 = ∅, P ⊥ = I (0) 2: for n = 1; n := n + 1 until stopping criterion do 3: Calculate orthonormal basis for residual: U(n−1) = Orth(R(n−1) ) 4: in = argmaxi∈ (n−1) φT U(n−1) 2 / P ⊥ i (n−1) φi 2 n 5: = n−1 ∪ in (n) 6: X n ,: = Φ† n Y 7: R(n) = P ⊥(n) Y where P ⊥ := (I − Φ (n) (n) Φ† (n) ) 8: end for ˜ R(n)is (k-n)-sparse in the modified dictionary φi = P ⊥(n) φi / P ⊥(n) φi 2
  • 19. IDCOM, University of Edinburgh RA-OMP vs RA-ORMP Comparison of how (typical) residual rank (–) and sparsity (–) evolve as a function of iteration RA-OMP RA-ORMP k k r r k k iteration # iteration # - region where correct selection is not guaranteed
  • 20. IDCOM, University of Edinburgh SOMP/RA-OMP/RA-ORMP Comparison SOMP RA−OMP RA=ORMP 1 1 1 Prob of Exact Recovery Prob of Exact Recovery Prob of Exact Recovery 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 10 20 30 0 10 20 30 0 10 20 30 Sparsity k Sparsity k Sparsity k n = 256, m = 32, l = 1,2,4,8,16,32. Dictionary ~ i.i.d. Gaussian and X coefficients ~ Gaussian i.i.d. (note that this is beneficial to SOMP!)
  • 21. IDCOM, University of Edinburgh Rank Aware OMP Alternative Solutions Recently two independent solutions have been proposed that are variations on a theme: 1. Compressive MUSIC [Kim et al 2010] i. perform SOMP for k-r-1 steps but SOMP is rank blind ii. apply modified MUSIC 2. Iterative MUSIC [Lee & Bresler 2010] 1. orthogonalize: U = orth(Y ) orthogonalization is not 2. apply SOMP to {Φ, U } for k-r-1 steps guaranteed beyond step 1 3. apply modified MUSIC This motivates us to consider a minor modification of (2): 3 RA-OMP+MUSIC i. perform RA-OMP for k-r-1 steps ii. apply modified MUSIC
  • 22. IDCOM, University of Edinburgh Recovery guarantee Two nice rank aware solutions a) Apply RA-OMP for k-r-1 steps then complete with modified MUSIC b) Apply RA-ORMP for k steps (if first k-r steps make correct selection we have guaranteed recovery) we now have the following recovery guarantee [Blanchard & D.]: Theorem 4 (MMV CS recovery) Assume XΛ ∈ Rn×r is in general position for some support set Λ, |Λ| = k > r and let Φ is a random matrix independent of X, Φi,j ∼ N (0, m−1 ). Then (a) and (b) can recover X from Y with high probability if: log N m ≥ const.k +1 r That is: as r increases the effect of the log N term diminishes
  • 23. IDCOM, University of Edinburgh RA-OMP+MUSIC / RA-ORMP Comparison RA−OMP+MUSIC RA=ORMP 1 1 Prob of Exact Recovery Prob of Exact Recovery 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 10 20 30 0 10 20 30 Sparsity k Sparsity k n = 256, m = 32, l = 1,2,4,8,16,32. i.i.d. Gaussian Dictionary and X coefficients ~ Gaussian i.i.d.
  • 24. IDCOM, University of Edinburgh Empirical Phase Transitions RA−OMP+MUSIC 16 16 RA−ORMPl l= 16= RA−OMP = 16 l SOMP l = 50 45 40 35 30 m 25 20 15 10 5 5 10 15 20 25 30 35 40 45 50 k Gaussian dictionary "phase transitions" with Gaussian significant coefficients
  • 25. IDCOM, University of Edinburgh Correlated vs uncorrelated coefficients SOMP RA-ORMP SOMP l = 16 RA−ORMP l = 16 50 50 45 45 40 40 35 35 30 30 m m 25 25 20 20 15 15 10 10 5 5 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 k k Gaussian dictionary "phase transitions" with uncorrelated sparse coefficients
  • 26. IDCOM, University of Edinburgh Correlated vs uncorrelated coefficients SOMP RA-ORMP SOMP l = 16, highly correlated RA−ORMP l = 16 highly correlated 50 50 45 45 40 40 35 35 30 30 m m 25 25 20 20 15 15 10 10 5 5 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 k k Gaussian dictionary "phase transitions" with highly correlated sparse coefficients
  • 27. IDCOM, University of Edinburgh Summary • MMV problem is easier than SMV problem in general • Don't dismiss using exhaustive search (not always NP-hard!) • Good rank aware greedy algorithms exist Questions • Can we extend these ideas to IHT or CoSaMP? • How can we incorporate rank awareness into convex optimization?
  • 28. IDCOM, University of Edinburgh Workshop : Signal Processing with Adaptive Sparse Structured Representations (SPARS '11) June 27-30, 2011 - Edinburgh, (Scotland, UK) Plenary speakers : David L. Donoho, Stanford University, USA Martin Vetterli, EPFL, Switzerland Stephen J. Wright, University of Wisconsin, USA David J. Brady, Duke University, Durham, USA Yi Ma, University of Illinois at Urbana-Champaign, USA Joel Tropp, California Institute of Technology, USA Remi Gribonval, Centre de Recherche INRIA Rennes, France Francis Bach, Laboratoire d'Informatique de l'E.N.S., France