SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Part 3.
              Nonnegative Matrix Factorization ⇔
               K-means and Spectral Clustering




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   100
Nonnegative Matrix Factorization
                               (NMF)
     Data Matrix: n points in p-dim:
                                                                          xi   is an image,
                           X = ( x1 , x2 ,L, xn )                              document,
                                                                               webpage, etc

      Decomposition
      (low-rank approximation)                                 X ≈ FG      T


       Nonnegative Matrices
                                                     X ij ≥ 0, Fij ≥ 0, Gij ≥ 0

               F = ( f1 , f 2 , L, f k )                       G = ( g1 , g 2 ,L, g k )
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                       101
Some historical notes
         • Earlier work by statistics people
         • P. Paatero (1994) Environmetrices
         • Lee and Seung (1999, 2000)
               – Parts of whole (no cancellation)
               – A multiplicative update algorithm




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   102
⎡0.0⎤
                                                                             ⎢ 0.5⎥
                                                                             ⎢ ⎥
                                                                             ⎢0.7⎥
                                                                             ⎢10 ⎥
                                                                             ⎢. ⎥
                                                                             ⎢M ⎥
                                                                             ⎢ ⎥
                                                                             ⎢0.8⎥
                                                                             ⎢0.2⎥
                                                                             ⎢ ⎥
                                                                             ⎣0.0⎦
                                                                          Pixel vector




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                  103
Parts-based perspective
                                                                          X = ( x1 , x2 ,L, xn )




         F = ( f1 , f 2 , L, f k ) G = ( g1 , g 2 ,L, g k )

                                         X ≈ FG              T



PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                      104
Sparsify F to get parts-based picture




       X ≈ FG T F = ( f1 , f 2 ,L, f k )                                  Li, et al, 200; Hoyer 2003




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                                105
Theorem.
                  NMF = kernel K-means clustering



          NMF produces holistic modeling of the data
         Theoretical results and experiments verification

                                                                          (Ding, He, Simon, 2005)


PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                             106
Our Results: NMF = Data Clustering




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   107
Our Results: NMF = Data Clustering




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   108
Theorem: K-means = NMF
         • Reformulate K-means and Kernel K-means
                                                                    T
                                       max Tr ( H WH )
                                  H H = I , H ≥0
                                     T


         • Show equivalence

                                        min           || W − HH ||        T   2
                                 H H = I , H ≥0
                                    T




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding           109
NMF = K-means


           H = arg max Tr ( H WH )                      T

                       H T H = I , H ≥0

                  = arg min [-2Tr ( H WH )]                    T

                       H T H = I , H ≥0

                  = arg min [|| W ||2 -2Tr ( H TWH )]+ || H T H ||2
                       H T H = I , H ≥0

                   = arg min || W − HH ||                       T 2

                       H T H = I , H ≥0

                   ⇒ arg min || W − HH ||                       T 2

                               H ≥0
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   110
Spectral Clustering = NMF
                                                    ⎛ s (Ck ,Cl ) s (Ck ,Cl ) ⎞        s (Ck ,G − Ck )
   Normalized Cut: J Ncut =                 ∑       ⎜
                                                    ⎜ d
                                            < k ,l >⎝      k
                                                                 +
                                                                      dl
                                                                              ⎟=
                                                                              ⎟
                                                                              ⎠
                                                                                   ∑
                                                                                   k
                                                                                              dk
                                           h1 ( D − W )h1
                                            T
                                                              hk ( D − W )hk
                                                               T
                                         =      T
                                                          +L+      T
                                               h1 Dh1             hk Dhk
                                                                       nk
                                                                      }
   Unsigned cluster indicators:                     y k = D1/ 2 (0L 0,1L1,0 L 0)T / || D1/ 2 hk ||
   Re-write:                                               ~                      ~
                        J Ncut ( y1 , L , y k ) = y1 ( I − W ) y1 + L + y k ( I − W ) y k
                                                   T                      T

                                           ~                          ~
                        = Tr (Y ( I − W )Y )
                                  T
                                                                     W = D −1/ 2WD −1/ 2

                                           ~
      Optimize :                  max Tr(Y W Y ), subject to Y T Y = I
                                                    T
                                     Y

                                                                           ~
        Normalized Cut ⇒                                                || W − HH ||
                                                                                 T 2
                                                          min
                                                   H H = I , H ≥0
                                                      T

                                                                                                    (Gu , et al, 2001)
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                                                  111
Advantages of NMF over standard K-means

                                           Soft clustering




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   112
Experiments on Internet Newsgroups
               NG2:      comp.graphics
                                                                    100 articles from each group.
               NG9:      rec.motorcycles                            1000 words
               NG10:     rec.sport.baseball                         Tf.idf weight. Cosine similarity
               NG15:     sci.space
               NG18:     talk.politics.mideast

                 cosine similarity                         Accuracy of clustering results

                                                                   K-means             W=HH’
                                                                   0.531               0.612
                                                                   0.491               0.590
                                                                   0.576               0.608
                                                                   0.632               0.652
                                                                   0.697               0.711
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                                113
Summary for Symmetric NMF
         • K-means , Kernel K-means
         • Spectral clustering
                                                                    T
                                       max Tr ( H WH )
                                  H H = I , H ≥0
                                     T



         • Equivalence to
                                        min           || W − HH ||        T   2
                                 H H = I , H ≥0
                                    T




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding           114
Nonsymmetric NMF
         • K-means , Kernel K-means
         • Spectral clustering
                                                                    T
                                       max Tr ( H WH )
                                  H H = I , H ≥0
                                     T



         • Equivalence to
                                        min           || W − HH ||        T   2
                                 H H = I , H ≥0
                                    T




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding           115
Non-symmetric NMF
                      Rectangular Data Matrix
                          Bipartite Graph

         •   Information Retrieval: word-to-document
         •   DNA gene expressions
         •   Image pixels
         •   Supermarket transaction data

PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   116
K-means Clustering of Bipartite Graphs

         Simultaneous clustering of rows and columns
                                                             row          ⎡1    0⎤
                                                                          ⎢1    0⎥
                                                          indicators      ⎢      ⎥ = ( f1 , f 2 , f3 ) = F
                                                                          ⎢0    1⎥
                                                                          ⎢      ⎥
                                                                          ⎣0    1⎦

                                                                          ⎡1    0⎤
                                                           column         ⎢0
                                                                          ⎢     1⎥
                                                                                 ⎥ = ( g1 , g 2 , g3 ) = G
                                                         indicators
                                 cut                                      ⎢1    0⎥
                                                                          ⎢      ⎥
                                                                          ⎣0    1⎦

                                     k      s ( BR j ,C j )
                    J Kmeans = ∑                              = Tr ( F T BG )
                                     j =1   | R j || C j |
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                                 117
NMF = K-means clustering

             H = arg max Tr ( F BG )                    T

                        F T F = I , F ≥0
                          GT G = I ,G ≥0


                = arg min Tr (−2 F BG )                     T

                     F T F = I , F ≥0
                       GT G = I ,G ≥0

                = arg min Tr (|| B || −2 F BG + F FG G )    2             T   T   T

                     F T F = I , F ≥0
                       GT G = I ,G ≥0

                 = arg min ||B − FG ||                          T   2

                      F T F = I , F ≥0
                        GT G = I ,G ≥0
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding               118
Solving NMF with non-negative least square

                             J =|| X − FGT ||2 , F ≥ 0, G ≥ 0

                  Fix F, solve for G; Fix G, solve for F
                                                          ~
                                                        ⎛ g1 ⎞
                                  n
                                            ~ ||2 , G = ⎜ M ⎟
                            J = ∑ || xi − F gi          ⎜ ⎟
                                i =1
                                                        ⎜g ⎟
                                                          ~
                                                        ⎝ n⎠
                   Iterate, converge to a local minima

PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   119
Solving NMF with multiplicative updating

                               J =|| X − FGT ||2 , F ≥ 0, G ≥ 0

                  Fix F, solve for G; Fix G, solve for F

                  Lee & Seung ( 2000) propose


                                ( XG )ik                                     ( X T F ) jk
                    Fik ← Fik                                  G jk ← G jk
                              ( FGT G )ik                                    (GF T F ) jk


PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding                     120
Symmetric NMF

                             J =|| W − HH T ||2 , H ≥ 0

           Constraint Optimization. KKT 1st condition

           Complementarity slackness condition
                           ⎛ ∂J ⎞
                         0=⎜    ⎟   H ik = (−4WH + 4 HH T H )ik H ik
                           ⎝ ∂H ⎠ik
           Gradient decent
                                  ∂J                      H ik
              H ik ← H ik − ε ik               ε ik =
                                 ∂H ik                4( HH T H )ik
                                             (WH )ik
                    H ik ← H ik (1 − β + β     T
                                                      )
                                           ( HH H )ik
PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   121
Summary


           •   NMF is a new low-rank approximation
           •   The holistic picture (vs. parts-based)
           •   NMF is equivalent to spectral clustering
           •   Main advantage: soft clustering




PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding   122

Contenu connexe

Tendances

ABC-Xian
ABC-XianABC-Xian
ABC-XianDeb Roy
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processesKazuki Fujikawa
 
Mesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursMesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursGabriel Peyré
 
Bachelor_Defense
Bachelor_DefenseBachelor_Defense
Bachelor_DefenseTeja Turk
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Continuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGDContinuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGDValentin De Bortoli
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse RepresentationGabriel Peyré
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsGabriel Peyré
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programmingShakil Ahmed
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsBigMC
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Gabriel Peyré
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingEnthought, Inc.
 
Gaussian Processes: Applications in Machine Learning
Gaussian Processes: Applications in Machine LearningGaussian Processes: Applications in Machine Learning
Gaussian Processes: Applications in Machine Learningbutest
 
Proximal Splitting and Optimal Transport
Proximal Splitting and Optimal TransportProximal Splitting and Optimal Transport
Proximal Splitting and Optimal TransportGabriel Peyré
 
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology
 
Bayesian Dark Knowledge and Matrix Factorization
Bayesian Dark Knowledge and Matrix FactorizationBayesian Dark Knowledge and Matrix Factorization
Bayesian Dark Knowledge and Matrix FactorizationPreferred Networks
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
 

Tendances (20)

ABC-Xian
ABC-XianABC-Xian
ABC-Xian
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processes
 
Mesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursMesh Processing Course : Active Contours
Mesh Processing Course : Active Contours
 
Bachelor_Defense
Bachelor_DefenseBachelor_Defense
Bachelor_Defense
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
Continuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGDContinuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGD
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse Representation
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
 
Gaussian Processes: Applications in Machine Learning
Gaussian Processes: Applications in Machine LearningGaussian Processes: Applications in Machine Learning
Gaussian Processes: Applications in Machine Learning
 
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
 
Proximal Splitting and Optimal Transport
Proximal Splitting and Optimal TransportProximal Splitting and Optimal Transport
Proximal Splitting and Optimal Transport
 
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
Higher-order Factorization Machines(第5回ステアラボ人工知能セミナー)
 
Bayesian Dark Knowledge and Matrix Factorization
Bayesian Dark Knowledge and Matrix FactorizationBayesian Dark Knowledge and Matrix Factorization
Bayesian Dark Knowledge and Matrix Factorization
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 

Similaire à Principal component analysis and matrix factorizations for learning (part 3) ding - icml tutorial 2005 - 2005

SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsJagadeeswaran Rathinavel
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
 
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片Chyi-Tsong Chen
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloUnbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloJeremyHeng10
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learningbutest
 
Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...zukun
 
Fuzzy inventory model with shortages in man power planning
Fuzzy inventory model with shortages in man power planningFuzzy inventory model with shortages in man power planning
Fuzzy inventory model with shortages in man power planningAlexander Decker
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningzukun
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big DataChristian Robert
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep LearningRayKim51
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloJeremyHeng10
 
MCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdfMCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdfChiheb Ben Hammouda
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 

Similaire à Principal component analysis and matrix factorizations for learning (part 3) ding - icml tutorial 2005 - 2005 (20)

Automatic bayesian cubature
Automatic bayesian cubatureAutomatic bayesian cubature
Automatic bayesian cubature
 
SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithms
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
 
Fit Main
Fit MainFit Main
Fit Main
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloUnbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...Principal component analysis and matrix factorizations for learning (part 2) ...
Principal component analysis and matrix factorizations for learning (part 2) ...
 
Fuzzy inventory model with shortages in man power planning
Fuzzy inventory model with shortages in man power planningFuzzy inventory model with shortages in man power planning
Fuzzy inventory model with shortages in man power planning
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 
Codes and Isogenies
Codes and IsogeniesCodes and Isogenies
Codes and Isogenies
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte Carlo
 
MCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdfMCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdf
 
2002 santiago et al
2002 santiago et al2002 santiago et al
2002 santiago et al
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Dernier

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Dernier (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Principal component analysis and matrix factorizations for learning (part 3) ding - icml tutorial 2005 - 2005

  • 1. Part 3. Nonnegative Matrix Factorization ⇔ K-means and Spectral Clustering PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 100
  • 2. Nonnegative Matrix Factorization (NMF) Data Matrix: n points in p-dim: xi is an image, X = ( x1 , x2 ,L, xn ) document, webpage, etc Decomposition (low-rank approximation) X ≈ FG T Nonnegative Matrices X ij ≥ 0, Fij ≥ 0, Gij ≥ 0 F = ( f1 , f 2 , L, f k ) G = ( g1 , g 2 ,L, g k ) PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 101
  • 3. Some historical notes • Earlier work by statistics people • P. Paatero (1994) Environmetrices • Lee and Seung (1999, 2000) – Parts of whole (no cancellation) – A multiplicative update algorithm PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 102
  • 4. ⎡0.0⎤ ⎢ 0.5⎥ ⎢ ⎥ ⎢0.7⎥ ⎢10 ⎥ ⎢. ⎥ ⎢M ⎥ ⎢ ⎥ ⎢0.8⎥ ⎢0.2⎥ ⎢ ⎥ ⎣0.0⎦ Pixel vector PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 103
  • 5. Parts-based perspective X = ( x1 , x2 ,L, xn ) F = ( f1 , f 2 , L, f k ) G = ( g1 , g 2 ,L, g k ) X ≈ FG T PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 104
  • 6. Sparsify F to get parts-based picture X ≈ FG T F = ( f1 , f 2 ,L, f k ) Li, et al, 200; Hoyer 2003 PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 105
  • 7. Theorem. NMF = kernel K-means clustering NMF produces holistic modeling of the data Theoretical results and experiments verification (Ding, He, Simon, 2005) PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 106
  • 8. Our Results: NMF = Data Clustering PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 107
  • 9. Our Results: NMF = Data Clustering PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 108
  • 10. Theorem: K-means = NMF • Reformulate K-means and Kernel K-means T max Tr ( H WH ) H H = I , H ≥0 T • Show equivalence min || W − HH || T 2 H H = I , H ≥0 T PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 109
  • 11. NMF = K-means H = arg max Tr ( H WH ) T H T H = I , H ≥0 = arg min [-2Tr ( H WH )] T H T H = I , H ≥0 = arg min [|| W ||2 -2Tr ( H TWH )]+ || H T H ||2 H T H = I , H ≥0 = arg min || W − HH || T 2 H T H = I , H ≥0 ⇒ arg min || W − HH || T 2 H ≥0 PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 110
  • 12. Spectral Clustering = NMF ⎛ s (Ck ,Cl ) s (Ck ,Cl ) ⎞ s (Ck ,G − Ck ) Normalized Cut: J Ncut = ∑ ⎜ ⎜ d < k ,l >⎝ k + dl ⎟= ⎟ ⎠ ∑ k dk h1 ( D − W )h1 T hk ( D − W )hk T = T +L+ T h1 Dh1 hk Dhk nk } Unsigned cluster indicators: y k = D1/ 2 (0L 0,1L1,0 L 0)T / || D1/ 2 hk || Re-write: ~ ~ J Ncut ( y1 , L , y k ) = y1 ( I − W ) y1 + L + y k ( I − W ) y k T T ~ ~ = Tr (Y ( I − W )Y ) T W = D −1/ 2WD −1/ 2 ~ Optimize : max Tr(Y W Y ), subject to Y T Y = I T Y ~ Normalized Cut ⇒ || W − HH || T 2 min H H = I , H ≥0 T (Gu , et al, 2001) PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 111
  • 13. Advantages of NMF over standard K-means Soft clustering PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 112
  • 14. Experiments on Internet Newsgroups NG2: comp.graphics 100 articles from each group. NG9: rec.motorcycles 1000 words NG10: rec.sport.baseball Tf.idf weight. Cosine similarity NG15: sci.space NG18: talk.politics.mideast cosine similarity Accuracy of clustering results K-means W=HH’ 0.531 0.612 0.491 0.590 0.576 0.608 0.632 0.652 0.697 0.711 PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 113
  • 15. Summary for Symmetric NMF • K-means , Kernel K-means • Spectral clustering T max Tr ( H WH ) H H = I , H ≥0 T • Equivalence to min || W − HH || T 2 H H = I , H ≥0 T PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 114
  • 16. Nonsymmetric NMF • K-means , Kernel K-means • Spectral clustering T max Tr ( H WH ) H H = I , H ≥0 T • Equivalence to min || W − HH || T 2 H H = I , H ≥0 T PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 115
  • 17. Non-symmetric NMF Rectangular Data Matrix Bipartite Graph • Information Retrieval: word-to-document • DNA gene expressions • Image pixels • Supermarket transaction data PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 116
  • 18. K-means Clustering of Bipartite Graphs Simultaneous clustering of rows and columns row ⎡1 0⎤ ⎢1 0⎥ indicators ⎢ ⎥ = ( f1 , f 2 , f3 ) = F ⎢0 1⎥ ⎢ ⎥ ⎣0 1⎦ ⎡1 0⎤ column ⎢0 ⎢ 1⎥ ⎥ = ( g1 , g 2 , g3 ) = G indicators cut ⎢1 0⎥ ⎢ ⎥ ⎣0 1⎦ k s ( BR j ,C j ) J Kmeans = ∑ = Tr ( F T BG ) j =1 | R j || C j | PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 117
  • 19. NMF = K-means clustering H = arg max Tr ( F BG ) T F T F = I , F ≥0 GT G = I ,G ≥0 = arg min Tr (−2 F BG ) T F T F = I , F ≥0 GT G = I ,G ≥0 = arg min Tr (|| B || −2 F BG + F FG G ) 2 T T T F T F = I , F ≥0 GT G = I ,G ≥0 = arg min ||B − FG || T 2 F T F = I , F ≥0 GT G = I ,G ≥0 PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 118
  • 20. Solving NMF with non-negative least square J =|| X − FGT ||2 , F ≥ 0, G ≥ 0 Fix F, solve for G; Fix G, solve for F ~ ⎛ g1 ⎞ n ~ ||2 , G = ⎜ M ⎟ J = ∑ || xi − F gi ⎜ ⎟ i =1 ⎜g ⎟ ~ ⎝ n⎠ Iterate, converge to a local minima PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 119
  • 21. Solving NMF with multiplicative updating J =|| X − FGT ||2 , F ≥ 0, G ≥ 0 Fix F, solve for G; Fix G, solve for F Lee & Seung ( 2000) propose ( XG )ik ( X T F ) jk Fik ← Fik G jk ← G jk ( FGT G )ik (GF T F ) jk PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 120
  • 22. Symmetric NMF J =|| W − HH T ||2 , H ≥ 0 Constraint Optimization. KKT 1st condition Complementarity slackness condition ⎛ ∂J ⎞ 0=⎜ ⎟ H ik = (−4WH + 4 HH T H )ik H ik ⎝ ∂H ⎠ik Gradient decent ∂J H ik H ik ← H ik − ε ik ε ik = ∂H ik 4( HH T H )ik (WH )ik H ik ← H ik (1 − β + β T ) ( HH H )ik PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 121
  • 23. Summary • NMF is a new low-rank approximation • The holistic picture (vs. parts-based) • NMF is equivalent to spectral clustering • Main advantage: soft clustering PCA & Matrix Factorization for Learning, ICML 2005 Tutorial, Chris Ding 122