SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
Paper digest
“Large-Scale Spectral Clustering on Graphs”
Akisato Kimura
akisato@ieee.org, @_akisato
One-page abstract
• Approx. acceleration of spectral clustering
– by introducing additional nodes that enable us to
compress the original graph,
– resulting in a bipartite graph which is
computationally efficient for spectral clustering.
• Note
– Large-scale spectral clustering,
especially works well for dense graphs.
– Not suitable for large-scale graph clustering,
due to the sparsity in nature.
Spectral clustering [Shi & Malik 1997]
• Notations
– Undirected weighted graph 𝐺 = 𝑉, 𝐸
– Num. nodes 𝑛 = |𝑉|; Num. Edges 𝑚 = |𝐸|
– Adjacency matrix 𝑊 = 𝑊𝑖,𝑗 𝑖,𝑗=1,2,…,𝑛
• Objective function
– Solved by eigen-decomposition (EVD)
min
𝑋∈ℝ 𝑘×𝑛
𝑇𝑟(𝑋 𝑇
𝐷−1/2
𝐿𝐷−1/2
𝑋) s.t. 𝑋 𝑇
𝑋 = 𝐼
(𝐿: graph Laplacian of 𝑊, 𝐷 = 𝐿 − 𝑊, 𝑘: num.clusters)
Main contribution of this work
• SC needs 𝑂(𝑛3
) computations due to EVD.
• Several improvements so far.
– Compressing the adjacency matrix by Nystrom
method [Fowlkes+ 2004]
– Reducing samples (= nodes) [Shinnou & Sasaki 2008] [Yan+
2009] [Sakai & Imiya 2009] [Chen & Cai 2011]
– Early stopping of EVD [Chen+ 2006] [Liu+ 2007]
• In contrast, this work
– Reducing the size of the graph.
• Why supernodes? --- Intuition from co-clustering
– A partition of supernodes can induce a partition of the
observed nodes, and vise versa.
• Generating a set of 𝑑 ≪ 𝑛 supernodes
Introducing supernodes
Original graph
Regular nodes
Supernodes
How to generate supernodes
1. Randomly choosing 𝑑 regular nodes as seeds.
2. Calculating the shortest paths from the seeds
to the other regular nodes.
i. Converting adjacencies to distances.
ii. Applying Dijkstra’s algorithm.
3. Partitioning all the regular nodes into 𝑑
disjoint subsets based on the shortest paths.
4. (Each subset corresponds to a supernode.)
After generating supernodes
𝑛 regular nodes
𝑑 supernodes
𝑊
𝑅
𝑊 = 𝑅𝑊
𝑅 ∈ ℤ 𝑑×𝑛: binary bipartite graph
𝑊 ∈ ℝ 𝑑×𝑛:
bipartite, called a “reduced graph”
𝑊Propagating edge weights between
regular nodes and supernodes
Spectral clustering on reduced graphs
• Consider another representation of the
reduced graph
• Spectral clustering on 𝑊′
𝑛 regular nodes
𝑑 supernodes
𝑛 regular nodes
𝑑 supernodes
Result of spectral clustering on 𝑊′
Spectral clustering on reduced graphs
• Spectral clustering on 𝑊′ becomes
• It can be more simplified
– 𝑦 is also an eigenvector of 𝑍𝑍 𝑇 ∈ ℝ 𝑑×𝑑
𝑛 regular nodes
𝑑 supernodes
• Co-clustering structure
• 𝑥 and 𝑦 are left & right
singular vectors of 𝑍 ∈ ℝ 𝑑×𝑛.
∵ 𝑍𝑍 𝑇 𝑦 = 𝑍 1 − 𝜆 𝑥 = 1 − 𝜆 2 𝑦
(𝑍𝑍 𝑇
looks like a compressed representation of 𝑊.)
In summary
Described by now
Additional steps
Regenerating supernodes
• Intuitions
1. The matrix 𝑈 ∈ ℝ 𝑛×𝑘 implies
the current clustering.
2. Most of the nodes in the
same cluster expect to be
densely connected.
• Method
– Selecting 𝑘 − 1 right
(= with large eigenvalues)
vectors as supernodes. 𝑈
𝑛 regular nodes
𝑑 supernodes
𝑘 cluster nodes
𝑊
In detail
New regular-super links
Average affiliation score over all the samples.
• Resulting in (𝑘 − 1) edges from every regular node.
• Every edge stands for a binalized affiliation score
• So, this idea can be easily extended to quantized affiliation scores with arbitrary sizes
Finally, the algorithm is as follows
Generating or updating
supernodes
Small-size spectral clustering
can be replaced to a function of 𝑡 as 𝑙 𝑡
Computational costs
3-4. 𝑂(𝑚𝑑)
1-2. 𝑂(𝑛𝑑 log 𝑛)
6. 𝑂 𝑛𝑑2 + 𝑂(𝑑3)
7-9. 𝑂(𝑛𝑑𝑘)
Alg. 1: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑛𝑑2)
5. 𝑂(𝑛𝑑)
3. 𝑂(𝑛𝑑 log 𝑛 + 𝑚(𝑑 + 1))
5. 𝑂(𝑚𝑘)
Alg. 2: 𝑂(𝑚𝑘)
Alg. 3: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑚𝑘𝑡 + 𝑛(𝑑2 + 𝑘2 𝑡))
If 𝑑2 ≈ 𝑘2 𝑡 ≈ log2 𝑛 → 𝑂 𝑛 log2 𝑛
( = modularity-based clustering)
Data sets for experiments
• 2 synthetic, 2 real-world.
– Syn-1k: kNN graph; 100k: 100-ins & 40-outs
– DBLP: Author network, co-conference links.
– IMDB: Movie network, co-director links.
• Looks like moderate-scale (not large-scale) graphs…
Experimental results
Shortest Path (See Slide 6)
Proposed (Alg. 1)
Proposed (Alg. 3)
Spectral Clustering
[Khoa & Chawla 2012]
[Fowlkes+ 2004]
The proposed method is
suitable for dense graphs.
(if sparse, modularity-based
clustering would be better
(𝑂 𝑛 log 𝑛 ∼ 𝑂(𝑛 log2
𝑛)) )
Detailed results
Performance of the proposed methods
w.r.t parameter 𝑑 (num.supernodes).
Why not monotonically increasing?
Performance of the proposed methods
w.r.t parameter 𝑡 (num.iterations).
Qualitative evaluations
• Toy example on Syn-1K
Ground truth k-NN graph SP Proposed 1
Proposed 2
(5 iterations)
SC RESC Nystrom
Comments
• The idea and technique are interesting and
maybe versatile.
• (Serialized and parallel) implementation
would be quite simple.
– Matlab code is available at
http://jialu.cs.illinois.edu/publication
• Might be suitable only for dense graph
clustering (with features).

Contenu connexe

Tendances

DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means ClusteringAnna Fensel
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data miningZHAO Sam
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERINGsingh7599
 
Spectral cnn
Spectral cnnSpectral cnn
Spectral cnnBrian Kim
 
K means clustering
K means clusteringK means clustering
K means clusteringThomas K T
 
K means clustering | K Means ++
K means clustering | K Means ++K means clustering | K Means ++
K means clustering | K Means ++sabbirantor
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetAlaaZ
 
Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmkhalid Shah
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means ClusteringSajib Sen
 
DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmPınar Yahşi
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sangwoo Mo
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 

Tendances (20)

Data miningpresentation
Data miningpresentationData miningpresentation
Data miningpresentation
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
 
Db Scan
Db ScanDb Scan
Db Scan
 
Spectral graph theory
Spectral graph theorySpectral graph theory
Spectral graph theory
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
 
Spectral cnn
Spectral cnnSpectral cnn
Spectral cnn
 
K means clustering
K means clusteringK means clustering
K means clustering
 
K means clustering | K Means ++
K means clustering | K Means ++K means clustering | K Means ++
K means clustering | K Means ++
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Kmeans
KmeansKmeans
Kmeans
 
Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithm
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering Algorithm
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)
 
Transfer learningforclp
Transfer learningforclpTransfer learningforclp
Transfer learningforclp
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Parallel-kmeans
Parallel-kmeansParallel-kmeans
Parallel-kmeans
 

Similaire à IJCAI13 Paper review: Large-scale spectral clustering on graphs

Dictionary Learning in Games - GDC 2014
Dictionary Learning in Games - GDC 2014Dictionary Learning in Games - GDC 2014
Dictionary Learning in Games - GDC 2014Manchor Ko
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satChenYiHuang5
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...ssuser4b1f48
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...ssuser2624f71
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
Parallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural ClusteringParallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural Clustering煜林 车
 
Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural NetworksLucaCrociani1
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...ssuser4b1f48
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesData-Centric_Alliance
 
Symbolic Regression on Network Properties
Symbolic Regression on Network PropertiesSymbolic Regression on Network Properties
Symbolic Regression on Network PropertiesMarcus Märtens
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matchingtaeseon ryu
 
Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)Grigory Yaroslavtsev
 
141222 graphulo ingraphblas
141222 graphulo ingraphblas141222 graphulo ingraphblas
141222 graphulo ingraphblasMIT
 
141205 graphulo ingraphblas
141205 graphulo ingraphblas141205 graphulo ingraphblas
141205 graphulo ingraphblasgraphulo
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attributiontaeseon ryu
 

Similaire à IJCAI13 Paper review: Large-scale spectral clustering on graphs (20)

post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
 
Dictionary Learning in Games - GDC 2014
Dictionary Learning in Games - GDC 2014Dictionary Learning in Games - GDC 2014
Dictionary Learning in Games - GDC 2014
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
 
Cs36565569
Cs36565569Cs36565569
Cs36565569
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Parallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural ClusteringParallelizing Pruning-based Graph Structural Clustering
Parallelizing Pruning-based Graph Structural Clustering
 
Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural Networks
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 
Symbolic Regression on Network Properties
Symbolic Regression on Network PropertiesSymbolic Regression on Network Properties
Symbolic Regression on Network Properties
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
 
Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)Parallel Algorithms for Geometric Graph Problems (at Stanford)
Parallel Algorithms for Geometric Graph Problems (at Stanford)
 
141222 graphulo ingraphblas
141222 graphulo ingraphblas141222 graphulo ingraphblas
141222 graphulo ingraphblas
 
141205 graphulo ingraphblas
141205 graphulo ingraphblas141205 graphulo ingraphblas
141205 graphulo ingraphblas
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 

Plus de Akisato Kimura

Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...Akisato Kimura
 
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...Akisato Kimura
 
多変量解析の一般化
多変量解析の一般化多変量解析の一般化
多変量解析の一般化Akisato Kimura
 
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
CVPR2016 reading - 特徴量学習とクロスモーダル転移についてCVPR2016 reading - 特徴量学習とクロスモーダル転移について
CVPR2016 reading - 特徴量学習とクロスモーダル転移についてAkisato Kimura
 
NIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imaginationNIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imaginationAkisato Kimura
 
CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Global refinement of random forest"CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Global refinement of random forest"Akisato Kimura
 
CVPR2015 reading "Understainding image virality" (in Japanese)
CVPR2015 reading "Understainding image virality" (in Japanese)CVPR2015 reading "Understainding image virality" (in Japanese)
CVPR2015 reading "Understainding image virality" (in Japanese)Akisato Kimura
 
Computational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cuesComputational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cuesAkisato Kimura
 
NIPS2014 reading - Top rank optimization in linear time
NIPS2014 reading - Top rank optimization in linear timeNIPS2014 reading - Top rank optimization in linear time
NIPS2014 reading - Top rank optimization in linear timeAkisato Kimura
 
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...Akisato Kimura
 
ICCV2013 reading: Learning to rank using privileged information
ICCV2013 reading: Learning to rank using privileged informationICCV2013 reading: Learning to rank using privileged information
ICCV2013 reading: Learning to rank using privileged informationAkisato Kimura
 
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...Akisato Kimura
 
関西CVPR勉強会 2012.10.28
関西CVPR勉強会 2012.10.28関西CVPR勉強会 2012.10.28
関西CVPR勉強会 2012.10.28Akisato Kimura
 
関西CVPR勉強会 2012.7.29
関西CVPR勉強会 2012.7.29関西CVPR勉強会 2012.7.29
関西CVPR勉強会 2012.7.29Akisato Kimura
 
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)Akisato Kimura
 
関西CVPRML勉強会(特定物体認識) 2012.1.14
関西CVPRML勉強会(特定物体認識) 2012.1.14関西CVPRML勉強会(特定物体認識) 2012.1.14
関西CVPRML勉強会(特定物体認識) 2012.1.14Akisato Kimura
 
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -Akisato Kimura
 
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明Akisato Kimura
 
立命館大学 AMLコロキウム 2011.10.20
立命館大学 AMLコロキウム 2011.10.20立命館大学 AMLコロキウム 2011.10.20
立命館大学 AMLコロキウム 2011.10.20Akisato Kimura
 

Plus de Akisato Kimura (20)

Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
 
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
 
多変量解析の一般化
多変量解析の一般化多変量解析の一般化
多変量解析の一般化
 
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
CVPR2016 reading - 特徴量学習とクロスモーダル転移についてCVPR2016 reading - 特徴量学習とクロスモーダル転移について
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
 
NIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imaginationNIPS2015 reading - Learning visual biases from human imagination
NIPS2015 reading - Learning visual biases from human imagination
 
CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Global refinement of random forest"CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Global refinement of random forest"
 
CVPR2015 reading "Understainding image virality" (in Japanese)
CVPR2015 reading "Understainding image virality" (in Japanese)CVPR2015 reading "Understainding image virality" (in Japanese)
CVPR2015 reading "Understainding image virality" (in Japanese)
 
Computational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cuesComputational models of human visual attention driven by auditory cues
Computational models of human visual attention driven by auditory cues
 
NIPS2014 reading - Top rank optimization in linear time
NIPS2014 reading - Top rank optimization in linear timeNIPS2014 reading - Top rank optimization in linear time
NIPS2014 reading - Top rank optimization in linear time
 
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
 
ICCV2013 reading: Learning to rank using privileged information
ICCV2013 reading: Learning to rank using privileged informationICCV2013 reading: Learning to rank using privileged information
ICCV2013 reading: Learning to rank using privileged information
 
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
 
関西CVPR勉強会 2012.10.28
関西CVPR勉強会 2012.10.28関西CVPR勉強会 2012.10.28
関西CVPR勉強会 2012.10.28
 
関西CVPR勉強会 2012.7.29
関西CVPR勉強会 2012.7.29関西CVPR勉強会 2012.7.29
関西CVPR勉強会 2012.7.29
 
ICWSM12 Brief Review
ICWSM12 Brief ReviewICWSM12 Brief Review
ICWSM12 Brief Review
 
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
 
関西CVPRML勉強会(特定物体認識) 2012.1.14
関西CVPRML勉強会(特定物体認識) 2012.1.14関西CVPRML勉強会(特定物体認識) 2012.1.14
関西CVPRML勉強会(特定物体認識) 2012.1.14
 
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
 
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
 
立命館大学 AMLコロキウム 2011.10.20
立命館大学 AMLコロキウム 2011.10.20立命館大学 AMLコロキウム 2011.10.20
立命館大学 AMLコロキウム 2011.10.20
 

Dernier

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Dernier (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

IJCAI13 Paper review: Large-scale spectral clustering on graphs

  • 1. Paper digest “Large-Scale Spectral Clustering on Graphs” Akisato Kimura akisato@ieee.org, @_akisato
  • 2. One-page abstract • Approx. acceleration of spectral clustering – by introducing additional nodes that enable us to compress the original graph, – resulting in a bipartite graph which is computationally efficient for spectral clustering. • Note – Large-scale spectral clustering, especially works well for dense graphs. – Not suitable for large-scale graph clustering, due to the sparsity in nature.
  • 3. Spectral clustering [Shi & Malik 1997] • Notations – Undirected weighted graph 𝐺 = 𝑉, 𝐸 – Num. nodes 𝑛 = |𝑉|; Num. Edges 𝑚 = |𝐸| – Adjacency matrix 𝑊 = 𝑊𝑖,𝑗 𝑖,𝑗=1,2,…,𝑛 • Objective function – Solved by eigen-decomposition (EVD) min 𝑋∈ℝ 𝑘×𝑛 𝑇𝑟(𝑋 𝑇 𝐷−1/2 𝐿𝐷−1/2 𝑋) s.t. 𝑋 𝑇 𝑋 = 𝐼 (𝐿: graph Laplacian of 𝑊, 𝐷 = 𝐿 − 𝑊, 𝑘: num.clusters)
  • 4. Main contribution of this work • SC needs 𝑂(𝑛3 ) computations due to EVD. • Several improvements so far. – Compressing the adjacency matrix by Nystrom method [Fowlkes+ 2004] – Reducing samples (= nodes) [Shinnou & Sasaki 2008] [Yan+ 2009] [Sakai & Imiya 2009] [Chen & Cai 2011] – Early stopping of EVD [Chen+ 2006] [Liu+ 2007] • In contrast, this work – Reducing the size of the graph.
  • 5. • Why supernodes? --- Intuition from co-clustering – A partition of supernodes can induce a partition of the observed nodes, and vise versa. • Generating a set of 𝑑 ≪ 𝑛 supernodes Introducing supernodes Original graph Regular nodes Supernodes
  • 6. How to generate supernodes 1. Randomly choosing 𝑑 regular nodes as seeds. 2. Calculating the shortest paths from the seeds to the other regular nodes. i. Converting adjacencies to distances. ii. Applying Dijkstra’s algorithm. 3. Partitioning all the regular nodes into 𝑑 disjoint subsets based on the shortest paths. 4. (Each subset corresponds to a supernode.)
  • 7. After generating supernodes 𝑛 regular nodes 𝑑 supernodes 𝑊 𝑅 𝑊 = 𝑅𝑊 𝑅 ∈ ℤ 𝑑×𝑛: binary bipartite graph 𝑊 ∈ ℝ 𝑑×𝑛: bipartite, called a “reduced graph” 𝑊Propagating edge weights between regular nodes and supernodes
  • 8. Spectral clustering on reduced graphs • Consider another representation of the reduced graph • Spectral clustering on 𝑊′ 𝑛 regular nodes 𝑑 supernodes 𝑛 regular nodes 𝑑 supernodes Result of spectral clustering on 𝑊′
  • 9. Spectral clustering on reduced graphs • Spectral clustering on 𝑊′ becomes • It can be more simplified – 𝑦 is also an eigenvector of 𝑍𝑍 𝑇 ∈ ℝ 𝑑×𝑑 𝑛 regular nodes 𝑑 supernodes • Co-clustering structure • 𝑥 and 𝑦 are left & right singular vectors of 𝑍 ∈ ℝ 𝑑×𝑛. ∵ 𝑍𝑍 𝑇 𝑦 = 𝑍 1 − 𝜆 𝑥 = 1 − 𝜆 2 𝑦 (𝑍𝑍 𝑇 looks like a compressed representation of 𝑊.)
  • 10. In summary Described by now Additional steps
  • 11. Regenerating supernodes • Intuitions 1. The matrix 𝑈 ∈ ℝ 𝑛×𝑘 implies the current clustering. 2. Most of the nodes in the same cluster expect to be densely connected. • Method – Selecting 𝑘 − 1 right (= with large eigenvalues) vectors as supernodes. 𝑈 𝑛 regular nodes 𝑑 supernodes 𝑘 cluster nodes 𝑊
  • 12. In detail New regular-super links Average affiliation score over all the samples. • Resulting in (𝑘 − 1) edges from every regular node. • Every edge stands for a binalized affiliation score • So, this idea can be easily extended to quantized affiliation scores with arbitrary sizes
  • 13. Finally, the algorithm is as follows Generating or updating supernodes Small-size spectral clustering can be replaced to a function of 𝑡 as 𝑙 𝑡
  • 14. Computational costs 3-4. 𝑂(𝑚𝑑) 1-2. 𝑂(𝑛𝑑 log 𝑛) 6. 𝑂 𝑛𝑑2 + 𝑂(𝑑3) 7-9. 𝑂(𝑛𝑑𝑘) Alg. 1: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑛𝑑2) 5. 𝑂(𝑛𝑑) 3. 𝑂(𝑛𝑑 log 𝑛 + 𝑚(𝑑 + 1)) 5. 𝑂(𝑚𝑘) Alg. 2: 𝑂(𝑚𝑘) Alg. 3: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑚𝑘𝑡 + 𝑛(𝑑2 + 𝑘2 𝑡)) If 𝑑2 ≈ 𝑘2 𝑡 ≈ log2 𝑛 → 𝑂 𝑛 log2 𝑛 ( = modularity-based clustering)
  • 15. Data sets for experiments • 2 synthetic, 2 real-world. – Syn-1k: kNN graph; 100k: 100-ins & 40-outs – DBLP: Author network, co-conference links. – IMDB: Movie network, co-director links. • Looks like moderate-scale (not large-scale) graphs…
  • 16. Experimental results Shortest Path (See Slide 6) Proposed (Alg. 1) Proposed (Alg. 3) Spectral Clustering [Khoa & Chawla 2012] [Fowlkes+ 2004] The proposed method is suitable for dense graphs. (if sparse, modularity-based clustering would be better (𝑂 𝑛 log 𝑛 ∼ 𝑂(𝑛 log2 𝑛)) )
  • 17. Detailed results Performance of the proposed methods w.r.t parameter 𝑑 (num.supernodes). Why not monotonically increasing? Performance of the proposed methods w.r.t parameter 𝑡 (num.iterations).
  • 18. Qualitative evaluations • Toy example on Syn-1K Ground truth k-NN graph SP Proposed 1 Proposed 2 (5 iterations) SC RESC Nystrom
  • 19. Comments • The idea and technique are interesting and maybe versatile. • (Serialized and parallel) implementation would be quite simple. – Matlab code is available at http://jialu.cs.illinois.edu/publication • Might be suitable only for dense graph clustering (with features).