SlideShare une entreprise Scribd logo
1  sur  27
Course: Machine Vision
Recognition (2)
Session 11
D5627 – I Gede Putra Kusuma Negara, B.Eng., PhD
Outline
• Unsupervised Learning
• Clustering
• K-Means Clustering
• Agglomerative Hierarchical Clustering
Unsupervised Learning
Unsupervised Learning
• Trying to find hidden structure in unlabeled data. Since the examples
given to the learner are unlabeled, there is no error or reward signal
to evaluate a potential solution
• This distinguishes unsupervised learning from supervised learning
Unsupervised Learning
• Unsupervised learning is closely related to the problem of density
estimation in statistics
• However unsupervised learning also encompasses many other
techniques that seek to summarize and explain key features of the
data
• Approaches to unsupervised learning include:
– Clustering (e.g., k-means, mixture models, hierarchical clustering)
– Hidden Markov models
– Blind signal separation using feature extraction techniques for dimensionality
reduction:
• principal component analysis (PCA)
• independent component analysis (ICA)
• non-negative matrix factorization
• singular value decomposition (SVD)
Unsupervised Learning
Unsupervised procedures are used for several reasons:
• Collecting and labeling a large set of sample patterns can be costly or
may not be feasible.
• One can train with large amount of unlabeled data, and then use
supervision to label the groupings found.
• Unsupervised methods can be used for feature extraction.
• Exploratory data analysis can provide insight into the nature or
structure of the data.
Clustering
Clusters
• A cluster is comprised of a
number of similar objects
collected or grouped together.
• Patterns within a cluster are
more similar to each other than
are patterns in different clusters.
• Clusters may be described as
connected regions of a multi-
dimensional space containing a
relatively high density of points,
separated from other such
regions by a region containing a
relatively low density of points.
Clustering
• Clustering is an unsupervised procedure that uses unlabeled samples
• Clustering is a very difficult problem because data can reveal clusters
with different shapes and sizes
• What makes two points/images/patches similar?
Then umber of clusters in the data often depend on the resolution (fine vs. coarse)
with which we view the data. How many clusters do you see in this figure? 5, 8, 10,
more?
Why Clustering?
• Summarizing data
– Look at large amounts of data
– Patch-based compression or denoising
– Represent a large continuous vector with the cluster number
• Counting
– Histograms of texture, color, SIFT vectors
• Segmentation
– Separate the image into different regions
• Prediction
– Images in the same cluster may have the same labels
Image Segmentation
(example)
Image Segmentation
(example)
Clustering Algorithms
• Most of clustering algorithms are based on the following
two popular techniques:
– Iterative squared-error partitioning,
– Agglomerative hierarchical clustering.
• One of the main challenges is to select an appropriate
measure of similarity to define clusters that is often both
data (cluster shape) and context dependent.
Squared-error Partitioning
• Suppose that the given set of n patterns has somehow been
partitioned into k clusters D1, . . . , Dk.
• Let ni be the number of samples in Di and let mi be the mean of those
samples
• Then, the sum-of-squared errors is defined by
• For a given cluster Di, the mean vector mi (centroid) is the best
representative of the samples in Di.
mi =
1
ni
x
xÎDi
å
Je = x - mi
2
xÎDi
å
i-1
k
å
K-Means Clustering
A general algorithm for iterative squared-error partitioning:
1. Select an initial partition with k clusters.
Repeat steps 2 through 5 until the cluster membership stabilizes.
2. Generate a new partition by assigning each pattern to its closest cluster
center.
3. Compute new cluster centers as the centroids of the clusters.
4. Repeat steps 2 and 3 until an optimum value of the criterion function is found
(e.g., when a local minimum is found or a predefined number of iterations are
completed).
5. Adjust the number of clusters by merging and splitting existing clusters or by
removing small or outlier clusters.
This algorithm, without step 5, is also known as the k-means clustering.
K-Means Clustering (demo)
Number of samples = 30
Number of cluster = 3
K-Means Clustering
• k-means is computationally efficient and gives good results if the
clusters are compact, hyper-spherical in shape and well-separated in
the feature space.
• However, choosing k and choosing the initial partition are the main
drawbacks of this algorithm.
• The value of k is often chosen empirically or by prior knowledge
about the data.
• The initial partition is often chosen by generating k random points
uniformly distributed within the range of the data, or by randomly
selecting k points from the data.
K-Means Clustering
• The k-means algorithm produces a flat data description
where the clusters are disjoint and are at the same level.
• In some applications, groups of patterns share some
characteristics when looked at a particular level.
• Hierarchical clustering tries to capture these multi-level
groupings using hierarchical representations rather than flat
partitions.
Hierarchical Clustering
• In hierarchical clustering, for a set of n samples,
– the first level consists of n clusters (each cluster containing exactly
one sample),
– the second level contains n − 1 clusters,
– the third level contains n − 2 clusters,
– and so on until the last (n-th) level at which all samples form a
single cluster.
• Given any two samples, at some level they will be grouped
together in the same cluster and remain together at all
higher levels.
Hierarchical Clustering
• A natural representation of hierarchical clustering is a tree, also called
a dendrogram, which shows how the samples are grouped.
• If there is an unusually large gap between the similarity values for two
particular levels, one can argue that the level with fewer number of
clusters represents a more natural grouping.
• A dendrogram can represent the results of hierarchical clustering
algorithms.
Agglomerative
Hierarchical Clustering
1. Specify the number of clusters. Place every pattern in a unique
cluster and repeat steps 2 and 3 until a partition with the required
number of clusters is obtained.
2. Find the closest clusters according to a distance measure.
3. Merge these two clusters.
4. Return the resulting clusters.
Popular Distance Measures
• Popular distance measures (for two clusters Di and Dj):
dmin (Di, Dj ) = min
xÎDi
xÎDj
x - x,
dmax (Di, Dj ) = max
xÎDi
xÎDj
x - x,
davg (Di, Dj ) =
1
# DiDj
x - x,
x'
ÎDj
å
xÎDi
å
dmean (Di, Dj ) = mi - mj
Hierarchical Clustering
(demo)
Number of cluster = 10
Number of cluster = 9
Number of cluster = 8
Number of cluster = 7
Number of cluster = 6
Number of cluster = 5
Number of cluster = 4
Number of cluster = 3
Number of cluster = 2
Number of cluster = 1
Texture Clustering (example)
• We are going to cluster texture image into 5 clusters
• Image source: 40 texture images from MIT VisTex database
– Number of classes: 5 (brick, fabric, grass, sand, stone)
• Features:
– Gray-level co-occurrence matrix (GLCM)
– Fourier power spectrum (#ring: 4, #sector:8)
• Clustering method:
– K-means clustering
Texture Images
Acknowledgment
Some of slides in this PowerPoint presentation are adaptation from
various slides, many thanks to:
1. Selim Aksoy, Department of Computer Engineering, Bilkent
University (http://www.cs.bilkent.edu.tr/~saksoy/)
2. James Hays, Computer Science Department, Brown University,
(http://cs.brown.edu/~hays/)
Thank You

Contenu connexe

Tendances

Image segmentation
Image segmentationImage segmentation
Image segmentation
Deepak Kumar
 
Threshold Selection for Image segmentation
Threshold Selection for Image segmentationThreshold Selection for Image segmentation
Threshold Selection for Image segmentation
Parijat Sinha
 

Tendances (20)

Otsu binarization
Otsu binarizationOtsu binarization
Otsu binarization
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Image segmentation
Image segmentation Image segmentation
Image segmentation
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Features image processing and Extaction
Features image processing and ExtactionFeatures image processing and Extaction
Features image processing and Extaction
 
Image segmentation
Image segmentation Image segmentation
Image segmentation
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
PPT s02-machine vision-s2
PPT s02-machine vision-s2PPT s02-machine vision-s2
PPT s02-machine vision-s2
 
Region-based Semi-supervised Clustering Image Segmentation
Region-based Semi-supervised Clustering Image SegmentationRegion-based Semi-supervised Clustering Image Segmentation
Region-based Semi-supervised Clustering Image Segmentation
 
Graph Image Segmentation
Graph Image SegmentationGraph Image Segmentation
Graph Image Segmentation
 
Image segmentation ajal
Image segmentation ajalImage segmentation ajal
Image segmentation ajal
 
Edge Detection and Segmentation
Edge Detection and SegmentationEdge Detection and Segmentation
Edge Detection and Segmentation
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
GRAY SCALE IMAGE SEGMENTATION USING OTSU THRESHOLDING OPTIMAL APPROACH
GRAY SCALE IMAGE SEGMENTATION USING OTSU THRESHOLDING OPTIMAL APPROACHGRAY SCALE IMAGE SEGMENTATION USING OTSU THRESHOLDING OPTIMAL APPROACH
GRAY SCALE IMAGE SEGMENTATION USING OTSU THRESHOLDING OPTIMAL APPROACH
 
Evaluation of Texture in CBIR
Evaluation of Texture in CBIREvaluation of Texture in CBIR
Evaluation of Texture in CBIR
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Threshold Selection for Image segmentation
Threshold Selection for Image segmentationThreshold Selection for Image segmentation
Threshold Selection for Image segmentation
 
TYBSC (CS) SEM 6- DIGITAL IMAGE PROCESSING
TYBSC (CS) SEM 6- DIGITAL IMAGE PROCESSINGTYBSC (CS) SEM 6- DIGITAL IMAGE PROCESSING
TYBSC (CS) SEM 6- DIGITAL IMAGE PROCESSING
 
Template Matching - Pattern Recognition
Template Matching - Pattern RecognitionTemplate Matching - Pattern Recognition
Template Matching - Pattern Recognition
 
Image Segmentation
 Image Segmentation Image Segmentation
Image Segmentation
 

Similaire à PPT s10-machine vision-s2

CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 

Similaire à PPT s10-machine vision-s2 (20)

Clustering on DSS
Clustering on DSSClustering on DSS
Clustering on DSS
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
 
Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptx
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithms
 
Chapter 10.1,2,3 pdf.pdf
Chapter 10.1,2,3 pdf.pdfChapter 10.1,2,3 pdf.pdf
Chapter 10.1,2,3 pdf.pdf
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Unsupervised Learning.pptx
Unsupervised Learning.pptxUnsupervised Learning.pptx
Unsupervised Learning.pptx
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
DS9 - Clustering.pptx
DS9 - Clustering.pptxDS9 - Clustering.pptx
DS9 - Clustering.pptx
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf
 

Plus de Binus Online Learning

Plus de Binus Online Learning (20)

LN s12-machine vision-s2
LN s12-machine vision-s2LN s12-machine vision-s2
LN s12-machine vision-s2
 
LN s11-machine vision-s2
LN s11-machine vision-s2LN s11-machine vision-s2
LN s11-machine vision-s2
 
LN s10-machine vision-s2
LN s10-machine vision-s2LN s10-machine vision-s2
LN s10-machine vision-s2
 
LN s09-machine vision-s2
LN s09-machine vision-s2LN s09-machine vision-s2
LN s09-machine vision-s2
 
LN s08-machine vision-s2
LN s08-machine vision-s2LN s08-machine vision-s2
LN s08-machine vision-s2
 
LN s07-machine vision-s2
LN s07-machine vision-s2LN s07-machine vision-s2
LN s07-machine vision-s2
 
LN s06-machine vision-s2
LN s06-machine vision-s2LN s06-machine vision-s2
LN s06-machine vision-s2
 
LN s05-machine vision-s2
LN s05-machine vision-s2LN s05-machine vision-s2
LN s05-machine vision-s2
 
LN s04-machine vision-s2
LN s04-machine vision-s2LN s04-machine vision-s2
LN s04-machine vision-s2
 
LN s03-machine vision-s2
LN s03-machine vision-s2LN s03-machine vision-s2
LN s03-machine vision-s2
 
LN s02-machine vision-s2
LN s02-machine vision-s2LN s02-machine vision-s2
LN s02-machine vision-s2
 
LN s01-machine vision-s2
LN s01-machine vision-s2LN s01-machine vision-s2
LN s01-machine vision-s2
 
PPT s09-machine vision-s2
PPT s09-machine vision-s2PPT s09-machine vision-s2
PPT s09-machine vision-s2
 
PPT s06-machine vision-s2
PPT s06-machine vision-s2PPT s06-machine vision-s2
PPT s06-machine vision-s2
 
PPT s05-machine vision-s2
PPT s05-machine vision-s2PPT s05-machine vision-s2
PPT s05-machine vision-s2
 
PPT s01-machine vision-s2
PPT s01-machine vision-s2PPT s01-machine vision-s2
PPT s01-machine vision-s2
 
LN sesi 2 delivering quality-1
LN sesi 2 delivering quality-1LN sesi 2 delivering quality-1
LN sesi 2 delivering quality-1
 
PPT Sesi 2 FO the guest delivering quality-1
PPT Sesi 2 FO the guest delivering quality-1PPT Sesi 2 FO the guest delivering quality-1
PPT Sesi 2 FO the guest delivering quality-1
 
PPT Sesi 3 FO the guest - delivering quality 2
PPT Sesi 3 FO the guest - delivering quality 2 PPT Sesi 3 FO the guest - delivering quality 2
PPT Sesi 3 FO the guest - delivering quality 2
 
LN sesi 3 delivering quality-2
LN sesi 3 delivering quality-2LN sesi 3 delivering quality-2
LN sesi 3 delivering quality-2
 

Dernier

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 

Dernier (20)

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 

PPT s10-machine vision-s2

  • 1. Course: Machine Vision Recognition (2) Session 11 D5627 – I Gede Putra Kusuma Negara, B.Eng., PhD
  • 2. Outline • Unsupervised Learning • Clustering • K-Means Clustering • Agglomerative Hierarchical Clustering
  • 4. Unsupervised Learning • Trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution • This distinguishes unsupervised learning from supervised learning
  • 5. Unsupervised Learning • Unsupervised learning is closely related to the problem of density estimation in statistics • However unsupervised learning also encompasses many other techniques that seek to summarize and explain key features of the data • Approaches to unsupervised learning include: – Clustering (e.g., k-means, mixture models, hierarchical clustering) – Hidden Markov models – Blind signal separation using feature extraction techniques for dimensionality reduction: • principal component analysis (PCA) • independent component analysis (ICA) • non-negative matrix factorization • singular value decomposition (SVD)
  • 6. Unsupervised Learning Unsupervised procedures are used for several reasons: • Collecting and labeling a large set of sample patterns can be costly or may not be feasible. • One can train with large amount of unlabeled data, and then use supervision to label the groupings found. • Unsupervised methods can be used for feature extraction. • Exploratory data analysis can provide insight into the nature or structure of the data.
  • 8. Clusters • A cluster is comprised of a number of similar objects collected or grouped together. • Patterns within a cluster are more similar to each other than are patterns in different clusters. • Clusters may be described as connected regions of a multi- dimensional space containing a relatively high density of points, separated from other such regions by a region containing a relatively low density of points.
  • 9. Clustering • Clustering is an unsupervised procedure that uses unlabeled samples • Clustering is a very difficult problem because data can reveal clusters with different shapes and sizes • What makes two points/images/patches similar? Then umber of clusters in the data often depend on the resolution (fine vs. coarse) with which we view the data. How many clusters do you see in this figure? 5, 8, 10, more?
  • 10. Why Clustering? • Summarizing data – Look at large amounts of data – Patch-based compression or denoising – Represent a large continuous vector with the cluster number • Counting – Histograms of texture, color, SIFT vectors • Segmentation – Separate the image into different regions • Prediction – Images in the same cluster may have the same labels
  • 13. Clustering Algorithms • Most of clustering algorithms are based on the following two popular techniques: – Iterative squared-error partitioning, – Agglomerative hierarchical clustering. • One of the main challenges is to select an appropriate measure of similarity to define clusters that is often both data (cluster shape) and context dependent.
  • 14. Squared-error Partitioning • Suppose that the given set of n patterns has somehow been partitioned into k clusters D1, . . . , Dk. • Let ni be the number of samples in Di and let mi be the mean of those samples • Then, the sum-of-squared errors is defined by • For a given cluster Di, the mean vector mi (centroid) is the best representative of the samples in Di. mi = 1 ni x xÎDi å Je = x - mi 2 xÎDi å i-1 k å
  • 15. K-Means Clustering A general algorithm for iterative squared-error partitioning: 1. Select an initial partition with k clusters. Repeat steps 2 through 5 until the cluster membership stabilizes. 2. Generate a new partition by assigning each pattern to its closest cluster center. 3. Compute new cluster centers as the centroids of the clusters. 4. Repeat steps 2 and 3 until an optimum value of the criterion function is found (e.g., when a local minimum is found or a predefined number of iterations are completed). 5. Adjust the number of clusters by merging and splitting existing clusters or by removing small or outlier clusters. This algorithm, without step 5, is also known as the k-means clustering.
  • 16. K-Means Clustering (demo) Number of samples = 30 Number of cluster = 3
  • 17. K-Means Clustering • k-means is computationally efficient and gives good results if the clusters are compact, hyper-spherical in shape and well-separated in the feature space. • However, choosing k and choosing the initial partition are the main drawbacks of this algorithm. • The value of k is often chosen empirically or by prior knowledge about the data. • The initial partition is often chosen by generating k random points uniformly distributed within the range of the data, or by randomly selecting k points from the data.
  • 18. K-Means Clustering • The k-means algorithm produces a flat data description where the clusters are disjoint and are at the same level. • In some applications, groups of patterns share some characteristics when looked at a particular level. • Hierarchical clustering tries to capture these multi-level groupings using hierarchical representations rather than flat partitions.
  • 19. Hierarchical Clustering • In hierarchical clustering, for a set of n samples, – the first level consists of n clusters (each cluster containing exactly one sample), – the second level contains n − 1 clusters, – the third level contains n − 2 clusters, – and so on until the last (n-th) level at which all samples form a single cluster. • Given any two samples, at some level they will be grouped together in the same cluster and remain together at all higher levels.
  • 20. Hierarchical Clustering • A natural representation of hierarchical clustering is a tree, also called a dendrogram, which shows how the samples are grouped. • If there is an unusually large gap between the similarity values for two particular levels, one can argue that the level with fewer number of clusters represents a more natural grouping. • A dendrogram can represent the results of hierarchical clustering algorithms.
  • 21. Agglomerative Hierarchical Clustering 1. Specify the number of clusters. Place every pattern in a unique cluster and repeat steps 2 and 3 until a partition with the required number of clusters is obtained. 2. Find the closest clusters according to a distance measure. 3. Merge these two clusters. 4. Return the resulting clusters.
  • 22. Popular Distance Measures • Popular distance measures (for two clusters Di and Dj): dmin (Di, Dj ) = min xÎDi xÎDj x - x, dmax (Di, Dj ) = max xÎDi xÎDj x - x, davg (Di, Dj ) = 1 # DiDj x - x, x' ÎDj å xÎDi å dmean (Di, Dj ) = mi - mj
  • 23. Hierarchical Clustering (demo) Number of cluster = 10 Number of cluster = 9 Number of cluster = 8 Number of cluster = 7 Number of cluster = 6 Number of cluster = 5 Number of cluster = 4 Number of cluster = 3 Number of cluster = 2 Number of cluster = 1
  • 24. Texture Clustering (example) • We are going to cluster texture image into 5 clusters • Image source: 40 texture images from MIT VisTex database – Number of classes: 5 (brick, fabric, grass, sand, stone) • Features: – Gray-level co-occurrence matrix (GLCM) – Fourier power spectrum (#ring: 4, #sector:8) • Clustering method: – K-means clustering
  • 26. Acknowledgment Some of slides in this PowerPoint presentation are adaptation from various slides, many thanks to: 1. Selim Aksoy, Department of Computer Engineering, Bilkent University (http://www.cs.bilkent.edu.tr/~saksoy/) 2. James Hays, Computer Science Department, Brown University, (http://cs.brown.edu/~hays/)