SlideShare une entreprise Scribd logo
1  sur  34
PARTITIONAL &
HIERARCHICAL
CLUSTERING
KS141321 SISTEM CERDAS
Materi – Minggu 11
Jurusan Sistem Informasi
ITS
Oleh: Irmasari Hafidz
OUTLINE
1. Partitional Clustering: K-Means
 Pseudocode of K-Means
 Example
 Evaluasi Performa K-Means
2. Hierarchical Clustering
Cluster Distance Measures
Agglomerative Algorithm
Example
PARTITIONAL
CLUSTERING: K-
MEANS
K-MEANS
Salah satu partitional clustering yang terkenal adalah K-Means
Clustering
Kelebihan: komputasinya yang sederhana
Kekurangan: kualitas kluster tergantung pada pemilihan centroid awal dan nilai k.
Parameter K menunjukkan banyaknya cluster yang akan dibentuk
Sebuah nilai k ditentukan di awal.
Nilai k = banyaknya cluster
• Didefinisikan centroid awal sebanyak k
• Centroid awal di-inisialisasikan secara random
K-MEANS (PSEUDOCODE)
Proses pengelompokan ke k-cluster dilakukan dalam beberapa
iterasi
Iterasi berhenti jika centroidnya tidak berubah lagi atau setiap data
selalu berada di cluster yang sama di iterasi-iterasi berikutnya
K-MEANS
Jika atribut ke-i numerik,
maka nilai centroid ke-i
merupakan mean dari nilai
atribut 1≤ i ≤ n
Jika atribut ke-i
kategorikal, maka nilai
centroid ke-i merupakan
modus dari nilai atribut itu
1 ≤ i ≤ n
Contoh k-Means clustering dengan
k=3, dan 3
centroid: m1, m2, m3
Setiap cluster diasosiasikan dengan sebuah centroid
Setiap point data dimasukkan ke cluster dengan centroid terdekat
Sebuah centroid: sebuah vektor n-dimensi. (Dimana n adalah banyaknya
atribut di setiap data)
7
K-MEAN ALGORITHM
Diinisialkan jumlah klaster sebanyak K, the K-means algorithm dilakukan
dalam 5 langkah:
1. Tentukan k
2. Tentukan titik awal centroid (set
seed points) sebanyak k
3. Masukkan setiap data ke cluster
dengan centroid terdekat (jarak
minimum)
4. Update centroid dari masing-
masing klaster (centroid adalah
pusat dari klaster, i.e., mean
point, dari klaster)
5. Kembali ke no 1, iterasi
berhenti jika sudah tidak ada
8
Problem
Example
Suppose we have 4 types of medicines and each has two attributes (pH and
weight index). Our goal is to group these objects into K=2 group of medicine.
Medicine Weight pH-Index
A 1 1
B 2 1
C 4 3
D 5 4
A B
C
D
9
EXAMPLE
Step 1: Use initial seed points for partitioning
B
c
,
A
c 2
1 

24
.
4
)
1
4
(
)
2
5
(
)
,
(
5
)
1
4
(
)
1
5
(
)
,
(
2
2
2
2
2
1










c
D
d
c
D
d
Assign each object to the cluster
with the nearest seed point
Euclidean distance
10
EXAMPLE
Step 2: Compute new centroids of the current partition
Knowing the members of each
cluster, now we compute the new
centroid of each group based on
these new memberships.
)
67
.
2
,
67
.
3
(
)
3
/
8
,
3
/
11
(
3
4
3
1
,
3
5
4
2
)
1
,
1
(
2
1







 





c
c
11
EXAMPLE
Step 2: Renew membership based on new centroids
Compute the distance of all
objects to the new centroids
Assign the membership to objects
12
EXAMPLE
Step 3: Repeat the first two steps until its convergence
Knowing the members of each
cluster, now we compute the new
centroid of each group based on
these new memberships.
)
2
1
3
,
2
1
4
(
2
4
3
,
2
5
4
)
1
,
2
1
1
(
2
1
1
,
2
2
1
2
1






 








 


c
c
13
EXAMPLE
Step 3: Repeat the first two steps until its convergence
Compute the distance of all objects to
the new centroids
Stop due to no new assignment
EVALUASI PERFORMA K-
MEANS
 Evaluasi performa K-Means Clustering dapat menggunakan Sum of
Square Error (SSE). Ide utama dari penggunaan SSE ini adalah
mengukur keseragaman antar data dalam satu klaster
 Keseragaman diukur berdasarkan error/jarak antara setiap data
dengan centroidnya. Semakin seragam data-data dalam sebuah
cluster, semakin kecil jarak antara setiap data dengan centroidnya
 Selanjutnya error disetiap cluster dijumlahkan untuk semua cluster
(Sum of Square Error/SSE). Semakin kecil nilai SSE maka semakin
bagus hasil clusteringnya
EVALUASI PERFORMA K-
MEANS
K = banyaknya cluster
Ci = Cluster ke-i
mi = centroid cluster ke-I
x = data yang berada di masing-masing cluster
CLUSTERING DENGAN WEKA
HIERARCHICAL
CLUSTERING
INTRODUCTION
Hierarchical Clustering Approach
 A typical clustering analysis approach via partitioning data set sequentially
 Construct nested partitions layer by layer via grouping objects into a tree of
clusters (without the need to know the number of clusters in advance)
 Uses distance matrix as clustering criteria and a termination condition needed
Agglomerative vs. Divisive
 Two sequential clustering strategies for constructing a tree of clusters
 Agglomerative: a bottom-up strategy
 Initially each data object is in its own (atomic) cluster
 Then merge these atomic clusters into larger and larger clusters
 Divisive: a top-down strategy
 Initially all objects are in one single cluster
 Then the cluster is subdivided into smaller and smaller clusters
INTRODUCTION
Illustrative Example
Agglomerative and divisive clustering on the data set {a, b, c, d ,e }
 Cluster distance
 Termination condition
Step 0 Step 1 Step 2 Step 3 Step 4
b
d
c
e
a
a b
d e
c d e
a b c d e
Step 4 Step 3 Step 2 Step 1 Step 0
Agglomerative
Divisive
single link
(min)
complete link
(max)
average
CLUSTER DISTANCE MEASURES
Single link: smallest distance
between an element in one cluster
and an element in the other, i.e., d(Ci,
Cj) = min{d(xip, xjq)}
Complete link: largest distance
between an element in one cluster
and an element in the other, i.e., d(Ci,
Cj) = max{d(xip, xjq)}
Average: avg distance between
elements in one cluster and elements
in the other, i.e.,
d(C , C ) = avg{d(x , x )}
AGGLOMERATIVE ALGORITHM
The Agglomerative algorithm is carried out in three steps:
1) Convert object attributes to distance
matrix
2) Set each object as a cluster (thus if
we have N objects, we will have N
clusters at the beginning)
3) Repeat until number of cluster is
one (or known # of clusters)
 Merge two closest clusters
 Update distance matrix
Problem: clustering analysis with agglomerative algorithm
Example and Demo
data matrix
distance matrix
Euclidean distance
Merge two closest clusters (iteration 1)
Example and Demo
Update distance matrix (iteration 1)
Example and Demo
Merge two closest clusters (iteration 2)
Example and Demo
Update distance matrix (iteration 2)
Example and Demo
Merge two closest clusters/update distance matrix (iteration 3)
Example and Demo
Merge two closest clusters/update distance matrix (iteration 4)
Example and Demo
COMP24111 MACHINE LEARNING 29
Final result (meeting termination condition)
Example and Demo
Dendrogram tree representation
Example and Demo
1. In the beginning we have 6
clusters: A, B, C, D, E and F
2. We merge cluster D and F into
cluster (D, F) at distance 0.50
3. We merge cluster A and cluster B
into (A, B) at distance 0.71
4. We merge cluster E and (D, F)
into ((D, F), E) at distance 1.00
5. We merge cluster ((D, F), E) and C
into (((D, F), E), C) at distance 1.41
6. We merge cluster (((D, F), E), C)
and (A, B) into ((((D, F), E), C), (A, B))
at distance 2.50
7. The last cluster contain all the objects,
thus conclude the computation
2
3
4
5
6
CLUSTERING IN R
library mva:
- Hierarchical clustering: hclust, heatmap
- k-means: kmeans
library class:
- Self-organizing maps: SOM
library cluster:
- pam and other functions
TUGAS T2: K-MEANS &
HIERARCHICAL CLUSTERING
(SECTION 4)
Individu
Dikerjakan di kertas folio/A4 (tulis tangan)
Dikumpulkan minggu depan (Minggu 12), 21 April 2015
NEXT WEEK (MINGGU 12)
UNSUPERVISED LEARNING: ASSOCIATION RULE
BAYES THEOREM
Final Project: Any topic (From Week 1-14) using R, Laporan & Demo:
Minggu 15
FP: kelompok, 3-4 orang
Neural Network
Clustering
Bayesian
Association Rule
REFERENCES
Flach, Peter. 2012. Machine Learning: The Art and Science of
Algorithms that Make Sense of Data. Cambridge University
Press.
Tan et. al., ‘Introduction to Data Mining’, Addison Wesley, 2006
Ke Chen, University of Manchester, COMP24111 Machine
Learning
http://www.cs.man.ac.uk/~kechen/teaching.php
Wikibooks, K-Means Example,
http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clu
stering/K-Means

Contenu connexe

Similaire à 11-2-Clustering.pptx

11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.pptSueMiu
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfVIKASGUPTA127897
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Salah Amean
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningHouw Liong The
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetAlaaZ
 
Chapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberChapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberHouw Liong The
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
AI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxAI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxSyed Ejaz
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.pptLPrashanthi
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clusteringishmecse13
 
Lecture8 clustering
Lecture8 clusteringLecture8 clustering
Lecture8 clusteringsidsingh680
 
K means clustering
K means clusteringK means clustering
K means clusteringkeshav goyal
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxShwetapadmaBabu1
 

Similaire à 11-2-Clustering.pptx (20)

11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdf
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text mining
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Chapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberChapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & Kamber
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
AI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxAI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptx
 
08 clustering
08 clustering08 clustering
08 clustering
 
Project PPT
Project PPTProject PPT
Project PPT
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
Lect4
Lect4Lect4
Lect4
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
Lecture8 clustering
Lecture8 clusteringLecture8 clustering
Lecture8 clustering
 
kmean clustering
kmean clusteringkmean clustering
kmean clustering
 
K means clustering
K means clusteringK means clustering
K means clustering
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 

Dernier

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 

Dernier (20)

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 

11-2-Clustering.pptx

  • 1. PARTITIONAL & HIERARCHICAL CLUSTERING KS141321 SISTEM CERDAS Materi – Minggu 11 Jurusan Sistem Informasi ITS Oleh: Irmasari Hafidz
  • 2. OUTLINE 1. Partitional Clustering: K-Means  Pseudocode of K-Means  Example  Evaluasi Performa K-Means 2. Hierarchical Clustering Cluster Distance Measures Agglomerative Algorithm Example
  • 4. K-MEANS Salah satu partitional clustering yang terkenal adalah K-Means Clustering Kelebihan: komputasinya yang sederhana Kekurangan: kualitas kluster tergantung pada pemilihan centroid awal dan nilai k. Parameter K menunjukkan banyaknya cluster yang akan dibentuk Sebuah nilai k ditentukan di awal. Nilai k = banyaknya cluster • Didefinisikan centroid awal sebanyak k • Centroid awal di-inisialisasikan secara random
  • 5. K-MEANS (PSEUDOCODE) Proses pengelompokan ke k-cluster dilakukan dalam beberapa iterasi Iterasi berhenti jika centroidnya tidak berubah lagi atau setiap data selalu berada di cluster yang sama di iterasi-iterasi berikutnya
  • 6. K-MEANS Jika atribut ke-i numerik, maka nilai centroid ke-i merupakan mean dari nilai atribut 1≤ i ≤ n Jika atribut ke-i kategorikal, maka nilai centroid ke-i merupakan modus dari nilai atribut itu 1 ≤ i ≤ n Contoh k-Means clustering dengan k=3, dan 3 centroid: m1, m2, m3 Setiap cluster diasosiasikan dengan sebuah centroid Setiap point data dimasukkan ke cluster dengan centroid terdekat Sebuah centroid: sebuah vektor n-dimensi. (Dimana n adalah banyaknya atribut di setiap data)
  • 7. 7 K-MEAN ALGORITHM Diinisialkan jumlah klaster sebanyak K, the K-means algorithm dilakukan dalam 5 langkah: 1. Tentukan k 2. Tentukan titik awal centroid (set seed points) sebanyak k 3. Masukkan setiap data ke cluster dengan centroid terdekat (jarak minimum) 4. Update centroid dari masing- masing klaster (centroid adalah pusat dari klaster, i.e., mean point, dari klaster) 5. Kembali ke no 1, iterasi berhenti jika sudah tidak ada
  • 8. 8 Problem Example Suppose we have 4 types of medicines and each has two attributes (pH and weight index). Our goal is to group these objects into K=2 group of medicine. Medicine Weight pH-Index A 1 1 B 2 1 C 4 3 D 5 4 A B C D
  • 9. 9 EXAMPLE Step 1: Use initial seed points for partitioning B c , A c 2 1   24 . 4 ) 1 4 ( ) 2 5 ( ) , ( 5 ) 1 4 ( ) 1 5 ( ) , ( 2 2 2 2 2 1           c D d c D d Assign each object to the cluster with the nearest seed point Euclidean distance
  • 10. 10 EXAMPLE Step 2: Compute new centroids of the current partition Knowing the members of each cluster, now we compute the new centroid of each group based on these new memberships. ) 67 . 2 , 67 . 3 ( ) 3 / 8 , 3 / 11 ( 3 4 3 1 , 3 5 4 2 ) 1 , 1 ( 2 1               c c
  • 11. 11 EXAMPLE Step 2: Renew membership based on new centroids Compute the distance of all objects to the new centroids Assign the membership to objects
  • 12. 12 EXAMPLE Step 3: Repeat the first two steps until its convergence Knowing the members of each cluster, now we compute the new centroid of each group based on these new memberships. ) 2 1 3 , 2 1 4 ( 2 4 3 , 2 5 4 ) 1 , 2 1 1 ( 2 1 1 , 2 2 1 2 1                     c c
  • 13. 13 EXAMPLE Step 3: Repeat the first two steps until its convergence Compute the distance of all objects to the new centroids Stop due to no new assignment
  • 14. EVALUASI PERFORMA K- MEANS  Evaluasi performa K-Means Clustering dapat menggunakan Sum of Square Error (SSE). Ide utama dari penggunaan SSE ini adalah mengukur keseragaman antar data dalam satu klaster  Keseragaman diukur berdasarkan error/jarak antara setiap data dengan centroidnya. Semakin seragam data-data dalam sebuah cluster, semakin kecil jarak antara setiap data dengan centroidnya  Selanjutnya error disetiap cluster dijumlahkan untuk semua cluster (Sum of Square Error/SSE). Semakin kecil nilai SSE maka semakin bagus hasil clusteringnya
  • 15. EVALUASI PERFORMA K- MEANS K = banyaknya cluster Ci = Cluster ke-i mi = centroid cluster ke-I x = data yang berada di masing-masing cluster
  • 18. INTRODUCTION Hierarchical Clustering Approach  A typical clustering analysis approach via partitioning data set sequentially  Construct nested partitions layer by layer via grouping objects into a tree of clusters (without the need to know the number of clusters in advance)  Uses distance matrix as clustering criteria and a termination condition needed Agglomerative vs. Divisive  Two sequential clustering strategies for constructing a tree of clusters  Agglomerative: a bottom-up strategy  Initially each data object is in its own (atomic) cluster  Then merge these atomic clusters into larger and larger clusters  Divisive: a top-down strategy  Initially all objects are in one single cluster  Then the cluster is subdivided into smaller and smaller clusters
  • 19. INTRODUCTION Illustrative Example Agglomerative and divisive clustering on the data set {a, b, c, d ,e }  Cluster distance  Termination condition Step 0 Step 1 Step 2 Step 3 Step 4 b d c e a a b d e c d e a b c d e Step 4 Step 3 Step 2 Step 1 Step 0 Agglomerative Divisive
  • 20. single link (min) complete link (max) average CLUSTER DISTANCE MEASURES Single link: smallest distance between an element in one cluster and an element in the other, i.e., d(Ci, Cj) = min{d(xip, xjq)} Complete link: largest distance between an element in one cluster and an element in the other, i.e., d(Ci, Cj) = max{d(xip, xjq)} Average: avg distance between elements in one cluster and elements in the other, i.e., d(C , C ) = avg{d(x , x )}
  • 21. AGGLOMERATIVE ALGORITHM The Agglomerative algorithm is carried out in three steps: 1) Convert object attributes to distance matrix 2) Set each object as a cluster (thus if we have N objects, we will have N clusters at the beginning) 3) Repeat until number of cluster is one (or known # of clusters)  Merge two closest clusters  Update distance matrix
  • 22. Problem: clustering analysis with agglomerative algorithm Example and Demo data matrix distance matrix Euclidean distance
  • 23. Merge two closest clusters (iteration 1) Example and Demo
  • 24. Update distance matrix (iteration 1) Example and Demo
  • 25. Merge two closest clusters (iteration 2) Example and Demo
  • 26. Update distance matrix (iteration 2) Example and Demo
  • 27. Merge two closest clusters/update distance matrix (iteration 3) Example and Demo
  • 28. Merge two closest clusters/update distance matrix (iteration 4) Example and Demo
  • 29. COMP24111 MACHINE LEARNING 29 Final result (meeting termination condition) Example and Demo
  • 30. Dendrogram tree representation Example and Demo 1. In the beginning we have 6 clusters: A, B, C, D, E and F 2. We merge cluster D and F into cluster (D, F) at distance 0.50 3. We merge cluster A and cluster B into (A, B) at distance 0.71 4. We merge cluster E and (D, F) into ((D, F), E) at distance 1.00 5. We merge cluster ((D, F), E) and C into (((D, F), E), C) at distance 1.41 6. We merge cluster (((D, F), E), C) and (A, B) into ((((D, F), E), C), (A, B)) at distance 2.50 7. The last cluster contain all the objects, thus conclude the computation 2 3 4 5 6
  • 31. CLUSTERING IN R library mva: - Hierarchical clustering: hclust, heatmap - k-means: kmeans library class: - Self-organizing maps: SOM library cluster: - pam and other functions
  • 32. TUGAS T2: K-MEANS & HIERARCHICAL CLUSTERING (SECTION 4) Individu Dikerjakan di kertas folio/A4 (tulis tangan) Dikumpulkan minggu depan (Minggu 12), 21 April 2015
  • 33. NEXT WEEK (MINGGU 12) UNSUPERVISED LEARNING: ASSOCIATION RULE BAYES THEOREM Final Project: Any topic (From Week 1-14) using R, Laporan & Demo: Minggu 15 FP: kelompok, 3-4 orang Neural Network Clustering Bayesian Association Rule
  • 34. REFERENCES Flach, Peter. 2012. Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press. Tan et. al., ‘Introduction to Data Mining’, Addison Wesley, 2006 Ke Chen, University of Manchester, COMP24111 Machine Learning http://www.cs.man.ac.uk/~kechen/teaching.php Wikibooks, K-Means Example, http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clu stering/K-Means