SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
Prof. Pier Luca Lanzi
Density Based Clustering
Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
What is density-based clustering?
• Clustering based on density (local cluster criterion),
such as density-connected points
• Major features:
§Discover clusters of arbitrary shape
§Handle noise
§One scan
§Need density parameters as termination condition
• Several interesting studies:
§DBSCAN: Ester, et al. (KDD’96)
§OPTICS: Ankerst, et al (SIGMOD’99).
§DENCLUE: Hinneburg & D. Keim (KDD’98)
§CLIQUE: Agrawal, et al. (SIGMOD’98) (more grid-based)
4
Prof. Pier Luca Lanzi
DBSCAN: Basic Concepts
• The neighborhood within a radius ε of a given object is called the
ε-neighborhood of the object
• If the ε-neighborhood of an object contains at least MinPts
objects, then the object is a core object
• An object p is directly density-reachable from object q if p is
within the ε-neighborhood of q and q is a core object
• An object p is density-reachable from object q if there is a chain
of object p1, …, pn where p1=p and pn=q such that pi+1 is
directly density reachable from pi
• An object p is density-connected to q with respect to ε and
MinPts if there is an object o such that both p and q are density
reachable from o
5
Prof. Pier Luca Lanzi
DBSCAN: Basic Concepts
• Density = number of points within a specified radius (Eps)
• A border point has fewer than MinPts within Eps,
but is in the neighborhood of a core point
• A noise point is any point that is not a core point
or a border point
• A density-based cluster is a set of density-connected objects that
is maximal with respect to density-reachability
6
Prof. Pier Luca Lanzi
Density-Reachable &
Density-Connected
• Directly density-reachable • Density-reachable
• Density-connected
p
q
p1
p q
o
p
q
MinPts = 5
Eps = 1 cm
7
Prof. Pier Luca Lanzi
DBSCAN: Core, Border, and
Noise Points
8
Prof. Pier Luca Lanzi
DBSCAN
Density Based Spatial Clustering
• Relies on a density-based notion of cluster: A cluster is defined
as a maximal set of density-connected points
• Discovers clusters of arbitrary shape in spatial databases with
noise
• The Algorithm
§Arbitrary select a point p
§Retrieve all points density-reachable
from p given Eps and MinPts.
§If p is a core point, a cluster is formed.
§If p is a border point, no points are density-reachable from p
and DBSCAN visits the next point of the database
§Continue the process until all of the points have been
processed
9
Prof. Pier Luca Lanzi
Core, Border and Noise Points
Eps = 10, MinPts = 4
10
Original Points Point types: core, border and noise
Prof. Pier Luca Lanzi
When DBSCAN Works Well
• Resistant to Noise
• Can handle clusters of different shapes and sizes
Original Points Clusters
11
Prof. Pier Luca Lanzi
When DBSCAN May Fail?
• Varying densities
• High-dimensional data
Original Points
(MinPts=4, Eps=9.75).
(MinPts=4, Eps=9.92)
12
Prof. Pier Luca Lanzi
Run the python notebook
on density-based clustering
Prof. Pier Luca Lanzi
Examples using R
14
Prof. Pier Luca Lanzi
Density-Based Clustering in R
library(fpc)
set.seed(665544)
n <- 600
x <- cbind(runif(10, 0, 10)+rnorm(n, sd=0.2), runif(10, 0,
10)+rnorm(n,sd=0.2))
par(bg="grey40")
ds <- dbscan(x, 0.2, showplot=1)
15
Prof. Pier Luca Lanzi
Density-Based Clustering in R
library(fpc)
set.seed(665544)
x <- seq(0,6.28,0.1)
y <- sin(x)
xd <- x+rnorm(630,sd=0.2)
yd <- y+rnorm(630,sd=0.2)
plot(xd,yd)
par(bg="grey40")
d <- cbind(xd,yd)
# this works nicely since the epsilon is
# the same size of the standard deviation (0.2)
# used to generate the data
ds <- dbscan(d, 0.2, showplot=1)
# this does not work so nicely
ds <- dbscan(d, 0.1, showplot=1)
16
Prof. Pier Luca Lanzi
Clustering Comparisons on Sin Data 17
hierarchical clustering kmeans clustering
Prof. Pier Luca Lanzi
Clustering Comparisons on Sin Data
(k-means with 10 clusters)
18
Prof. Pier Luca Lanzi
http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/Density-Based_Clustering
Software Packages

Contenu connexe

Tendances

Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet JitterPeer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Alpen-Adria-Universität
 

Tendances (19)

DMTM Lecture 04 Classification
DMTM Lecture 04 ClassificationDMTM Lecture 04 Classification
DMTM Lecture 04 Classification
 
CNN for modeling sentence
CNN for modeling sentenceCNN for modeling sentence
CNN for modeling sentence
 
Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet JitterPeer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
 
Diversified Social Media Retrieval for News Stories
Diversified Social Media Retrieval for News StoriesDiversified Social Media Retrieval for News Stories
Diversified Social Media Retrieval for News Stories
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Distributed Hash Table
Distributed Hash TableDistributed Hash Table
Distributed Hash Table
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet Allocation
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
On using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translationOn using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translation
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Research Summary: Hidden Topic Markov Models, Gruber
Research Summary: Hidden Topic Markov Models, GruberResearch Summary: Hidden Topic Markov Models, Gruber
Research Summary: Hidden Topic Markov Models, Gruber
 
O(1) DHT
O(1) DHTO(1) DHT
O(1) DHT
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
DLBLR talk
DLBLR talkDLBLR talk
DLBLR talk
 
Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
 

Similaire à DMTM Lecture 14 Density based clustering

Similaire à DMTM Lecture 14 Density based clustering (20)

DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 17 Text Mining Part 1DMTM 2015 - 17 Text Mining Part 1
DMTM 2015 - 17 Text Mining Part 1
 
clustering density technidques in machine learning
clustering density technidques in machine learningclustering density technidques in machine learning
clustering density technidques in machine learning
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text mining
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clustering
 
Db Scan
Db ScanDb Scan
Db Scan
 
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian ClassifiersMachine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
 
density based method and expectation maximization
density based method and expectation maximizationdensity based method and expectation maximization
density based method and expectation maximization
 
DBSCAN
DBSCANDBSCAN
DBSCAN
 
3.4 density and grid methods
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methods
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
DMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparationDMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparation
 
DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2DMTM 2015 - 18 Text Mining Part 2
DMTM 2015 - 18 Text Mining Part 2
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data Preparation
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
 
WaveNet.pdf
WaveNet.pdfWaveNet.pdf
WaveNet.pdf
 
Erik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageErik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better Mortgage
 
Text classification using Text kernels
Text classification using Text kernelsText classification using Text kernels
Text classification using Text kernels
 
Python as the Zen of Data Science
Python as the Zen of Data SciencePython as the Zen of Data Science
Python as the Zen of Data Science
 
Branch & bound
Branch & boundBranch & bound
Branch & bound
 
DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethods
 

Plus de Pier Luca Lanzi

Plus de Pier Luca Lanzi (20)

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei Videogiochi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning Welcome
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di apertura
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data exploration
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph mining
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rules
 
DMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationDMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluation
 
DMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesDMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensembles
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rules
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
 
DMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationDMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluation
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representation
 
DMTM Lecture 03 Regression
DMTM Lecture 03 RegressionDMTM Lecture 03 Regression
DMTM Lecture 03 Regression
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 Introduction
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data mining
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipeline
 

Dernier

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Dernier (20)

This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

DMTM Lecture 14 Density based clustering

  • 1. Prof. Pier Luca Lanzi Density Based Clustering Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
  • 4. Prof. Pier Luca Lanzi What is density-based clustering? • Clustering based on density (local cluster criterion), such as density-connected points • Major features: §Discover clusters of arbitrary shape §Handle noise §One scan §Need density parameters as termination condition • Several interesting studies: §DBSCAN: Ester, et al. (KDD’96) §OPTICS: Ankerst, et al (SIGMOD’99). §DENCLUE: Hinneburg & D. Keim (KDD’98) §CLIQUE: Agrawal, et al. (SIGMOD’98) (more grid-based) 4
  • 5. Prof. Pier Luca Lanzi DBSCAN: Basic Concepts • The neighborhood within a radius ε of a given object is called the ε-neighborhood of the object • If the ε-neighborhood of an object contains at least MinPts objects, then the object is a core object • An object p is directly density-reachable from object q if p is within the ε-neighborhood of q and q is a core object • An object p is density-reachable from object q if there is a chain of object p1, …, pn where p1=p and pn=q such that pi+1 is directly density reachable from pi • An object p is density-connected to q with respect to ε and MinPts if there is an object o such that both p and q are density reachable from o 5
  • 6. Prof. Pier Luca Lanzi DBSCAN: Basic Concepts • Density = number of points within a specified radius (Eps) • A border point has fewer than MinPts within Eps, but is in the neighborhood of a core point • A noise point is any point that is not a core point or a border point • A density-based cluster is a set of density-connected objects that is maximal with respect to density-reachability 6
  • 7. Prof. Pier Luca Lanzi Density-Reachable & Density-Connected • Directly density-reachable • Density-reachable • Density-connected p q p1 p q o p q MinPts = 5 Eps = 1 cm 7
  • 8. Prof. Pier Luca Lanzi DBSCAN: Core, Border, and Noise Points 8
  • 9. Prof. Pier Luca Lanzi DBSCAN Density Based Spatial Clustering • Relies on a density-based notion of cluster: A cluster is defined as a maximal set of density-connected points • Discovers clusters of arbitrary shape in spatial databases with noise • The Algorithm §Arbitrary select a point p §Retrieve all points density-reachable from p given Eps and MinPts. §If p is a core point, a cluster is formed. §If p is a border point, no points are density-reachable from p and DBSCAN visits the next point of the database §Continue the process until all of the points have been processed 9
  • 10. Prof. Pier Luca Lanzi Core, Border and Noise Points Eps = 10, MinPts = 4 10 Original Points Point types: core, border and noise
  • 11. Prof. Pier Luca Lanzi When DBSCAN Works Well • Resistant to Noise • Can handle clusters of different shapes and sizes Original Points Clusters 11
  • 12. Prof. Pier Luca Lanzi When DBSCAN May Fail? • Varying densities • High-dimensional data Original Points (MinPts=4, Eps=9.75). (MinPts=4, Eps=9.92) 12
  • 13. Prof. Pier Luca Lanzi Run the python notebook on density-based clustering
  • 14. Prof. Pier Luca Lanzi Examples using R 14
  • 15. Prof. Pier Luca Lanzi Density-Based Clustering in R library(fpc) set.seed(665544) n <- 600 x <- cbind(runif(10, 0, 10)+rnorm(n, sd=0.2), runif(10, 0, 10)+rnorm(n,sd=0.2)) par(bg="grey40") ds <- dbscan(x, 0.2, showplot=1) 15
  • 16. Prof. Pier Luca Lanzi Density-Based Clustering in R library(fpc) set.seed(665544) x <- seq(0,6.28,0.1) y <- sin(x) xd <- x+rnorm(630,sd=0.2) yd <- y+rnorm(630,sd=0.2) plot(xd,yd) par(bg="grey40") d <- cbind(xd,yd) # this works nicely since the epsilon is # the same size of the standard deviation (0.2) # used to generate the data ds <- dbscan(d, 0.2, showplot=1) # this does not work so nicely ds <- dbscan(d, 0.1, showplot=1) 16
  • 17. Prof. Pier Luca Lanzi Clustering Comparisons on Sin Data 17 hierarchical clustering kmeans clustering
  • 18. Prof. Pier Luca Lanzi Clustering Comparisons on Sin Data (k-means with 10 clusters) 18
  • 19. Prof. Pier Luca Lanzi http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/Density-Based_Clustering Software Packages