SlideShare une entreprise Scribd logo
1  sur  27
Spatial Data Mining
Presented by-:
Rajkumar jain
M.tech (c.s.e)
1st
year (2nd
sem)
Overview
• What is spatial data.
• What makes spatial data mining
different.
• Spatial data mining task
• Spatial data properties
• Clustering analysis
• Trend analysis
• Future parameter
2
What is Spatial Data?
• Objects of
types:
– points
– lines
– polygons
– etc.
Used in/for:
GIS - Geographic Information
Systems
GPS - Global Positioning System
Environmental studies
etc …
4
Introduction
• Spatial data mining is the process of discovering
interesting, useful, non-trivial patterns from large
spatial datasets
– E.g. Determining hotspots: unusual locations.
• Spatial Data Mining Tasks
– Characteristics rule.
– Discriminate rule.
E.g. Comparison of price ranges of different geographical
area.
– Association rule-: we can associate the non spatial attribute
to spatial attribute or spatial attribute to spatial attribute.
– Clustering rule-: helpful to find outlier detection which is
useful to find suspicious knowledge E.g. Group crime location.
- Classification rule-: it defines whether a spatial entity belong
to a particular class or how many classes will be classified.
e.g. Remote sensed image based on spectrum and GIS data.
- Trend detection-A trend is a temporal pattern in some time
series data. Spatial trend is defined as consider a non spatial
attribute which is the neighbour of a spatial data object.
• Properties of Spatial Data
– Spatial autocorrelation
– Spatial heterogeneity
– Implicit Spatial Relations
5
Hetrogeneity of Spatial Data
• Auto correlation.
• Patterns usually have to be defined in the
spatial attribute subspace and not in the
complete attribute space.
• Longitude and latitude (or other
coordinate systems) are the glue that
link different data collections together.
• People are used to maps in GIS
therefore, data mining results have to
summarized on the top of maps.
• Patterns not only refer to points, but
can also refer to lines, or polygons or
other higher order geometrical objects
7
Autocorrelation
• Items in a traditional data are
independent of each other,
– whereas properties of locations in a map are
often “auto-correlated”.
• First law of geography [Tobler]:
– Everything is related to everything, but
nearby things are more related than distant
things.
– People with similar backgrounds tend to live
in the same area.
– Economies of nearby regions tend to be
similar.
– Changes in temperature occur gradually
over space.
9
10
Spatial Relations
• Spatial databases do not store spatial
relations explicitly
– Additional functionality required to compute them
• Three types of spatial relations specified by
the OGC reference model
– Distance relations
• Euclidean distance between two spatial features
– Direction relations
• Ordering of spatial features in space
– Topological relations
• Characterise the type of intersection between spatial
features
11
Distance relations
• If dist is a distance
function and c is
some real number
1. dist(A,B)>c,
2. dist(A,B)<c and
3. dist(A,B)=c
A
B
A B
BA
12
Direction relations
• If directions of B and C
are required with
respect to A
• Define a representative
point, rep(A)
• rep(A) defines the
origin of a virtual
coordinate system
• The quadrants and half
planes define the
direction relations
• B can have two values
{northeast, east}
• Exact direction relation
is northeast
A
C
B
rep(A)
C north A
B northeast A
13
Topological Relations
• Topological relations describe how geometries
intersect spatially.
• Simple geometry types
– Point, 0-dimension
– Line, 1-dimension
– Polygon, 2-dimension
• Each geometry represented in terms of
– boundary (B) – geometry of the lower dimension
– interior (I) – points of the geometry when boundary is
removed
– exterior (E) – points not in the interior or boundary
14
DE-9IM
• Topological relations are defined using any
one of the following models
– 4IM, four intersection model (only B and E
considered)
– 9IM, nine intersection models (B, I, and E)
– DE-9IM, dimensionally extended 9 intersection
model.
• Dim is the dimension function
15
Example
• Consider two
polygons
– A - POLYGON ((10
10, 15 0, 25 0, 30 10,
25 20, 15 20, 10 10))
– B - POLYGON ((20
10, 30 0, 40 10, 30
20, 20 10))
16
I(B) B(B) E(B)
I(A)
B(A)
E(A)
9-Intersection Matrix of example
geometries
17
DE-9IM for the example
geometries
I(B) B(B) E(B)
I(A) 2 1 2
B(A) 1 0 1
E(A) 2 1 2
18
Relationships using DE-9IM
• Different geometries may give
rise to different numbers in the
DE-9IM
• For a specific type of
relationship we are only
interested in certain values in
certain positions
– That is, we are interested in
patterns in the matrix than
actual values
• Actual values are replaced by
wild cards
– T: value is "true" - non empty -
any dimension >= 0
– F: value is "false" - empty -
dimension < 0
– *: Don't care what the value is
– 0: value is exactly zero
– 1: value is exactly one
– 2: value is exactly two
A
over
laps
B
I(B) B(B) E(B)
I(A) T * T
B(A) * * *
E(A) T * *
19
Cluster analysis
• Cluster analysis divides data into meaningful or useful groups
(clusters). Cluster analysis is very useful in spatial databases.
For example, by grouping feature vectors as clusters can be
used to create thematic maps which are useful in geographic
information systems.
• CLUSTERING METHODS FOR SPATIAL DATA MINING
1. Partitioning Around Medoids (PAM)- PAM is similar to K- means
algorithm. Like k- means algorithm, PAM divides data sets into
groups but based on medoids. Whereas k- means is based on
centroids. By using medoids, we can reduce the dissimilarity of
objects within a cluster. In PAM, first calculate the medoid,
then assigned the object to the nearest medoid, which forms a
cluster.
• let i be a object, vi be a cluster then i is nearer to medoids mvi
than mw d(i ,mvi)<d(i, mw) here w=1,2,……..k.
The k representative objects should minimize the objective
function, which is the sum of the dissimilarities of all objects to
their nearest medoid: Objective function = S d(i, mvi)
• Clustering Large Applications(CLARA)
• Compared to PAM, CLARA can deal with much larger data sets.
Like PAM CLARA also finds objects that are centrally located in
the clusters. The main problem with PAM is that it finds the
entire dissimilarity matrix at a time. So for n objects, the space
complexity of PAM becomes O(n2). But CLARA avoid this
problem. CLARA accepts only the actual measurements (i.e.,. n ´
p data matrix).
• CLARA assigns objects to clusters in the following way:
• BUILD-step: Select k "centrally located" objects, to be used as
initial medoids. Now the smallest possible average distance
between the objects to their medoids are selected, that forms
clusters.
• SWAP-step: Try to decrease the average distance between the
objects and the medoids. This is done by replacing
representative objects. Now an object that does not belong to
the sample is assigned to the nearest medoids.
20
Trend analysis
• Spatial trend-: it is regular change of one or more non spatial
attribute.
E.g. when we move away eastward from the cyber tower, the
rental of residential house decrease approximately at the rate
of 5% per km.
• This trend is identified by neighborhood path starting from
location O and regression analysis is performed on the
respective attribute values for the object of a neighborhood
path to describe the regularity of change.
there are two algorithm to determine the global trend and local
trend.
• Global trend-:
here if considering all the object on all path starting from O,
the values for the specified attribute in general trend tend to
increase or decrease with increasing distance or decreasing
distance. 21
• Local trend-:
it consider the detect single path starting from an object O
and having a certain trend. E.g. some trends may be positive
while the other may be negative.
22
Spatial trend detection
• E.g. Let g be graph and O is an object in g and let a is a non
special attribute on which we are detecting changing pattern
while we move away from O in the neighborhood graph.
• Here let be a filter which indicate subset of neighbor to be
taken into consideration.
• Let min_conf be real number.
• Let min_length and max_length initialized with natural
number and here difference of distance must be between
these.
23
Architecture of Spatial Data mining
24
HUMAN COMPUTER INTERACTION SYSTEM
SPATIAL DATA
MINING SYSTEM,
DISCOVERABLE
KNOWLEDGE
DATA RELATED TO
PROBLEM
KNOWLEDGE BASE
MANAGEMENT
SYSTEM
SPATIAL DATABASE
SPATIAL DATA
BASE
MANAGEMENT
SYSTEM
DOMAIN
KNOWLEDGE
DATABASE
Examples of Spatial Patterns
• 1855 Asiatic Cholera in London.
– A water pump identified as the source.
• Crime hotspots for planning police
patrol routes.
• Affects of weather in the US caused by
unusual warming of Pacific ocean (El
Nino).
26
Future scope
• Data mining in Spatial Object Oriented Databases:
How can the object oriented approach be used to design a
spatial database. Object Oriented Database may be a better
choice for handling spatial data rather than traditional relational
or extended relational models. For example, rectangles,
polygons, and more complex spatial objects can be model
naturally in object oriented database.
• Parallel data mining can use because here it takes much
computational time to process the spatial data.
Thank you
27

Contenu connexe

Tendances

Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed SystemSunita Sahu
 
4.5 mining the worldwideweb
4.5 mining the worldwideweb4.5 mining the worldwideweb
4.5 mining the worldwidewebKrish_ver2
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memoryAshish Kumar
 
Semantic nets in artificial intelligence
Semantic nets in artificial intelligenceSemantic nets in artificial intelligence
Semantic nets in artificial intelligenceharshita virwani
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...sumithragunasekaran
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data miningkavitha muneeshwaran
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 

Tendances (20)

Deductive databases
Deductive databasesDeductive databases
Deductive databases
 
Spatial Database
Spatial DatabaseSpatial Database
Spatial Database
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
4.5 mining the worldwideweb
4.5 mining the worldwideweb4.5 mining the worldwideweb
4.5 mining the worldwideweb
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
 
Semantic nets in artificial intelligence
Semantic nets in artificial intelligenceSemantic nets in artificial intelligence
Semantic nets in artificial intelligence
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data mining
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Text mining
Text miningText mining
Text mining
 

Similaire à Spatial data mining

ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleKuldeep Jiwani
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsMason Porter
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
Spatial Data Mining : Seminar
Spatial Data Mining : SeminarSpatial Data Mining : Seminar
Spatial Data Mining : SeminarIpsit Dash
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSLiemNguyenDuy
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringDMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringPier Luca Lanzi
 
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptxFassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptxHannesFesswald
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clusteringKrish_ver2
 
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos..."Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...Quantopian
 

Similaire à Spatial data mining (20)

UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
 
Fa18_P2.pptx
Fa18_P2.pptxFa18_P2.pptx
Fa18_P2.pptx
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Topology for data science
Topology for data scienceTopology for data science
Topology for data science
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
Spatial Data Mining : Seminar
Spatial Data Mining : SeminarSpatial Data Mining : Seminar
Spatial Data Mining : Seminar
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
PPT s07-machine vision-s2
PPT s07-machine vision-s2PPT s07-machine vision-s2
PPT s07-machine vision-s2
 
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringDMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to Clustering
 
[PPT]
[PPT][PPT]
[PPT]
 
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptxFassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clustering
 
Data Mining Lecture_5.pptx
Data Mining Lecture_5.pptxData Mining Lecture_5.pptx
Data Mining Lecture_5.pptx
 
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos..."Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
 
Cs501 cluster analysis
Cs501 cluster analysisCs501 cluster analysis
Cs501 cluster analysis
 

Dernier

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Spatial data mining

  • 1. Spatial Data Mining Presented by-: Rajkumar jain M.tech (c.s.e) 1st year (2nd sem)
  • 2. Overview • What is spatial data. • What makes spatial data mining different. • Spatial data mining task • Spatial data properties • Clustering analysis • Trend analysis • Future parameter 2
  • 3. What is Spatial Data? • Objects of types: – points – lines – polygons – etc. Used in/for: GIS - Geographic Information Systems GPS - Global Positioning System Environmental studies etc …
  • 4. 4 Introduction • Spatial data mining is the process of discovering interesting, useful, non-trivial patterns from large spatial datasets – E.g. Determining hotspots: unusual locations. • Spatial Data Mining Tasks – Characteristics rule. – Discriminate rule. E.g. Comparison of price ranges of different geographical area. – Association rule-: we can associate the non spatial attribute to spatial attribute or spatial attribute to spatial attribute. – Clustering rule-: helpful to find outlier detection which is useful to find suspicious knowledge E.g. Group crime location.
  • 5. - Classification rule-: it defines whether a spatial entity belong to a particular class or how many classes will be classified. e.g. Remote sensed image based on spectrum and GIS data. - Trend detection-A trend is a temporal pattern in some time series data. Spatial trend is defined as consider a non spatial attribute which is the neighbour of a spatial data object. • Properties of Spatial Data – Spatial autocorrelation – Spatial heterogeneity – Implicit Spatial Relations 5
  • 6. Hetrogeneity of Spatial Data • Auto correlation. • Patterns usually have to be defined in the spatial attribute subspace and not in the complete attribute space. • Longitude and latitude (or other coordinate systems) are the glue that link different data collections together. • People are used to maps in GIS therefore, data mining results have to summarized on the top of maps.
  • 7. • Patterns not only refer to points, but can also refer to lines, or polygons or other higher order geometrical objects 7
  • 8. Autocorrelation • Items in a traditional data are independent of each other, – whereas properties of locations in a map are often “auto-correlated”. • First law of geography [Tobler]: – Everything is related to everything, but nearby things are more related than distant things. – People with similar backgrounds tend to live in the same area.
  • 9. – Economies of nearby regions tend to be similar. – Changes in temperature occur gradually over space. 9
  • 10. 10 Spatial Relations • Spatial databases do not store spatial relations explicitly – Additional functionality required to compute them • Three types of spatial relations specified by the OGC reference model – Distance relations • Euclidean distance between two spatial features – Direction relations • Ordering of spatial features in space – Topological relations • Characterise the type of intersection between spatial features
  • 11. 11 Distance relations • If dist is a distance function and c is some real number 1. dist(A,B)>c, 2. dist(A,B)<c and 3. dist(A,B)=c A B A B BA
  • 12. 12 Direction relations • If directions of B and C are required with respect to A • Define a representative point, rep(A) • rep(A) defines the origin of a virtual coordinate system • The quadrants and half planes define the direction relations • B can have two values {northeast, east} • Exact direction relation is northeast A C B rep(A) C north A B northeast A
  • 13. 13 Topological Relations • Topological relations describe how geometries intersect spatially. • Simple geometry types – Point, 0-dimension – Line, 1-dimension – Polygon, 2-dimension • Each geometry represented in terms of – boundary (B) – geometry of the lower dimension – interior (I) – points of the geometry when boundary is removed – exterior (E) – points not in the interior or boundary
  • 14. 14 DE-9IM • Topological relations are defined using any one of the following models – 4IM, four intersection model (only B and E considered) – 9IM, nine intersection models (B, I, and E) – DE-9IM, dimensionally extended 9 intersection model. • Dim is the dimension function
  • 15. 15 Example • Consider two polygons – A - POLYGON ((10 10, 15 0, 25 0, 30 10, 25 20, 15 20, 10 10)) – B - POLYGON ((20 10, 30 0, 40 10, 30 20, 20 10))
  • 16. 16 I(B) B(B) E(B) I(A) B(A) E(A) 9-Intersection Matrix of example geometries
  • 17. 17 DE-9IM for the example geometries I(B) B(B) E(B) I(A) 2 1 2 B(A) 1 0 1 E(A) 2 1 2
  • 18. 18 Relationships using DE-9IM • Different geometries may give rise to different numbers in the DE-9IM • For a specific type of relationship we are only interested in certain values in certain positions – That is, we are interested in patterns in the matrix than actual values • Actual values are replaced by wild cards – T: value is "true" - non empty - any dimension >= 0 – F: value is "false" - empty - dimension < 0 – *: Don't care what the value is – 0: value is exactly zero – 1: value is exactly one – 2: value is exactly two A over laps B I(B) B(B) E(B) I(A) T * T B(A) * * * E(A) T * *
  • 19. 19 Cluster analysis • Cluster analysis divides data into meaningful or useful groups (clusters). Cluster analysis is very useful in spatial databases. For example, by grouping feature vectors as clusters can be used to create thematic maps which are useful in geographic information systems. • CLUSTERING METHODS FOR SPATIAL DATA MINING 1. Partitioning Around Medoids (PAM)- PAM is similar to K- means algorithm. Like k- means algorithm, PAM divides data sets into groups but based on medoids. Whereas k- means is based on centroids. By using medoids, we can reduce the dissimilarity of objects within a cluster. In PAM, first calculate the medoid, then assigned the object to the nearest medoid, which forms a cluster. • let i be a object, vi be a cluster then i is nearer to medoids mvi than mw d(i ,mvi)<d(i, mw) here w=1,2,……..k. The k representative objects should minimize the objective function, which is the sum of the dissimilarities of all objects to their nearest medoid: Objective function = S d(i, mvi)
  • 20. • Clustering Large Applications(CLARA) • Compared to PAM, CLARA can deal with much larger data sets. Like PAM CLARA also finds objects that are centrally located in the clusters. The main problem with PAM is that it finds the entire dissimilarity matrix at a time. So for n objects, the space complexity of PAM becomes O(n2). But CLARA avoid this problem. CLARA accepts only the actual measurements (i.e.,. n ´ p data matrix). • CLARA assigns objects to clusters in the following way: • BUILD-step: Select k "centrally located" objects, to be used as initial medoids. Now the smallest possible average distance between the objects to their medoids are selected, that forms clusters. • SWAP-step: Try to decrease the average distance between the objects and the medoids. This is done by replacing representative objects. Now an object that does not belong to the sample is assigned to the nearest medoids. 20
  • 21. Trend analysis • Spatial trend-: it is regular change of one or more non spatial attribute. E.g. when we move away eastward from the cyber tower, the rental of residential house decrease approximately at the rate of 5% per km. • This trend is identified by neighborhood path starting from location O and regression analysis is performed on the respective attribute values for the object of a neighborhood path to describe the regularity of change. there are two algorithm to determine the global trend and local trend. • Global trend-: here if considering all the object on all path starting from O, the values for the specified attribute in general trend tend to increase or decrease with increasing distance or decreasing distance. 21
  • 22. • Local trend-: it consider the detect single path starting from an object O and having a certain trend. E.g. some trends may be positive while the other may be negative. 22
  • 23. Spatial trend detection • E.g. Let g be graph and O is an object in g and let a is a non special attribute on which we are detecting changing pattern while we move away from O in the neighborhood graph. • Here let be a filter which indicate subset of neighbor to be taken into consideration. • Let min_conf be real number. • Let min_length and max_length initialized with natural number and here difference of distance must be between these. 23
  • 24. Architecture of Spatial Data mining 24 HUMAN COMPUTER INTERACTION SYSTEM SPATIAL DATA MINING SYSTEM, DISCOVERABLE KNOWLEDGE DATA RELATED TO PROBLEM KNOWLEDGE BASE MANAGEMENT SYSTEM SPATIAL DATABASE SPATIAL DATA BASE MANAGEMENT SYSTEM DOMAIN KNOWLEDGE DATABASE
  • 25. Examples of Spatial Patterns • 1855 Asiatic Cholera in London. – A water pump identified as the source. • Crime hotspots for planning police patrol routes. • Affects of weather in the US caused by unusual warming of Pacific ocean (El Nino).
  • 26. 26 Future scope • Data mining in Spatial Object Oriented Databases: How can the object oriented approach be used to design a spatial database. Object Oriented Database may be a better choice for handling spatial data rather than traditional relational or extended relational models. For example, rectangles, polygons, and more complex spatial objects can be model naturally in object oriented database. • Parallel data mining can use because here it takes much computational time to process the spatial data.