SlideShare a Scribd company logo
1 of 49
Data Mining Engine for Enterprise GIS AkashDwivedi 	  (09IT6001)    		          Under the guidance of  Prof. S.K. Ghosh                         School of Information Technology                    Indian Institute of Technology, Kharagpur
OUTLINE  4/2/2011 2
OBJECTIVES  4/2/2011 3
What is Spatial Data? ,[object Object]
traffic, bird habitats, global climate, logistics, ...
Object types:
Points, Lines, Polygons ,etc. Used in/for: ,[object Object]
Meteorology
Astronomy
Environmental studies, etc.4/2/2011 4
What is Special about Spatial Data 4/2/2011 5
Why Data Mining in Spatial Data 4/2/2011 6
Spatial Data + Web Services= OGC (Open Geospatial Consortium) 4/2/2011 7
Proposed Architecture of Enterprise GIS 4/2/2011 8 Semantic Resolution of query DB1 Client Map Overlay Query Broker (service composition) WFS DB 2 WFS Spatial Data mining Engine WMS DB n WPS Fig.1: Architecture of Enterprise GIS
Data Mining Engine Framework Fig.2: Data mining engine framework 4/2/2011 9
Spatial Outlier Detection
Spatial Outlier Fig.3 : Palm Beach county as spatial outlier (source : http://madison.hss.cmu.edu/buchanan-bush.gif) 4/2/2011 11
Spatial Outlier Detection Problem 4/2/2011 12
Back To Our Motivating Example:- 4/2/2011 13
Results (Classical Data Mining Algorithms) 4/2/2011 14
Results for the above methods Fig.4 :Outliers in red color Fig. 5:Outliers in Brown color 4/2/2011 15
Results(Spatial data mining algorithms)  4/2/2011 16
LAG based approach 4/2/2011 17
LAG based approach contd.. Fig. 6: LAG Based Box Map 4/2/2011 18
Using Moran Scatter Plot Fig.7 Moran scatter plot, yellow points are spatial outliers 4/2/2011 19
Verification  Fig. 8: LISA cluster map, Outliers in Red color 4/2/2011 20
Verification Contd… Fig. 9 : Relation between HR7984 and PE82 4/2/2011 21
Any Reasons  Fig.12 :Scatterplotbw RDAC80 and HR7984 outliers in yellow color.			. 4/2/2011 22
Spatial Cluster Analysis
4/2/2011 24
While choosing a clustering algorithm many factors have to be considered like:  4/2/2011 25
Spatial Clustering Problem Definition 4/2/2011 26 Given,
Problem Definition Contd… 4/2/2011 27
Back To Our Motivating Example:- 4/2/2011 28
Experimental Setup 4/2/2011 29 Table 1. Experimental Setup details
Analysis Histogram Figure 13: Histogram of House price Data We can roughly model with a mixture of components. 4/2/2011 30
Results for K=2, Using NEM  Figure 14: Clustering Results for K=2, High priced Houses in  in Brown color 4/2/2011 31
Results for k=3, Using NEM Figure 15:k=3, High Prices building shown in red color  4/2/2011 32
Semantic Enrichment using spatial clustering
Problem Definition 4/2/2011 34
Proposed Solution 4/2/2011 35
Framework  Figure 16: Semantic enrichment of clusters 4/2/2011 36
Framework Contd… 4/2/2011 37
Reasoning of ontology for implicit knowledge 4/2/2011 38
Results: Ontology  Figure 17:Data ontology for Baltimore House price data 4/2/2011 39
Results Contd… Reasoning , ABox reasoning done to this ontology using SPARQL. Sample Query:  Figure 18:SPARQL Query page 4/2/2011 40
Results Contd… Result for the given query Figure 19: Result for the given query 4/2/2011 41
Future Work 4/2/2011 42
References 4/2/2011 43 [1] P. Bolstad, "GIS fundamentals," A first text on Geographic Information Systems, 2002. [2] S. and Chawla, S. Shekhar, "Spatial databases: a tour," Upper Saddle River, New Jersey, vol. 7458. [3] K. and Adhikary, J. and Han, J. Koperski, "Spatial data mining: progress and challenges survey paper," in Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Montreal, Canada., 1996. [4] R. and Srikant, R. Agrawal, "Fast algorithms for mining association rules," in Proc. 20th Int. Conf. Very Large Data Bases, VLDB., 1994, vol. 1215, pp. 487--499. [5] J.R. Quinlan, C4. 5: programs for machine learning.: Morgan Kaufmann, 1993. [6] V. and Lewis, T. Barnett, Outliers in statistical data. New York: Wiley , 1994. [7] A.K. and Dubes, R.C. Jain, Algorithms for clustering data., 1988. [8] L. and Procopiuc, O. and Ramaswamy, S. and Suel, T. and Vitter, J.S. Arge, "Scalable sweeping-based spatial join," in PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES., 1998, pp. 570--581. [9] Y. Chou,.: Onward Press, 1997. [10]H.P. Kriegel, R.T. Ng, and J. Sander M.M. Breunig, "Optics-of: Id ntifying local outliers," Proc. of PKDD, pp. 262-270, 1999.

More Related Content

What's hot

Geographical information system unit 2
Geographical information  system unit 2Geographical information  system unit 2
Geographical information system unit 2WE-IT TUTORIALS
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )designQube
 
Geographical information system unit 5
Geographical information  system unit 5Geographical information  system unit 5
Geographical information system unit 5WE-IT TUTORIALS
 
Improvement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationImprovement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationBeniamino Murgante
 
Big Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsBig Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsHPCC Systems
 
GIS in land suitability mapping
GIS in land suitability mappingGIS in land suitability mapping
GIS in land suitability mappingGlory Enaruvbe
 
Surface Representations using GIS AND Topographical Mapping
Surface Representations using GIS AND Topographical MappingSurface Representations using GIS AND Topographical Mapping
Surface Representations using GIS AND Topographical MappingNAXA-Developers
 
Database gis fundamentals
Database gis fundamentalsDatabase gis fundamentals
Database gis fundamentalsSumant Diwakar
 
A quick overview of geospatial analysis
A quick overview of geospatial analysisA quick overview of geospatial analysis
A quick overview of geospatial analysisMd.Farhad Hossen
 
Rahul seminar1 for_slideshare
Rahul seminar1 for_slideshareRahul seminar1 for_slideshare
Rahul seminar1 for_slideshareRahulSingh769902
 
Lect 1 & 2 introduction to gis & rs
Lect 1 & 2  introduction to gis & rsLect 1 & 2  introduction to gis & rs
Lect 1 & 2 introduction to gis & rsRehana Jamal
 
Remote Sensing: Overlay Analysis
Remote Sensing: Overlay AnalysisRemote Sensing: Overlay Analysis
Remote Sensing: Overlay AnalysisKamlesh Kumar
 
Basic of gis concept and theories
Basic of gis concept and theoriesBasic of gis concept and theories
Basic of gis concept and theoriesMohsin Siddique
 
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial DatabaseTYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial DatabaseArti Parab Academics
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data miningMITS Gwalior
 
Spatial data analysis 2
Spatial data analysis 2Spatial data analysis 2
Spatial data analysis 2Johan Blomme
 

What's hot (20)

Gis Concepts 1/5
Gis Concepts 1/5Gis Concepts 1/5
Gis Concepts 1/5
 
Geographical information system unit 2
Geographical information  system unit 2Geographical information  system unit 2
Geographical information system unit 2
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )
 
Geographical information system unit 5
Geographical information  system unit 5Geographical information  system unit 5
Geographical information system unit 5
 
Improvement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationImprovement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data Conflation
 
Iccsa stankuteha180611
Iccsa stankuteha180611Iccsa stankuteha180611
Iccsa stankuteha180611
 
Big Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsBig Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC Systems
 
GIS in land suitability mapping
GIS in land suitability mappingGIS in land suitability mapping
GIS in land suitability mapping
 
Surface Representations using GIS AND Topographical Mapping
Surface Representations using GIS AND Topographical MappingSurface Representations using GIS AND Topographical Mapping
Surface Representations using GIS AND Topographical Mapping
 
Database gis fundamentals
Database gis fundamentalsDatabase gis fundamentals
Database gis fundamentals
 
A quick overview of geospatial analysis
A quick overview of geospatial analysisA quick overview of geospatial analysis
A quick overview of geospatial analysis
 
Rahul seminar1 for_slideshare
Rahul seminar1 for_slideshareRahul seminar1 for_slideshare
Rahul seminar1 for_slideshare
 
Lect 1 & 2 introduction to gis & rs
Lect 1 & 2  introduction to gis & rsLect 1 & 2  introduction to gis & rs
Lect 1 & 2 introduction to gis & rs
 
Remote Sensing: Overlay Analysis
Remote Sensing: Overlay AnalysisRemote Sensing: Overlay Analysis
Remote Sensing: Overlay Analysis
 
Basic of gis concept and theories
Basic of gis concept and theoriesBasic of gis concept and theories
Basic of gis concept and theories
 
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial DatabaseTYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
TYBSC IT PGIS Unit I Chapter II Geographic Information and Spacial Database
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Spatial data for GIS
Spatial data for GISSpatial data for GIS
Spatial data for GIS
 
Spatial data analysis 2
Spatial data analysis 2Spatial data analysis 2
Spatial data analysis 2
 
TYBSC IT SEM 6 GIS
TYBSC IT SEM 6 GISTYBSC IT SEM 6 GIS
TYBSC IT SEM 6 GIS
 

Viewers also liked

Studio USB Media Platform
Studio USB Media PlatformStudio USB Media Platform
Studio USB Media Platformmonot595
 
Migration decision and power relations
Migration decision and power relationsMigration decision and power relations
Migration decision and power relationsFlora Lin
 
Liquidazione componenti società mista
Liquidazione componenti società mistaLiquidazione componenti società mista
Liquidazione componenti società mistanuovaideadomani
 
Global political economy__local_disadvantages__and_transnational
Global political economy__local_disadvantages__and_transnationalGlobal political economy__local_disadvantages__and_transnational
Global political economy__local_disadvantages__and_transnationalFlora Lin
 
Presentation1.joanna embley.
Presentation1.joanna embley.Presentation1.joanna embley.
Presentation1.joanna embley.JoannaEmbley
 
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...Thierry Badard
 

Viewers also liked (9)

Studio USB Media Platform
Studio USB Media PlatformStudio USB Media Platform
Studio USB Media Platform
 
Osservazioni murgiaviva
Osservazioni murgiavivaOsservazioni murgiaviva
Osservazioni murgiaviva
 
Migration decision and power relations
Migration decision and power relationsMigration decision and power relations
Migration decision and power relations
 
Liquidazione componenti società mista
Liquidazione componenti società mistaLiquidazione componenti società mista
Liquidazione componenti società mista
 
Global political economy__local_disadvantages__and_transnational
Global political economy__local_disadvantages__and_transnationalGlobal political economy__local_disadvantages__and_transnational
Global political economy__local_disadvantages__and_transnational
 
Presentation1.joanna embley.
Presentation1.joanna embley.Presentation1.joanna embley.
Presentation1.joanna embley.
 
Cimitero degradato
Cimitero degradatoCimitero degradato
Cimitero degradato
 
Webquest
WebquestWebquest
Webquest
 
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
 

Similar to Presentation1.1

Scalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for multidimensional indexingScalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for multidimensional indexingeSAT Journals
 
Scalable and efficient cluster based framework for
Scalable and efficient cluster based framework forScalable and efficient cluster based framework for
Scalable and efficient cluster based framework foreSAT Publishing House
 
GraRep: Learning Graph Representations with Global Structural Information.pptx
GraRep: Learning Graph Representations with Global Structural Information.pptxGraRep: Learning Graph Representations with Global Structural Information.pptx
GraRep: Learning Graph Representations with Global Structural Information.pptxssuser2624f71
 
Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...
Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...
Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...Niki Pavlopoulou
 
Performance of RGB and L Base Supervised Classification Technique Using Multi...
Performance of RGB and L Base Supervised Classification Technique Using Multi...Performance of RGB and L Base Supervised Classification Technique Using Multi...
Performance of RGB and L Base Supervised Classification Technique Using Multi...IJERA Editor
 
Urbanization Detection Using LiDAR-Based Remote Sensing.pdf
Urbanization Detection Using LiDAR-Based Remote Sensing.pdfUrbanization Detection Using LiDAR-Based Remote Sensing.pdf
Urbanization Detection Using LiDAR-Based Remote Sensing.pdfEngrMuhammadimranGha1
 
JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994
JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994
JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994Jaime A. Lugo
 
Topographic Information System of Federal School of Surveying, Oyo East Local...
Topographic Information System of Federal School of Surveying, Oyo East Local...Topographic Information System of Federal School of Surveying, Oyo East Local...
Topographic Information System of Federal School of Surveying, Oyo East Local...IJAEMSJORNAL
 
Top 10 Download Article in Computer Science & Information Technology: March 2021
Top 10 Download Article in Computer Science & Information Technology: March 2021Top 10 Download Article in Computer Science & Information Technology: March 2021
Top 10 Download Article in Computer Science & Information Technology: March 2021AIRCC Publishing Corporation
 
Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...
Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...
Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...Keith.May
 
20131106 acm geocrowd
20131106 acm geocrowd20131106 acm geocrowd
20131106 acm geocrowdDongpo Deng
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression IJECEIAES
 
Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...iosrjce
 
A framework for outlier detection in
A framework for outlier detection inA framework for outlier detection in
A framework for outlier detection inijfcstjournal
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningijitcs
 
Redistricting Algorithms
Redistricting AlgorithmsRedistricting Algorithms
Redistricting AlgorithmsMicah Altman
 
Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Waqas Tariq
 
Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...
Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...
Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...GagandeepKaur872517
 
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning Phuc Nguyen
 

Similar to Presentation1.1 (20)

Scalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for multidimensional indexingScalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for multidimensional indexing
 
Scalable and efficient cluster based framework for
Scalable and efficient cluster based framework forScalable and efficient cluster based framework for
Scalable and efficient cluster based framework for
 
GraRep: Learning Graph Representations with Global Structural Information.pptx
GraRep: Learning Graph Representations with Global Structural Information.pptxGraRep: Learning Graph Representations with Global Structural Information.pptx
GraRep: Learning Graph Representations with Global Structural Information.pptx
 
Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...
Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...
Using Embeddings for Dynamic Diverse Summarisation in Heterogeneous Graph Str...
 
Performance of RGB and L Base Supervised Classification Technique Using Multi...
Performance of RGB and L Base Supervised Classification Technique Using Multi...Performance of RGB and L Base Supervised Classification Technique Using Multi...
Performance of RGB and L Base Supervised Classification Technique Using Multi...
 
Urbanization Detection Using LiDAR-Based Remote Sensing.pdf
Urbanization Detection Using LiDAR-Based Remote Sensing.pdfUrbanization Detection Using LiDAR-Based Remote Sensing.pdf
Urbanization Detection Using LiDAR-Based Remote Sensing.pdf
 
JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994
JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994
JLugo Thesis (MA in Geography) Triangulated Quadtree Sequencing-1994
 
Topographic Information System of Federal School of Surveying, Oyo East Local...
Topographic Information System of Federal School of Surveying, Oyo East Local...Topographic Information System of Federal School of Surveying, Oyo East Local...
Topographic Information System of Federal School of Surveying, Oyo East Local...
 
Top 10 Download Article in Computer Science & Information Technology: March 2021
Top 10 Download Article in Computer Science & Information Technology: March 2021Top 10 Download Article in Computer Science & Information Technology: March 2021
Top 10 Download Article in Computer Science & Information Technology: March 2021
 
Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...
Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...
Space-Time in the Matrix and Uses of Allen Temporal Operators for Stratigraph...
 
20131106 acm geocrowd
20131106 acm geocrowd20131106 acm geocrowd
20131106 acm geocrowd
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression
 
Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...
 
A framework for outlier detection in
A framework for outlier detection inA framework for outlier detection in
A framework for outlier detection in
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learning
 
Redistricting Algorithms
Redistricting AlgorithmsRedistricting Algorithms
Redistricting Algorithms
 
Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...
 
Poster
PosterPoster
Poster
 
Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...
Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...
Comparative Analysis of RMSE and MAP Metrices for Evaluating CNN and LSTM Mod...
 
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Presentation1.1

  • 1. Data Mining Engine for Enterprise GIS AkashDwivedi (09IT6001) Under the guidance of Prof. S.K. Ghosh School of Information Technology Indian Institute of Technology, Kharagpur
  • 4.
  • 5. traffic, bird habitats, global climate, logistics, ...
  • 7.
  • 11. What is Special about Spatial Data 4/2/2011 5
  • 12. Why Data Mining in Spatial Data 4/2/2011 6
  • 13. Spatial Data + Web Services= OGC (Open Geospatial Consortium) 4/2/2011 7
  • 14. Proposed Architecture of Enterprise GIS 4/2/2011 8 Semantic Resolution of query DB1 Client Map Overlay Query Broker (service composition) WFS DB 2 WFS Spatial Data mining Engine WMS DB n WPS Fig.1: Architecture of Enterprise GIS
  • 15. Data Mining Engine Framework Fig.2: Data mining engine framework 4/2/2011 9
  • 17. Spatial Outlier Fig.3 : Palm Beach county as spatial outlier (source : http://madison.hss.cmu.edu/buchanan-bush.gif) 4/2/2011 11
  • 18. Spatial Outlier Detection Problem 4/2/2011 12
  • 19. Back To Our Motivating Example:- 4/2/2011 13
  • 20. Results (Classical Data Mining Algorithms) 4/2/2011 14
  • 21. Results for the above methods Fig.4 :Outliers in red color Fig. 5:Outliers in Brown color 4/2/2011 15
  • 22. Results(Spatial data mining algorithms) 4/2/2011 16
  • 23. LAG based approach 4/2/2011 17
  • 24. LAG based approach contd.. Fig. 6: LAG Based Box Map 4/2/2011 18
  • 25. Using Moran Scatter Plot Fig.7 Moran scatter plot, yellow points are spatial outliers 4/2/2011 19
  • 26. Verification Fig. 8: LISA cluster map, Outliers in Red color 4/2/2011 20
  • 27. Verification Contd… Fig. 9 : Relation between HR7984 and PE82 4/2/2011 21
  • 28. Any Reasons Fig.12 :Scatterplotbw RDAC80 and HR7984 outliers in yellow color. . 4/2/2011 22
  • 31. While choosing a clustering algorithm many factors have to be considered like: 4/2/2011 25
  • 32. Spatial Clustering Problem Definition 4/2/2011 26 Given,
  • 34. Back To Our Motivating Example:- 4/2/2011 28
  • 35. Experimental Setup 4/2/2011 29 Table 1. Experimental Setup details
  • 36. Analysis Histogram Figure 13: Histogram of House price Data We can roughly model with a mixture of components. 4/2/2011 30
  • 37. Results for K=2, Using NEM Figure 14: Clustering Results for K=2, High priced Houses in in Brown color 4/2/2011 31
  • 38. Results for k=3, Using NEM Figure 15:k=3, High Prices building shown in red color 4/2/2011 32
  • 39. Semantic Enrichment using spatial clustering
  • 42. Framework Figure 16: Semantic enrichment of clusters 4/2/2011 36
  • 44. Reasoning of ontology for implicit knowledge 4/2/2011 38
  • 45. Results: Ontology Figure 17:Data ontology for Baltimore House price data 4/2/2011 39
  • 46. Results Contd… Reasoning , ABox reasoning done to this ontology using SPARQL. Sample Query: Figure 18:SPARQL Query page 4/2/2011 40
  • 47. Results Contd… Result for the given query Figure 19: Result for the given query 4/2/2011 41
  • 49. References 4/2/2011 43 [1] P. Bolstad, "GIS fundamentals," A first text on Geographic Information Systems, 2002. [2] S. and Chawla, S. Shekhar, "Spatial databases: a tour," Upper Saddle River, New Jersey, vol. 7458. [3] K. and Adhikary, J. and Han, J. Koperski, "Spatial data mining: progress and challenges survey paper," in Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Montreal, Canada., 1996. [4] R. and Srikant, R. Agrawal, "Fast algorithms for mining association rules," in Proc. 20th Int. Conf. Very Large Data Bases, VLDB., 1994, vol. 1215, pp. 487--499. [5] J.R. Quinlan, C4. 5: programs for machine learning.: Morgan Kaufmann, 1993. [6] V. and Lewis, T. Barnett, Outliers in statistical data. New York: Wiley , 1994. [7] A.K. and Dubes, R.C. Jain, Algorithms for clustering data., 1988. [8] L. and Procopiuc, O. and Ramaswamy, S. and Suel, T. and Vitter, J.S. Arge, "Scalable sweeping-based spatial join," in PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES., 1998, pp. 570--581. [9] Y. Chou,.: Onward Press, 1997. [10]H.P. Kriegel, R.T. Ng, and J. Sander M.M. Breunig, "Optics-of: Id ntifying local outliers," Proc. of PKDD, pp. 262-270, 1999.
  • 50. References Contd… 4/2/2011 44 [11] V. Barnett and T. Lewis, Outliers in Statistical Data. New York: John Wiley, 1994. [12] M.M Breunig, H.P. Kriegel, and J. Sander M. ankerst, "Ordering points to identify the clustering," International conference on Management of Data, pp. 49-60, 1999. [13] R. Johnson, Applied Multivariate Statistical Analysis.: Prentice Halt, 1992. [14] R. Rastogi, and K. Shim. S. Ramaswamy, "Efficient algorithms for mining outliers from large data sets," Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, vol. 29, pp. 427-438, 2000. [15] Shashi and Lu, Chang-Tien and Zhang, PushengShekhar, "A Unified Approach to Detecting Spatial Outliers," Geoinformatica, vol. 7, no. 2, pp. 139--166, June 2003. [16] Anselin Luc, "Exploratory spatial data analysis and geographic information systems," in New Tools for Spatial Analysis., 1994, pp. 45-54. [17] D. and Hebeler, J. and Dean, M. Kolas, "Geospatial semantic web: Architecture of ontologies," GeoSpatial Semantics, pp. 183--194, 2005. [18] T. and Vt, "Creating and using geospatial ontology time series in a semantic cultural heritage portal," in Proceedings of the 5th European semantic web conference on The semantic web: research and applications.: Springer-Verlag, 2008, pp. 110—123.
  • 51. References Contd… 4/2/2011 45 [19]P. and Di, L. and Yang, W. and Yu, G. and Zhao, P. and Gong, J. Yue, "Semantic Web Services-based process planning for earth science applications," International Journal of Geographical Information Science, vol. 29, no. 9, pp. 1139--1163, 2009. [20]M. and Ghosh, SK Paul, "oward Assessing Semantic Similarity of Geospatial Services," in TENCON 2006. 2006 IEEE Region 10 Conference., pp. 1--4. [21]E. and Lutz, M. and Kuhn, W. Klien, "Ontology-based discovery of geographic information services--An application in disaster management," Computers, environment and urban systems, vol. 30, no. 1, 2006. [22]Anselin Luc, "Local indicators of spatial association: LISA," Geographical Analysis, vol. 27, no. 2. [23]L. Anselin, D. Hawkins, G. Deane, S. Tolnay, R. Baller S. Messner. (2000) [Online]. http://www.ncovr.heinz.cmu.edu/ [24]ShashiShekhar,Weili Wu, and UygarOzesmi Sanjay Chawla, "Predicting Locations Using Map Similarity(PLUMS): A Framework for Spatial Data Mining," in MDM/KDD, Simeon J. Simoff and Osmar R. Za, Ed. Boston, MA, USA: University of Alberta, 2000, pp. 14-24. [25]Robin A. Dubin. (1992) geodacenter.asu.edu. [Online]. http://geodacenter.org/downloads/data-files/baltimore.zip
  • 52. References Contd… 4/2/2011 46 [26] P. Zhang, Y. Huang, R. Vatsavai S. Shekhar, "Trend in Spatial Data Mining," in Data Mining: Next Generation Challenges and Future Directions.: AAAI/MIT Press, 2003. [27] C. and Govaert, G. Ambroise, "onvergence of an EM-type algorithm for spatial clustering," pattern recognition letters, vol. 19, no. 10, pp. 919--927, 1998. [28] N. Alameh, "Chaining geographic information web services," IEEE Internet Computing, vol. 7, no. 5, pp. 22--29, 2003. [29] A. and Lucchi, R. and Lutz, M. and OstlFriis-Christensen, "Service chaining architectures for applications implementing distributed geographic information processing," International Journal of Geographical Information Science, vol. 23, no. 5, pp. 561--580, 2009. [30] P. and Gong, J. and Di, L. and He, L. and Wei, Y. Yue, "Integrating semantic web technologies and geospatial catalog services for geospatial information discovery and processing in cyberinfrastructure," GeoInformatica, 2009.
  • 54. Box Map Since box maps are based on the same methodology as box plots, they can be used to detect outliers in a stricter sense than is possible with percentile maps. Box maps group values such as counts or rates into six fixed categories: Four quartiles (1-25%, 25-50%, 50-75%, and 75-100%) plus two outlier categories at the low and high end of the distribution. Values are classified as outliers if they are 1. 5 times higher than the interquartile range (IQR). IQR is the difference between the 75th percentile (Q3) and the 25th percentile (Q1) or Q3-Q1. It describes the range of the middle of the distribution since 25% of values are above the interquartile range and 25% below it. 4/2/2011 48
  • 55. Box Plot Box plots are particularly useful to identify outliers and gain an overview of the spread of a distribution. The box plot (sometimes referred to as box and whisker plot) is a non-parametric method. For normally distributed data, the median corresponds to the mean and the interquartile range to the standard deviation. The box plot shows the median, first and third quartile of a distribution (the 50%, 25% and 75% points in the cumulative distribution) as well as outliers. An observation is classified as an outlier when it lies more than a given multiple of the interquartile range (the difference in value between the 75% and 25% observation) above or below respectively the value for the 75th percentile and 25th percentile. The standard multiples used are 1.5 and 3 times the interquartile range. The red bar in the middle corresponds to the median, the dark part shows the interquartile range. The individual observations in the first and fourth quartile are shown as blue dots. The thin line is the hinge, corresponding to the default criterion of 1.5. 4/2/2011 49