SlideShare une entreprise Scribd logo
1  sur  31
Overview of Data Mining Meeting of WP Data Mining April 28, 2008 Bowo Prasetyo http://www.scribd.com/prazjp http://www.slideshare.net/bowoprasetyo ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Contents ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What Is Data Mining? ,[object Object],1) Berry and Linoff,  Data Mining Techniques for Marketing, Sales and Customer Support  (Book), 1997
Does It Differ To Statistics? ,[object Object],16) D. Pregibon,  Data Mining: Statistical Computing and Graphics , p. 7-8, 1997 Statistics Artificial Intelligence Database Data Mining
Statistics, AI, Database ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why Uses Data Mining? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],2) ESG Research,  New ESG Research Finds Large Organizations Experiencing Explosive Growth in Log Data Collection, Analysis, and Storage , 2007 ( http://www.enterprisestrategygroup.com/_documents/NewsEvent/NewsEvent439.pdf )  3) EMC — IDC Research,  The Expanding Digital Universe: A Forecast of Worldwide Information Growth Through 2010 , 2006 ( http://www.emc.com/about/destination/digital_universe/ )
What Can Data Mining Do? Examples
On Business and Network Security ,[object Object],[object Object],[object Object],[object Object],4) G. Adomavicius and A. Tuzhilin,  Using data mining methods to build customer profiles , in Computer magazine p. 74-82, 2001 5) Z. Huang, H. Chen, C. Hsu, W. Chen, S. Wu,  Credit rating analysis with support vector machines and neural networks: a market comparative study , in Journal of Decision Support Systems p. 543-558, 2004 6) T. Fawcett and F. Provost,  Adaptive Fraud Detection , in Journal of Data Mining and Knowledge Discovery p. 291-316, 2004 7) W. Lee and S. J. Stolfo,  Data Mining Approaches for Intrusion Detection , in Proceedings of the 7th USENIX Security Symposium, 1998
On The Web ,[object Object],[object Object],[object Object],[object Object],8) R. Cooley, B. Mobasher, J. Srivastava,  Web Mining: Information and Pattern Discovery on the World Wide Web , in Proceedings of 9th International Conference on Tools with Artificial Intelligence (ICTAI) p. 0558, 1997 9) Larry Page, Sergey Brin, R. Motwani, T. Winograd,  The PageRank Citation Ranking: Bringing Order to the Web , 1998 ( http://citeseer.ist.psu.edu/page98pagerank.html )  10) M. Eirinaki and M. Vazirgiannis,  Web mining for web personalization , in ACM Transactions on Internet Technology (TOIT) p. 1- 27, 2003.  11) S. W. Changchien and T. Lu,  Mining association rules procedure to support on-line recommendation by customers and products fragmentation , in Journal of Expert Systems with Applications v. 20-4 p. 325-335, 2001
On Environment ,[object Object],[object Object],[object Object],[object Object],12)  J. Han, K. Koperski, N. Stefanovic, GeoMiner: a system prototype for spatial data mining, in  Proceedings of ACM SIGMOD international conference on Management of data p. 553 - 556, 1997 13) Z. Nazeri and J. Zhang,  Mining aviation data to understand impacts of severe weather on airspace system performance , in Proceedings of International Conference on Coding and Computing p. 518- 523, 2002.  14) V. Kumar, M. Steinbach, P. Tan, S. Klooster, C. Potter, A. Torregrosa,  Mining Scientific Data: Discovery of Patterns in the Global Climate System , in Proceedings of the Joint Statistical Meetings p. 5--9, 2001 15) M.  Steinbach, P. Tan, V. Kumar, S. Klooster, C. Potter ,  Data Mining for the Discovery of Ocean Climate Indices , in Proceedings of the 5th Workshop on Scientific Data Mining p. 7-16, 2002
Methods in Data Mining Basic Methods
Classification, Clustering, Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classification ,[object Object],[object Object],[object Object],17)  http ://en.wikipedia.org/wiki/Naive_Bayes_classifier
Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object],18) J. A. Hartigan and M. A. Wong,  A k-means clustering algorithm,  in Applied Statistics, 28 (1) p. 100-108, 1979
Association Rules ,[object Object],[object Object],[object Object],[object Object],19) R. Agrawal, R. Srikant,  Fast Algorithms for Mining Association Rules , in Proc. 20th Int. Conf. Very Large Data Bases, VLDB, 1994
Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],( C k : Candidate itemset of size  k )  ( L k : frequent itemset of size  k  whose  support  >=  minsup )
Association Rules ,[object Object],[object Object],[object Object]
Visualization Of Mining Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Contoh Kasus Aturan Asosiasi di Toserba
Item dan Transaksi ,[object Object],[object Object],[object Object],[object Object],[object Object],transaksi item
Frequent Item (Item Sering) ,[object Object],[object Object],support minimum support
n -Length Item ( n -Item) ,[object Object],2-length item 3-length item
Aturan Asosiasi ,[object Object],[object Object],beras => minyak goreng support(minyak goreng & beras) support(beras) = 2/2 = 1 confidence antecedent consequent
Aturan Asosiasi Lengkap
Mining Environmental Data Examples
Explosion in Environmental Data ,[object Object],[object Object],[object Object],[object Object]
Geo-spatial Database ,[object Object],[object Object],[object Object],[object Object],[object Object],GeoMiner
Earth Science ,[object Object],Regions that are covered by the highly correlated pattern, FPAR-Hi    NPP-Hi Shrubland regions FPAR: Fractional Intercepted Photosynthetically Active Radiation NPP  : Net Primary Production
Earth Science ,[object Object],Two clusters for NPP (land) and two clusters for SST (ocean). The clusters approximate the northern and southern hemispheres, for land and ocean. SST: sea surface temperature
Earth Science ,[object Object],Clusters of ocean near the Philipines (SST) and lands of Eastern Brazil, Southern Africa, and a bit of Australia (NPP) is highly correlated (0.47). In particular, this sea region is highly correlated (0.66), with SOI, which is a climate index related to El Niño, and it is known that parts of Southern Africa and Australia experience droughts related to El Nino.
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Tendances

PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANKPATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
IJDKP
 

Tendances (9)

Analysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduceAnalysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduce
 
Data mining and knowledge Discovery
Data mining and knowledge DiscoveryData mining and knowledge Discovery
Data mining and knowledge Discovery
 
Data Warehousing and Business Intelligence Project on Smart Agriculture and M...
Data Warehousing and Business Intelligence Project on Smart Agriculture and M...Data Warehousing and Business Intelligence Project on Smart Agriculture and M...
Data Warehousing and Business Intelligence Project on Smart Agriculture and M...
 
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANKPATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
 
Data Mining Overview
Data Mining OverviewData Mining Overview
Data Mining Overview
 
A literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesA literature review of modern association rule mining techniques
A literature review of modern association rule mining techniques
 
Bigdata AI
Bigdata AI Bigdata AI
Bigdata AI
 
Data science courses
Data science coursesData science courses
Data science courses
 
Data analytics courses
Data analytics coursesData analytics courses
Data analytics courses
 

Similaire à Overview of Data Mining

Similaire à Overview of Data Mining (20)

Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .ppt
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notes
 
Data-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptxData-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptx
 
Data-Mining-ppt.pptx
Data-Mining-ppt.pptxData-Mining-ppt.pptx
Data-Mining-ppt.pptx
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
 
Application of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in ThailandApplication of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in Thailand
 
Application of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailandApplication of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailand
 
future2020
future2020future2020
future2020
 
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big DataIRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big Data
 
10probs.ppt
10probs.ppt10probs.ppt
10probs.ppt
 
Data science Innovations January 2018
Data science Innovations January 2018Data science Innovations January 2018
Data science Innovations January 2018
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Development of Data Integration & Analysis System in Japan
Development of Data Integration & Analysis System in JapanDevelopment of Data Integration & Analysis System in Japan
Development of Data Integration & Analysis System in Japan
 
Data science innovations
Data science innovations Data science innovations
Data science innovations
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversation
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining Techniques
 
Data Mining introduction and basic concepts
Data Mining introduction and basic conceptsData Mining introduction and basic concepts
Data Mining introduction and basic concepts
 

Plus de Bowo Prasetyo

Plus de Bowo Prasetyo (10)

e-Voting Application using Barcode Vtoken
e-Voting Application using Barcode Vtokene-Voting Application using Barcode Vtoken
e-Voting Application using Barcode Vtoken
 
e-Voting Application using Internal Vtoken
e-Voting Application using Internal Vtokene-Voting Application using Internal Vtoken
e-Voting Application using Internal Vtoken
 
Konsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor ModelingKonsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor Modeling
 
Konsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor ModelingKonsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor Modeling
 
Konsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor ModelingKonsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor Modeling
 
Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6
 
Mengenal Rapidminer
Mengenal RapidminerMengenal Rapidminer
Mengenal Rapidminer
 
Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6
 
Nutch dan Solr
Nutch dan SolrNutch dan Solr
Nutch dan Solr
 
Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Overview of Data Mining