SlideShare une entreprise Scribd logo
1  sur  28
Lecture Notes 1: Introduction to Data Mining Zhangxi Lin ISQS 6347 Texas Tech University ISQS 6347, Data & Text Mining
What is Data Mining? ,[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Data Mining Process ISQS 6347, Data & Text Mining
What is Text Mining? ,[object Object],ISQS 6347, Data & Text Mining Patterns Trends Associations
Motivation for Text Mining ,[object Object],[object Object],ISQS 6347, Data & Text Mining 90% Structured Numerical or Coded Information 10% Unstructured or Semi-structured Information
Text Mining Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Why Mine Data? Commercial Viewpoint ISQS 6347, Data & Text Mining
Why Mine Data? Scientific Viewpoint ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],Origins of Data Mining ISQS 6347, Data & Text Mining Machine Learning/ Pattern   Recognition Statistics/ AI Data Mining Database systems
ISQS 6347, Data & Text Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining Tasks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Classification: Definition ,[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Classification Example ISQS 6347, Data & Text Mining categorical categorical continuous class Training  Set Learn  Classifier Test Set Model
Classification: Application 1 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining From [Berry & Linoff] Data Mining Techniques, 1997
Classification: Application 2 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Clustering Definition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Illustrating Clustering ISQS 6347, Data & Text Mining ,[object Object],Intracluster distances are minimized Intercluster distances are maximized
Clustering Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Association Rule Discovery: Definition ,[object Object],[object Object],ISQS 6347, Data & Text Mining Rules Discovered: {Milk} --> {Coke} {Diaper, Milk} --> {Beer}
Association Rule Discovery Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Regression ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Deviation/Anomaly Detection ,[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining Typical network traffic at University level may reach over 100 million connections per day
Text Mining Tasks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Example:  Decision Support using Bank Call Center Data ISQS 6347, Data & Text Mining
Example:  Decision Support using Bank Call Center Data ,[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining AC2G31, 01, 0101, PCC, 021, 0053352,  NEW YORK, NY , H-SUPRVR8,  STMT ,  “ Mr. Stark has been with the company for about 20 yrs. He  hates  his  stmt   format and wishes that we would show a daily balance to help him know when he falls below the required balance on the account.”
Challenges of Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
Challenges of Text Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining
SAS Training/Self-taught Courses ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ISQS 6347, Data & Text Mining

Contenu connexe

Tendances

Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
mayurik19
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Mining
samiksha sharma
 

Tendances (19)

Introduction to data mining technique
Introduction to data mining techniqueIntroduction to data mining technique
Introduction to data mining technique
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
 
Data mining and knowledge Discovery
Data mining and knowledge DiscoveryData mining and knowledge Discovery
Data mining and knowledge Discovery
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousing
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Mining
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
An introduction to data mining and its techniques
An introduction to data mining and its techniquesAn introduction to data mining and its techniques
An introduction to data mining and its techniques
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Abstract
AbstractAbstract
Abstract
 

En vedette (7)

Makers all around you
Makers all around youMakers all around you
Makers all around you
 
My Law
My LawMy Law
My Law
 
Testing
TestingTesting
Testing
 
My Law
My LawMy Law
My Law
 
Seguimiento Merge Sort
Seguimiento Merge SortSeguimiento Merge Sort
Seguimiento Merge Sort
 
Creative spaces
Creative spaces Creative spaces
Creative spaces
 
Testing
TestingTesting
Testing
 

Similaire à D M1

Data warehouse and data mining
Data warehouse and data miningData warehouse and data mining
Data warehouse and data mining
Rohit Kumar
 

Similaire à D M1 (20)

Data mining-basic
Data mining-basicData mining-basic
Data mining-basic
 
Data Mining
Data MiningData Mining
Data Mining
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
Data-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptxData-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptx
 
Data-Mining-ppt.pptx
Data-Mining-ppt.pptxData-Mining-ppt.pptx
Data-Mining-ppt.pptx
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data mining final year project in jalandhar
Data mining final year project in jalandharData mining final year project in jalandhar
Data mining final year project in jalandhar
 
Data mining final year project in ludhiana
Data mining final year project in ludhianaData mining final year project in ludhiana
Data mining final year project in ludhiana
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data mining
Data miningData mining
Data mining
 
Data warehouse and data mining
Data warehouse and data miningData warehouse and data mining
Data warehouse and data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana
 
6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar
 
6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana
 
6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar
 
Data mining
Data miningData mining
Data mining
 

Dernier

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

D M1

  • 1. Lecture Notes 1: Introduction to Data Mining Zhangxi Lin ISQS 6347 Texas Tech University ISQS 6347, Data & Text Mining
  • 2.
  • 3. Data Mining Process ISQS 6347, Data & Text Mining
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. Classification Example ISQS 6347, Data & Text Mining categorical categorical continuous class Training Set Learn Classifier Test Set Model
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.