SlideShare une entreprise Scribd logo
1  sur  15
DATA MINING
     BY
     SARANYA




               Page 1
INTRODUCTION

• New buzzword, old idea.
• Inferring new information from already
  collected data.
• Traditionally job of Data Analysts
• Computers have changed this.
  Far more efficient to comb through
  data using a machine than eyeballing
  statistical data.

                                 Page 2
DEFINITION
      “Data mining is the entire
process of applying computer-
based methodology, including
new techniques for knowledge
discovery, from data.”


                             Page 3
Two Main Components

Knowledge Discovery
     Concrete information gleaned from known
 data. Data you may not have known, but which
 is supported by recorded facts.

Knowledge Prediction
    Uses known data to forecast future trends,
 events, etc. (ie: Stock market predictions)


                                       Page 4
Uses of Data Mining
• AI/Machine Learning
  Combinatorial/Game Data Mining
  Good for analyzing winning strategies to games, and
  thus developing intelligent AI opponents. (ie: Chess)
• Business Strategies
  Market Basket Analysis
  Identify customer demographics, preferences, and
  purchasing patterns.
• Risk Analysis
  Product Defect Analysis
  Analyze product defect rates for given plants and
  predict possible complications (read: lawsuits) down
  the line.

                                             Page 5
(Continued)
• User Behavior Validation
  Fraud Detection
  In the realm of cell phones
  Comparing phone activity to calling records.
  Can help detect calls made on cloned
  phones.

  Similarly, with credit cards, comparing
  purchases with historical purchases. Can
  detect activity with stolen cards.

                                        Page 6
Sources of Data for Mining


  Databases (most obvious)

  Text Documents

  Computer Simulations

  Social Networks
                              Page 7
Data Mining Development




                      Page 8
Database Processing vs. Data Mining
            Processing
   • Query                  • Query
     – Well defined           – Poorly defined
     – SQL                    – No precise query
                                language
   • Data                   • Data
     -Operational data        – - Not operational data

   • Output                 • Output
     - Precise                – - Fuzzy
     - Subset of database     – - Not a subset of
                                database
                                             Page 9
Data Mining Models and Tasks




                         Page 10
Basic Data Mining Tasks
• Classification maps data into predefined
  groups or classes
  – Supervised learning
  – Pattern recognition
  – Prediction
• Regression is used to map a data item
  to a real valued prediction variable.
• Clustering groups similar data together into
  clusters.
  – Unsupervised learning
  – Segmentation
  – Partitioning                      Page 11
(cont’d)
• Summarization maps data into subsets with
  associated simple descriptions.
  – Characterization
  – Generalization
• Link Analysis uncovers relationships
  among data.
  – Affinity Analysis
  – Association Rules
  – Sequential Analysis determines sequential
    patterns.

                                         Page 12
Data Mining Techniques
• Statistical
   – Point Estimation
   – Models Based on Summarization
   – Bayes Theorem
   – Hypothesis Testing
   – Regression and Correlation
• Similarity Measures
• Decision Trees
• Neural Networks
   – Activation Functions
• Genetic Algorithms
                                     Page 13
Challenges of Data Mining
 q Scalability
 q Dimensionality
 q Complex and Heterogeneous Data
 q Data Quality
 q Data Ownership and Distribution
 q Privacy Preservation
 q Streaming Data


                              Page 14
THANK”U”




           Page 15

Contenu connexe

Tendances

Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
butest
 
Data Mining
Data MiningData Mining
Data Mining
Mîrză MuNib
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
Thanveen
 

Tendances (20)

Data mining
Data miningData mining
Data mining
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data mining
Data miningData mining
Data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Datamining - On What Kind of Data
Datamining - On What Kind of DataDatamining - On What Kind of Data
Datamining - On What Kind of Data
 
Basic Overview of Data Mining
Basic Overview of Data MiningBasic Overview of Data Mining
Basic Overview of Data Mining
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
 
Data mining services
Data mining servicesData mining services
Data mining services
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining
Data miningData mining
Data mining
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
 
Data Mining Overview
Data Mining OverviewData Mining Overview
Data Mining Overview
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Datamining and Business Analytics
Datamining and Business Analytics Datamining and Business Analytics
Datamining and Business Analytics
 

En vedette (7)

Part1
Part1Part1
Part1
 
Data Mining
Data MiningData Mining
Data Mining
 
Thesis summary knowledge discovery from academic data using association rule...
Thesis summary  knowledge discovery from academic data using association rule...Thesis summary  knowledge discovery from academic data using association rule...
Thesis summary knowledge discovery from academic data using association rule...
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 
The Art of Data Science
The Art of Data ScienceThe Art of Data Science
The Art of Data Science
 
Data Mining Techniques
Data Mining Techniques Data Mining Techniques
Data Mining Techniques
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 

Similaire à `Data mining

Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
Sulman Ahmed
 
lec01-IntroductionToDataMining.pptx
lec01-IntroductionToDataMining.pptxlec01-IntroductionToDataMining.pptx
lec01-IntroductionToDataMining.pptx
AmjadAlDgour
 
Session7part1
Session7part1Session7part1
Session7part1
abiraaman
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for Newbies
Eunjeong (Lucy) Park
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
BigMine
 
Что такое Data Science
Что такое Data ScienceЧто такое Data Science
Что такое Data Science
Olga Lavrentieva
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
bhagathk
 

Similaire à `Data mining (20)

Data mining
Data miningData mining
Data mining
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slides
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Data Mining in Operating System
Data Mining in Operating SystemData Mining in Operating System
Data Mining in Operating System
 
data warehousing and data mining
data warehousing and data miningdata warehousing and data mining
data warehousing and data mining
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
 
Graph
GraphGraph
Graph
 
Data Mining- Unit-I PPT (1).ppt
Data Mining- Unit-I PPT (1).pptData Mining- Unit-I PPT (1).ppt
Data Mining- Unit-I PPT (1).ppt
 
lec01-IntroductionToDataMining.pptx
lec01-IntroductionToDataMining.pptxlec01-IntroductionToDataMining.pptx
lec01-IntroductionToDataMining.pptx
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
what is data mining
what is data mining what is data mining
what is data mining
 
data warehousing and data mining
data warehousing and data mining data warehousing and data mining
data warehousing and data mining
 
Session7part1
Session7part1Session7part1
Session7part1
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for Newbies
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Что такое Data Science
Что такое Data ScienceЧто такое Data Science
Что такое Data Science
 
DM-Unit-1-Part 1-R.pdf
DM-Unit-1-Part 1-R.pdfDM-Unit-1-Part 1-R.pdf
DM-Unit-1-Part 1-R.pdf
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
DataScienceIntroduction.pptx
DataScienceIntroduction.pptxDataScienceIntroduction.pptx
DataScienceIntroduction.pptx
 

Dernier

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Dernier (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

`Data mining

  • 1. DATA MINING BY SARANYA Page 1
  • 2. INTRODUCTION • New buzzword, old idea. • Inferring new information from already collected data. • Traditionally job of Data Analysts • Computers have changed this. Far more efficient to comb through data using a machine than eyeballing statistical data. Page 2
  • 3. DEFINITION “Data mining is the entire process of applying computer- based methodology, including new techniques for knowledge discovery, from data.” Page 3
  • 4. Two Main Components Knowledge Discovery Concrete information gleaned from known data. Data you may not have known, but which is supported by recorded facts. Knowledge Prediction Uses known data to forecast future trends, events, etc. (ie: Stock market predictions) Page 4
  • 5. Uses of Data Mining • AI/Machine Learning Combinatorial/Game Data Mining Good for analyzing winning strategies to games, and thus developing intelligent AI opponents. (ie: Chess) • Business Strategies Market Basket Analysis Identify customer demographics, preferences, and purchasing patterns. • Risk Analysis Product Defect Analysis Analyze product defect rates for given plants and predict possible complications (read: lawsuits) down the line. Page 5
  • 6. (Continued) • User Behavior Validation Fraud Detection In the realm of cell phones Comparing phone activity to calling records. Can help detect calls made on cloned phones. Similarly, with credit cards, comparing purchases with historical purchases. Can detect activity with stolen cards. Page 6
  • 7. Sources of Data for Mining  Databases (most obvious)  Text Documents  Computer Simulations  Social Networks Page 7
  • 9. Database Processing vs. Data Mining Processing • Query • Query – Well defined – Poorly defined – SQL – No precise query language • Data • Data -Operational data – - Not operational data • Output • Output - Precise – - Fuzzy - Subset of database – - Not a subset of database Page 9
  • 10. Data Mining Models and Tasks Page 10
  • 11. Basic Data Mining Tasks • Classification maps data into predefined groups or classes – Supervised learning – Pattern recognition – Prediction • Regression is used to map a data item to a real valued prediction variable. • Clustering groups similar data together into clusters. – Unsupervised learning – Segmentation – Partitioning Page 11
  • 12. (cont’d) • Summarization maps data into subsets with associated simple descriptions. – Characterization – Generalization • Link Analysis uncovers relationships among data. – Affinity Analysis – Association Rules – Sequential Analysis determines sequential patterns. Page 12
  • 13. Data Mining Techniques • Statistical – Point Estimation – Models Based on Summarization – Bayes Theorem – Hypothesis Testing – Regression and Correlation • Similarity Measures • Decision Trees • Neural Networks – Activation Functions • Genetic Algorithms Page 13
  • 14. Challenges of Data Mining q Scalability q Dimensionality q Complex and Heterogeneous Data q Data Quality q Data Ownership and Distribution q Privacy Preservation q Streaming Data Page 14
  • 15. THANK”U” Page 15