SlideShare une entreprise Scribd logo
1  sur  13
DATA WARE HOUSING
AND DATA MINING
  DECISION TREE
Contents
•   Introduction
•   Decision Tree
•   Decision Tree Algorithm
•   Decision Tree Based Algorithm
•   Algorithm
•   Decision Tree Advantages and Disadvantages
Introduction
• Classification is a most familiar and most
  popular data mining technique.
• Classification applications includes image and
  pattern recognition, loan approval, detecting
  faults in industrial applications.
• All approaches to performing classification
  assumes some knowledge of the data.
• Training set is used to develop specific
  parameters required by the technique.
Decision Tree

• Decision Tree (DT):
 ▫ Tree where the root and each internal node is
   labeled with a question.
 ▫ The arcs represent each possible answer to the
   associated question.
 ▫ Each leaf node represents a prediction of a solution
   to the problem.
• Popular technique for classification; Leaf node
  indicates class to which the corresponding tuple
  belongs.
Decision Tree Example
Decision Tree
 • A Decision Tree Model is a computational
   model consisting of three parts:
   ▫ Decision Tree
   ▫ Algorithm to create the tree
   ▫ Algorithm that applies the tree to data
 • Creation of the tree is the most difficult part.
 • Processing is basically a search similar to that
   in a binary search tree (although DT may not
   be binary).
Decision Tree Algorithm
Algorithm Definition

• The decision tree approach is most useful in
  classification problems. With this technique, a
  tree is constructed to model the classification
  process.
• Once the tree is build, it is applied to each tuple
  in the database and results in a classification for
  that tuple.
• There are two basics step in this techinque:
  Building the tree and Applying the tree to the
  database.
• The decision tree approach to classification is to
  divide the search space into rectangular region.
  A tuple is classified based on the region into
  which it falls.
• Definition: Given a database D={t1……..tn}
  where ti=<ti1……..tih> and the database schema
  consist of following attributes {A1,A2,………,Ah}
  also a set of classes C={C1,……,Cm}. A decision
  tree DT or classification tree is a tree associated
  with D that has the following properties:
  ▫ Each internal node is labeled with an attribute Ai
  ▫ Each arc is labeled with a predicate that can be
    applied to a attribute associated with a parent.
  ▫ Each leaf node is labeled with a class Cj.
Algorithm
• Input:
  D        // Training data
• Output:
  T        //Decision tree
• DTBuild algorithm
  // Simplistic algorithm to illustrate naive
  approach to building DT
• T=0;
  Determine best splitting criterion;
  T=Create root node, node and label with splitting
  attribute;
  T=Add arc to root node for each split predicate and
  label;
  for each arc do
  D= database created by applying splitting predicate to
  D;
  if stopping point reached for this path, then
  T’= Create leaf node and label with appropriate class;
  else
  T’=DTBuild(D);
  T=Add T’ to arc;
DT Advantages/Disadvantages
 • Advantages:
  ▫ Easy to understand.
  ▫ Easy to generate rules
 • Disadvantages:
  ▫   May suffer from overfitting.
  ▫   Classifies by rectangular partitioning.
  ▫   Does not easily handle nonnumeric data.
  ▫   Can be quite large – pruning is necessary.
Decision trees

Contenu connexe

Tendances

k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptxRoshan86572
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree LearningMilind Gokhale
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and predictionDataminingTools Inc
 
Chapter 4 Classification
Chapter 4 ClassificationChapter 4 Classification
Chapter 4 ClassificationKhalid Elshafie
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligenceMdAlAmin187
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Mustafa Sherazi
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation pptGichelle Amon
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classificationKrish_ver2
 

Tendances (20)

k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptx
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Chapter 4 Classification
Chapter 4 ClassificationChapter 4 Classification
Chapter 4 Classification
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation ppt
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 

En vedette (12)

DECISION MAKING - ENGINEERING ECONOMICS AND FINANCIAL ACCOUNTING
DECISION MAKING - ENGINEERING ECONOMICS AND FINANCIAL ACCOUNTINGDECISION MAKING - ENGINEERING ECONOMICS AND FINANCIAL ACCOUNTING
DECISION MAKING - ENGINEERING ECONOMICS AND FINANCIAL ACCOUNTING
 
Data ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housingData ware housing- Introduction to data ware housing
Data ware housing- Introduction to data ware housing
 
Microblogging
MicrobloggingMicroblogging
Microblogging
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Neural networks
Neural networksNeural networks
Neural networks
 
Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)
 
Data-ware Housing
Data-ware HousingData-ware Housing
Data-ware Housing
 
Main MeMory Data Base
Main MeMory Data BaseMain MeMory Data Base
Main MeMory Data Base
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 

Similaire à Decision trees

Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxPriyadharshiniG41
 
Decision Tree in Machine Learning
Decision Tree in Machine Learning  Decision Tree in Machine Learning
Decision Tree in Machine Learning Souma Maiti
 
Data Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary TreesData Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary TreesFerdin Joe John Joseph PhD
 
Lecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdfLecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdfssuser4c50a9
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut
 
Decision trees
Decision treesDecision trees
Decision treesNcib Lotfi
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)Shweta Ghate
 
Decision tree for data mining and computer
Decision tree for data mining and computerDecision tree for data mining and computer
Decision tree for data mining and computertttiba
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Lecture_1_Introduction to Data Structures and Algorithm.pptx
Lecture_1_Introduction to Data Structures and Algorithm.pptxLecture_1_Introduction to Data Structures and Algorithm.pptx
Lecture_1_Introduction to Data Structures and Algorithm.pptxmueedmughal88
 
Advanced c c++
Advanced c c++Advanced c c++
Advanced c c++muilevan
 
Directed Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocksDirected Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocksMohammad Vaseem Akaram
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithmsiqbalphy1
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratchFEG
 
DFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptxDFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptxAliyahAli19
 

Similaire à Decision trees (20)

Lecture4.ppt
Lecture4.pptLecture4.ppt
Lecture4.ppt
 
Chapter 4.pdf
Chapter 4.pdfChapter 4.pdf
Chapter 4.pdf
 
Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptx
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision Tree in Machine Learning
Decision Tree in Machine Learning  Decision Tree in Machine Learning
Decision Tree in Machine Learning
 
Data Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary TreesData Structures and Algorithm - Week 4 - Trees, Binary Trees
Data Structures and Algorithm - Week 4 - Trees, Binary Trees
 
Lecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdfLecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdf
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
 
Decision trees
Decision treesDecision trees
Decision trees
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
 
Decision tree for data mining and computer
Decision tree for data mining and computerDecision tree for data mining and computer
Decision tree for data mining and computer
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
data mining.pptx
data mining.pptxdata mining.pptx
data mining.pptx
 
Lecture_1_Introduction to Data Structures and Algorithm.pptx
Lecture_1_Introduction to Data Structures and Algorithm.pptxLecture_1_Introduction to Data Structures and Algorithm.pptx
Lecture_1_Introduction to Data Structures and Algorithm.pptx
 
Advanced c c++
Advanced c c++Advanced c c++
Advanced c c++
 
Directed Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocksDirected Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocks
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithms
 
ML.pptx
ML.pptxML.pptx
ML.pptx
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch
 
DFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptxDFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptx
 

Plus de Jagjit Wilku

Mobile communication
Mobile communicationMobile communication
Mobile communicationJagjit Wilku
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)Jagjit Wilku
 

Plus de Jagjit Wilku (6)

Health insurance
Health insuranceHealth insurance
Health insurance
 
Auto insurance
Auto insuranceAuto insurance
Auto insurance
 
Mobile communication
Mobile communicationMobile communication
Mobile communication
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
 
Complier designer
Complier designerComplier designer
Complier designer
 
Mc wireless lan
Mc wireless lanMc wireless lan
Mc wireless lan
 

Dernier

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 

Dernier (20)

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 

Decision trees

  • 1. DATA WARE HOUSING AND DATA MINING DECISION TREE
  • 2. Contents • Introduction • Decision Tree • Decision Tree Algorithm • Decision Tree Based Algorithm • Algorithm • Decision Tree Advantages and Disadvantages
  • 3. Introduction • Classification is a most familiar and most popular data mining technique. • Classification applications includes image and pattern recognition, loan approval, detecting faults in industrial applications. • All approaches to performing classification assumes some knowledge of the data. • Training set is used to develop specific parameters required by the technique.
  • 4. Decision Tree • Decision Tree (DT): ▫ Tree where the root and each internal node is labeled with a question. ▫ The arcs represent each possible answer to the associated question. ▫ Each leaf node represents a prediction of a solution to the problem. • Popular technique for classification; Leaf node indicates class to which the corresponding tuple belongs.
  • 6. Decision Tree • A Decision Tree Model is a computational model consisting of three parts: ▫ Decision Tree ▫ Algorithm to create the tree ▫ Algorithm that applies the tree to data • Creation of the tree is the most difficult part. • Processing is basically a search similar to that in a binary search tree (although DT may not be binary).
  • 8. Algorithm Definition • The decision tree approach is most useful in classification problems. With this technique, a tree is constructed to model the classification process. • Once the tree is build, it is applied to each tuple in the database and results in a classification for that tuple. • There are two basics step in this techinque: Building the tree and Applying the tree to the database.
  • 9. • The decision tree approach to classification is to divide the search space into rectangular region. A tuple is classified based on the region into which it falls. • Definition: Given a database D={t1……..tn} where ti=<ti1……..tih> and the database schema consist of following attributes {A1,A2,………,Ah} also a set of classes C={C1,……,Cm}. A decision tree DT or classification tree is a tree associated with D that has the following properties: ▫ Each internal node is labeled with an attribute Ai ▫ Each arc is labeled with a predicate that can be applied to a attribute associated with a parent. ▫ Each leaf node is labeled with a class Cj.
  • 10. Algorithm • Input: D // Training data • Output: T //Decision tree • DTBuild algorithm // Simplistic algorithm to illustrate naive approach to building DT
  • 11. • T=0; Determine best splitting criterion; T=Create root node, node and label with splitting attribute; T=Add arc to root node for each split predicate and label; for each arc do D= database created by applying splitting predicate to D; if stopping point reached for this path, then T’= Create leaf node and label with appropriate class; else T’=DTBuild(D); T=Add T’ to arc;
  • 12. DT Advantages/Disadvantages • Advantages: ▫ Easy to understand. ▫ Easy to generate rules • Disadvantages: ▫ May suffer from overfitting. ▫ Classifies by rectangular partitioning. ▫ Does not easily handle nonnumeric data. ▫ Can be quite large – pruning is necessary.