SlideShare une entreprise Scribd logo
1  sur  29
Data Mining Classification: Alternative Techniques
Rule-Based Classifier Classify records by using a collection of “if…then…” rules Rule:    (Condition)  y where   Condition is a conjunctions of attributes   y is the class label LHS: rule antecedent or condition RHS: rule consequent
Characteristics of Rule-Based Classifier Mutually exclusive rules Classifier contains mutually exclusive rules if the rules are independent of each other Every record is covered by at most one rule Exhaustive rules Classifier has exhaustive coverage if it accounts for every possible combination of attribute values Each record is covered by at least one rule
Building Classification Rules Direct Method:   Extract rules directly from data  e.g.: RIPPER, CN2, Holte’s 1R Indirect Method:  Extract rules from other classification models (e.g.    decision trees, neural networks, etc). e.g: C4.5rules
Direct Method: Sequential Covering Start from an empty rule Grow a rule using the Learn-One-Rule function Remove training records covered by the rule Repeat Step (2) and (3) until stopping criterion is met
Aspects of Sequential Covering Rule Growing Instance Elimination Rule Evaluation Stopping Criterion Rule Pruning
Contd… Grow a single rule Remove Instances from rule Prune the rule (if necessary) Add rule to Current Rule Set Repeat
Indirect Method: C4.5rules Extract rules from an unpruned decision tree For each rule, r: A  y,  consider an alternative rule r’: A’  y where A’ is obtained by removing one of the conjuncts in A Compare the pessimistic error rate for r against all r’s Prune if one of the r’s has lower pessimistic error rate Repeat until we can no longer improve generalization error
Indirect Method: C4.5rules Instead of ordering the rules, order subsets of rules (class ordering) Each subset is a collection of rules with the same rule consequent (class) Compute description length of each subset  Description length = L(error) + g L(model)  g is a parameter that takes into account the presence of redundant attributes in a rule set (default value = 0.5)
Advantages of Rule-Based Classifiers As highly expressive as decision trees Easy to interpret Easy to generate Can classify new instances rapidly Performance comparable to decision trees
Nearest Neighbor Classifiers ,[object Object]
The set of stored records
Distance Metric to compute distance between records
The value of k, the number of nearest neighbors to retrieve
To classify an unknown record:
Compute distance to other training records
Identify k nearest neighbors
Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority vote,[object Object]
Nearest Neighbor Classification… Choosing the value of k: If k is too small, sensitive to noise points If k is too large, neighborhood may include points from other classes Scaling issues Attributes may have to be scaled to prevent distance measures from being dominated by one of the attributes Example:  height of a person may vary from 1.5m to 1.8m  weight of a person may vary from 90lb to 300lb
Nearest neighbor Classification… k-NN classifiers are lazy learners  It does not build models explicitly Unlike eager learners such as decision tree induction and rule-based systems Classifying unknown records are relatively expensive
Bayes Classifier A probabilistic framework for solving classification problems Conditional Probability: Bayes theorem:
Example of Bayes Theorem Given:  A doctor knows that meningitis causes stiff neck 50% of the time Prior probability of any patient having meningitis is 1/50,000 Prior probability of any patient having stiff neck is 1/20  If a patient has stiff neck, what’s the probability he/she has meningitis?
Naïve Bayes Classifier Assume independence among attributes Ai when class is given:     P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj) Can estimate P(Ai| Cj) for all Ai and Cj. New point is classified to Cj if  P(Cj)  P(Ai| Cj)  is maximal.
Naïve Bayes Classifier If one of the conditional probability is zero, then the entire expression becomes zero Probability estimation: c: number of classes p: prior probability m: parameter
Naïve Bayes (Summary) Robust to isolated noise points Handle missing values by ignoring the instance during probability estimate calculations Robust to irrelevant attributes Independence assumption may not hold for some attributes Use other techniques such as Bayesian Belief Networks (BBN)
Artificial Neural Networks (ANN) Model is an assembly of inter-connected nodes and weighted links Output node sums up each of its input value according to the weights of its links Compare output node against some threshold t
General Structure of ANN Training ANN means learning the weights of the neurons
Algorithm for learning ANN Initialize the weights (w0, w1, …, wk) Adjust the weights in such a way that the output of ANN is consistent with class labels of training examples Objective function: Find the weights wi’s that minimize the above objective function  e.g., backpropagation algorithm
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made by multiple classifiers

Contenu connexe

Tendances (18)

Cis166 final review c#
Cis166 final review c#Cis166 final review c#
Cis166 final review c#
 
CIS-166 Midterm
CIS-166 MidtermCIS-166 Midterm
CIS-166 Midterm
 
Ap Power Point Chpt9
Ap Power Point Chpt9Ap Power Point Chpt9
Ap Power Point Chpt9
 
Linear models and multiclass classification
Linear models and multiclass classificationLinear models and multiclass classification
Linear models and multiclass classification
 
List classes
List classesList classes
List classes
 
Lecture1
Lecture1Lecture1
Lecture1
 
Vector space classification
Vector space classificationVector space classification
Vector space classification
 
C++ arrays part1
C++ arrays part1C++ arrays part1
C++ arrays part1
 
Arrays
ArraysArrays
Arrays
 
Ap Power Point Chpt6
Ap Power Point Chpt6Ap Power Point Chpt6
Ap Power Point Chpt6
 
Finding everything about findings about (fa)
Finding everything about findings about (fa)Finding everything about findings about (fa)
Finding everything about findings about (fa)
 
Array 2 hina
Array 2 hina Array 2 hina
Array 2 hina
 
Datastructures and algorithms prepared by M.V.Brehmanada Reddy
Datastructures and algorithms prepared by M.V.Brehmanada ReddyDatastructures and algorithms prepared by M.V.Brehmanada Reddy
Datastructures and algorithms prepared by M.V.Brehmanada Reddy
 
M v bramhananda reddy dsa complete notes
M v bramhananda reddy dsa complete notesM v bramhananda reddy dsa complete notes
M v bramhananda reddy dsa complete notes
 
Collections (1)
Collections (1)Collections (1)
Collections (1)
 
Chap4java5th
Chap4java5thChap4java5th
Chap4java5th
 
Types of methods in python
Types of methods in pythonTypes of methods in python
Types of methods in python
 
264finalppt (1)
264finalppt (1)264finalppt (1)
264finalppt (1)
 

En vedette (20)

Txomin Hartz Txikia
Txomin Hartz TxikiaTxomin Hartz Txikia
Txomin Hartz Txikia
 
Norihicodanch
NorihicodanchNorihicodanch
Norihicodanch
 
Ccc
CccCcc
Ccc
 
MED dra Coding -MSSO
MED dra Coding -MSSOMED dra Coding -MSSO
MED dra Coding -MSSO
 
SPSS: Data Editor
SPSS: Data EditorSPSS: Data Editor
SPSS: Data Editor
 
XL-Miner: Timeseries
XL-Miner: TimeseriesXL-Miner: Timeseries
XL-Miner: Timeseries
 
MS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesMS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rules
 
LISP:Loops In Lisp
LISP:Loops In LispLISP:Loops In Lisp
LISP:Loops In Lisp
 
Mysql:Operators
Mysql:OperatorsMysql:Operators
Mysql:Operators
 
Data Applied: Association
Data Applied: AssociationData Applied: Association
Data Applied: Association
 
Asha & Beckis Nc Presentation
Asha & Beckis Nc PresentationAsha & Beckis Nc Presentation
Asha & Beckis Nc Presentation
 
Quick Look At Classification
Quick Look At ClassificationQuick Look At Classification
Quick Look At Classification
 
Data
DataData
Data
 
R: Apply Functions
R: Apply FunctionsR: Apply Functions
R: Apply Functions
 
RapidMiner: Setting Up A Process
RapidMiner: Setting Up A ProcessRapidMiner: Setting Up A Process
RapidMiner: Setting Up A Process
 
LíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara LozanoLíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara Lozano
 
Control Statements in Matlab
Control Statements in  MatlabControl Statements in  Matlab
Control Statements in Matlab
 
Data Applied:Decision Trees
Data Applied:Decision TreesData Applied:Decision Trees
Data Applied:Decision Trees
 
SPSS: Quick Look
SPSS: Quick LookSPSS: Quick Look
SPSS: Quick Look
 
Épica Latina Latín II
Épica Latina Latín IIÉpica Latina Latín II
Épica Latina Latín II
 

Similaire à Alternative Data Mining Classification Techniques

Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionMargaret Wang
 
slides
slidesslides
slidesbutest
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsSalah Amean
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents hussainahmad77100
 
Chapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptSubrata Kumer Paul
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiersKrish_ver2
 
Classifiers
ClassifiersClassifiers
ClassifiersAyurdata
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basicHouw Liong The
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberHouw Liong The
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptAnshika865276
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Salah Amean
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Rakibul Hasan Pranto
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasicengrasi
 

Similaire à Alternative Data Mining Classification Techniques (20)

Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
 
slides
slidesslides
slides
 
Text categorization
Text categorizationText categorization
Text categorization
 
Supervised algorithms
Supervised algorithmsSupervised algorithms
Supervised algorithms
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
 
09 classadvanced
09 classadvanced09 classadvanced
09 classadvanced
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
 
Chapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.ppt
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
 
Classifiers
ClassifiersClassifiers
Classifiers
 
[ppt]
[ppt][ppt]
[ppt]
 
[ppt]
[ppt][ppt]
[ppt]
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & Kamber
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
 
Machine learning
Machine learningMachine learning
Machine learning
 

Plus de DataminingTools Inc

AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

Plus de DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 

Dernier

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Dernier (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Alternative Data Mining Classification Techniques

  • 1. Data Mining Classification: Alternative Techniques
  • 2. Rule-Based Classifier Classify records by using a collection of “if…then…” rules Rule: (Condition)  y where Condition is a conjunctions of attributes y is the class label LHS: rule antecedent or condition RHS: rule consequent
  • 3. Characteristics of Rule-Based Classifier Mutually exclusive rules Classifier contains mutually exclusive rules if the rules are independent of each other Every record is covered by at most one rule Exhaustive rules Classifier has exhaustive coverage if it accounts for every possible combination of attribute values Each record is covered by at least one rule
  • 4. Building Classification Rules Direct Method: Extract rules directly from data e.g.: RIPPER, CN2, Holte’s 1R Indirect Method: Extract rules from other classification models (e.g. decision trees, neural networks, etc). e.g: C4.5rules
  • 5. Direct Method: Sequential Covering Start from an empty rule Grow a rule using the Learn-One-Rule function Remove training records covered by the rule Repeat Step (2) and (3) until stopping criterion is met
  • 6. Aspects of Sequential Covering Rule Growing Instance Elimination Rule Evaluation Stopping Criterion Rule Pruning
  • 7. Contd… Grow a single rule Remove Instances from rule Prune the rule (if necessary) Add rule to Current Rule Set Repeat
  • 8. Indirect Method: C4.5rules Extract rules from an unpruned decision tree For each rule, r: A  y, consider an alternative rule r’: A’  y where A’ is obtained by removing one of the conjuncts in A Compare the pessimistic error rate for r against all r’s Prune if one of the r’s has lower pessimistic error rate Repeat until we can no longer improve generalization error
  • 9. Indirect Method: C4.5rules Instead of ordering the rules, order subsets of rules (class ordering) Each subset is a collection of rules with the same rule consequent (class) Compute description length of each subset Description length = L(error) + g L(model) g is a parameter that takes into account the presence of redundant attributes in a rule set (default value = 0.5)
  • 10. Advantages of Rule-Based Classifiers As highly expressive as decision trees Easy to interpret Easy to generate Can classify new instances rapidly Performance comparable to decision trees
  • 11.
  • 12. The set of stored records
  • 13. Distance Metric to compute distance between records
  • 14. The value of k, the number of nearest neighbors to retrieve
  • 15. To classify an unknown record:
  • 16. Compute distance to other training records
  • 17. Identify k nearest neighbors
  • 18.
  • 19. Nearest Neighbor Classification… Choosing the value of k: If k is too small, sensitive to noise points If k is too large, neighborhood may include points from other classes Scaling issues Attributes may have to be scaled to prevent distance measures from being dominated by one of the attributes Example: height of a person may vary from 1.5m to 1.8m weight of a person may vary from 90lb to 300lb
  • 20. Nearest neighbor Classification… k-NN classifiers are lazy learners It does not build models explicitly Unlike eager learners such as decision tree induction and rule-based systems Classifying unknown records are relatively expensive
  • 21. Bayes Classifier A probabilistic framework for solving classification problems Conditional Probability: Bayes theorem:
  • 22. Example of Bayes Theorem Given: A doctor knows that meningitis causes stiff neck 50% of the time Prior probability of any patient having meningitis is 1/50,000 Prior probability of any patient having stiff neck is 1/20 If a patient has stiff neck, what’s the probability he/she has meningitis?
  • 23. Naïve Bayes Classifier Assume independence among attributes Ai when class is given: P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj) Can estimate P(Ai| Cj) for all Ai and Cj. New point is classified to Cj if P(Cj)  P(Ai| Cj) is maximal.
  • 24. Naïve Bayes Classifier If one of the conditional probability is zero, then the entire expression becomes zero Probability estimation: c: number of classes p: prior probability m: parameter
  • 25. Naïve Bayes (Summary) Robust to isolated noise points Handle missing values by ignoring the instance during probability estimate calculations Robust to irrelevant attributes Independence assumption may not hold for some attributes Use other techniques such as Bayesian Belief Networks (BBN)
  • 26. Artificial Neural Networks (ANN) Model is an assembly of inter-connected nodes and weighted links Output node sums up each of its input value according to the weights of its links Compare output node against some threshold t
  • 27. General Structure of ANN Training ANN means learning the weights of the neurons
  • 28. Algorithm for learning ANN Initialize the weights (w0, w1, …, wk) Adjust the weights in such a way that the output of ANN is consistent with class labels of training examples Objective function: Find the weights wi’s that minimize the above objective function e.g., backpropagation algorithm
  • 29. Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made by multiple classifiers
  • 31. Why does it work? Suppose there are 25 base classifiers Each classifier has error rate,  = 0.35 Assume classifiers are independent Probability that the ensemble classifier makes a wrong prediction:
  • 32. Examples of Ensemble Methods How to generate an ensemble of classifiers? Bagging Boosting
  • 33. Bagging Sampling with replacement Build classifier on each bootstrap sample Each sample has probability (1 – 1/n)n of being selected
  • 34. Boosting An iterative procedure to adaptively change distribution of training data by focusing more on previously misclassified records Initially, all N records are assigned equal weights Unlike bagging, weights may change at the end of boosting round
  • 35. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net