XL Miner: Classification

Introduction to XLMiner™ The Data mining add-in for Microsoft Excel. Classification XLMiner and Microsoft Office are registered trademarks of the respective owners.

CLASSIFICATION XLMiner provides us with different tools that can be used to classify data: They are: Discriminant Analysis Logistic Regression Classification Tree Naive Bayes Neural Network (Multilayer feed forward) k-Nearest Neighbors Let us look at each of these methods one by one. http://dataminingtools.net

CLASSIFICATION-Discriminant Analysis Discriminant analysis is a technique for classifying a set of observations into predefined classes. The purpose is to determine the class of an observation based on a set of variables known as predictors or input variables. The model is built based on a set of observations for which the classes are known. This set of observations is sometimes referred to as the training set. Based on the training set , the technique constructs a set of linear functions of the predictors, known as discriminant functions. We will use the Wine.xls as the data source. http://dataminingtools.net

CLASSIFICATION-Discriminant Analysis(Step 1) The variables (independent) that are selected as the input variables The output ( dependent) variable http://dataminingtools.net

CLASSIFICATION-Discriminant Analysis(Step 2) Choosing the “According to relative occurrences” will specify the prior class probability i.e. the probability of a particular class occurring is selected equal to its frequency in the training set. Choosing “Use equal” specifies the class probabilities to be taken as equal . http://dataminingtools.net

CLASSIFICATION-Discriminant Analysis (Step 3) Check the options which you want to be displayed in the output, and then click on finish. http://dataminingtools.net

CLASSIFICATION-Discriminant Analysis (Output) http://dataminingtools.net

CLASSIFICATION-Discriminant Analysis This section of the output shows how each training data case was classified. The highest probability values in each record are highlighted http://dataminingtools.net

CLASSIFICATION- Classification Trees These trees are very useful to classify/predict outcomes. They generate simple rules that can easily be translated to a natural query language. The decision trees work by binary recursive partitioning – i.e. they keep on classifying a record by checking whether it meets the criteria at a node or not. Since the partitioning is binary, it is essential that the nodes be divided such that they represent mutually exclusive conditions. http://dataminingtools.net

CLASSIFICATION- Classification Trees (Step 1) http://dataminingtools.net

CLASSIFICATION- Classification Trees (Step 2) The “Minimum #records in terminal node” determines when the classification should stop i.e. when the minimum number of records is reached classification is halted so that the built model is not over fitted. http://dataminingtools.net

CLASSIFICATION- Classification Trees (Step 3) Select the options for output. Selecting “Best pruned tree” causes the tree to be pruned and the best fitting for validation set is selected. http://dataminingtools.net

CLASSIFICATION- Classification Trees (Output) Rules that are used to create nodes. http://dataminingtools.net

CLASSIFICATION- Classification Trees (output) http://dataminingtools.net

CLASSIFICATION- Naïve Bayes Theorem This theorem is applicable to independent events only, i.e. the value of one variable will not affect that of the others. If there are say, 10 variables that a classification technique has to consider, the Bayes theorem does classification by taking each variable into account separately. http://dataminingtools.net

CLASSIFICATION- Naïve Bayes Theorem (Step 1 ) http://dataminingtools.net

CLASSIFICATION- Naïve Bayes Theorem(Step 2-3) http://dataminingtools.net

CLASSIFICATION- Naïve Bayes Theorem (output) http://dataminingtools.net

CLASSIFICATION- Naïve Bayes Theorem (Output) http://dataminingtools.net

CLASSIFICATION- k-nearest neighbors In k-nearest neighbours classification (k-NN), for each record, the k-nearest neighbours (nearness is defined by the Euclidean distance to the record in question) are identified and the class a majority of them belong to is determined. The original record is also attributed to the same class. http://dataminingtools.net

CLASSIFICATION- k-nearest neighbors (Step 1) http://dataminingtools.net

CLASSIFICATION- k-nearest neighbors (Step 2) http://dataminingtools.net

CLASSIFICATION- k-nearest neighbors (Output) http://dataminingtools.net

CLASSIFICATION- k-nearest neighbors (Output) Based on the probability , record is placed in the class with highest probability. http://dataminingtools.net

Thank you For more presentations, tutorial videos on Data Mining, please visit http://dataminingtools.net http://dataminingtools.net

XL Miner: Classification

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (17)

En vedette

En vedette (20)

Similaire à XL Miner: Classification

Similaire à XL Miner: Classification (20)

Plus de DataminingTools Inc

Plus de DataminingTools Inc (20)

Dernier

Dernier (20)

XL Miner: Classification