SlideShare une entreprise Scribd logo
1  sur  9
Data Mining Using WEKA



         Submitted to
    Prof. Prithwis Mukerjee


        Submitted By
       Shikha Jayaswal




        19th April, 2012
Table of Contents

Objective ................................................................................................................................................4

WEKA......................................................................................................................................................4

   Running WEKA....................................................................................................................................4

Loading Datasets:...................................................................................................................................5

Linear Regression...................................................................................................................................7

   Model.................................................................................................................................................7

   Interpreting the Output......................................................................................................................7

Clustering................................................................................................................................................8

   Model.................................................................................................................................................8

   Interpreting the Output......................................................................................................................9
List of Figures:

Figure 1: Weka GUI Chooser...................................................................................................................4

Figure 2: Weka Explorer.........................................................................................................................5

Figure 3: Load Dataset............................................................................................................................6

Figure 4: Linear Regression.....................................................................................................................7
Objective

Exhibit the use of WEKA in performing the following data mining tasks:

    •   Linear Regression.
    •   Clustering



WEKA

Weka is a data mining tool developed at the University of Waikato. It uses GNU general public
licenses and is freely available. It is implemented in the java programming language and has GUI for
loading data, running analysis and producing visualizations.

The software could be downloaded from: http://www.cs.waikato.ac.nz/~ml/weka/
The version being used in the current analysis is 3.6.6.


Running WEKA


The following Weka GUI Chooser pops up on running weka:




Figure 1: Weka GUI Chooser




The Explorer button leads to the Weka Explorer window through which data could be loaded and be
used further for analysis.
Figure 2: Weka Explorer




Loading Datasets:

The file types supported are:

    •   Arff data files
    •   C4.5 data files
    •   Csv data files
    •   Libsvm data file
    •   Svm ligt data files
    •   Binary serialized data files
    •   Xrff data files


The data file being used for the study is:
Click “Open file..” >> select the file to be loaded and open it.




Figure 3: Load Dataset
Linear Regression
Model
Steps for creating the regression model:

   1. Click on the Classify tab.
   2. Click on the Choose button, in the window that opens up expand classifiers and then
      functions, select LinearRegression.
   3. Click on the LinearRegression text area, one could see GenericObjectEditor pop-up, in the
      dropdown attributeSelectionMethod select No Attribute Selection, Click on OK.
   4. Check Use Training Set to use the loaded dataset.
   5. In the dropdown select Price/Unit as the dependent variable and click on the Start button.




   Figure 4: Linear Regression




Interpreting the Output


Price/Unit = -0.0012 * BTU/Hr + 0.5806 * Weight lbs + 3.7411 * EER + 0 * Unit volume
             -1.2524 * Region -2.1025 * Type + 24.8058
Clustering
Model
Steps for creating the clustering model:

    1. Click on the Cluster tab.
    2. Click on the Choose button, in the window that opens up expand clusterers, select EM.
    3. Click on the EM text area, one could see GenericObjectEditor pop-up, Fill in the cluster
       attributes, Click on OK.
            a. -V Verbose.
            b. -N The number of clusters to generate. If omitted, EM will use cross validation to
                select the number of clusters automatically.
            c. -I Terminate after this many iterations if EM has not converged.
            d. -S Specify random number seed.
            e. -M Set the minimum allowable standard deviation for normal density calculation.
    4. Check Use Training Set to use the loaded dataset and click on the Start button.
Interpreting the Output


The Clustered Instances:

   Cluster      Instances
      0           7(16%)
      1          14(31%)
      2          10(22%)
      3            3(%)
      4          11(24%)


The attributes of the clusters are:

 Cluster                                     0           1           2           3          4
 Attribute                                0.16         0.3         0.2        0.07       0.27
                      mean             34.1022    32.5883     39.1963     38.0867     30.9768
 Price/Unit           std. dev.         4.1176     1.2413      2.2264      1.0193      2.8369
                      mean            912.8122   499.9553    496.4343    856.6667    347.0964
 BTU/Hr               std. dev.       105.4301   159.6201    178.5667     57.9272    140.3392
                      mean             10.4966     5.6066      5.6444      9.5967      3.9301
 Weight lbs.          std. dev.         1.3785      1.848      2.0181      0.7312       1.559
                     mean               3.3643     3.9673      4.9873      4.8533      4.4754
 EER                 std. dev           0.2773     0.3885      0.3347      0.1586      0.3313
                     mean             180985.9   129223.9    71417.94       74000    92473.04
 Unit Volume         std. dev         239037.4   135545.2    45108.85     44639.3    85150.53
                     mean                    3     3.1226            4           5     4.8882
 Region              std. dev           0.8848     0.4794            0     0.8848       0.365
                     mean               1.1427           2           2     1.3333           2
 Type                std. dev           0.3497     0.3866      0.3866      0.4714      0.3866

Contenu connexe

En vedette

En vedette (11)

Fighting spam using social gate keepers
Fighting spam using social gate keepersFighting spam using social gate keepers
Fighting spam using social gate keepers
 
Amazon mp
Amazon mpAmazon mp
Amazon mp
 
Real time classification of malicious urls.pptx 2
Real time classification of malicious urls.pptx 2Real time classification of malicious urls.pptx 2
Real time classification of malicious urls.pptx 2
 
Twitter r t under crisis
Twitter r t under crisisTwitter r t under crisis
Twitter r t under crisis
 
Weka
WekaWeka
Weka
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_Sagar
 
Weka
WekaWeka
Weka
 
Weka presentation cmt111
Weka presentation cmt111Weka presentation cmt111
Weka presentation cmt111
 
Social influence and political mobilization
Social influence and political mobilizationSocial influence and political mobilization
Social influence and political mobilization
 
Predictive Analytics: It's The Intervention That Matters
Predictive Analytics: It's The Intervention That MattersPredictive Analytics: It's The Intervention That Matters
Predictive Analytics: It's The Intervention That Matters
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 

Similaire à Weka

Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012RossBettinger
 
Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...
Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...
Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...Luis Beltran
 
ContentsPreface vii1 Introduction 11.1 What .docx
ContentsPreface vii1 Introduction 11.1 What .docxContentsPreface vii1 Introduction 11.1 What .docx
ContentsPreface vii1 Introduction 11.1 What .docxdickonsondorris
 
2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthi2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthiHoopeer Hoopeer
 
Dissertation_of_Pieter_van_Zyl_2_March_2010
Dissertation_of_Pieter_van_Zyl_2_March_2010Dissertation_of_Pieter_van_Zyl_2_March_2010
Dissertation_of_Pieter_van_Zyl_2_March_2010Pieter Van Zyl
 
AWS Cost Cheat Sheet
AWS Cost Cheat SheetAWS Cost Cheat Sheet
AWS Cost Cheat SheetAkash Agrawal
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsKelly Lipiec
 
Financial Data Mining Talk
Financial Data Mining TalkFinancial Data Mining Talk
Financial Data Mining TalkMike Bowles
 
GE4230 Micromirror Project 2
GE4230 Micromirror Project 2GE4230 Micromirror Project 2
GE4230 Micromirror Project 2Jon Zickermann
 
High Performance Traffic Sign Detection
High Performance Traffic Sign DetectionHigh Performance Traffic Sign Detection
High Performance Traffic Sign DetectionCraig Ferguson
 
Practical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and PresentationPractical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and PresentationHariniMS1
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on SteroidsAdam Blevins
 
Big Data and the Web: Algorithms for Data Intensive Scalable Computing
Big Data and the Web: Algorithms for Data Intensive Scalable ComputingBig Data and the Web: Algorithms for Data Intensive Scalable Computing
Big Data and the Web: Algorithms for Data Intensive Scalable ComputingGabriela Agustini
 

Similaire à Weka (20)

Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012
 
Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...
Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...
Cloud Lunch and Learn ML.NET MACHINE LEARNING (AND DEEP LEARNING) FOR THE CSh...
 
thesis
thesisthesis
thesis
 
ContentsPreface vii1 Introduction 11.1 What .docx
ContentsPreface vii1 Introduction 11.1 What .docxContentsPreface vii1 Introduction 11.1 What .docx
ContentsPreface vii1 Introduction 11.1 What .docx
 
2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthi2019 imta bouklihacene-ghouthi
2019 imta bouklihacene-ghouthi
 
Report
ReportReport
Report
 
edc_adaptivity
edc_adaptivityedc_adaptivity
edc_adaptivity
 
document
documentdocument
document
 
Dissertation_of_Pieter_van_Zyl_2_March_2010
Dissertation_of_Pieter_van_Zyl_2_March_2010Dissertation_of_Pieter_van_Zyl_2_March_2010
Dissertation_of_Pieter_van_Zyl_2_March_2010
 
Thesis
ThesisThesis
Thesis
 
AWS Cost Cheat Sheet
AWS Cost Cheat SheetAWS Cost Cheat Sheet
AWS Cost Cheat Sheet
 
data structures
data structuresdata structures
data structures
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing Units
 
Financial Data Mining Talk
Financial Data Mining TalkFinancial Data Mining Talk
Financial Data Mining Talk
 
GE4230 Micromirror Project 2
GE4230 Micromirror Project 2GE4230 Micromirror Project 2
GE4230 Micromirror Project 2
 
High Performance Traffic Sign Detection
High Performance Traffic Sign DetectionHigh Performance Traffic Sign Detection
High Performance Traffic Sign Detection
 
Practical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and PresentationPractical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and Presentation
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on Steroids
 
Big Data and the Web: Algorithms for Data Intensive Scalable Computing
Big Data and the Web: Algorithms for Data Intensive Scalable ComputingBig Data and the Web: Algorithms for Data Intensive Scalable Computing
Big Data and the Web: Algorithms for Data Intensive Scalable Computing
 
Big data-and-the-web
Big data-and-the-webBig data-and-the-web
Big data-and-the-web
 

Dernier

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 

Dernier (20)

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 

Weka

  • 1. Data Mining Using WEKA Submitted to Prof. Prithwis Mukerjee Submitted By Shikha Jayaswal 19th April, 2012
  • 2. Table of Contents Objective ................................................................................................................................................4 WEKA......................................................................................................................................................4 Running WEKA....................................................................................................................................4 Loading Datasets:...................................................................................................................................5 Linear Regression...................................................................................................................................7 Model.................................................................................................................................................7 Interpreting the Output......................................................................................................................7 Clustering................................................................................................................................................8 Model.................................................................................................................................................8 Interpreting the Output......................................................................................................................9
  • 3. List of Figures: Figure 1: Weka GUI Chooser...................................................................................................................4 Figure 2: Weka Explorer.........................................................................................................................5 Figure 3: Load Dataset............................................................................................................................6 Figure 4: Linear Regression.....................................................................................................................7
  • 4. Objective Exhibit the use of WEKA in performing the following data mining tasks: • Linear Regression. • Clustering WEKA Weka is a data mining tool developed at the University of Waikato. It uses GNU general public licenses and is freely available. It is implemented in the java programming language and has GUI for loading data, running analysis and producing visualizations. The software could be downloaded from: http://www.cs.waikato.ac.nz/~ml/weka/ The version being used in the current analysis is 3.6.6. Running WEKA The following Weka GUI Chooser pops up on running weka: Figure 1: Weka GUI Chooser The Explorer button leads to the Weka Explorer window through which data could be loaded and be used further for analysis.
  • 5. Figure 2: Weka Explorer Loading Datasets: The file types supported are: • Arff data files • C4.5 data files • Csv data files • Libsvm data file • Svm ligt data files • Binary serialized data files • Xrff data files The data file being used for the study is:
  • 6. Click “Open file..” >> select the file to be loaded and open it. Figure 3: Load Dataset
  • 7. Linear Regression Model Steps for creating the regression model: 1. Click on the Classify tab. 2. Click on the Choose button, in the window that opens up expand classifiers and then functions, select LinearRegression. 3. Click on the LinearRegression text area, one could see GenericObjectEditor pop-up, in the dropdown attributeSelectionMethod select No Attribute Selection, Click on OK. 4. Check Use Training Set to use the loaded dataset. 5. In the dropdown select Price/Unit as the dependent variable and click on the Start button. Figure 4: Linear Regression Interpreting the Output Price/Unit = -0.0012 * BTU/Hr + 0.5806 * Weight lbs + 3.7411 * EER + 0 * Unit volume -1.2524 * Region -2.1025 * Type + 24.8058
  • 8. Clustering Model Steps for creating the clustering model: 1. Click on the Cluster tab. 2. Click on the Choose button, in the window that opens up expand clusterers, select EM. 3. Click on the EM text area, one could see GenericObjectEditor pop-up, Fill in the cluster attributes, Click on OK. a. -V Verbose. b. -N The number of clusters to generate. If omitted, EM will use cross validation to select the number of clusters automatically. c. -I Terminate after this many iterations if EM has not converged. d. -S Specify random number seed. e. -M Set the minimum allowable standard deviation for normal density calculation. 4. Check Use Training Set to use the loaded dataset and click on the Start button.
  • 9. Interpreting the Output The Clustered Instances: Cluster Instances 0 7(16%) 1 14(31%) 2 10(22%) 3 3(%) 4 11(24%) The attributes of the clusters are: Cluster 0 1 2 3 4 Attribute 0.16 0.3 0.2 0.07 0.27 mean 34.1022 32.5883 39.1963 38.0867 30.9768 Price/Unit std. dev. 4.1176 1.2413 2.2264 1.0193 2.8369 mean 912.8122 499.9553 496.4343 856.6667 347.0964 BTU/Hr std. dev. 105.4301 159.6201 178.5667 57.9272 140.3392 mean 10.4966 5.6066 5.6444 9.5967 3.9301 Weight lbs. std. dev. 1.3785 1.848 2.0181 0.7312 1.559 mean 3.3643 3.9673 4.9873 4.8533 4.4754 EER std. dev 0.2773 0.3885 0.3347 0.1586 0.3313 mean 180985.9 129223.9 71417.94 74000 92473.04 Unit Volume std. dev 239037.4 135545.2 45108.85 44639.3 85150.53 mean 3 3.1226 4 5 4.8882 Region std. dev 0.8848 0.4794 0 0.8848 0.365 mean 1.1427 2 2 1.3333 2 Type std. dev 0.3497 0.3866 0.3866 0.4714 0.3866