Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

XL-MINER: Data Utilities

1 663 vues

Publié le

XL-MINER: Data Utilities

Publié dans : Technologie
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

XL-MINER: Data Utilities

  1. 1. Introduction to<br />XLMiner™<br />DATA Utilities<br />XLMiner and Microsoft Office are registered trademarks of the respective owners.<br />
  2. 2. Brief description of the features of XLMiner:<br />Data Utilities<br />The XLMiner provides the user with a host of Data Utilities at his disposal. They are:<br /> The different Data Utilities that XLMiner Provides are:-<br />Sample from Worksheet/Database.<br /><ul><li>Simple Random sample.
  3. 3. Stratified Sampling.</li></ul>Missing Data handling.<br />Bin Continuous Data.<br />Transform Categorical Data .<br />http://dataminingtools.net<br />
  4. 4. Sample data from Worksheet<br />When huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain. <br />The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions. <br />A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample.<br />XLMiner provides us sampling facilities.<br />http://dataminingtools.net<br />
  5. 5. Sample data from Worksheet<br />In XLMiner, sampling can be done in two ways:<br />Simple Random sampling:<br /> A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosen<br />Stratified Sampling :<br /> The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample.<br />http://dataminingtools.net<br />
  6. 6. Sample data from Worksheet- Simple Random Sampling<br />Select the variables to be present in the sample<br />Here “Simple Random sampling is selected<br />We can specify the seed value( value used for random selection) or the wizard will specify it by default.<br />Set the size for the sampled set<br />If selected duplicate copies of records may be used.<br />http://dataminingtools.net<br />
  7. 7. Sample data from Worksheet- Simple Random Sampling output<br />http://dataminingtools.net<br />
  8. 8. Sample data from Worksheet- <br />Simple Random Sampling output with replacement.<br />Duplicate copies of record exist in the sample.<br />http://dataminingtools.net<br />
  9. 9. Sample data from Worksheet- Stratified Sample( proportionate )<br />http://dataminingtools.net<br />
  10. 10. Sample data from Worksheet- Stratified Sample( proportionate – output )<br />As selected by us, the % of records in each stratum in the sample set is same as that in the input set<br />http://dataminingtools.net<br />
  11. 11. Sample data from Worksheet- Stratified Sample(specify number)<br />http://dataminingtools.net<br />
  12. 12. Sample data from Worksheet- Stratified Sample(specify number)<br />All stratums have equal sizes as specified by user (here 10 records each)<br />http://dataminingtools.net<br />
  13. 13. Sample data from Worksheet- Stratified Sample( size of smallest stratum)<br />http://dataminingtools.net<br />
  14. 14. Sample data from Worksheet- Stratified Sample( size of smallest stratum-output)<br />All stratum have size equal to the size of the smallest stratum<br />http://dataminingtools.net<br />
  15. 15. Missing Data Handling<br />This utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants.<br /> <br />XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user.<br />The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variable<br />http://dataminingtools.net<br />
  16. 16. Missing Data Handling<br />http://dataminingtools.net<br />
  17. 17. Missing Data Handling<br />Data Set<br />Select the action to handle the missing data in individual columns and click on “Apply this option to selected variable”<br />http://dataminingtools.net<br />
  18. 18. Missing Data Handling-Output<br />Changed records high-lighted<br />http://dataminingtools.net<br />
  19. 19. Transform Categorical Data<br />Sometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data.<br />There are two ways to transform categorical data:<br />Creating Dummies: <br />Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D.<br />Create category scores:<br /> In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to.<br />http://dataminingtools.net<br />
  20. 20. Transform Categorical Data- Dummies<br />Select the variable that contains non-numeric Data and needs to be transformed<br />http://dataminingtools.net<br />
  21. 21. Transform Categorical Data-Category Scores<br />http://dataminingtools.net<br />
  22. 22. Transform Categorical Data-Category Scores(output)<br />http://dataminingtools.net<br />
  23. 23. Thank you<br />For more visit:<br />http://dataminingtools.net<br />http://dataminingtools.net<br />
  24. 24. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at www.dataminingtools.net<br />