SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Guide for reproducing
results of Bioassay paper
using Weka
Important points to remember before
starting a run:
   All datasets should be in ARFF format, otherwise weka will complain for incompatible
    format during training and testing.
   Standard classifiers are used for confirmatory screen data as it is smaller and less im-
    balanced, whereas cost-sensitive classifiers are used with primary & mixed datasets as
    they are more imbalanced.
   We have two goals-
       1. To find most robust and versatile classifier for imbalanced bioassay data.
       2. To find out optimal misclassification cost setting for a classifier.
   The misclassification cost for False Negatives has to be set in order to achieve maxi-
    mum number of True Positives with a False Positive rate less than 20%.
   The datasets are randomly split into 80% training and validation set and 20% independ-
    ent test set, so we should have two files for each dataset one for training the classifier
    and one for testing the model built by that classifier.
   Use 5 fold cross-validation for larger datasets i.e. primary and mixed screens and use
    10 fold cross–validation for smaller datasets i.e. confirmatory screens.
   CostSensitiveClassifier is used for base classifiers Naïve Bayes, SMO (Sequential Minimal
    Optimization) and Random Forest, as it outperforms other meta-learners.
   MetaCost with J48 produces bettet results than other meta-learners.
   For Naïve Bayes and Random Forest, default options are used.
   For SMO, option BuildLogisticModels was set to true.
   For J48, option Unpruned was set to true.
   For more details please refer the paper.
Step wise guide to set-up a weka run:
1. Start weka explorer.
2. In Preprocess tab go to open file…
3. Open a training file in ARFF format.




                                              Click open




4. For example, AID1608red_train.arff.
5. After opening the file should look like:
6. Now click on classify tab in the menu bar.
7. We will first train a model using Naïve Bayes classifier, as we are using confirmatory
  screen AID1608 we will first apply standard classifiers and if there will be less than 20%
  False Positive rate than cost-sensitive classifiers is used.
8. Click on Choose button to select a classifier. From Bayes folder choose Naïve Bayes.




9. Your window should appear as below with cross-validation selected with 10 folds:
10. Now click on start button, model will start building.
11. Since we have used 10 fold cross-validation so it will build models for 10 folds.




                               Check status here




               Run completed
12. Look at the output section scroll to bottom section as shown:




13. This is the model generated by Naïve Bayes classifier by using training set
    AID1608red_train.
14. Next step is to test this model on the independent test set AID1608red_test.
15. Go to section test options select Supplied test set and click on set.
16. Open the test file AID1608red_test.
17. After reading the file close the Test instances dialog by clicking on close.
18. Now right-click on your model in result list and choose Re-evaluate model on current
test set.




                                      Click here
19. Within fraction of a second results are produced in the same output window.




                            False positive


         True positive



                                             False negative
                         True negative




20. We have obtained a False Positive rate of 14.5% which is less than 20% and a True posi-
tive rate of 15.4% which is very low. Now, we will set cost-sensitive classifier to improve
the results.
21. As mentioned in page 2 of this tutorial for Naïve Bayes we will use Weka’s CostSensi-
tiveClassifier.
22. The author has used incremental costing where cost was increased in stages from 2 to
    1000000, until a 20% False positive rate was reached.
23. So, we will set up a cost matrix by starting with a misclassification cost of 2.
24. Go to choose button, select CostSensitiveClassifier from meta folder.




25. Click on the text box to open the GenericObjectEditor dialog box as shown:




     Click here and this
    dialog box will open
             up
26. In this dialog box, select Naïve Bayes from choose classifier.
27. Next, click on costMatrix to set up misclassification cost.




28. We have 2 classes in our dataset i.e. actives and inactives so we will set up a 2X2
     Matrix. ( For TP, FP, TN, FN).




   In classes enter 2.
   Click resize to cre-
ate a 2X2 matrix.
   Change misclassi-
fication cost for false
negatives to 2.
   Then close the
dialog box.




                                                                              Write 2 in place of 1
29. Leave all other options default and now close GenericObjectEditor dialog by clicking OK
30. Click start to begin building cost-sensitive model.
31. Repeat steps 13-19 as described above for testing.




32. See improved results, True Positives has increased within a 20% limit for False
    Positives.
33. We stop here as we have achieved our goal.
34. Similarly, you can build models using SMO, Random Forest and J48. Check their
    settings as mentioned on page 2 of this tutorial before starting the run.

Contenu connexe

Tendances

Tendances (10)

Slides for a workshop to build the pharma competition Living Business Model
Slides for a workshop to build the pharma competition Living Business ModelSlides for a workshop to build the pharma competition Living Business Model
Slides for a workshop to build the pharma competition Living Business Model
 
One sample t test (procedure and output in SPSS)
One sample t test (procedure and output in SPSS)One sample t test (procedure and output in SPSS)
One sample t test (procedure and output in SPSS)
 
Paired sample t test (procedure and output)
Paired sample t test (procedure and output)Paired sample t test (procedure and output)
Paired sample t test (procedure and output)
 
One way anova in spss (procedure and output)
One way anova in spss (procedure and output)One way anova in spss (procedure and output)
One way anova in spss (procedure and output)
 
Independent sample t test in spss (procedure and output)
Independent sample t test in spss (procedure and output)Independent sample t test in spss (procedure and output)
Independent sample t test in spss (procedure and output)
 
Basic abap oo
Basic abap ooBasic abap oo
Basic abap oo
 
XL-MINER:Partition
XL-MINER:PartitionXL-MINER:Partition
XL-MINER:Partition
 
GIMP BASICS by Aedam Ampongan
GIMP BASICS by Aedam AmponganGIMP BASICS by Aedam Ampongan
GIMP BASICS by Aedam Ampongan
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
Multiply-and-divide-in-excel
Multiply-and-divide-in-excelMultiply-and-divide-in-excel
Multiply-and-divide-in-excel
 

En vedette

Test
TestTest
Test
rofop
 
Parameter Optimisation for Automated Feature Point Detection
Parameter Optimisation for Automated Feature Point DetectionParameter Optimisation for Automated Feature Point Detection
Parameter Optimisation for Automated Feature Point Detection
Dario Panada
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 

En vedette (9)

Consumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestConsumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random Forest
 
Test
TestTest
Test
 
SPIPNOZ 2013 : le plugin evaluations
SPIPNOZ 2013 : le plugin evaluationsSPIPNOZ 2013 : le plugin evaluations
SPIPNOZ 2013 : le plugin evaluations
 
Parameter Optimisation for Automated Feature Point Detection
Parameter Optimisation for Automated Feature Point DetectionParameter Optimisation for Automated Feature Point Detection
Parameter Optimisation for Automated Feature Point Detection
 
Conistency of random forests
Conistency of random forestsConistency of random forests
Conistency of random forests
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Global refinement of random forest"CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Global refinement of random forest"
 
Random forest
Random forestRandom forest
Random forest
 
Random forest
Random forestRandom forest
Random forest
 

Similaire à Weka guide

Lab 10.doc
Lab 10.docLab 10.doc
Lab 10.doc
butest
 
Lab 10.doc
Lab 10.docLab 10.doc
Lab 10.doc
butest
 
Scoring documentation
Scoring documentationScoring documentation
Scoring documentation
Fatima Khalid
 
Blackboxtesting 02 An Example Test Series
Blackboxtesting 02 An Example Test SeriesBlackboxtesting 02 An Example Test Series
Blackboxtesting 02 An Example Test Series
nazeer pasha
 

Similaire à Weka guide (20)

AI Builder - Text Classification
AI Builder - Text ClassificationAI Builder - Text Classification
AI Builder - Text Classification
 
OLT open script
OLT open script OLT open script
OLT open script
 
Normal Modal Analysis in Hypermesh
Normal Modal Analysis in HypermeshNormal Modal Analysis in Hypermesh
Normal Modal Analysis in Hypermesh
 
Lab report watson
Lab report watsonLab report watson
Lab report watson
 
Lab 10.doc
Lab 10.docLab 10.doc
Lab 10.doc
 
Lab 10.doc
Lab 10.docLab 10.doc
Lab 10.doc
 
Bank of pecunia mortgage risk model
Bank of pecunia mortgage risk modelBank of pecunia mortgage risk model
Bank of pecunia mortgage risk model
 
Easy Pivot Tutorial June 2020
Easy Pivot Tutorial June 2020Easy Pivot Tutorial June 2020
Easy Pivot Tutorial June 2020
 
Tutorials.pdf
Tutorials.pdfTutorials.pdf
Tutorials.pdf
 
CedCommerce Walmart Marketplace Repricer Extension for Magento Store
CedCommerce Walmart Marketplace Repricer Extension for Magento StoreCedCommerce Walmart Marketplace Repricer Extension for Magento Store
CedCommerce Walmart Marketplace Repricer Extension for Magento Store
 
Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011
 
AI Builder - Binary Classification
AI Builder - Binary ClassificationAI Builder - Binary Classification
AI Builder - Binary Classification
 
Scoring documentation
Scoring documentationScoring documentation
Scoring documentation
 
Advanced Computer Programming..pptx
Advanced Computer Programming..pptxAdvanced Computer Programming..pptx
Advanced Computer Programming..pptx
 
Predictive Modeling with Enterprise Miner
Predictive Modeling with Enterprise MinerPredictive Modeling with Enterprise Miner
Predictive Modeling with Enterprise Miner
 
Predictive Modeling with Enterprise Miner
Predictive Modeling with Enterprise MinerPredictive Modeling with Enterprise Miner
Predictive Modeling with Enterprise Miner
 
Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)
 
CIS 1403 lab 4 selection
CIS 1403 lab 4 selectionCIS 1403 lab 4 selection
CIS 1403 lab 4 selection
 
How to prevent duplicate values in a range nta
How to prevent duplicate values in a range ntaHow to prevent duplicate values in a range nta
How to prevent duplicate values in a range nta
 
Blackboxtesting 02 An Example Test Series
Blackboxtesting 02 An Example Test SeriesBlackboxtesting 02 An Example Test Series
Blackboxtesting 02 An Example Test Series
 

Plus de Abhik Seal

Modeling Chemical Datasets
Modeling Chemical DatasetsModeling Chemical Datasets
Modeling Chemical Datasets
Abhik Seal
 
Introduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsIntroduction to Adverse Drug Reactions
Introduction to Adverse Drug Reactions
Abhik Seal
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
Abhik Seal
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
Abhik Seal
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
Abhik Seal
 
Understanding Smiles
Understanding Smiles Understanding Smiles
Understanding Smiles
Abhik Seal
 
Learning chemistry with google
Learning chemistry with googleLearning chemistry with google
Learning chemistry with google
Abhik Seal
 
3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data
Abhik Seal
 

Plus de Abhik Seal (20)

Chemical data
Chemical dataChemical data
Chemical data
 
Clinicaldataanalysis in r
Clinicaldataanalysis in rClinicaldataanalysis in r
Clinicaldataanalysis in r
 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug Discovery
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on r
 
Data handling in r
Data handling in rData handling in r
Data handling in r
 
Networks
NetworksNetworks
Networks
 
Modeling Chemical Datasets
Modeling Chemical DatasetsModeling Chemical Datasets
Modeling Chemical Datasets
 
Introduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsIntroduction to Adverse Drug Reactions
Introduction to Adverse Drug Reactions
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
 
Understanding Smiles
Understanding Smiles Understanding Smiles
Understanding Smiles
 
Learning chemistry with google
Learning chemistry with googleLearning chemistry with google
Learning chemistry with google
 
3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data
 
Poster
PosterPoster
Poster
 
R scatter plots
R scatter plotsR scatter plots
R scatter plots
 
Indo us 2012
Indo us 2012Indo us 2012
Indo us 2012
 
Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
 
Pharmacohoreppt
PharmacohorepptPharmacohoreppt
Pharmacohoreppt
 
Document1
Document1Document1
Document1
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Weka guide

  • 1. Guide for reproducing results of Bioassay paper using Weka
  • 2. Important points to remember before starting a run:  All datasets should be in ARFF format, otherwise weka will complain for incompatible format during training and testing.  Standard classifiers are used for confirmatory screen data as it is smaller and less im- balanced, whereas cost-sensitive classifiers are used with primary & mixed datasets as they are more imbalanced.  We have two goals- 1. To find most robust and versatile classifier for imbalanced bioassay data. 2. To find out optimal misclassification cost setting for a classifier.  The misclassification cost for False Negatives has to be set in order to achieve maxi- mum number of True Positives with a False Positive rate less than 20%.  The datasets are randomly split into 80% training and validation set and 20% independ- ent test set, so we should have two files for each dataset one for training the classifier and one for testing the model built by that classifier.  Use 5 fold cross-validation for larger datasets i.e. primary and mixed screens and use 10 fold cross–validation for smaller datasets i.e. confirmatory screens.  CostSensitiveClassifier is used for base classifiers Naïve Bayes, SMO (Sequential Minimal Optimization) and Random Forest, as it outperforms other meta-learners.  MetaCost with J48 produces bettet results than other meta-learners.  For Naïve Bayes and Random Forest, default options are used.  For SMO, option BuildLogisticModels was set to true.  For J48, option Unpruned was set to true.  For more details please refer the paper.
  • 3. Step wise guide to set-up a weka run: 1. Start weka explorer. 2. In Preprocess tab go to open file… 3. Open a training file in ARFF format. Click open 4. For example, AID1608red_train.arff. 5. After opening the file should look like:
  • 4. 6. Now click on classify tab in the menu bar. 7. We will first train a model using Naïve Bayes classifier, as we are using confirmatory screen AID1608 we will first apply standard classifiers and if there will be less than 20% False Positive rate than cost-sensitive classifiers is used. 8. Click on Choose button to select a classifier. From Bayes folder choose Naïve Bayes. 9. Your window should appear as below with cross-validation selected with 10 folds:
  • 5. 10. Now click on start button, model will start building. 11. Since we have used 10 fold cross-validation so it will build models for 10 folds. Check status here Run completed
  • 6. 12. Look at the output section scroll to bottom section as shown: 13. This is the model generated by Naïve Bayes classifier by using training set AID1608red_train. 14. Next step is to test this model on the independent test set AID1608red_test. 15. Go to section test options select Supplied test set and click on set. 16. Open the test file AID1608red_test.
  • 7. 17. After reading the file close the Test instances dialog by clicking on close. 18. Now right-click on your model in result list and choose Re-evaluate model on current test set. Click here
  • 8. 19. Within fraction of a second results are produced in the same output window. False positive True positive False negative True negative 20. We have obtained a False Positive rate of 14.5% which is less than 20% and a True posi- tive rate of 15.4% which is very low. Now, we will set cost-sensitive classifier to improve the results. 21. As mentioned in page 2 of this tutorial for Naïve Bayes we will use Weka’s CostSensi- tiveClassifier. 22. The author has used incremental costing where cost was increased in stages from 2 to 1000000, until a 20% False positive rate was reached. 23. So, we will set up a cost matrix by starting with a misclassification cost of 2.
  • 9. 24. Go to choose button, select CostSensitiveClassifier from meta folder. 25. Click on the text box to open the GenericObjectEditor dialog box as shown: Click here and this dialog box will open up
  • 10. 26. In this dialog box, select Naïve Bayes from choose classifier. 27. Next, click on costMatrix to set up misclassification cost. 28. We have 2 classes in our dataset i.e. actives and inactives so we will set up a 2X2 Matrix. ( For TP, FP, TN, FN).  In classes enter 2.  Click resize to cre- ate a 2X2 matrix.  Change misclassi- fication cost for false negatives to 2.  Then close the dialog box. Write 2 in place of 1
  • 11. 29. Leave all other options default and now close GenericObjectEditor dialog by clicking OK 30. Click start to begin building cost-sensitive model. 31. Repeat steps 13-19 as described above for testing. 32. See improved results, True Positives has increased within a 20% limit for False Positives. 33. We stop here as we have achieved our goal. 34. Similarly, you can build models using SMO, Random Forest and J48. Check their settings as mentioned on page 2 of this tutorial before starting the run.