Chronic Kidney Disease Prediction

RajandeepKaur
Ph.D Scholar
18803003
Chronic Kidney Disease Prediction with
Attribute Reduction using Data Mining
Classifiers

Content
 Introduction
 What is Chronic Kidney Disease (CKD)
 Data Mining & Classification
 Role ofAttribute Selection
 LiteratureReview
 Dataset Used
 PerformanceParameters
 Results & Discussion
 Conclusion
 References

Introduction
 As the past records show, the number of deaths in
India due to chronic kidney disease (CKD) were 5.21
million in 2008 and this number can be further
raised to 7.63 million by 2020 [4] .
 There is need of detection of the chronic kidney
disease at early stage before getting it worse.
 To reduce mortality rate, an efficient technique is required
to predict and classify it.

Need of Study
General Problems :
 A large space is required for complete dataset
 Large computation time
 Not providing good Accuracy
Aim of study:
To predict Chronic Kidney Disease in more accurate and
faster way with reduced attributes.

What is Chronic Kidney Disease (CKD)
Structural or functional abnormalities of the kidneys for
>3 months, as manifested by either:
1. Kidney damage, with or without decreased GFR,
as defined by
 pathologic abnormalities
 markers of kidney damage, including abnormalities in the
composition of the blood or urine or abnormalities in
imaging tests
2. GFR <60 ml/min/1.73 m2, with or without
kidney damage; where GFR is Glomerular
FiltrationRate.

CKD
death
Stages in Progression of Chronic Kidney Disease
and Therapeutic Strategies
Complications
Screening
for CKD
risk factors
CKD risk
reduction;
Screening for
CKD
Diagnosis
& treatment;
Treat
comorbid
conditions;
Slow
progression
Estimate
progression;
Treat
complications;
Prepare for
replacement
Replacement
by dialysis
& transplant
Normal
Increased
risk
Kidney
failure
Damage  GFR

Data Mining & Classification
 Data mining refers to extracting meaningful
information from hidden patterns of dataset [2].
 The data mining techniques are very useful in health
informatics [16, 17].
 Data mining classification techniques play a vital role
in classifying various diseases from symptoms and
various medical tests.

Attribute Selection
 Before inducing a model we almost always do input
engineering
 The most useful part of this is attribute selection (also
called feature selection)
 Select relevantattributes
 Remove redundantand/or irrelevantattributes
 Select the most “relevant” subset of attributes according to
some selection criteria.
Why?

Reasons for Attribute Selection
 Simpler model
 Moretransparent
 Easier to interpret
 Faster model induction
 Structural knowledge
 Knowing which attributes are important may be inherently
important to the application
 Reduce storage requirement
What about the accuracy?

Attribute Selection Contd…
 Attribute Selection can be done by following two
methods:
 Filter
 Wrapper

Filter Method
 Results in either
Ranked list of attributes
 Typical when each attribute is evaluated individually
 Must select how many to keep
A selected subset of attributes
 Forward selection
 Best first
 Random search such as genetic algorithm

Wrapper Method
 “Wrap around” the learning algorithm
 Always evaluate subsets
 Return the best subset of attributes
 Use same search methods as before
 Wrapper approach is generally more accurate but
also more computationally expensive

Literature Review
Researcher Year Classifier Accuracy Remarks
K.R. Lakshmi [6] 2014 ANN 93.8521% Performed better than
Decision Tree and Logical
regressionclassifiers
Naganna Chetty
[7]
2015 NaïveBayes,
SMO,IBK
99%,98.25%,
100%
Attribute Reduction using
Wrapper Method
S.Vijayarani [8] 2015 SVM 76.32%. 584 instances and six
attributes
L.Jerlin Rubini
[9]
2015 Multilayer
Preceptor
99.75% Performed better than radial
basis function network, logistic
regression
Uma N Dulhare
[10]
2016 NaïveBayes 97.5% Attribute Reduction using
OneR
HuseyinPolat
[11]
2017 SVM 98.5%. Attribute Reduction
WalaA. [12] 2017 Decisiontree 99% Missing Values are replaced
withmean

DataSet Used
chronic_kidney_disease
from UCI machine learning
repository
Thedataset contains:
•400 instances
•25 attributes
 14 are nominal
11 are numeric

PERFORMANCE ANALYSIS PARAMETERS
 Accuracy
 Precision
 Recall
 RMSE (Root Mean Square Error)
 MAE (MeanAbsolute Error)
 ExecutionTime
 Kappa Statistics
 ROC(Receiver Operating Characteristics)

RESULT AND DISCUSSION
 Tool
 WEKA 3.8 (The Waikato Environment for Knowledge
Analysis)
 Classifier
 J48,DecisionTable and IBK
 AttributeSelection
 CfsSubsetEval,ClassifierSubsetEval,and WrapperSubsetEval
 SearchingTechnique
 Greedy and Bestfit Search Approach

RESULT OF J48, DECISION TABLE AND IBK
CLASSIFIERS ON CKD
Algorithm Accuracy Precision Recall Kappa Statistics Execution Time RMSE
J48 99% 0.990 0.990 0.9786 0.13 0.0807
DecisionTable 99% 0.990 0.990 0.9786 0.46 0.2507
IBK 95.75% 0.962 0.958 0.9113 0.01 0.2056
General Observations:
•J48 and Decision table provide 99% accuracy
•J48 provides least RMSE value
•IBK takes least time to execute

Attribute Reduction
Classifier Attribute Selection Method
Attributes in Original
Dataset
No. of reduced Attributes Attribute Reduction (in %)
J48
CFSSubsetEval+ Greedy
Stepwise
25 17 32
ClassifierSubsetEval+Greedy
Stepwise
25 4 84
WrapperSubsetEval+Best Fit 25 13 48
Decision Table
Stepwise
25 17 32
Stepwise
25 4 84
IBK
Stepwise
25 17 32
Stepwise
25 5 80

Accuracy of
Reduced Dataset
Classifier Attribute Selection Method Attribute Reduction (in %)
Accuracy without
Reduction
Accuracy with Reduction
J48
Stepwise
32 99 99
Stepwise
84 99 98.25
Decision Table
Stepwise
32 99 98.75
Stepwise
84 99 99.25
IBK
Stepwise
32 95.75 98
Stepwise
80 95.75 99.75
WrapperSubsetEval+Best Fit 72 95.75 100

Comparison of Accuracy for J48, Decision
Table and IBK Classifier with original and
reduced dataset

CONCLUSION
 The accuracy of IBK for original dataset is 95.75%
 While with 72% reduced dataset, it provides 100% accuracy
using WrapperSubsetEval attribute evaluator with bestfirst
search.
 J48 and Decision Table provides better results than IBK for
originaldataset
 While IBK performed better with reduced dataset than
originaldataset.
 IBK can be used to predict CKD in efficient and fast way with
reduced attributes.

References
[1] L. Jena, and N. Ku. Kamila, "Distributed data mining classification algorithms for prediction of chronic-
kidney-disease," International Journal of Emerging Research in Management &Technology, vol-4, Issue-
11, pp: 110-118, November 2015.
[2] K. Chandel, V. Kunwar, S. Sabitha, T. Choudhury, and S. Mukherjee, “A comparative study on thyroid
disease detection using K-nearest neighbor and Naive Bayes classification techniques, CSI transactions on
ICT, 4(2-4), pp: 313-319, 2016.
[3] Sudhir B. Jagtap, "Census data mining and data analysis using WEKA," arXiv preprint arXiv:1310.4647,
2013.
[4] S.Dilli Arasu, R.Thirumalaiselvi, “Review of Chronic Kidney Disease based on Data Mining Techniques,”
International Journal ofApplied Engineering Research, vol-12, pp: 13498-13505, 2017.
[5] S. Zeynu, Shruti Patil, “Survey on Prediction of Chronic Kidney Disease Using Data Mining Classification
Techniques and Feature Selection,” International Journal of Pure and Applied Mathematics, vol-118, No.
8,pp:149-156, 2018.
[6] K. R. Lakshmi, Y. Nagesh, and M. Veera Krishna, "Performance comparison of three data mining techniques
for predicting kidney dialysis survivability," International Journal of Advances in Engineering &
Technology, vol. 7, pp: 242-254, 2014.
[7] N. Chetty, Kunwar Singh Vaisla, and Sithu D. Sudarsan, “Role of attributes selection in classification of
Chronic Kidney Disease patients,” Computing, Communication and Security (ICCCS), International
Conference on. IEEE, 2015.

References
[8] S. Vijayarani, and S. Dhayanand, "Data mining classification algorithms for kidney disease
prediction,"International Journal on Cybernetics and Informatics (IJCI) , 2015.
[9] L. Jerlin Rubini and Dr. P. Eswaran, “Generating comparative analysis of early stage prediction of Chronic
Kidney Disease,” International Journal of Modern Engineering Research (IJMER), Volume 5, Issue 7, pp
49-55, July2015.
[10] Uma N. Dulhare, and Mohammad Ayesha, “Extraction of action rules for chronic kidney disease using
Naïve bayes classifier,” Computational Intelligence and Computing Research (ICCIC), IEEE International
Conference on IEEE, 2016.
[11] H. Polat, Homay Danaei Mehr, and Aydin Cetin, “Diagnosis of chronic kidney disease based on support
vector machine by feature selection methods,” Journal of medical systems, Feb 2017.
[12] W. Abedalkhader, and Noora Abdulrahman, “Missing Data Classification Of Chronic Kidney Disease,”
International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.5/6,
November 2017.
[13] Abeer Y. Al-Hyari, “Chronic Kidney Disease Prediction System UsingClassifying Data Mining Techniques,”
Library of university of Jordan, 2012.
[14 Jiliang Tang, Salem Alelyani, and Huan Liu, “Feature selection for classification: A review,” Data
classification:Algorithms and applications, 2014.

References
[15] Geoffrey Holmes, Andrew Donkin, and Ian H. Witten, “Weka: A machine learning workbench,”
Intelligent Information Systems, 1994. Proceedings of the 1994 Second Australian and New Zealand
Conference on. IEEE, 1994.
[16] Mary K. Obenshain, “Application of data mining techniques to healthcare data,” Infection Control &
Hospital Epidemiology25.8, pp: 690-695, 2004.
[17] Cheng, Li-Chen, Ya-Han Hu, and Shr-Han Chiou, “Applying the Temporal Abstraction Technique to the
Predictionof Chronic Kidney Disease Progression,” Journal of medical systems 41, April 2017.
[18] Neeraj Bhargava, Girja Sharma, Ritu Bhargava, and Manish Mathuria, “Decision tree analysis on J48
algorithm for data mining,” Proceedings of International Journal of Advanced Research in Computer
Scienceand Software Engineering, Vol. 3, pp:1114-1119, June 2013.
[19] Hongjun Lu, and Hongyan Liu, “Decision tables: Scalable classification exploring RDBMS
capabilities,”Proceedings of the 26th International Conference onVery Large Data Bases,VLDB'00. 2000.

Chronic Kidney Disease Prediction

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Chronic Kidney Disease Prediction

Similaire à Chronic Kidney Disease Prediction (20)

Plus de Rajandeep Gill

Plus de Rajandeep Gill (7)

Dernier

Dernier (20)

Chronic Kidney Disease Prediction