ABSTRACT
Diabetes mellitus is the most common disease worldwide and keeps
increasing everyday due to changing lifestyles, unhealthy food habits
and over weight problems.
There were studies handled in predicting diabetes mellitus through
physical and chemical tests, are available for diagnosing diabetes.
Data science methods have the potential to benefit other scientific
fields by shedding new light on common questions.
In the proposed system, an efficient way of detecting diabetes is
proposed through machine learning and deep leaning. Under machine
learning, we used the classification algorithm Support Vector machine
(SVM) and neural network (NN) for deep learning algorithm.
The experiment results shows that the prediction of diabetes done at
high accuracy.
OBJECTIVES OF STUDY
The objective of the study is classify Indian PIMA
dataset for diabetes.
This is proposed to achieve through machine learning
and deep learning classification algorithm.
Classification is considered as our data mining
problem, in which SVM algorithm is proposed to use
as machine learning part.
Neural network is used for deep learning part.
Our objective is to design an interactive application, in
which user can give a single input to arrive the
prediction.
LITERATURE REVIEW
A new data preparation method based on clustering algorithms for diagnosis
systems of heart and diabetes diseases
Yılmaz, Nihat & Inan, Onur & Uzer, Mustafa. (2014). A New Data Preparation
Method Based on Clustering Algorithms for Diagnosis Systems of Heart and
Diabetes Diseases. Journal of medical systems. 38. 48. 10.1007/s10916-014-0048-7.
The most important factors that prevent pattern recognition from functioning rapidly
and effectively are the noisy and inconsistent data in databases. This article presents a
new data preparation method based on clustering algorithms for diagnosis of heart and
diabetes diseases. In this method, a new modified K-means Algorithm is used for
clustering based data preparation system for the elimination of noisy and inconsistent
data and Support Vector Machines is used for classification. This newly developed
approach was tested in the diagnosis of heart diseases and diabetes, which are prevalent
within society and figure among the leading causes of death. The data sets used in the
diagnosis of these diseases are the Statlog (Heart), the SPECT images and the Pima
Indians Diabetes data sets obtained from the UCI database. The proposed system
achieved 97.87 %, 98.18 %, 96.71 % classification success rates from these data sets.
Classification accuracies for these data sets were obtained through using 10-fold cross-
validation method. According to the results, the proposed method of performance is
highly successful compared to other results attained, and seems very promising for
pattern recognition applications.
Contd..
Classification Of Diabetes Disease Using Support Vector Machine
Jegan, Chitra. (2013). Classification Of Diabetes Disease Using Support Vector
Machine. International Journal of Engineering Research and Applications. 3. 1797
- 1801.
Diabetes mellitus is one of the most serious health challenges in both developing and
developed countries. According to the International Diabetes Federation, there are 285
million diabetic people worldwide. This total is expected to rise to 380 million within 20
years. Due to its importance, a design of classifier for the detection of Diabetes disease
with optimal cost and better performance is the need of the age. The Pima Indian
diabetic database at the UCI machine learning laboratory has become a standard for
testing data mining algorithms to see their prediction accuracy in diabetes data
classification. The proposed method uses Support Vector Machine (SVM), a machine
learning method as the classifier for diagnosis of diabetes. The machine learning method
focus on classifying diabetes disease from high dimensional medical dataset. The
experimental results obtained show that support vector machine can be successfully used
for diagnosing diabetes disease
Contd..
Diagnosis Of Diabetes Using Classification Mining Techniques
Iyer, Aiswarya & Jeyalatha, S & Sumbaly, Ronak. (2015). Diagnosis of
Diabetes Using Classification Mining Techniques. International Journal
of Data Mining & Knowledge Management Process. 5. 1-14.
10.5121/ijdkp.2015.5101.
Diabetes has affected over 246 million people worldwide with a majority of
them being women. According to the WHO report, by 2025 this number is
expected to rise to over 380 million. The disease has been named the fifth
deadliest disease in the United States with no imminent cure in sight. With the
rise of information technology and its continued advent into the medical and
healthcare sector, the cases of diabetes as well as their symptoms are well
documented. This paper aims at finding solutions to diagnose the disease by
analyzing the patterns found in the data through classification analysis by
employing Decision Tree and Naïve Bayes algorithms. The research hopes to
propose a quicker and more efficient technique of diagnosing the disease,
leading to timely treatment of the patients
Contd..
A Prediction Technique in Data Mining for Diabetes Mellitus
Mareeswari, V. & Saranya, R & Mahalakshmi, R & Preethi, E. (2017).
Prediction of Diabetes Using Data Mining Techniques. Research Journal
of Pharmacy and Technology. 10. 1098. 10.5958/0974-360X.2017.00199.8.
Diabetes mellitus is one of the world’s major diseases. Millions of people are
affected by the disease. The risk of diabetes is increasing day by day and is
found mostly in women than men. The diagnosis of diabetes is a tedious
process. So with improvement in science and technology it is made easy to
predict the disease. The purpose is to diagnose whether the person is affected
by diabetes or not using K Nearest Neighbor classification technique. The
diabetes dataset is a taken as the training data and the details of the patient are
taken as testing data. The training data are classified by using the KNN
classifier and secondly the target data is predicted. KNN algorithm used here
would be more efficient for both classification and prediction. The results are
analyzed with different values for the parameter k.
Contd..
Prognosis of Diabetes Using Data mining Approach-Fuzzy C Means Clustering
and Support Vector Machine
Sanakal, Ravi & Jayakumari, Smt. (2014). Prognosis of Diabetes Using Data
mining Approach-Fuzzy C Means Clustering and Support Vector Machine.
International Journal of Computer Trends and Technology. 11. 94-98.
10.14445/22312803/IJCTT-V11P120.
Clinical decision-making needs available information to be the guidance for physicians.
Nowadays, data mining method is applied in medical research in order to analyze large
volume of medical data. This study attempts to use data mining method to analyze the
databank of Diabetes disease and diagnose the Diabetes disease. This study involves the
implementation of FCM and SVM and testing it on a set of medical data related to
diabetes diagnosis problem. The medical data is taken from UCI repository, consists of 9
input attributes related to clinical diagnosis of diabetes, and one output attribute which
indicates whether the patient is diagnosed with the diabetes or not. The whole data set
consists of 768 cases.
EXISTING SYSTEMS
Existing many research handled for diabetes detection.
Data mining approach like clustering, classification
were studied in existing system.
Diabetes prediction using algorithms such as k-
Nearest Neighbour (k-NN), k-means, branch and
bound algorithm was proposed. A basic diabetic
dataset is chosen for carrying out the comparative
analysis. The importance of feature analysis for
predicting diabetes by employing machine learning
technique is discussed.
ISSUES IN EXISTING SYSTEM
using machine learning the accuracy of detection is
less
High false positives
There is no interactive tool for users to predict
diabetes.
PROPOSED SYSTEM
The proposed system study is classification of Indian
PIMA dataset for diabetes as binary classification
problem.
This is proposed to achieve through machine learning
and deep learning classification algorithm.
For machine learning, SVM algorithm is proposed
For deep learning Neural network is used.
The proposed system improves accuracy of prediction
through deep learning techniques.
Requirements
Hardware Requirements
Processor : Any Processor above 500 MHz.
Ram : 4 GB
Hard Disk : 4 GB
Input device : Standard Keyboard and Mouse.
Output device : VGA and High Resolution
Monitor.
Software Requirements
Operating System : Windows 7 or higher
Programming : Python 3.6 and related libraries