Deep learning-approach

A deep learning approach for predicting a hypertensive
population
Fernando López-Mart´ıneza,1,∗
, Edward Rolando Núñez-Valdezb,3,∗∗
, Aron
Schwarcz.MDa,2,∗∗
, Vicente Garc´ıa-D´ıazb,3,∗∗
a350 Engle Street,Englewood Hospital, Englewood,NJ 07631, USA
bC/ Federico Garca Lorca, University of Oviedo, 33007, Oviedo, Spain
Abstract
This study presents a deep learning prediction model to evaluate the association
between gender, race, BMI, age, smoking, kidney disease and diabetes in hyper-
tensive individuals. Data collected from NHANES survey from 2007 to 2016.
An imbalanced dataset of 24,434 with (69.71%) non-hypertensive individuals
and (30.29%) hypertensive individuals was used for this study. The study re-
sults show a sensitivity of 40%, a specificity of 87%, precision on the positive
predicted value of 57.8% in the test sample and a calculated AUC of 77% (95%
CI[75.01 - 79.01]).This deep learning model can be implemented in applications
of artificial intelligence to guide population health management in detecting
patients with high probability of developing hypertension. Deep learning tech-
niques applied to large clinical datasets may provide a meaningful data-driven
approach to categorize patients for population health management.
Keywords: Machine Learning, Deep Learning, Artificial Neural
Networks, Hypertension, NHANES, medical decision support systems.
∗Corresponding author
∗∗Co-Authors
Email addresses: uo259897@uniovi.es (Fernando López-Mart´ınez),
nunezedward@uniovi.es (Edward Rolando Núñez-Valdez), aron.schwarcz@ehmchealth.org
(Aron Schwarcz.MD), garciavicente@uniovi.es (Vicente Garc´ıa-D´ıaz)
1Student. Department of Computer Science, Oviedo University. Phone: +1 5515870112
2Board Certified in Cardiovascular Disease and Nuclear Cardiology. Phone: +1 9175969186
3Lecturer. Department of Computer Science, Oviedo University. +34 985103326
Preprint submitted to Engineering Applications of Artificial Intelligence July 12, 2018

1. Introduction
Currently, the use of deep learning models for disease prediction is increas-
ing rapidly, not only because of the large amount of data available that is being
generated by health care devices and systems but also for the magnitude of
computational resources available for data calculation and processing(Dr. S.5
Vijayarani1, 2015). This immense amount of data can be use to train models
significantly, and facilitates the use of expert systems and classification tech-
niques for finding trends and patterns in the evaluation and prediction of several
diseases. Hypertension is one of the most critical risk factors for cardiovascu-
lar disease that caused 17.7 million deaths in the world in 2015(World Health10
Organization, 2017). And, in the United States, this disease is the major cause
of death among U.S. adults despite the existence of effective and inexpensive
treatments(National Center for Health Statistics, 2017), with significant public
health risks and economic implications for U.S. population. The National Health
and Nutrition Examination Survey (NHANES) conducted by the National Cen-15
ter for Health Statistics has been the main source of tracking hypertension in the
U.S. population (Committee on Public Health Priorities to Reduce and Control
Hypertension in the U.S. Population., 2010) with huge amounts of variables and
data related to cardiovascular diseases. In this study we develop a deep learning
classification model to predict hypertension with non-invasive risk factors us-20
ing national health data from the National Health and Nutrition Examination
Survey NHANES. An artificial neural network architecture was used to iden-
tify hypertensive individuals at risk of developing hypertension. Our main goal
in this study was to train a classifier that will identify hypertensive patients
in a highly imbalanced NHANES dataset. Additionally, we aspire to achieve25
lower error rate with a deep learning architecture compared to the logistic re-
gression model developed in a previous study (López-Mart´ınez et al., 2018) by
using the same set of input variables. The motivation to develop a new model
was the non-separability of the input variables, and neural networks are usu-
ally trained to treat nonlinearity due to the nonlinear nature of them (Dreiseitl30
2

& Ohno-Machado, 2002), making the model more flexible compared to logistic
regression.
The structure of this paper is as follows: Section 2 presents related work
and literature research of several risk models that used neural networks for
cardiovascular disease prediction. In section 3, we describe the development35
of the deep learning model, describing the population and data source, and
validation of the model. Section 4 presents the statistical and clinical analysis
of the outcomes. Section 5 discusses results of the model and its limitations.
Finally, section 6 covers conclusions and future work.
2. Related work40
Several papers were reviewed that used artificial neural networks to predict
hypertension risk and some other studies compared classification performance
and accuracy with logistic regression. In every paper we analyzed the model
building process, variable selection, ground truth, training and test datasets,
overfitting avoidance, error estimate and accuracy information. We also re-45
viewed if the models were validated or not, either by an unseen dataset or a
panel of experts in the domain (Debray et al., 2015). Some of the models are
shown in table 1.
LaFreniere et al.(LaFreniere et al., 2016) presents an Artificial Neural Net-
work (ANN) to predict hypertensive individuals utilizing the Canadian Primary50
Care Sentinel Surveillance Network (CPCSSN) data set. The independent vari-
ables used were age, gender, BMI, systolic and diastolic blood pressure, high
and low density lipoproteins, triglycerides, cholesterol, microalbumin, and urine
albumin creatinine ratio. Confusion matrix and Receiver Operating Character-
istic (ROC) curve were used to measure the accuracy of the model. This study55
used a very large data set to train the model compared with other studies.
Polak and Mendyk(Polak & Mendyk, 2008) improve and validate an artificial
neural network high blood pressure risk prediction model, using data from Cen-
ter for Disease Control and National Center for Health Statistics (CDC-NCHS).
3

Table 1: Related work
Artiﬁcal Neural Network Models comparison
Author Input Variables n Total Type of model AUC
LaFreniere et al age, gender, BMI,
sys/diast BP, high and
low density lipoproteins,
triglycerides, cholesterol,
microalbumin, and urine
albumin creatinine ratio
379,027 backpropagation neural
network
82%
Polak and Mendyk age, sex, diet, smoking
and drinking habits, phys-
ical activity level and BMI
159,989 backpropagation (BP)
and fuzzy network
75%
Zi-Hui Tang et al sys/diast BP, Fasting
plasma glucose,Age,BMI,
heart rate, gender,WC,
diabetes,renal proﬁle
2,092 feed-forward, back-
propagation neural
network
76%
Mevlut Ture et al age, sex, hypertension,
smoking, lipoprotein,
triglyceride,uric acid,total
cholesterol,BMI
694 feed-forward
neural network
81%
Ke-Shiuan Lynn et al Sixteen genes, age, BMI,
fasting blood sugar, hy-
pertension medication, no
history of cancer, kidney,
liver or lung
22184 genes, 159 cases one-hidden-layer neural
network
96.72%
Sakr et al Age, Gender, Race, rea-
son for test, stress, medi-
cal history
23,095 backpropagation neural
network
64%
Our Study Age, Gender, ethnicity,
BMI, smoking history,
kidney disease, diabetes
24,434 feed-forward, three-
hidden layer neural
network
0.77%
4

The independent variables used in this model were age, sex, diet, smoking and60
drinking habits, physical activity level and BMI index. ROC curve was used to
measure the accuracy of the model and a comparison with a logistic regression
classification model was performed.
Zi-Hui Tang et al. (Tang et al., 2013) Develop an artificial neural network
for prediction of cardiovascular disease including hypertension, this study was65
based on a Chinese population. Univariate analysis indicated that 14 risk factors
showed statistically importance with cardiovascular disease. The ROC curve
was used to measure performance of the model.
Mevlut Ture et al. (Ture et al., 2005) implemented a multi layer perceptron
for classification of hypertensive patients. The predictor variables used were age,70
sex, family history of hypertension, smoking habits, lipoprotein, triglyceride,
uric acid, total cholesterol, and body mass index (BMI). Receiver Operating
Characteristic (ROC) curve were used to measure the accuracy of the model.
Ke-Shiuan Lynn et al. (Lynn et al., 2009) constructed a neural network
model to simulate the geneendophenotypedisease relationship for Taiwanese hy-75
pertensive males. Sixteen genes, age, BMI, fasting blood sugar, hypertension
medication, no history of cancer, kidney, liver or lung. Classification accuracy
was used to measure the performance of the model.
Sakr et al. (Sakr et al., 2018) built an artificial neural network to compares
the performance of it with different machine learning techniques on predicting80
the risk of developing hypertension. Age, gender, race, reason for test, stress
tests and medical history were used for prediction. Receiver Operating Charac-
teristic (ROC) curve were used to measure the accuracy of the model.
3. Model Development and Validation
We developed a deep learning classification model in this study, and we85
will discuss and analyze how the model has been implemented, trained and
evaluated. The package sklearn of Python programing language (Pedregosa
et al., 2012), Microsoft Cognitive Toolkit (CNTK) from Microsoft and AzureML
5

(Team et al., 2016) were used to build the model (Seide & Agarwal, 2016).
3.1. Data Source90
The National Health and Nutrition Examination Survey (NHANES) datasets
from NHANES 2007-2008 to NHANES 2015-2016 was used. Demographic, lab-
oratory data, blood pressure, body measures data and questionnaires related
to diabetes, smoking cigarette and kidney conditions were part of the dataset.
NHANES data was released as SAS transport files and imported for the study95
by using xport 2.0 Python reader.
3.2. Study Population and Analysis
Data of the National Health and Nutrition Examination Survey (NHANES),
collected during 2007 - 2016 were used to train and test the prediction model.
This deep learning model was created to evaluate the importance of several100
risk factor variables and their relationship with the prevalence of hyperten-
sion among a representative sample of adults ≥ 20 years in the United States
(n=24,434). Table 2. shows the distribution of the hypertensive patients by
gender and race.
3.3. Input variables105
For this study, the input variables selected were Age, Race, Body Mass In-
dex (BMI), cigarette smoking, kidney disease presence and diabetes. Chronic
kidney disease (CKD) was defined as “yes” response to the question “Have you
ever told by a health care provider you have weak or failing kidneys?” during the
interview, and for NHANES 2015-2016, CKD was defined as a glomerular fil-110
tration rate (GFR) ≤ 60 ml/min/1.73 m2
(Miller, 2009) and albumin-creatinine
ratio ≥ 30 mg/g (Center for Disease Control and Prevention, 2015). Partici-
pant were considered to have diabetes if they answered “Yes” or “Borderline” to
the question “Doctor told you have diabetes?” (U.S. Department of Health and
Human, 2005). Cigarette smokers were defined as adults who reported having115
smoked ≥ 100 cigarettes during their lifetime and who currently smoke every
6

Table 2: n samples by hypertensive class, gender and race.
Hypertension, Adults 20 and over - 2007 - 2016
Class Gender Race n
Hypertensive
Female
Mexican American 464
Non-Hispanic Black 925
Non-Hispanic White 1,433
Other Hispanic 368
Other Race - Including Multi-Racial 277
Male
Mexican American 575
Non-Hispanic Black 1,039
Other Hispanic 371
Non-Hypertensive
Female
Mexican American 1,461
Other Hispanic 1,084
Other Race - Including Multi-Racial 1,038
Male
Mexican American 1,275
Other Hispanic 820
Total 24,434
7

day or some days (Centers for Disease Control, 2016). Age and BMI were cat-
egorized and transformed from continues variables to Categorical variables to
help understand the real relationship between the variables and to reduce the
scale difference between them. Blood pressure was used to create the dependent120
variable; where hypertension was defined as a mean systolic blood pressure of
130 mmHg, previously define as 140 mmHg by the American Heart Associa-
tion(Whelton et al., 2017). Table 3. show the independent variables and the
dependent variable.
3.4. Variables selection125
Clinical importance was relevant in addition to the statistical significance to
select the final inputs for the model. For this study, chi2 was selected due to the
categorical form of the data used in the model and previous work indicates that
this statistical test should be applied to evaluate sets of categorical data(Feizi-
Derakhshi & Ghaemi, 2014; Razmjoo et al., 2017; Uysal & Gunal, 2014; Seret130
et al., 2015). Some heuristic techniques and some filters and wrapper methods
were investigated to confirm the usefulness of the variables. Genetic Algorithm
in combination with other Machine Learning methods probably shows more
desirable results(Uysal & Gunal, 2014). Some modifications to the original
Genetic Algorithm in combination with Artificial Neural networks have shown135
acceptable time complexity to find optimal solutions(Jiang et al., 2017; Wu
et al., 2011), and it will be considered and discussed in future studies. For this
study, the statistical computation and measure of association of Chi2 between
each non-negative feature and the class were used due to the characteristics of
the inputs. Table 4 shows all variables with their p-values and scores. Based on140
the clinical significance of all input variables, our clinical expert decided not to
exclude any variable from the study due to the relationship between variables
in previous studies.
3.5. Deep Learning Model
Deep learning describes a class of machine learning algorithms that receives145
input from one or more sources into layers of intermediate features or nodes
8

Table 3: Variables included in the model
Variables included in the model
Variable Name Description Code Meaning
GENDER Gender
1 Male
2 Female
AGERANGE Age at Screening Adjudicated - date of birth was used to calculate AGE
1 20-30
2 31-40
3 41-50
4 51-60
5 61-70
6 71-80
RACE Race/Hispanic origin
1 Mexican American
2 Other Hispanic
3 Non-Hispanic White
4 Non-Hispanic Black
5 Other Race - Including Multi-Racial
BMXBMI Body Mass Index (kg/m**2)
1 Underweight = ¡18.5
2 Normal weight = 18.524.9
3 Overweight = 2529.9
4 Obesity = BMI of 30 or greater
KIDNEY Ever told you had weak/failing kidneys
1 Yes
2 No
SMOKE Smoked at least 100 cigarettes in life
1 Yes
2 No
DIABETES Doctor told you have diabetes
1 Yes
2 No
3 Borderline
HYPCLASS Systolic: Blood pres (mean) mm Hg
0 Non-Hypertensive
1 Hypertensive
9

Table 4: Chi-squared between each variable
Chi-squared between each feature
Feature p-value Score
GENDER 0.3988107 0.711909
AGERANGE 0.000000 1965.607023
RACE 0.008822 6.858521
BMIRANGE 0.0172385 5.67193
KIDNEY 0.3546428 0.856775
SMOKE 0.0975246 2.745566
DIABETES 0.0012164 10.465222
which are chained together (hidden layers) to build a network that generates
output values (Ching et al., 2018). Figure 1 shows the structure of the artificial
neural network architecture used in this study. The input nodes received cate-
gorical values that will be encoded and normalized with gaussian normalization150
(Jain et al., 2018) to improve the computation of the network and to prevent
the data to be distant from each other. Three fully connected hidden layers
in addition to the input layer and output layer were added to the network to
generate our feedforward multilayer perceptron architecture. The number of
hidden layers and hidden nodes depends on the complexity of the problem and155
is an active area of research. For our study, the forward approach was used as
described by (Panchal & Panchal, 2014) and three layers were enough so our
feedforward network can function as a universal approximation for the math-
ematical function that can separate our input variables to solve classification
problems of arbitrary complexity and preventing overfitting.160
We can define a simple feedforward network with the first layer taking an
input vector (x) with dimensions (m) and produces the output (first hidden
layer S1
with dimension (n). Each feature in the input layer is connected with
10

X1
X2
X3
X4
X5
X6
X7
0
1
Hidden
layer1
Hidden
layer2
Hidden
layer3
Input
layer
Output
layer
Figure 1: Multilayer Perceptron Architecture
a node in the output later by the weight which is represented by a matrix (W)
with dimensions ( m × n ):165
S1
= W · X + b
Where (b) is a bias vector of dimension (n).
In the architecture of our network, each layer has its own weight matrix W,
its own bias vector b, a input vector n and an output vector o. The layers
will be identify by superscripts appended to these variables. For R inputs, S1
represents the number of neurons in the ﬁrst layer and S2
neurons in the second170
layer. The output layer for our model is the layer 4 with three hidden layers
1,2 and 3. Then, the output of layer 1 is the input for layer 2 and the output
of layer 2 is the input for layer 3 and the output of layer 3 is the input for the
output layer.
if:175
o1
= f1
(W1
x + b1
)
11

o2
= f2
(W2
o1
+ b2
)
o3
= f3
(W3
o2
+ b3
)
o4
= f4
(W4
o3
+ b4
)
then,
o3
= f3
(W3
f2
(W2
f1
(W1x + b1
) + b2
) + b3
)
The structure of our network can be expressed in a simple way as:180
R → S1
→ S2
→ S3
→ S4
Each layer will be a dense layer with a specific input dimensions, output
dimensions and activation function. Specifically, in our model, the first layer
have an input dimension of 7 variables, an output dimension of 64 nodes and
activation function being relu, the second layer have an input dimension of 64
nodes, an output dimension of 32 nodes and activation function being also relu,185
the third layer have an input of 32 nodes, an output of 16 nodes and activation
function being also relu. And, the output layer with no activation. The final
output layer emits a vector of 2 values and we used softmax to normalize the
output of the model and to map the accumulated evidences or activations to a
probability distribution over the classes defined by the expression:190
Softmax Computes the gradient of
f(z) = log
i
exp(zi) at z = x
softmax(x) =
exp(x1)
i exp(xi)
exp(x1)
i exp(xi)
. . .
exp(x1)
i exp(xi)
The output is a vector of non-negative numbers that sum to 1 and can be
interpreted as probabilities for mutually exclusive outcomes(Triwijoyo et al.,
2017).
12

3.6. Model Training195
For training our model we would like the generated probabilities to be as
close as possible to the observed labels. We calculated the loss function which
is the difference between the learned model versus the generated by the training
set (Singh Gill et al., 2018). Cross entropy with softmax was used in our study.
We used the derivative of softmax to derive the derivative of the cross entropy200
loss function(Potash & Rumshisky, 2016):
∂L
∂oi
= pi

yi +
k=1
yk

 − yi (1)
y is the one hot encoded output vector for the labels, so k yk = 1, and
yi + k=1 yk = 1. Then,
∂L
∂oi
= pi − yi (2)
Once defined the loss function, the model minimizes it using an optimization
technique. In this study we used stochastic gradient descent with momentum205
(Allen-Zhu & Orecchia, 2014). The model start with random initialization of
the parameters and generates an new sets of parameters after each evaluation.
We used minibatches to train our model, small number of observations were
loaded into the model to calculate the average of the loss to update the model
parameters. Another key parameter used in our training was the learning rate210
(Takase et al., 2018). This is a factor that moderates how much we change
the parameters in each iteration. Each iteration will work on 10 samples and
we trained the model with 70% (17,104) of the dataset and the number of
minibatches to train is defined by the number of samples to train divided by
the minibatch size. Table 5 shows the parameters of the trainer architecture215
and Figure 2 and Figure 3 show the training loss and prediction error of the
minibatch run for the model.
13

Table 5: Model Architecture parameters
Model Architecture parameters
Parameter Value
Input Dimension 7
Num Output classes 2
Num Hidden Layers 3
Hidden Layer1 Dimension 64
Activation Func Layer1 Relu
Minibatch size 10
Num samples to train 17,104
Num minibatches to train 1,710
Loss Function cross entropy with softmax
Eval Error Classification error
Learner for parameters momentum sgd
Eval Metrics Confusion Matrix, AUC
3.7. Model Evaluation
In order to evaluate the classification of our model, we compute the average
test error. It finds the index of the highest value in the output vector and220
compares it to the actual ground truth label. We evaluated the trained network
on data that has not been used for training and corresponds to the ground
14

Figure 2: Training Error Figure 3: Loss Error
truth of 30% (7331) of the dataset. The resulting error is comparable to the
training error and this indicates that our model has good generalization error.
Our model can meritoriously deal with unseen observations and this is one of225
the keys to avoid overfitting (Subramanian & Simon, 2013).
For each observation, our model use softmax as the evaluation function that
returns the probability distribution across all the classes. In our study, it would
be a vector of 2 elements per observation. The output nodes in our model
convert the activations into a probability and maps the aggregated activations230
across the network to probabilities across the 2 classes. Figure 4 shows the test
prediction error for our model.
We also generated some other evaluation metrics to evaluate the classifier.
Table 6, shows the confusion matrix with a classification results summary of
the actual class label vs. the predicted ones. True positive value (1,134), True235
Negative value (4,221), False Negative (1,020) and False Positive value (956).
Table 6: Confusion Matrix
Predicted
Non-Hypertensive Hypertensive
TRUE
Non-Hypertensive 4477 648
Hypertensive 1318 887
15

Figure 4: Test prediction error
Classification report is shown in Table 7 and it shows the precision, sensitiv-
ity and the harmonic mean of precision and sensitivity. There is a marked low
precision on the hypertensive class due to the large number in the false posi-
tives, the sensitivity of the model moderately acceptable due to the imbalanced240
testing dataset, and this is reflected in the high number of false negatives.
Table 7: Classification Report
Classification Report
True Positive False Negative Precision Accuracy
887 1318 0.578 0.732
False Positive True Negative Recall f1-score
648 4477 0.402 0.474
Positive Label: 1 Negative Label: 0
4. Results and Analysis
Statistical and clinical analysis are important to explain the results and
usefulness of the model. In the next sections We will review some of the results.
16

4.1. Statistical Analysis245
A multilayer perceptron is a class of non-linear model that was use in this
study to identify the unknown nonlinearity of the input variables. Our test
sampling of 7,330 includes 5,125 (69,9%) non-hypertensive samples and 2,205
(30,1%) hypertensive samples. The model produces a sensitivity of 887/2,205
= 40% (positives that were correctly identified) and a specificity of 4,477/5,125250
= 87% (negatives that are correctly identified).
The positive predicted value was 887/1,535 =57% and the negative predicted
value 4,477/5,795 =77%. The false negative rate of the model was 1,318/2,205
= 59% and a false positive rate of 648/5,125 = 12%. The probability of false
alarm of the model was 12%, and the likelihood ratio for negative individuals was255
0.68%. The miss rate and the fall out rate were not significant for our model
due to the imbalance of the sampling dataset. In this study, the multilayer
perceptron model was better at identifying individuals who will not develop
hypertension than those that will develop hypertension. Figure 5 shows the
area under the curve plotted with true positive rate on the y axis and false260
positive rate on the x axis. Figure 6 shows the plot of the proportion of the true
results over all positives results versus the fraction of all correct results returned
by the model.
Figure 5: ROC Curve Figure 6: Precision/Recall
17

4.2. Clinical Analysis
A sensitivity of 40% and a specificity of 87% shows that our model might be265
ineffective for healthcare diagnosis in finding positive cases but the true negative
rate shows that our model can be efficient at finding non-hypertensive patients.
The high negative predicted value of 77% confirm the adequacy of the model as
a screening tool. The percentage of positive cases of 57% shows that our model
works slightly better than random guessing and the conditional probability of270
a negative test results is considerably low. The accuracy of 73% shows that our
model correctly identifies non-hypertensive patients and hypertensive patients
based on the characteristics of the dataset and the number os cases that were
examined.
5. Discussion and Limitations275
Even though an multilayer perceptron with one layer can model a huge vari-
ety of problems in the clinical domain, in our study, a model with three hidden
layers was useful to approximate the highly nonlinear behavior of the input vari-
ables. The results of the model were affected by the imbalanced dataset but we
did not balance it to maintain the real distribution of the samples.280
The study has shown that the prediction capability of the model was im-
proved (AUC - 77%), based on the results of the logistic regression model used
in a previous study (AUC - 73%) when applied to the input variables gender,
race, BMI, age, smoking, kidney disease and diabetes. However, challenges in
applying deep learning to clinical domain still remain. The use of deep learning285
to analyze hypertension risk variables can be considered as a complement for
the traditional approach and might be considered to validate other statistical
models.
One of the major limitation of our model is that it was developed using a
highly imbalanced dataset from the CDC to which a high prevalence of non-290
hypertensive patients is observed. therefore, our model must be validated in
other clinical settings, and further studies should include other neural network
18

architectures to improve the precision and recall of the model to consider the
integration of it to the clinical diagnostic scheme. Also, more adequate training
data volume will be needed to train our model and improve the classification295
results.
6. Conclusions and Future work
We proposed a deep learning approach to overcome the nonlinearity problem
with the risk factor variables used as an input for the multilayer perceptron
model. This study shows that the proposed model improve the accuracy and300
performance of a previous study that used the same input variables and the
results were better than logistic regression in a small percentage. The influence
of the imbalanced dataset to the class with more presence in the data was
confirmed by our multilayer perceptron model. This study showed that the
proposed model can be a guide to the design of other models. This model305
cannot be used to provide the final diagnosis in developing hypertension to the
patient. However, this model can be use to make them aware that there is a
probability that the patients could be developing hypertension.
For future work, we will apply this model to a real balanced dataset and
we will explore more network architectures and more interesting results. In310
addition, new risk factors can be used to better handle the distribution and
behavior of the input variables for the model. This study will be the ground for
the construction of a decision supporting tool that may be useful to healthcare
practitioners for contributing to decision making about the risk of developing
hypertension in typical or atypical patient screening circumstances.315
Acknowledgments
This document presents independent study funded by the company SYS-
DEVELOPMENT LLC. The points of view expressed are those of the authors
and not necessarily those of the NIHR, the NHS, the NHANES or the depart-
ment of health. We thank Wilton Alexander Marroquin Bocanegra, Microsoft320
19

corporation Most Valuable Professional at Artiﬁcial Intelligence and Babafemi
Thompson, Clinical Systems Analyst at Englewood Hospital who provided in-
sight and expertise that greatly assisted the study.
20

References
Allen-Zhu, Z., & Orecchia, L. (2014). Linear Coupling: An Ulti-325
mate Unification of Gradient and Mirror Descent. Distill, 2, e6.
URL: http://distill.pub/2017/momentumhttp://arxiv.org/abs/1407.
1537. doi:10.23915/distill.00006. arXiv:1407.1537.
Center for Disease Control and Prevention (2015). Percentage with CKD stage
3 or 4 who were aware of their disease by stage and age 1999-2012. Technical330
Report. URL: http://nccd.cdc.gov/ckd/detail.aspx?QNum=Q98.
Centers for Disease Control (2016). Current Cigarette Smoking Prevalence
Among Working Adults United States , 2004 2010. Technical Report Mor-
bidity and Mortality Weekly Report ( MMWR ). URL: https://www.cdc.
gov/mmwr/preview/mmwrhtml/mm6038a2.htm.335
Ching, T., Himmelstein, D. S., Beaulieu-Jones, B. K., Kalinin, A. A., Do, B. T.,
Way, G. P., Ferrero, E., Agapow, P. M., Zietz, M., Hoffman, M. M., Xie, W.,
Rosen, G. L., Lengerich, B. J., Israeli, J., Lanchantin, J., Woloszynek, S., Car-
penter, A. E., Shrikumar, A., Xu, J., Cofer, E. M., Lavender, C. A., Turaga,
S. C., Alexandari, A. M., Lu, Z., Harris, D. J., Decaprio, D., Qi, Y., Kundaje,340
A., Peng, Y., Wiley, L. K., Segler, M. H., Boca, S. M., Swamidass, S. J.,
Huang, A., Gitter, A., & Greene, C. S. (2018). Opportunities and obstacles
for deep learning in biology and medicine. Journal of the Royal Society In-
terface, 15. URL: http://www.ncbi.nlm.nih.gov/pubmed/29618526http:
//www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC5938574.345
doi:10.1098/rsif.2017.0387. arXiv:142760.
Committee on Public Health Priorities to Reduce and Control Hypertension in
the U.S. Population. (2010). A Population-Based Policy and Systems Change
Approach to Prevent and Control Hypertension. URL: http://www.nap.edu/
catalog.php?record{_}id=12819. doi:10.17226/12819.350
Debray, T. P. A., Vergouwe, Y., Koffijberg, H., Nieboer, D., Steyerberg,
E. W., & Moons, K. G. M. (2015). A new framework to enhance
21

the interpretation of external validation studies of clinical prediction
models. Journal of Clinical Epidemiology, 68, 279–289. URL: https:
//www.sciencedirect.com/science/article/pii/S0895435614002753?355
via{%}3Dihubhttp://dx.doi.org/10.1016/j.jclinepi.2014.06.018.
doi:10.1016/j.jclinepi.2014.06.018.
Dr. S. Vijayarani1, M. (2015). Liver Disease Prediction using SVM and Na¨ıve
Bayes Algorithms. International Journal of Science, Engineering and Tech-
nology Research, 4, 816–820.360
Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and
artificial neural network classification models: A methodology re-
view. URL: https://www.sciencedirect.com/science/article/pii/
S1532046403000340. doi:10.1016/S1532-0464(03)00034-0.
Feizi-Derakhshi, M.-R., & Ghaemi, M. (2014). Classifying Different365
Feature Selection Algorithms Based on the Search Strategies. In-
ternational Conference on Machine Learning, Electrical and Mechani-
cal Engineering (ICMLEME’2014, (pp. 17–21). URL: http://iieng.org/
images/proceedings{_}pdf/1360E0114032.pdf. doi:http://dx.doi.org/
10.15242/IIE.E0114032.370
Jain, S., Shukla, S., & Wadhvani, R. (2018). Dynamic selection of normaliza-
tion techniques using data complexity measures. Expert Systems with Appli-
cations, 106, 252–262. URL: https://www.sciencedirect.com/science/
article/pii/S095741741830232X. doi:10.1016/j.eswa.2018.04.008.
Jiang, S., Chin, K. S., Wang, L., Qu, G., & Tsui, K. L. (2017). Modi-375
fied genetic algorithm-based feature selection combined with pre-trained
deep neural network for demand forecasting in outpatient department.
Expert Systems with Applications, 82, 216–230. URL: https://ac.
els-cdn.com/S0957417417302610/1-s2.0-S0957417417302610-main.
pdf?{_}tid=8f927cf2-7277-4156-8908-20dc8ba49085{&}acdnat=380
22

1521460315{_}445b5474ecc8bf411654161cfe02f23b. doi:10.1016/j.
eswa.2017.04.017.
LaFreniere, D., Zulkernine, F., Barber, D., & Martin, K. (2016). Us-
ing machine learning to predict hypertension from a clinical dataset.
In 2016 IEEE Symposium Series on Computational Intelligence385
(SSCI) (pp. 1–7). URL: https://pdfs.semanticscholar.org/05af/
aefd523706b554896bbb70178b632343ba76.pdf. doi:10.1109/SSCI.2016.
7849886.
López-Mart´ınez, F., Schwarcz.MD, A., Núñez-Valdez, E. R., & Garc´ıa-
D´ıaz, V. (2018). Machine Learning Classification Analysis for a390
Hypertensive Population as a Function of Several Risk Factors. Ex-
pert Systems with Applications, 110, 206–215. URL: https://www.
sciencedirect.com/science/article/pii/S095741741830352Xhttp:
//linkinghub.elsevier.com/retrieve/pii/S095741741830352X.
doi:10.1016/j.eswa.2018.06.006.395
Lynn, K. S., lan Li, L., Lin, Y. J., Wang, C. H., Sheng, S. H., Lin, J. H.,
Liao, W., Hsu, W. L., & Pan, W. H. (2009). A neural network model for
constructing endophenotypes of common complex diseases: An application
to male young-onset hypertension microarray data. Bioinformatics, 25,
981–988. URL: http://www.ncbi.nlm.nih.gov/pubmed/19237446http:400
//www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC2666815.
doi:10.1093/bioinformatics/btp106.
Miller, W. G. (2009). Estimating glomerular filtration rate. Clinical
Chemistry and Laboratory Medicine, 47, 1017–1019. URL: https://www.
niddk.nih.gov/health-information/communication-programs/nkdep/405
laboratory-evaluation/glomerular-filtration-rate/estimating.
doi:10.1515/CCLM.2009.264.
National Center for Health Statistics (2017). Health, United States, 2016:
With Chartbook on Long-term Trends in Health. Technical Report.
23

URL: https://www.cdc.gov/nchs/data/hus/hus16.pdf{#}053https:410
//www.cdc.gov/nchs/data/hus/hus16.pdf{#}019.
Panchal, F. S., & Panchal, M. (2014). Review on Methods of Selecting Number
of Hidden Nodes in Artificial Neural Network. International Journal of Com-
puter Science and Mobile Computing, 311, 455–464. URL: www.ijcsmc.com.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel,415
O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas,
J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay,
É. (2012). Scikit-learn: Machine Learning in Python. Journal of Ma-
chine Learning Research, 12, 2825–2830. URL: http://www.jmlr.org/
papers/volume12/pedregosa11a/pedregosa11a.pdfhttp://dl.acm.org/420
citation.cfm?id=2078195{%}5Cnhttp://arxiv.org/abs/1201.0490.
doi:10.1007/s13398-014-0173-7.2. arXiv:1201.0490.
Polak, S., & Mendyk, A. (2008). Artificial neural networks based
Internet hypertension prediction tool development and valida-
tion. Applied Soft Computing, 8, 734–739. URL: https://ac.425
pdf?{_}tid=320c3e92-e752-11e7-862a-00000aab0f6b{&}acdnat=
1513972767{_}c0172ecf585f7aec4914e7d77441af2b. doi:10.1016/j.
asoc.2007.06.001.
Potash, P., & Rumshisky, A. (2016). Recommender system incor-430
porating user personality profile through analysis of written re-
views. In CEUR Workshop Proceedings (pp. 60–66). volume
1680. URL: http://cs231n.github.io/convolutional-networks/.
doi:10.1016/j.aqpro.2013.07.003. arXiv:arXiv:1011.1669v3.
Razmjoo, A., Xanthopoulos, P., & Zheng, Q. P. (2017). Online435
feature importance ranking based on sensitivity analysis. Ex-
pert Systems with Applications, 85, 397–406. URL: https://ac.
24

pdf?{_}tid=935aad69-06d6-4eee-b6a8-896eb4080096{&}acdnat=
1521459488{_}fc6a940dc7f4cca72324f35af146bdaf. doi:10.1016/j.440
eswa.2017.05.016.
Sakr, S., Elshawi, R., Ahmed, A., Qureshi, W. T., Brawner, C., Keteyian,
S., Blaha, M. J., & Al-Mallah, M. H. (2018). Using machine learn-
ing on cardiorespiratory ﬁtness data for predicting hypertension:
The Henry Ford exercise testing (FIT) Project. PLoS ONE, 13,445
e0195344. URL: http://www.ncbi.nlm.nih.gov/pubmed/29668729http:
//www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC5905952.
doi:10.1371/journal.pone.0195344.
Seide, F., & Agarwal, A. (2016). Cntk. In Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Min-450
ing - KDD ’16 KDD ’16 (pp. 2135–2135). New York, NY, USA: ACM.
URL: http://dl.acm.org/citation.cfm?doid=2939672.2945397. doi:10.
1145/2939672.2945397.
Seret, A., Maldonado, S., & Baesens, B. (2015). Identifying next rele-
vant variables for segmentation by using feature selection approaches.455
Expert Systems with Applications, 42, 6255–6266. URL: https://ac.
pdf?{_}tid=8c8988f7-fc8b-442a-b00a-18df03b92527{&}acdnat=
1521460138{_}7c7b8b1e3eca29166155148568bbcac4. doi:10.1016/j.
eswa.2015.01.070.460
Singh Gill, H., Singh Khehra, B., Singh, A., & Kaur, L. (2018). Teaching-
learning-based optimization algorithm to minimize cross entropy for Select-
ing multilevel threshold values. URL: https://www.sciencedirect.com/
science/article/pii/S1110866517302736. doi:10.1016/j.eij.2018.03.
006.465
Subramanian, J., & Simon, R. (2013). Overﬁtting in prediction models - Is
it a problem only in high dimensions? Contemporary Clinical Trials, 36,
25

636–641. URL: https://www.sciencedirect.com/science/article/pii/
S1551714413001031. doi:10.1016/j.cct.2013.06.011.
Takase, T., Oyama, S., & Kurihara, M. (2018). Effective neural net-470
work training with adaptive learning rate based on training loss. Neural
Networks, 101, 68–78. URL: https://www.sciencedirect.com/science/
article/pii/S0893608018300303. doi:10.1016/j.neunet.2018.01.016.
Tang, Z.-H., Liu, J., Zeng, F., Li, Z., Yu, X., & Zhou, L. (2013). Comparison
of Prediction Model for Cardiovascular Autonomic Dysfunction Using475
Artificial Neural Network and Logistic Regression Analysis. PLoS ONE, 8,
e70571. URL: http://journals.plos.org/plosone/article/file?id=10.
1371/journal.pone.0070571{&}type=printablehttp://dx.plos.org/
10.1371/journal.pone.0070571. doi:10.1371/journal.pone.0070571.
Team, A., Dorard, L., Reid, M. D., & Martin, F. J. (2016). AzureML: Anatomy480
of a machine learning service. JMLR: Workshop and Conference Proceedings,
50, 1–13. URL: http://datarobot.comhttp://proceedings.mlr.press/
v50/azureml15.pdf.
Triwijoyo, B. K., Budiharto, W., & Abdurachman, E. (2017). The Clas-
sification of Hypertensive Retinopathy using Convolutional Neural Net-485
work. In Procedia Computer Science (pp. 166–173). Elsevier vol-
ume 116. URL: https://www.sciencedirect.com/science/article/pii/
S1877050917321166. doi:10.1016/j.procs.2017.10.066.
Ture, M., Kurt, I., Turhan Kurum, A., & Ozdamar, K. (2005). Comparing clas-
sification techniques for predicting essential hypertension. Expert Systems490
with Applications, 29, 583–588. URL: https://pdfs.semanticscholar.
org/14ca/bb4509d6285a4c516f15c164a55a6bdd04d6.pdf. doi:10.1016/j.
eswa.2005.04.014.
U.S. Department of Health and Human (2005). Awareness of Predia-
betes United States, 20052010. Centers for Disease Control & Pre-495
vention Source: Morbidity and Mortality Weekly Report Centers for
26

Disease Control & Prevention, 62, 209–212. URL: https://www.cdc.
gov/mmwr/preview/mmwrhtml/mm6211a4.htmhttp://www.jstor.org/
stable/24846089{%}5Cnhttp://www.jstor.org/stable/24846089?seq=
1{&}cid=pdf-reference{#}references{_}tab{_}contents{%}5Cnhttp:500
//about.jstor.org/terms.
Uysal, A. K., & Gunal, S. (2014). Text classiﬁcation using ge-
netic algorithm oriented latent semantic features. Expert Sys-
tems with Applications, 41, 5938–5947. URL: https://ac.
els-cdn.com/S0957417414001791/1-s2.0-S0957417414001791-main.505
pdf?{_}tid=54c39245-0adb-4c04-9b1f-aeab69b3a69f{&}acdnat=
1521459926{_}4ffbed6d160dc687e75fce9cadcb9383. doi:10.1016/j.
eswa.2014.03.041.
Whelton, P. K., Committee, W., Carey, R. M., Chair, V., Aronow,
W. S., Committee Member, W., Casey, D. E., Collins, K. J., Denni-510
son Himmelfarb, C., DePalma, S. M., Gidding, S., Jamerson, K. A.,
Jones, D. W., MacLaughlin, E. J., Muntner, P., Ovbiagele, B., Smith,
S. C., Spencer, C. C., Staﬀord, R. S., Taler, S. J., Thomas, R. J.,
Williams, K. A., Williamson, J. D., & Wright, J. T. (2017). 2017
ACC/AHA/AAPA/ABC/ACPM/AGS/APhA/ASH/ASPC/NMA/PCNA515
Guideline for the Prevention, Detection, Evaluation, and Management
of High Blood Pressure in Adults. Journal of the American College of
Cardiology, (pp. 735–1097). URL: http://linkinghub.elsevier.com/
retrieve/pii/S0735109717415191. doi:10.1016/j.jacc.2017.11.006.
World Health Organization (2017). World Health Statistics 2017 : Mon-520
itoring Health for The SDGs. URL: http://apps.who.int/iris/
bitstream/10665/255336/1/9789241565486-eng.pdf?ua=1. doi:10.1017/
CBO9781107415324.004. arXiv:arXiv:1011.1669v3.
Wu, Y.-L., Tang, C.-Y., Hor, M.-K., & Wu, P.-F. (2011). Fea-
ture selection using genetic algorithm and cluster validation. Ex-525
27

pert Systems with Applications, 38, 2727–2732. URL: https://ac.
pdf?{_}tid=29051820-c700-41c6-aed5-0ffadbe4053d{&}acdnat=
1521460466{_}0074cf1fc04f9a4fe975c2cc7d70a594http://linkinghub.
elsevier.com/retrieve/pii/S0957417410008584. doi:10.1016/j.eswa.530
2010.08.062.
28

Deep learning-approach

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Deep learning-approach

Similar to Deep learning-approach (20)

More from Willy Marroquin (WillyDevNET)

More from Willy Marroquin (WillyDevNET) (20)

Recently uploaded

Recently uploaded (20)

Deep learning-approach