SlideShare une entreprise Scribd logo
1  sur  34
Logistic RegressionLogistic Regression
Dr Mike BlythDr Mike Blyth
February 2006February 2006
Logistic RegressionLogistic Regression
A way to look at effect ofA way to look at effect of
– ““Numeric” (interval or ratio) independentNumeric” (interval or ratio) independent
variablevariable
OnOn
– BinaryBinary (yes-no) dependent variable(yes-no) dependent variable
Dependent variable is continuousDependent variable is continuous intervalinterval oror
ratioratio (numeric)(numeric)
Independent variables are also interval orIndependent variables are also interval or
ratioratio
ExamplesExamples
– Effect of weight on blood pressureEffect of weight on blood pressure
– Effect of drug dose on reticulocyte countEffect of drug dose on reticulocyte count
Review Linear RegressionReview Linear Regression
Linear RegressionLinear Regression
Independent Variable Dependent Variable
Logistic RegressionLogistic Regression
Independent Variable Dependent Variable
Logistic RegressionLogistic Regression
Dependent variable is binary (yes/no) outcome.Dependent variable is binary (yes/no) outcome.
Independent variables are continuous intervalIndependent variables are continuous interval
Examples:Examples:
– Relation of weight and BP to 10 year risk of deathRelation of weight and BP to 10 year risk of death
– Relation of CD4 count to 1 year risk of AIDS diagnosisRelation of CD4 count to 1 year risk of AIDS diagnosis
Why do we need it?Why do we need it?
Could use categorical analysis such as frequency tableCould use categorical analysis such as frequency table
AIDSAIDS No AIDSNo AIDS
CD4 > 350CD4 > 350 8080 2020
150 < CD4 < 350150 < CD4 < 350 5050 5050
CD4 < 150CD4 < 150 2020 8080
• Problems
a) some information is lost when we collapse the
numeric data into categories. This leads to loss
of power.
b) no estimate of magnitude of relation
Odds RatioOdds Ratio
Probability:Probability:
p = probability of eventp = probability of event
1 - p = probabilty of1 - p = probabilty of notnot the event (also called q)the event (also called q)
p varies from 0 to 1p varies from 0 to 1
OddsOdds
– Ratio of probability of event to probability of notRatio of probability of event to probability of not
having the event: Odds = p/(1 - p)having the event: Odds = p/(1 - p)
– When p = 0.5, odds = 1 (or “1:1 odds”)When p = 0.5, odds = 1 (or “1:1 odds”)
– When p = 0.1, odds = 0.1/0.9 = 0.11When p = 0.1, odds = 0.1/0.9 = 0.11
Log Odds RatioLog Odds Ratio
The log odds ratio (also called “logit”) is simply the naturalThe log odds ratio (also called “logit”) is simply the natural
logarithm of the odds ratio:logarithm of the odds ratio:
¤ logitlogit = ln(odds ratio)= ln(odds ratio)
= ln(p/(1-p))= ln(p/(1-p))
= ln(p) – ln(1-p)= ln(p) – ln(1-p)
ln (1) = 0, so logit is 0 when odds are 1:1, orln (1) = 0, so logit is 0 when odds are 1:1, or
probability = 50%probability = 50%
The logit for event of probability p is the opposite of the logitThe logit for event of probability p is the opposite of the logit
for the probability of not having the event.for the probability of not having the event.
Relation between probability p and logit
0.000
0.250
0.500
0.750
1.000
-8 -6 -4 -2 0 2 4 6 8
logit = ln[p/(1-p)]
Logistic regression modelLogistic regression model
The linear regression model with one variableThe linear regression model with one variable
isis
y = a + bx + ey = a + bx + e
The logistic regression model with oneThe logistic regression model with one
variable isvariable is
logit = a + bx + elogit = a + bx + e
wherewhere
logit = ln(p/(1-p))logit = ln(p/(1-p))
The logistic regression model with oneThe logistic regression model with one
variable isvariable is
logit = a + bxlogit = a + bx where logit = ln(p/(1-p))where logit = ln(p/(1-p))
In other words, the model says the odds of the eventIn other words, the model says the odds of the event
happening arehappening are
– A constant factor (a)A constant factor (a)
– Some other constant (b)Some other constant (b)
– times a numeric risk factor (x) (for example, SBP)times a numeric risk factor (x) (for example, SBP)
Logistic regression modelLogistic regression model
Logistic regression modelLogistic regression model
Given value of the independent variables, theGiven value of the independent variables, the
regression equation predicts theregression equation predicts the
Log Odds RatioLog Odds Ratio
Logistic regression modelLogistic regression model
The statistics program calculates theThe statistics program calculates the
coefficient bcoefficient b
TheThe coefficient bcoefficient b shows how much the oddsshows how much the odds
ratio changes with a change in theratio changes with a change in the
independent variableindependent variable
Positive bPositive b  higher risk with higher valueshigher risk with higher values
Negative bNegative b  lower risk with higher valueslower risk with higher values
Logistic regression modelLogistic regression model
Hypothetical example given above examining relation of BP toHypothetical example given above examining relation of BP to
risk of stroke/death. The model predicts:risk of stroke/death. The model predicts:
ln(odds ratio) = constant + bln(odds ratio) = constant + b ∙ SBPSBP
ee(lnoddsratio)(lnoddsratio)
= e= e(c+b(c+b∙ SBP)SBP)
Odds RatioOdds Ratio == ee(c+b(c+b∙SBP)SBP)
== eecc
∙ ee(b(b∙SBP)SBP)
Logistic regression modelLogistic regression model
The coefficient b shows how much the odds ratioThe coefficient b shows how much the odds ratio
changes with a change in the independent variablechanges with a change in the independent variable
Odds RatioOdds Ratio == eecc
∙ ee(bx)(bx)
In other words,In other words,
Odds RatioOdds Ratio == somethingsomething ∙ (e(ebb
))(x)(x)
Logistic regression modelLogistic regression model
Odds RatioOdds Ratio = constant= constant ∙ ((eebb
))(x)(x)
SoSo eebb
is the factor indicating effect of x on theis the factor indicating effect of x on the
event.event.
Each one unit change in x will multiply the oddsEach one unit change in x will multiply the odds
ratio by a factor of eratio by a factor of ebb
..
Logistic regression modelLogistic regression model
Odds RatioOdds Ratio = constant= constant ∙ ((eebb
))(x)(x)
– Suppose b = 0.693 so eSuppose b = 0.693 so ebb
= 2= 2
– A one-unit change in x willA one-unit change in x will doubledouble the odds ratiothe odds ratio
– Suppose b = -0.693 so eSuppose b = -0.693 so ebb
= 0.5= 0.5
– A one-unit change in x willA one-unit change in x will halvehalve the odds ratio.the odds ratio.
– If b = 0, eIf b = 0, ebb
= 1, and x has no effect on OR= 1, and x has no effect on OR
Logistic regression modelLogistic regression model
For the hypothetical example above, the report isFor the hypothetical example above, the report is
given by Epi Info asgiven by Epi Info as
TermTerm OddsOdds
RatioRatio
95% CI95% CI CoeffCoeff S. E.S. E. ZZ PP
BPBP 1.05971.0597 1.0221.022 1.0981.098 0.05790.0579 0.01850.0185 3.1313.131 0.00170.0017
ConstConst ** ** ** -7.201-7.201 2.29942.2994 3.1313.131 0.00170.0017
Logistic regression modelLogistic regression model
TermTerm Odds RatioOdds Ratio 95% CI95% CI CoefficientCoefficient S. E.S. E. ZZ P-valueP-value
BPBP 1.05971.0597 1.0221.022 1.0981.098 0.05790.0579 0.0180.018 3.1313.131 0.00170.0017
ConstantConstant ** ** ** -7.2014-7.2014 2.2992.299 3.1313.131 0.00170.0017
Coefficient, or beta, or b, is the slope or magnitude
of the effect.
Logistic regression modelLogistic regression model
TermTerm OddsOdds
RatioRatio
95% CI95% CI CoefficientCoefficient S. E.S. E. ZZ P-valueP-value
BPBP 1.05971.0597 1.02201.0220 1.09871.0987 0.05790.0579 0.01850.0185 3.13193.1319 0.00170.0017
ConstantConstant ** ** ** -7.2014-7.2014 2.29942.2994 3.13193.1319 0.00170.0017
Odds ratio for one unit change in the
independent variable (e.g. BP). This is the
calculated eb
eb
A one unit change in BP multiplies the odds ratio by
1.0597.
Logistic regression modelLogistic regression model
TermTerm Odds RatioOdds Ratio 95% CI95% CI CoeffCoeff S. E.S. E. ZZ P-valueP-value
BPBP 1.05971.0597 1.0221.022 1.0981.098 0.05790.0579 0.01850.0185 3.13193.1319 0.00170.0017
ConstantConstant ** ** ** -7.2014-7.2014 2.29942.2994 3.13193.1319 0.00170.0017
95% confidence interval for that odds ratio.
The confidence interval does not include 1, so the
effect is statistically significant
Using more than one independentUsing more than one independent
variablevariable
Single variable:Single variable:
logit = c + bxlogit = c + bx
OR = c’ ∙ (eOR = c’ ∙ (ebb
))xx
Multiple variables:Multiple variables:
logit = c + blogit = c + b11xx11 + b+ b22xx22 + … + b+ … + bnnxxnn
OR = c’ ∙ (eOR = c’ ∙ (eb1b1
))x1x1
∙ (e∙ (eb2b2
))x2x2
∙ … ∙ (e∙ … ∙ (ebnbn
))xnxn
Note that the termsNote that the terms multiplymultiply their effect ontheir effect on
odds ratio.odds ratio.
Using more than one independentUsing more than one independent
variablevariable
Analysis reports a b coefficient for eachAnalysis reports a b coefficient for each
independent variable.independent variable.
That coefficient is the effect of the givenThat coefficient is the effect of the given
independent variable, separated from theindependent variable, separated from the
effects of all the other independent variables.effects of all the other independent variables.
Real Life ExampleReal Life Example
Prospective cohort study of causes ofProspective cohort study of causes of
cardiac disease: Evans County Study 1965cardiac disease: Evans County Study 1965
Independent variables = age, gender,Independent variables = age, gender,
race, social index, SBP, diabetes, smoking,race, social index, SBP, diabetes, smoking,
cholesterol, and an obesity indexcholesterol, and an obesity index
Dependent variable = risk of dying duringDependent variable = risk of dying during
10 year period10 year period
VariableVariable RangeRange b coeffb coeff SESE pp
ConstantConstant -6.376-6.376 1.6341.634 <0.001<0.001
AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001
GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
Age x genderAge x gender -0.043-0.043 0.0170.017 0.0110.011
Social indexSocial index 20-8420-84 -0.056-0.056 0.0400.040 0.1600.160
(Soc ind)(Soc ind)22
400-7056400-7056 0.00060.0006 0.00030.0003 0.0820.082
SBPSBP 88-31088-310 0.0190.019 0.0020.002 <0.001<0.001
DiabetesDiabetes 0=n, 1=y0=n, 1=y 1.1231.123 0.2610.261 <0.001<0.001
SmokingSmoking 0=n, 1=y0=n, 1=y 0.3170.317 0.1570.157 0.0430.043
CholesterolCholesterol 94-54694-546 0.00310.0031 0.00150.0015 0.0410.041
QuartletQuartlet 2.11-8.762.11-8.76 -1.064-1.064 0.4320.432 0.0140.014
(Quartlet)(Quartlet)22
4.44-76.84.44-76.8 0.1120.112 0.0490.049 0.0220.022
Cited in Kelsey et al., Methods in Observational Epidemiology, 1986
VariableVariable RangeRange b coeffb coeff SESE pp
ConstantConstant -6.376-6.376 1.6341.634 <0.001<0.001
AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001
GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
Age x genderAge x gender -0.043-0.043 0.0170.017 0.0110.011
Social indexSocial index 20-8420-84 -0.056-0.056 0.0400.040 0.1600.160
(Soc ind)(Soc ind)22
400-7056400-7056 0.00060.0006 0.00030.0003 0.0820.082
SBPSBP 88-31088-310 0.0190.019 0.0020.002 <0.001<0.001
DiabetesDiabetes 0=n, 1=y0=n, 1=y 1.1231.123 0.2610.261 <0.001<0.001
SmokingSmoking 0=n, 1=y0=n, 1=y 0.3170.317 0.1570.157 0.0430.043
CholesterolCholesterol 94-54694-546 0.00310.0031 0.00150.0015 0.0410.041
QuartletQuartlet 2.11-8.762.11-8.76 -1.064-1.064 0.4320.432 0.0140.014
(Quartlet)(Quartlet)22
4.44-76.84.44-76.8 0.1120.112 0.0490.049 0.0220.022
Statistical SignificanceStatistical Significance
The p value indicates statistical significanceThe p value indicates statistical significance
Age is positively correlated with risk of deathAge is positively correlated with risk of death
Gender has positive b coefficient, but the p valueGender has positive b coefficient, but the p value
is 0.12, indicating that we cannot say that there isis 0.12, indicating that we cannot say that there is
a significant relationship.a significant relationship.
VariableVariable RangeRange b coeffb coeff SESE pp
AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001
GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
Dichotomous (yes-no) variablesDichotomous (yes-no) variables
Gender is coded as 0 for male, 1 for femaleGender is coded as 0 for male, 1 for female
eebb
[e[e1.51.5
= 4.48] is change in OR for 1 unit change in gender,= 4.48] is change in OR for 1 unit change in gender,
i.e. OR for females relative to malesi.e. OR for females relative to males
eebb
for any dummy variable (coded 0-1) is the adjustedfor any dummy variable (coded 0-1) is the adjusted
OR for that risk factor, since “1 unit of change” =OR for that risk factor, since “1 unit of change” =
presence vs. absence of risk factorpresence vs. absence of risk factor
VariableVariable RangeRange b coeffb coeff SESE pp
ConstantConstant -6.376-6.376 1.6341.634 <0.001<0.001
AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001
GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
Squared termsSquared terms
Social index squared is included as well asSocial index squared is included as well as
social index itself.social index itself.
Squared terms allow for curvilinearSquared terms allow for curvilinear
relationships, just as in ordinaryrelationships, just as in ordinary
regressionregression
VariableVariable RangeRange b coeffb coeff SESE pp
Age x genderAge x gender -0.043-0.043 0.0170.017 0.0110.011
Social indexSocial index 20-8420-84 -0.056-0.056 0.0400.040 0.1600.160
(Soc ind)(Soc ind)22
400-7056400-7056 0.00060.0006 0.00030.0003 0.0820.082
Interaction termsInteraction terms
Age and gender are entered into model asAge and gender are entered into model as
separate termsseparate terms
Age x gender included to see whether ageAge x gender included to see whether age
has different effect in males than inhas different effect in males than in
females.females.
VariableVariable RangeRange b coeffb coeff SESE pp
AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001
GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
Age x genderAge x gender M: 0-0M: 0-0
F: 40-69F: 40-69
-0.043-0.043 0.0170.017 0.0110.011
InterpretationInterpretation
With binary, dummy variables, eWith binary, dummy variables, ebb
is the odds ratio.is the odds ratio.
You can compare the strength (slope) of the effectYou can compare the strength (slope) of the effect
by comparing b.by comparing b.
With numeric variables, b is not a direct measure ofWith numeric variables, b is not a direct measure of
strength of effect.strength of effect.
– Example: b is quite small in effect of BP on mortality,Example: b is quite small in effect of BP on mortality,
because it is the effect of onlybecause it is the effect of only one mmHgone mmHg change in BP. BPchange in BP. BP
is still an important factor in mortality because there is ais still an important factor in mortality because there is a
widewide rangerange in the BP.in the BP.
InterpretationInterpretation
In a prospective cohort study we can useIn a prospective cohort study we can use
logistic regression model to predictlogistic regression model to predict probabilityprobability
of the event given the independent variables.of the event given the independent variables.
Also can derive relative risk.Also can derive relative risk.
In a cross sectional study we only have theIn a cross sectional study we only have the
odds ratio.odds ratio.
Selection of variablesSelection of variables
Same principle as with ordinary regressionSame principle as with ordinary regression
Forward selection: add one variable at a timeForward selection: add one variable at a time
until there are no more that make a significantuntil there are no more that make a significant
differencedifference
Backward selection: start with all, remove oneBackward selection: start with all, remove one
at a time to see if they made a significantat a time to see if they made a significant
contributioncontribution
EPI Info has suggestions on how to do thisEPI Info has suggestions on how to do this

Contenu connexe

Tendances

Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSSLNIPE
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorAmir Al-Ansary
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Modelsrichardchandler
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Ml3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsMl3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsankit_ppt
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic netKyusonLim
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear RegressionAndrew Ferlitsch
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IJames Neill
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Regression analysis algorithm
Regression analysis algorithm Regression analysis algorithm
Regression analysis algorithm Sammer Qader
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regressionnszakir
 
Introduction to Bayesian Methods
Introduction to Bayesian MethodsIntroduction to Bayesian Methods
Introduction to Bayesian MethodsCorey Chivers
 

Tendances (20)

Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood Estimator
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Ml3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metricsMl3 logistic regression-and_classification_error_metrics
Ml3 logistic regression-and_classification_error_metrics
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic net
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Regression
RegressionRegression
Regression
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
 
Linear regression theory
Linear regression theoryLinear regression theory
Linear regression theory
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Regression analysis algorithm
Regression analysis algorithm Regression analysis algorithm
Regression analysis algorithm
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regression
 
Introduction to Bayesian Methods
Introduction to Bayesian MethodsIntroduction to Bayesian Methods
Introduction to Bayesian Methods
 

En vedette

Logistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseLogistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseMichael Lieberman
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMNYC Predictive Analytics
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science -  Part XV - MARS, Logistic Regression, & Survival AnalysisData Science -  Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science - Part XV - MARS, Logistic Regression, & Survival AnalysisDerek Kane
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examplesGaurav Kamboj
 
Logistic Regression: Behind the Scenes
Logistic Regression: Behind the ScenesLogistic Regression: Behind the Scenes
Logistic Regression: Behind the ScenesChris White
 
Intro to Logistic Regression
Intro to Logistic RegressionIntro to Logistic Regression
Intro to Logistic RegressionJay Victoria
 
Fault prediction using logistic regression (Python)
Fault prediction using logistic regression (Python)Fault prediction using logistic regression (Python)
Fault prediction using logistic regression (Python)Binayak Dutta
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Logistic Regression In Data Science
Logistic Regression In Data ScienceLogistic Regression In Data Science
Logistic Regression In Data ScienceEdureka!
 
Logistic Regression Demystified (Hopefully)
Logistic Regression Demystified (Hopefully)Logistic Regression Demystified (Hopefully)
Logistic Regression Demystified (Hopefully)Gabriele Tolomei
 
Anthony_Kimeu_Geospat_Presentation
Anthony_Kimeu_Geospat_PresentationAnthony_Kimeu_Geospat_Presentation
Anthony_Kimeu_Geospat_PresentationANTHONY KIMEU
 
Lecture slides stats1.13.l20.air
Lecture slides stats1.13.l20.airLecture slides stats1.13.l20.air
Lecture slides stats1.13.l20.airatutor_te
 
DUMMY VARIABLE REGRESSION MODEL
DUMMY VARIABLE REGRESSION MODELDUMMY VARIABLE REGRESSION MODEL
DUMMY VARIABLE REGRESSION MODELArshad Ahmed Saeed
 

En vedette (20)

Logistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseLogistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVM
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Binary Logistic Regression Example
Binary Logistic Regression ExampleBinary Logistic Regression Example
Binary Logistic Regression Example
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science -  Part XV - MARS, Logistic Regression, & Survival AnalysisData Science -  Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
 
Multilevel Binary Logistic Regression
Multilevel Binary Logistic RegressionMultilevel Binary Logistic Regression
Multilevel Binary Logistic Regression
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examples
 
Ordinal Logistic Regression
Ordinal Logistic RegressionOrdinal Logistic Regression
Ordinal Logistic Regression
 
Logistic Regression: Behind the Scenes
Logistic Regression: Behind the ScenesLogistic Regression: Behind the Scenes
Logistic Regression: Behind the Scenes
 
Intro to Logistic Regression
Intro to Logistic RegressionIntro to Logistic Regression
Intro to Logistic Regression
 
Fault prediction using logistic regression (Python)
Fault prediction using logistic regression (Python)Fault prediction using logistic regression (Python)
Fault prediction using logistic regression (Python)
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Logistic Regression In Data Science
Logistic Regression In Data ScienceLogistic Regression In Data Science
Logistic Regression In Data Science
 
Logistic Regression Demystified (Hopefully)
Logistic Regression Demystified (Hopefully)Logistic Regression Demystified (Hopefully)
Logistic Regression Demystified (Hopefully)
 
Logistic regression teaching
Logistic regression teachingLogistic regression teaching
Logistic regression teaching
 
Anthony_Kimeu_Geospat_Presentation
Anthony_Kimeu_Geospat_PresentationAnthony_Kimeu_Geospat_Presentation
Anthony_Kimeu_Geospat_Presentation
 
Lecture slides stats1.13.l20.air
Lecture slides stats1.13.l20.airLecture slides stats1.13.l20.air
Lecture slides stats1.13.l20.air
 
MRA vs AVM
MRA vs AVM MRA vs AVM
MRA vs AVM
 
DUMMY VARIABLE REGRESSION MODEL
DUMMY VARIABLE REGRESSION MODELDUMMY VARIABLE REGRESSION MODEL
DUMMY VARIABLE REGRESSION MODEL
 

Similaire à Logistic regression (blyth 2006) (simplified)

Linear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLLinear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLKumud Arora
 
Presentation on Regression Analysis
Presentation on Regression AnalysisPresentation on Regression Analysis
Presentation on Regression AnalysisJ P Verma
 
P G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 RegressionP G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 RegressionAashish Patel
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Daniel Katz
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppthabtamu biazin
 
Probably, Definitely, Maybe
Probably, Definitely, MaybeProbably, Definitely, Maybe
Probably, Definitely, MaybeJames McGivern
 
Lecture.3.regression.all
Lecture.3.regression.allLecture.3.regression.all
Lecture.3.regression.allKUBUKE JACKSON
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data AnalysisNBER
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionRupak Roy
 
Linear Regression
Linear Regression Linear Regression
Linear Regression Rupak Roy
 
whitehead-logistic-regression.ppt
whitehead-logistic-regression.pptwhitehead-logistic-regression.ppt
whitehead-logistic-regression.ppt19DSMA012HarshSingh
 
Bipolar Disorder Investigation Using Modified Logistic Ridge Estimator
Bipolar Disorder Investigation Using Modified Logistic Ridge EstimatorBipolar Disorder Investigation Using Modified Logistic Ridge Estimator
Bipolar Disorder Investigation Using Modified Logistic Ridge EstimatorIOSR Journals
 

Similaire à Logistic regression (blyth 2006) (simplified) (20)

Linear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLLinear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in ML
 
Regression-Logistic-4.pdf
Regression-Logistic-4.pdfRegression-Logistic-4.pdf
Regression-Logistic-4.pdf
 
Presentation on Regression Analysis
Presentation on Regression AnalysisPresentation on Regression Analysis
Presentation on Regression Analysis
 
Logistics regression
Logistics regressionLogistics regression
Logistics regression
 
Logmodels2
Logmodels2Logmodels2
Logmodels2
 
Logmodels2
Logmodels2Logmodels2
Logmodels2
 
P G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 RegressionP G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 Regression
 
M8.logreg.ppt
M8.logreg.pptM8.logreg.ppt
M8.logreg.ppt
 
M8.logreg.ppt
M8.logreg.pptM8.logreg.ppt
M8.logreg.ppt
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppt
 
Probably, Definitely, Maybe
Probably, Definitely, MaybeProbably, Definitely, Maybe
Probably, Definitely, Maybe
 
Lecture.3.regression.all
Lecture.3.regression.allLecture.3.regression.all
Lecture.3.regression.all
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
Linear Regression
Linear Regression Linear Regression
Linear Regression
 
whitehead-logistic-regression.ppt
whitehead-logistic-regression.pptwhitehead-logistic-regression.ppt
whitehead-logistic-regression.ppt
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
 
Bipolar Disorder Investigation Using Modified Logistic Ridge Estimator
Bipolar Disorder Investigation Using Modified Logistic Ridge EstimatorBipolar Disorder Investigation Using Modified Logistic Ridge Estimator
Bipolar Disorder Investigation Using Modified Logistic Ridge Estimator
 

Dernier

Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...rajnisinghkjn
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Nehru place Escorts
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.ANJALI
 
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi NcrDelhi Call Girls
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...narwatsonia7
 
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near MeHigh Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Menarwatsonia7
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Timevijaych2041
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxDr.Nusrat Tariq
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAAjennyeacort
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Hematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsHematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsMedicoseAcademics
 
Case Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxCase Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxNiranjan Chavan
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 

Dernier (20)

Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
Call Girls Service in Virugambakkam - 7001305949 | 24x7 Service Available Nea...
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.Statistical modeling in pharmaceutical research and development.
Statistical modeling in pharmaceutical research and development.
 
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
9873777170 Full Enjoy @24/7 Call Girls In North Avenue Delhi Ncr
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
 
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near MeHigh Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
High Profile Call Girls Mavalli - 7001305949 | 24x7 Service Available Near Me
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptx
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
Hematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsHematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes Functions
 
Case Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptxCase Report Peripartum Cardiomyopathy.pptx
Case Report Peripartum Cardiomyopathy.pptx
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 

Logistic regression (blyth 2006) (simplified)

  • 1. Logistic RegressionLogistic Regression Dr Mike BlythDr Mike Blyth February 2006February 2006
  • 2. Logistic RegressionLogistic Regression A way to look at effect ofA way to look at effect of – ““Numeric” (interval or ratio) independentNumeric” (interval or ratio) independent variablevariable OnOn – BinaryBinary (yes-no) dependent variable(yes-no) dependent variable
  • 3. Dependent variable is continuousDependent variable is continuous intervalinterval oror ratioratio (numeric)(numeric) Independent variables are also interval orIndependent variables are also interval or ratioratio ExamplesExamples – Effect of weight on blood pressureEffect of weight on blood pressure – Effect of drug dose on reticulocyte countEffect of drug dose on reticulocyte count Review Linear RegressionReview Linear Regression
  • 6. Logistic RegressionLogistic Regression Dependent variable is binary (yes/no) outcome.Dependent variable is binary (yes/no) outcome. Independent variables are continuous intervalIndependent variables are continuous interval Examples:Examples: – Relation of weight and BP to 10 year risk of deathRelation of weight and BP to 10 year risk of death – Relation of CD4 count to 1 year risk of AIDS diagnosisRelation of CD4 count to 1 year risk of AIDS diagnosis
  • 7. Why do we need it?Why do we need it? Could use categorical analysis such as frequency tableCould use categorical analysis such as frequency table AIDSAIDS No AIDSNo AIDS CD4 > 350CD4 > 350 8080 2020 150 < CD4 < 350150 < CD4 < 350 5050 5050 CD4 < 150CD4 < 150 2020 8080 • Problems a) some information is lost when we collapse the numeric data into categories. This leads to loss of power. b) no estimate of magnitude of relation
  • 8. Odds RatioOdds Ratio Probability:Probability: p = probability of eventp = probability of event 1 - p = probabilty of1 - p = probabilty of notnot the event (also called q)the event (also called q) p varies from 0 to 1p varies from 0 to 1 OddsOdds – Ratio of probability of event to probability of notRatio of probability of event to probability of not having the event: Odds = p/(1 - p)having the event: Odds = p/(1 - p) – When p = 0.5, odds = 1 (or “1:1 odds”)When p = 0.5, odds = 1 (or “1:1 odds”) – When p = 0.1, odds = 0.1/0.9 = 0.11When p = 0.1, odds = 0.1/0.9 = 0.11
  • 9. Log Odds RatioLog Odds Ratio The log odds ratio (also called “logit”) is simply the naturalThe log odds ratio (also called “logit”) is simply the natural logarithm of the odds ratio:logarithm of the odds ratio: ¤ logitlogit = ln(odds ratio)= ln(odds ratio) = ln(p/(1-p))= ln(p/(1-p)) = ln(p) – ln(1-p)= ln(p) – ln(1-p) ln (1) = 0, so logit is 0 when odds are 1:1, orln (1) = 0, so logit is 0 when odds are 1:1, or probability = 50%probability = 50% The logit for event of probability p is the opposite of the logitThe logit for event of probability p is the opposite of the logit for the probability of not having the event.for the probability of not having the event.
  • 10. Relation between probability p and logit 0.000 0.250 0.500 0.750 1.000 -8 -6 -4 -2 0 2 4 6 8 logit = ln[p/(1-p)]
  • 11. Logistic regression modelLogistic regression model The linear regression model with one variableThe linear regression model with one variable isis y = a + bx + ey = a + bx + e The logistic regression model with oneThe logistic regression model with one variable isvariable is logit = a + bx + elogit = a + bx + e wherewhere logit = ln(p/(1-p))logit = ln(p/(1-p))
  • 12. The logistic regression model with oneThe logistic regression model with one variable isvariable is logit = a + bxlogit = a + bx where logit = ln(p/(1-p))where logit = ln(p/(1-p)) In other words, the model says the odds of the eventIn other words, the model says the odds of the event happening arehappening are – A constant factor (a)A constant factor (a) – Some other constant (b)Some other constant (b) – times a numeric risk factor (x) (for example, SBP)times a numeric risk factor (x) (for example, SBP) Logistic regression modelLogistic regression model
  • 13. Logistic regression modelLogistic regression model Given value of the independent variables, theGiven value of the independent variables, the regression equation predicts theregression equation predicts the Log Odds RatioLog Odds Ratio
  • 14. Logistic regression modelLogistic regression model The statistics program calculates theThe statistics program calculates the coefficient bcoefficient b TheThe coefficient bcoefficient b shows how much the oddsshows how much the odds ratio changes with a change in theratio changes with a change in the independent variableindependent variable Positive bPositive b  higher risk with higher valueshigher risk with higher values Negative bNegative b  lower risk with higher valueslower risk with higher values
  • 15. Logistic regression modelLogistic regression model Hypothetical example given above examining relation of BP toHypothetical example given above examining relation of BP to risk of stroke/death. The model predicts:risk of stroke/death. The model predicts: ln(odds ratio) = constant + bln(odds ratio) = constant + b ∙ SBPSBP ee(lnoddsratio)(lnoddsratio) = e= e(c+b(c+b∙ SBP)SBP) Odds RatioOdds Ratio == ee(c+b(c+b∙SBP)SBP) == eecc ∙ ee(b(b∙SBP)SBP)
  • 16. Logistic regression modelLogistic regression model The coefficient b shows how much the odds ratioThe coefficient b shows how much the odds ratio changes with a change in the independent variablechanges with a change in the independent variable Odds RatioOdds Ratio == eecc ∙ ee(bx)(bx) In other words,In other words, Odds RatioOdds Ratio == somethingsomething ∙ (e(ebb ))(x)(x)
  • 17. Logistic regression modelLogistic regression model Odds RatioOdds Ratio = constant= constant ∙ ((eebb ))(x)(x) SoSo eebb is the factor indicating effect of x on theis the factor indicating effect of x on the event.event. Each one unit change in x will multiply the oddsEach one unit change in x will multiply the odds ratio by a factor of eratio by a factor of ebb ..
  • 18. Logistic regression modelLogistic regression model Odds RatioOdds Ratio = constant= constant ∙ ((eebb ))(x)(x) – Suppose b = 0.693 so eSuppose b = 0.693 so ebb = 2= 2 – A one-unit change in x willA one-unit change in x will doubledouble the odds ratiothe odds ratio – Suppose b = -0.693 so eSuppose b = -0.693 so ebb = 0.5= 0.5 – A one-unit change in x willA one-unit change in x will halvehalve the odds ratio.the odds ratio. – If b = 0, eIf b = 0, ebb = 1, and x has no effect on OR= 1, and x has no effect on OR
  • 19. Logistic regression modelLogistic regression model For the hypothetical example above, the report isFor the hypothetical example above, the report is given by Epi Info asgiven by Epi Info as TermTerm OddsOdds RatioRatio 95% CI95% CI CoeffCoeff S. E.S. E. ZZ PP BPBP 1.05971.0597 1.0221.022 1.0981.098 0.05790.0579 0.01850.0185 3.1313.131 0.00170.0017 ConstConst ** ** ** -7.201-7.201 2.29942.2994 3.1313.131 0.00170.0017
  • 20. Logistic regression modelLogistic regression model TermTerm Odds RatioOdds Ratio 95% CI95% CI CoefficientCoefficient S. E.S. E. ZZ P-valueP-value BPBP 1.05971.0597 1.0221.022 1.0981.098 0.05790.0579 0.0180.018 3.1313.131 0.00170.0017 ConstantConstant ** ** ** -7.2014-7.2014 2.2992.299 3.1313.131 0.00170.0017 Coefficient, or beta, or b, is the slope or magnitude of the effect.
  • 21. Logistic regression modelLogistic regression model TermTerm OddsOdds RatioRatio 95% CI95% CI CoefficientCoefficient S. E.S. E. ZZ P-valueP-value BPBP 1.05971.0597 1.02201.0220 1.09871.0987 0.05790.0579 0.01850.0185 3.13193.1319 0.00170.0017 ConstantConstant ** ** ** -7.2014-7.2014 2.29942.2994 3.13193.1319 0.00170.0017 Odds ratio for one unit change in the independent variable (e.g. BP). This is the calculated eb eb A one unit change in BP multiplies the odds ratio by 1.0597.
  • 22. Logistic regression modelLogistic regression model TermTerm Odds RatioOdds Ratio 95% CI95% CI CoeffCoeff S. E.S. E. ZZ P-valueP-value BPBP 1.05971.0597 1.0221.022 1.0981.098 0.05790.0579 0.01850.0185 3.13193.1319 0.00170.0017 ConstantConstant ** ** ** -7.2014-7.2014 2.29942.2994 3.13193.1319 0.00170.0017 95% confidence interval for that odds ratio. The confidence interval does not include 1, so the effect is statistically significant
  • 23. Using more than one independentUsing more than one independent variablevariable Single variable:Single variable: logit = c + bxlogit = c + bx OR = c’ ∙ (eOR = c’ ∙ (ebb ))xx Multiple variables:Multiple variables: logit = c + blogit = c + b11xx11 + b+ b22xx22 + … + b+ … + bnnxxnn OR = c’ ∙ (eOR = c’ ∙ (eb1b1 ))x1x1 ∙ (e∙ (eb2b2 ))x2x2 ∙ … ∙ (e∙ … ∙ (ebnbn ))xnxn Note that the termsNote that the terms multiplymultiply their effect ontheir effect on odds ratio.odds ratio.
  • 24. Using more than one independentUsing more than one independent variablevariable Analysis reports a b coefficient for eachAnalysis reports a b coefficient for each independent variable.independent variable. That coefficient is the effect of the givenThat coefficient is the effect of the given independent variable, separated from theindependent variable, separated from the effects of all the other independent variables.effects of all the other independent variables.
  • 25. Real Life ExampleReal Life Example Prospective cohort study of causes ofProspective cohort study of causes of cardiac disease: Evans County Study 1965cardiac disease: Evans County Study 1965 Independent variables = age, gender,Independent variables = age, gender, race, social index, SBP, diabetes, smoking,race, social index, SBP, diabetes, smoking, cholesterol, and an obesity indexcholesterol, and an obesity index Dependent variable = risk of dying duringDependent variable = risk of dying during 10 year period10 year period
  • 26. VariableVariable RangeRange b coeffb coeff SESE pp ConstantConstant -6.376-6.376 1.6341.634 <0.001<0.001 AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001 GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121 Age x genderAge x gender -0.043-0.043 0.0170.017 0.0110.011 Social indexSocial index 20-8420-84 -0.056-0.056 0.0400.040 0.1600.160 (Soc ind)(Soc ind)22 400-7056400-7056 0.00060.0006 0.00030.0003 0.0820.082 SBPSBP 88-31088-310 0.0190.019 0.0020.002 <0.001<0.001 DiabetesDiabetes 0=n, 1=y0=n, 1=y 1.1231.123 0.2610.261 <0.001<0.001 SmokingSmoking 0=n, 1=y0=n, 1=y 0.3170.317 0.1570.157 0.0430.043 CholesterolCholesterol 94-54694-546 0.00310.0031 0.00150.0015 0.0410.041 QuartletQuartlet 2.11-8.762.11-8.76 -1.064-1.064 0.4320.432 0.0140.014 (Quartlet)(Quartlet)22 4.44-76.84.44-76.8 0.1120.112 0.0490.049 0.0220.022 Cited in Kelsey et al., Methods in Observational Epidemiology, 1986
  • 27. VariableVariable RangeRange b coeffb coeff SESE pp ConstantConstant -6.376-6.376 1.6341.634 <0.001<0.001 AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001 GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121 Age x genderAge x gender -0.043-0.043 0.0170.017 0.0110.011 Social indexSocial index 20-8420-84 -0.056-0.056 0.0400.040 0.1600.160 (Soc ind)(Soc ind)22 400-7056400-7056 0.00060.0006 0.00030.0003 0.0820.082 SBPSBP 88-31088-310 0.0190.019 0.0020.002 <0.001<0.001 DiabetesDiabetes 0=n, 1=y0=n, 1=y 1.1231.123 0.2610.261 <0.001<0.001 SmokingSmoking 0=n, 1=y0=n, 1=y 0.3170.317 0.1570.157 0.0430.043 CholesterolCholesterol 94-54694-546 0.00310.0031 0.00150.0015 0.0410.041 QuartletQuartlet 2.11-8.762.11-8.76 -1.064-1.064 0.4320.432 0.0140.014 (Quartlet)(Quartlet)22 4.44-76.84.44-76.8 0.1120.112 0.0490.049 0.0220.022
  • 28. Statistical SignificanceStatistical Significance The p value indicates statistical significanceThe p value indicates statistical significance Age is positively correlated with risk of deathAge is positively correlated with risk of death Gender has positive b coefficient, but the p valueGender has positive b coefficient, but the p value is 0.12, indicating that we cannot say that there isis 0.12, indicating that we cannot say that there is a significant relationship.a significant relationship. VariableVariable RangeRange b coeffb coeff SESE pp AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001 GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
  • 29. Dichotomous (yes-no) variablesDichotomous (yes-no) variables Gender is coded as 0 for male, 1 for femaleGender is coded as 0 for male, 1 for female eebb [e[e1.51.5 = 4.48] is change in OR for 1 unit change in gender,= 4.48] is change in OR for 1 unit change in gender, i.e. OR for females relative to malesi.e. OR for females relative to males eebb for any dummy variable (coded 0-1) is the adjustedfor any dummy variable (coded 0-1) is the adjusted OR for that risk factor, since “1 unit of change” =OR for that risk factor, since “1 unit of change” = presence vs. absence of risk factorpresence vs. absence of risk factor VariableVariable RangeRange b coeffb coeff SESE pp ConstantConstant -6.376-6.376 1.6341.634 <0.001<0.001 AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001 GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121
  • 30. Squared termsSquared terms Social index squared is included as well asSocial index squared is included as well as social index itself.social index itself. Squared terms allow for curvilinearSquared terms allow for curvilinear relationships, just as in ordinaryrelationships, just as in ordinary regressionregression VariableVariable RangeRange b coeffb coeff SESE pp Age x genderAge x gender -0.043-0.043 0.0170.017 0.0110.011 Social indexSocial index 20-8420-84 -0.056-0.056 0.0400.040 0.1600.160 (Soc ind)(Soc ind)22 400-7056400-7056 0.00060.0006 0.00030.0003 0.0820.082
  • 31. Interaction termsInteraction terms Age and gender are entered into model asAge and gender are entered into model as separate termsseparate terms Age x gender included to see whether ageAge x gender included to see whether age has different effect in males than inhas different effect in males than in females.females. VariableVariable RangeRange b coeffb coeff SESE pp AgeAge 40-69 y40-69 y 0.0860.086 0.1150.115 <0.001<0.001 GenderGender 0=m, 1=f0=m, 1=f 1.5001.500 0.9670.967 0.1210.121 Age x genderAge x gender M: 0-0M: 0-0 F: 40-69F: 40-69 -0.043-0.043 0.0170.017 0.0110.011
  • 32. InterpretationInterpretation With binary, dummy variables, eWith binary, dummy variables, ebb is the odds ratio.is the odds ratio. You can compare the strength (slope) of the effectYou can compare the strength (slope) of the effect by comparing b.by comparing b. With numeric variables, b is not a direct measure ofWith numeric variables, b is not a direct measure of strength of effect.strength of effect. – Example: b is quite small in effect of BP on mortality,Example: b is quite small in effect of BP on mortality, because it is the effect of onlybecause it is the effect of only one mmHgone mmHg change in BP. BPchange in BP. BP is still an important factor in mortality because there is ais still an important factor in mortality because there is a widewide rangerange in the BP.in the BP.
  • 33. InterpretationInterpretation In a prospective cohort study we can useIn a prospective cohort study we can use logistic regression model to predictlogistic regression model to predict probabilityprobability of the event given the independent variables.of the event given the independent variables. Also can derive relative risk.Also can derive relative risk. In a cross sectional study we only have theIn a cross sectional study we only have the odds ratio.odds ratio.
  • 34. Selection of variablesSelection of variables Same principle as with ordinary regressionSame principle as with ordinary regression Forward selection: add one variable at a timeForward selection: add one variable at a time until there are no more that make a significantuntil there are no more that make a significant differencedifference Backward selection: start with all, remove oneBackward selection: start with all, remove one at a time to see if they made a significantat a time to see if they made a significant contributioncontribution EPI Info has suggestions on how to do thisEPI Info has suggestions on how to do this