2. Smoking & lung cancer
Good case-control study associating lung cancer to smoking (Wynder EL, Graham E.
Tobacco smoking as a possible etiologic factor in bronchiogenic carcinoma: a study of 684
proven cases. JAMA 1950;143:329-36.)
Tobacco dust (not smoke) might be causing the elevated incidence of lung tumours among German tobacco
workers. (Hermann Rottmann in Würzburg 1898)
Difference Causal
Smoking might be related to lung cancer, but lung cancer is still rare (Adler I. Primary Malignant
Growths of the Lungs and Bronchi. London: Longmans, 1912:22)
86 lung cancers patients were likely smoked (Müller FH. Tabakmissbrauch und Lungencarcinom.
Zeitschrift für Krebsforschung 1939;49:57–85.)
Smoking 35 sticks per day increase risk to 40 times (Doll R,
Hill AB. The mortality of doctors in relation to their smoking
habits. BMJ 1954;1:1451–5.)
Animal study associating cigarette smoke tar with cancer (Wynder E,
Graham EA, Croninger AB. Experimental production of carcinoma with
cigarette tar. Cancer Res 1953;13:855–66)
8November2016(C)JamalludinAbRahman2015
2
3. It is about relationship
Analysis of the
relationships
between two or
more variables.
8November2016(C)JamalludinAbRahman2015
3
4. Multivariate?
Multivariate - general
term – multiple IV
May involved multiple DV
DVIV IV
DV DV
IV
IV
8November2016(C)JamalludinAbRahman2015
4
8. Exercise & fitness
Low Moderate High
Is there any difference % between
Low & Moderate intensity?
How big is the difference %
between Low & Moderate?
Is there any pattern now?
What is your conclusion?
Fitness level
Exercise intensity
8November2016(C)JamalludinAbRahman2015
8
11. The 3rd factors can be a...
1. Confounder
2. Mediator or intervening factor
3. Moderator or effect modifier (interaction)
8November2016(C)JamalludinAbRahman2015
11
16. Stress vs. MS vs. Coping mechanism
Multiple
sclerosis
new
lesions
Coping
Mech.
Stress
Mohr, D. C., Goodkin, D. E., Nelson, S., Cox, D., & Weiner, M. (2002).
Moderating Effects of Coping on the Relationship Between Stress and the
Development of New Brain Lesions in Multiple Sclerosis. Psychosom Med,
64(5), 803-809.
OR = 1.62, p = 0.009
Distraction (OR=0.69, p=0.009),
instrumental (OR=0.77, p=0.081),
emotional preoccupation (OR=1.46, p=0.088)
& palliative (NS)
8November2016(C)JamalludinAbRahman2015
16
20. Why multivariate?
Multi-factorial – which are the significant factors?
Multiple outcomes
Multiple unit of measurements
Exploration of associations
8November2016(C)JamalludinAbRahman2015
20
21. Regression
Most common multivariate
technique
Best line to fit the data - OLS
Many types:
Linear regression
Logistic regression
& many others!!
8November2016(C)JamalludinAbRahman2015
21
x
y
𝑦 = 𝛽0 + 𝛽1 𝑥
𝛽0
22. Regression equation
e.g. Linear regression
8November2016(C)JamalludinAbRahman2015
22
Dependent Var
Intercept
Coefficient for Var x1
Explanatory Var x1
Error/Residual
𝑌 = 𝛽𝑜 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽 𝑛 𝑥 𝑛 + 𝜀
23. Example #1
Arterial BP = Constant + Age + Body weight + Pulse rate +
Stress + Residual
𝑌 = 𝛽𝑜 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽 𝑛 𝑥 𝑛 + 𝜀
8November2016(C)JamalludinAbRahman2015
23
24. Logistic regression
𝐿𝑛
𝑃
1−𝑃
= 𝛽𝑜 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑛 𝑥 𝑛
P = Probability that Y=1 i.e. Event occurs
1-P = Event not occur
𝐿𝑛
𝑃
1−𝑃
= Ln of OR = Logit
8November2016(C)JamalludinAbRahman2015
24
25. R2 = 0.986, meaning 99% of variation in ABP is
explained by Age, Body Weight, Pulse Rate &
Stress (F(4,15)=291.948), P<0.001)
8November2016(C)JamalludinAbRahman2015
25
26. Main Result
Arterial BP = 17.3 + 0.6(Age) + 0.9(Body weight) +
0.09(Pulse rate) + 0.01(Stress)
8November2016(C)JamalludinAbRahman2015
26
27. Example #2
Snoring and risk of cardiovascular disease in women . Hu 2000.
From The Nurses’ Health Study. Cohort. Baseline, N=71,779
women 40 to 65 years old and without diagnosed CVD or cancer
in 1986. Till 31st May 1994.
CVD = Snoring + Age +Smoking + BMI + Alcohol + Physical
Activity + Menopausal status + Family history of MI + DM +
Cholesterol + Hours sleeping + Sleeping position
8November2016(C)JamalludinAbRahman2015
27
35. Influential data
x
y A B
C
A = Outlier, still within the
range of x, large residual value
B & C = Leverage points
B = Good leverage, it won’t
impact the regression line
C = Bad leverage. It will
change the regression line
8November2016(C)JamalludinAbRahman2015
35
36. Type of multivariate tests
Dependent
Variables
Independent
Variables
Test
1 – Cont ≥ 2 – All Cont Linear Regression
1 – Cont ≥ 2 – All Cat ANOVA
1 – Cont ≥ 2 – Cont + Cat ANCOVA
> 1 – Cont All Cat MANOVA
> 1 – Cont Cat + Cont MANCOVA
1 – Dichotomous ≥ 2 – Cont + Cat Binary Logistic Regression
8November2016(C)JamalludinAbRahman2015
36