Case Control Study (ANALYTICAL EPIDEMIOLOGY)

CASE-CONTROL STUDY
Presentation by: Dr. Nidhi Singh
Moderator: Dr. Geeta Pardeshi
Department of Community Medicine
VMMC and Safdarjung Hospital
1

CONTENTS
 Introduction
 Study Design
 Basic steps
 Bias in Case Control Study
 Strengths and Weaknesses of the study
 Variants of case control study
 Classical Examples
 Summary
2

CLASSIFICATION
EPIDEMIOLOGIC
METHODS
OBSERVATIONAL
STUDIES
DESCRIPTIVE
STUDIES
ANALYTICAL
STUDIES
ECOLOGICAL
CROSS-
SECTIONAL
CASE-CONTROL COHORT
EXPERIMENTAL
STUDIES
RANDOMISED
CONTROLLED
TRIALS
FIELD TRIALS
COMMUNITY
TRIALS
3

INTRODUCTION
 In contrast to DESCRIPTIVE STUDIES, in
ANALYTICAL STUDIES, the investigator proceeds
with a ‘preformed hypothesis’ regarding a
“causal exposure”.
 i.e. this ‘particular exposure’ leads to that
particular ‘outcome’).
 Comparative studies: uses a comparison group
4

From these study designs one can determine:
1. Whether or not a statistical association
exists between a disease and a suspected
factor
2. If one exists, then what is the strength of
association.
5
INTRODUCTION (CONTD)

CASE-CONTROL STUDY
 Are used to retrospectively determine if there is an
association between an exposure and a specific health
outcome.
 Backward looking study ( effect to cause study)
 Case- reference study
 TROHOC STUDY (Reverse of Cohort )
 First common approach to test the CAUSAL HYPOTHESIS .
6

3 DISTINCT FEATURES OF CASE-
CONTROL STUDY:
1. Both exposure and outcome have occurred before the
start of the study.
2. The study proceeds backwards from effect to cause.
3. Uses control or comparison group to support or refute
an inference.
7

WHEN IS A CASE-CONTROL STUDY
WARRANTED?
 Conducted before a cohort or an experimental study:
 to identify the possible etiology of the disease.
 Can investigate multiple exposures :
 when the real exposure is not known.
 When the disease is rare.
8

STUDY DESIGN
9
STUDY
POPULATION
Study begins here
CASES
(disease)
CONTROLS
(no disease)
Factor present
Factor absent
Factor present
Factor absent
Time
Present
Past

 Distinction between 2 studies i.e. case-control and
cohort: not based on the time.
 It is the starting point of the study which decides the
type.
10
Exposed
Non-
exposed
Cohort
study
Diseased
No
disease
Case-
control
study

BASIC STEPS
11
1. Selection of cases
2. Selection of controls
3. Matching
4. Measurement of
exposure
5. Analysis and
interpretation

12
Specify the total and the actual
population
•Study population: derived from
the total population
•Total population: may be as
broad as mankind or limited to
specific groups.

 Definition of case
 Inclusion/exclusion criteria
 Eligibility criteria
 Sources of cases
13
SELECTION OF CASES:

SELECTION OF CASES:
Definition of case:
Diagnostic criteria:
 Must be specified before the study is undertaken.
 It should be well formulated and documented.
 Should not be altered or changed till the study is over.
Eg. Diagnostic criteria formulated by WHO (MI):
1) ECG abnormalities
2)enzyme changes
3)characteristic chest pain
Eligibility criteria:
 Whether to include incident or prevalent cases? 14

 Incident cases
 newly diagnosed cases
 Prevalent cases
 which have already been diagnosed , a larger number of
cases is often available for study.
Despite this practical advantage of using prevalent
cases, it is generally preferable to use incident cases of
the disease in case-control studies of disease etiology.
15

Risk factors identified using prevalent cases : more related to
survival with the disease than to the development of the
disease (incidence).
 E.g.: Prevalent cases
- most people who develop the disease die soon after the
diagnosis -underrepresented in a study
- more likely to include long term survivors.
- Cases non representative
- any risk factor identified in this group may not be the
general characteristic of all the patients with the disease,
but only of the survivors
16

 This can lead to a type of bias.
 What type of bias ??
17

SOURCES OF CASES:
1) Hospital:
 Can be recruited from a hospital, clinic, GP registers.
 Relatively easy and inexpensive to conduct
 Select cases from several hospitals in the community,
admitted during a specified period of time.
-avoids selection bias:
Eg. If the hospital from which cases are selected is tertiary care
facility, which selectively admits only severely ill patients, any
risk factors identified in the study may be risk factors only in
persons with severe forms of the disease.
18

2)General population:
 Cases of the study disease occurring within a defined
geographic area, during a specified period of time may
be included.
 Ascertained through –
1. Survey
2. Disease registry
3. Hospital network
 Adv :- description of entire picture of the disease in
that population
 Disadv:-logistic and cost consideration are often high,
hence routinely not done. 19

SELECTION OF CONTROLS:
 Definition of controls
 Inclusion/exclusion criteria
 Sources of controls
 No. of controls
 No. of control groups
20

SELECTION OF CONTROLS:
 Free from disease under study.
 The controls should have undergone the same
diagnostic work up as cases , but have been found to be
NEGATIVE.
 Equally at risk of developing the diseases
 Selected from same population as of cases.
21

CRITERIA FOR SELECTION OF
CONTROLS:
1. Similar to the cases in all respects other than having
the disease in question ( MATCHING comes into play)
2. Representative of all persons without the disease in
the population from which they are selected.
Exclusion criteria: patients with diseases known to be
associated either positively or negatively with the exposure
of interest.
22

SOURCES OF CONTROLS
23
1. Hospital patients
2. Special controls :
1. Relatives
2. Friend
3. Neighborhood controls
4. General population
control

24
Hospital Controls
 Represent sample of ill defined reference population
 Unlikely to be representative of the general reference
population
 Differ from people In the community
Eg.: prevalence of cigarette smoking is known to be higher in
hospitalized patients than in community residents
 Adv:-
 Readily available
 Aware of antecedent exposures better
 Willing to cooperate

Relatives
May be unsuitable where genetic conditions are under
study.
Neighborhood controls
Same geographical area ( LOCALITY)
Eg.: persons working in same factory
children attending same school
General population controls
-from defined geographic area
-identify home of a case as starting point , and from there
walk past a specified no. of houses in a specified direction
, seek the 1st household that contains the eligible control
-door to door approach. 25

HOW MANY CONTROLS ARE NEEDED ?
 CONTROLS OF SAME TYPE :
-minimum no= 1:1
-Maximum no. of controls per case : 4
- Noticeable increase in power is gained only upto a ratio
of 1 case to 4 controls.
- Multiple controls per case are used to increase the power
of the study
- For rare diseases potential cases are limited , because
the cases cant be increased without extending the study
in time to enroll more cases , the option to increase the
controls is often chosen
26

Multiple controls- increase the power of the study
:
1case
4 controls

 MULTIPLE CONTROLS OF DIFFERENT TYPES:-
( Hospital and neighborhood controls, controls with
different diseases)
- Results obtained when cases compared with hospital
controls will be similar to the results obtained when
cases are compared with neighborhood controls.
28

RATIONALE FOR USING TWO
CONTROL GROUPS:
E.g. let us consider the question, “ Did mothers of children
with brain tumours have more prenatal radiation
exposure than control mothers?”
2 groups selected for comparison :
1. Normal controls
2. Other cancer controls
29

SOME POSSIBLE RESULTS:
A] Radiation exposure is same in both brain tumor and
other cancer controls, and is higher in in both
compared to normal controls.
30
Brain tumor
cases
Other cancer
controls
Normal
controls
= History of radiation exposure
= No history of radiation exposure

 Chances of recall bias present ( a well known
epidemiologist; Ernst Wynder, also called it as
“rumination bias” )
- Mothers of children with any type of cancer will recall
prenatal radiation exposure better than mothers of
normal children.
31

B] Radiation exposure in other cancer controls is same as
in normal controls, but is lower than in brain tumor
cases.
32
Brain tumor
cases
Other cancer
controls
Normal
controls
Use of multiple controls of different types: to take into account
possible potential biases. Eg. Recall bias

MATCHING
-Process of selecting the controls so that they are similar
to the cases in certain characteristics, such as age, sex,
socioeconomic status , occupation etc.
-Needed to ensure comparability between cases and
controls
33

TYPES OF MATCHING:
1. Group matching
2. Individual matching
Group matching( frequency matching):
Proportion of controls with a certain characteristics is
identical to the proportions of cases with same
characteristic
Eg. 25% cases- married
controls will be selected in such a manner that
25%controls are married
Prerequisite: all the cases should be selected first 34

Individual matching (Matched Pairs)
 For each case, a control will be selected
 Controls should be similar to cases in terms of the
specific variables or variables of concern.
(often used when hospital controls are taken)
Matching for universal confounders : Age, Sex
35

CONFOUNDING
 Concomitant variables
 Associated with both: exposure and the disease
 Distributed unequally in study and control groups
 Does not lie in the chain of sequence between the
exposure and the outcome
 Confounder can itself be a risk factor for the disease
under study independently
 Should be identified before the data is collected
36

EXAMPLE:
37
Consumption of alcohol is a risk factor for oral cancer
History of
alcohol
Oral cancer
Present Absent Total
Present 80 20 100
Absent 20 80 100
Total 100 100 200
16
20x20
80x80
RATIOODDS Conclusion: risk of getting oral cancer is 16
times higher if person drinks alcohol

 Conclusion is incorrect: Hidden effect of tobacco use
Risk factor : alcohol use
Confounder: Tobacco use
People who drink alcohol are also often the ones who also
use tobacco; and tobacco use itself is a direct cause of
oral cancer, whether one drinks or not.
38

EXAMPLE TO REMOVE CONFOUNDING
 Stratification: if risk of cancer remains high in both strata –risk is
not due to tobacco but due to alcohol itself
39
History
of
alcohol
Tobacco users
Oral cancer
Present 60 15 75
Absent 20 5 25
Total 80 20 100
Stratum OR= (60x5/15x20)=1
History
of
alcohol
Non Tobacco users
Oral cancer
Present 5 20 25
Absent 15 60 75
Total 20 80 100
Stratum OR= (5x60/20x15)=1
After making adjustment for the use of tobacco, the odds ratio in both strata=1
Alcohol by itself has no risk
Differential distribution:
alcohol and tobacco users: 60/80=75%
Few patients who consume alcohol are nontobacco users: 5/20=25%

CONTROL OF CONFOUNDING
 Identify all Potential Confounding Variables (PCV) right
at the time when research question is being developed.
 Control at stages:
1. During planning
2. During analysis
40
During planning During analysis
Restriction Stratified analysis
Matching Regression analysis

MEASUREMENT OF EXPOSURE:
 Obtained by :
 QUESTIONNAIRES
 PAST RECORDS of cases
 INTERVIEWS
 Information about exposure should be obtained in
precisely the same manner both for cases and controls
 This step has susceptibility to various forms of biases.
41

2 X 2 CONTINGENCY TABLE
42
a b
c d
Cases
(disease present )
Controls
(disease absent)
Exposed
(risk factor present)
Not exposed
(Risk factor absent )

ANALYSIS:
 To Find out:-
1. Exposure rates among cases and controls to suspected
factor.
2. Estimation of disease risk associated with exposure
(odds ratio)
43

EXPOSURE RATES:
 Exposure rate in cases:-
 Exposure rate in controls:-
44
)( ca
a

)( db
b

a b
c d
D+
E-
E+
D-
(a+c) (b+d)

EXAMPLE:
Case control study of smoking and lung cancer:
Exposure rates:
Cases: = 33/35=94.2%
Controls:
=55/82=67%
45
33
(a)
55
(b)
2
(c)
27
(d)
Cases
(lung cancer)
Non smokers
Smokers
Controls
(without lung cancer)
)( ca
a

)( db
b

•Frequency of risk factor : higher in cases than in controls (exposure is
associated with the disease)

 Next step is to ascertain whether there is a statistical
association between exposure and the disease.
 To resolve this: P-value is calculated
 Variables under investigation can be:
1. Discrete : Rates, proportions –chi square test( standard
error of difference between 2 proportions)
2. Continuous : standard error of difference between 2 means
or t-test
 If P-value is </= 0.05 :-statistically significant
46
SMALLER THE P-VALUE, THE GREATER THE STATISTICAL
SIGNIFICANCE
( Probability that the association is not due to chance alone )

P-value does not imply
causation
47

ESTIMATION OF RISK : ODDS
RATIO
 Measure of strength of association between the risk
factor and outcome.
 The ratio of the number of ways the event can occur to
the no. of ways the event cannot occur.
Eg. Probability of winning = 60%
Odds of winning =
48
P
P


1
ratioOdds
5.1
40
60
60100
60



In case-control study =
Odds of cases being exposed=
Odds of controls being exposed =
Odds ratio=
49
exposedwascontrolathatodds
exposedwascaseathatodds
c
a
d
b
bc
ad

 When is the odds ratio a good estimate of the relative
risk ?
(3 assumptions)
1. Disease under investigation is rare
2. Cases must be representative of those with the disease
3. Controls must be representative of those without the
disease
50

CALCULATING ODDS RATIO IN AN UNMATCHED CASE-
CONTROL
51
Cases
E
E
N
E
N
N
E
E
E
N
Controls
N
E
N
N
E
N
N
E
N
N
6 3
4 7
D+
E-
E+
D-
ODDS RATIO=
(a) (b)
(d)(c)
ad
bc
= 3.5

CALCULATING ODDS RATIO IN A MATCHED CASE-
CONTROL
 Concordant pairs : pairs that had the same exposure
 Either both cases and controls were exposed, or both were
non exposed
 Discordant pairs: different exposure (2 combinations)
 Cases –exposed Cases - non exposed
 Controls – non exposed Controls – exposed
 Calculation is based on the discordant pairs only
52

CALCULATING ODDS RATIO IN A MATCHED CASE-
CONTROL
53
Cases
E
E
N
E
N
N
E
E
E
N
Controls
N
E
N
N
E
N
N
E
N
N
a(2) B(1)
C(4) D(2)
D+
E-
E+
D-
Discordant pairs:
B and C

 ODDS RATIO :
54
exposedwascontrolathatodds
exposedwascaseathatodds
SIMILARLY,
Here, it is ratio of the number of pairs in which case was
exposed and control was not, to the number of pairs in
which the control was exposed and the case was not.
4
1
4
b
c
:RATIOODDS 

INTERPRETATION: ODDS RATIO
• =exposure is not related to disease
1
• = risk in exposed is greater than non
exposed (positive association)>1
• = risk in exposed is less than non
exposed (negative association: possibly
protective)
<1 55

BIAS IN CASE CONTROL STUDY
 Defined as “any systematic error in the design, conduct
or analysis of a study that results in a mistaken estimate
of an exposure’s effect on the risk of disease.
 Can arise in three ways:-
1. Basic measurement technique is wrong
2. Variations between observer or subjects
3. At the time of :
1. Selection
2. Making measurements
56

TYPES OF BIAS
57
Selection bias
Self selection
(volunteers bias)
Berkson's bias
Survivorship
(neyman’s) bias
Healthy worker
effect
Exposure-related
bias
Inappropriate
group
Measurement
bias
Recall bias
Observer bias

SELECTION BIAS
 Error resulting from the way the subjects are selected
Self selection bias/volunteer induced bias
Avoid volunteers: they may be systematically very different
from the usual population
Berkson’s bias(hospital selective admission)
Patients admitted with two concurrent diseases to a hospital
may find a stronger association than would really exist in
general community
58

 Incidence- prevalence bias (Neyman’s bias,survivorship bias)
 Healthy worker effect
Comparison between health status of military and civilian
population may show a better health status of the soldiers
 Exposure related bias:
Special type of Berkson’s bias
Occurs due to differing probability of hospital admission among
those who have and those who do not have the suspected
cause
59

Selection of inappropriate control group
 Controls should be equally at risk of developing the diseases
 Should be selected from same population as of cases.
60

INFORMATION
(MEASUREMENT) BIAS
1. Wrong technique, wrong definitions
2. Recall bias:-
Disease person is more likely to recall the possible exposure
3. Observer(interviewer’s bias):-
Interviewer subconsciously may be more inclined to interrogate
/examine the diseased group, to prove the research
question
61

STRENGTHS OF THE STUDY DESIGN
 Easy to carry out
 Rapid and inexpensive
 Require comparatively few subjects
 Suitable to investigate rare disease
 Good for diseases with long latency period
 No risk to subjects
 Allows the study of several risk factors
 No attrition problem
 Ethical problems are minimal
62

WEAKNESSES OF THE STUDY DESIGN
 Various types of bias may arise:-
 Relies on memory or past records(accuracy may be
uncertain)
 Validation of information obtained is difficult
 Selection of an appropriate control group may be
difficult
 Incidence can not be measured
 Do not distinguish between the cause and the
associated factors
63

VARIANTS OF CASE CONTROL
STUDY:
1. Nested case control studies
2. Case cohort studies
 These studies are based in a defined cohort, which is
followed over time.
 In the beginning baseline data is collected of the cohort
 Later on when cases develop, controls are selected
accordingly, and then analysis is done only for those
cases and controls selected from that population. 64

ADVANTAGES OF EMBEDDING A CASE
CONTROL STUDY IN A DEFINED COHORT
 Problem of recall bias eliminated
 Temporality can be established
 More economical study to conduct
65

SOME IMPORTANT DISCOVERIES MADE IN CASE
CONTROL STUDIES :
 1950’s : Cigarette smoking and lung cancer
 1970’s: Diethyl stilbesterol and vaginal adenocarcinoma,
post menopausal estrogens and endometrial cancer
 1980’s: aspirin and Reyes Syndrome, tampon use and
toxic shock syndrome
 1990’s : Diet and cancer
66

CLASSICAL EXAMPLE: 1
Hypothesis : Association of maternal stilbesterol
therapy with tumor (adenocarcinoma)
appearance in young women
67
1) Cancer of vagina: rare disease
2) Usual occurrence in the women above 50 years of age
3) Apparent clustering of cases, which appeared within 4
years (1946-1951)

 Cases:
1) Seven girls 15-22years of age with adenocarcinoma of vagina ( clear-
cell type ) were taken from the Vincent Memorial Hospital which
occurred between 1966 and 1969
2) An eighth identical case occurred in 20yr old patient, treated in
another hospital was also included because she and her family with
their matched controls were as available as their own seven cases .
Controls:
1) 4 matched controls were selected per case
2) Selected by birth record of the hospital in which each patient was
born (socioeconomic differences are reduced)
3) Females born within 5 days and on the same type of service ( ward
or private)
68

Data collection: personal interview (standard questionnaire)
Comparison between groups was made regarding 7 risk factors
1. Maternal age at the birth
2. Maternal smoking
3. Bleeding during study pregnancy
4. Any pregnancy loss
5. Maternal estrogen therapy during study pregnancy ( DES)
6. Breast feeding of infant
7. Intrauterine x-ray exposure
Results:
1) Highly significant association between the maternal
estrogen therapy during study pregnancy loss and
development of adenocancer of vagina in their daughter (p-
value <0.00001)
2) Low level of significance seen with maternal bleeding in the
study pregnancy (p-value<0.05) and prior pregnancy loss (p-
value< 0.01)
3) No significant difference seen with other factors. 69

 Bias in the study:
 Of the candidates for the control group found on
hospital birth lists- 25% could not be located.( selection
bias)
70

CLASSICAL EXAMPLE: 2
Hypothesis : women who too oral contraceptives were at
greater risk of developing thromboembolic disease
1)249 reports of adverse reaction
2)16 reports of death in women taking oral contraceptives
Cases:
-women admitted to hospitals with venous thrombosis or
pulmonary embolism without medical cause
Controls:
Women admitted to same hospital with other diseases. (2
per case)
 Matching done for age, marital status and parity. 71

Results:
-Out of 84, 42(50%) of those with venous thrombosis and
pulmonary embolism had been using OCPs, compared
with 14% of controls
- Investigators found that users of oral contraceptives were
about 6 times more likely to develop thromboembolic
disease.
72

REFERENCES
74
•Park K. Park’s textbook of preventive and social
medicine. 24th ed.
•Gordis L. Epidemiology. 5th ed. Saunders Elsevier;
•Rajiv bhalwar.Textbook of Community
Medicine.2nd ed.
•Charles H. Hennekens . Epidemiology in Medicine

Case Control Study (ANALYTICAL EPIDEMIOLOGY)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Case Control Study (ANALYTICAL EPIDEMIOLOGY)

Similar to Case Control Study (ANALYTICAL EPIDEMIOLOGY) (20)

Recently uploaded

Recently uploaded (20)

Case Control Study (ANALYTICAL EPIDEMIOLOGY)

Editor's Notes