Validity

Definition

How well a survey measures what it sets out to
measure.
Validity can be determined only if there is a
reference procedure of “gold standard”.
Food–frequency questionnaires food diaries
Birth weight hospital record.

validity
Three method:
1- content validity
2-criterion –related validity
3-construct validity

Screening test

Validity – get the correct result
Sensitive – correctly classify cases
Specificity – correctly classify non-cases
[screening and diagnosis are not identical]

Validity: 1) Sensitivity

Probability (proportion) of
correct classification of cases

/
Cases found all cases

Validity: 2) Specificity

correct classification of noncases

/
Noncases identified all noncases

2 cases / month

OO

OO O

O


O

O O OO


O



Pre-detectable preclinical clinical old
OO

OO O


O O

O

O O

O O OO

O



Pre-detectable pre-clinical clinical old
O O O OO

OO O O O O

O O O O

O O O O

O O O O O

O O OOO

O O O O O

O O O O


What is the prevalence of “the condition”?
O O O OO


OO O O O O

O O O O

O O O O

O O O O O

O O OOO

O O O O O

O O O O


Sensitivity of a screening test

correct classification of detectable, pre-
clinical cases

Pre-detectable pre-clinical clinical old
(8) (10) (6) (14)
O O O OO


OO O O O O


O O O O


O O O O


O O O O O


O O OOO


O O O O O


O O O O


Correctly classified
Sensitivity: –––––––––––––––––––––––––––
Total detectable pre-clinical (10)
O O O OO


OO O O O O


O O O O


O O O O


O O O O O


O O OOO


O O O O O


O O O O


Specificity of a screening test

correct classification of noncases

Noncases identified / all noncases

Correctly classified
Specificity: –––––––––––––––––––––––––––––
Total non-cases (& pre-detect) (162 or 170)

O O O OO


OO O O O O


O O O O


O O O O


O O O O O


O O OOO


O O O O O


O O O O


True Disease Status
Cases Non-cases
True False
Positive positive positive a+b
Screening
ab
Test c d True
Results False c+d
Negative negative
negative
a+c b+d
True positives a
Sensitivity = =
All cases a+c
True negatives d
Specificity = =
All non-cases b+d

True Disease Status
Cases Non-cases

Positive 140 1,000 1,140
Screening
ab
Test c d
Results 19,000 19,060
Negative 60

200 20,000
True positives 140
Sensitivity = = = 70%
All cases 200
Specificity = True negatives = 19,000 = 95%
All non-cases 20,000

Interpreting test results: predictive value

Probability (proportion) of those tested who
are correctly classified
Cases identified / all positive tests

Noncases identified / all negative tests

True Disease Status
Cases Non-cases
True False
Positive positive positive a+b
Screening
ab
Test c d True
Results False c+d
Negative negative
negative
a+c b+d
True positives a
PPV = =
All positives a+b
True negatives d
NPV = =
All negatives c+d

True Disease Status
Cases Non-cases

Positive 140 1,000 1,140
Screening
ab
Test c d
Results 19,000 19,060
Negative 60

200 20,000
True positives 140
PPV = = = 12.3%
All positives 1,140
19,000
NPV = True negatives = = 99.7%
All negatives 19,060

Confidence interval
point estimate+_[1.96*SE(estimate)]
SE(sensivity)=√P(1-P)
N
SE=0.013
0.70-(1.96*0.013)=0.67
0.70+(1.96*0.013)=0.95

Receiver operating characteristic (ROC) curve

Not aIl tests give a simple yes/no result. Some
yield results that are numerical values along a
continuous scale of measurement. in these
situations, high sensitivity is obtained at the
cost of low specificity and vice versa

Reliability
Repeatability – get same result
Each time
From each instrument
From each rater
If don’t know correct result, then can
examine reliability only.

Definition

The degree of stability exhibited when a
measurement is repeated under identical
conditions

Lack of reliability may arise from divergences
between observers or instruments of
measurement or instability of the attribute
being measured
(from Last. Dictionary of Epidemiology.

Assessment of reliability
Test-Retest Reliability
Equivalence
Internal Consistency
spss
Reliability: Kappa

EXAMPLE OF PERCENT
AGREEMENT

Two physicians are each given a
set of 100 X-rays to look at independently
and asked to judge whether pneumonia is
present or absent. When both sets of
diagnoses are tallied, it is found that 95%
of the diagnoses are the same.

IS PERCENT AGREEMENT GOOD
ENOUGH?

Do these two physicians exhibit high
diagnostic reliability?

Can there be 95% agreement between
two observers without really having
good reliability?

Compare the two tables below:
Table 2 Table 1
MD#1 MD#1

Yes No Yes No

Yes 1 3 Yes 43 3
MD#2 MD#2
No 2 94 No 2 52

In both instances, the physicians agree
95% of the time. Are the two physicians
equally reliable in the two tables?

USE OF THE KAPPA
STATISTIC TO ASSESS
RELIABILITY

Kappa is a widely used test of
inter or intra-observer agreement
(or reliability) which corrects for
chance agreement.

KAPPA VARIES FROM + 1 to - 1
+ 1 means that the two observers are perfectly
reliable. They classify everyone exactly the
same way.

0 means there is no relationship at all
between the two observer’s
classifications, above the agreement that
would be expected by chance.

- 1 means the two observers classify exactly the
opposite of each other. If one observer says
yes, the other always says no.

GUIDE TO USE OF KAPPAS IN
EPIDEMIOLOGY AND MEDICINE

Kappa > .80 is considered excellent
Kappa .60 - .80 is considered good
Kappa .40 - .60 is considered fair
Kappa < .40 is considered poor

WAY TO CALCULATE KAPPA

1. Calculate observed agreement (cells in
which the observers agree/total cells). In
both table 1 and table 2 it is 95%

2. Calculate expected agreement (chance
agreement) based on the marginal totals

Table 1’s marginal totals are:

MD#1
OBSERVED
Yes No

Yes 1 3 4
MD#2
No 2 94 96

3 97 100

OBSERVED MD #1 How do we calculate •
the N expected by
chance in each cell?
Yes No
We assume that •
MD#2 Yes 1 3 4 each cell should
No 2 94 96 reflect the marginal
distributions, i.e. the
3 97 100
proportion of yes
and no answers
EXPECTED MD #1 should be the same
within the four-fold
table as in the
Yes No marginal totals.
MD#2 Yes 4
No 96
3 97 100

To do this, we find the proportion of answers in either
the column (3% and 97%, yes and no respectively for
MD #1) or row (4% and 96% yes and no respectively
for MD #2) marginal totals, and apply one of the two
proportions to the other marginal total. For example,
96% of the row totals are in the “No” category.
Therefore, by chance 96% of MD #1’s “No’s” should
also be in the “No” column. 96% of 97 is 93.12.
MD#1
EXPECTED
Yes No
MD#2 Yes 4
No 93.12 96
3 97 100

By subtraction, all other cells fill in automatically,
and each yes/no distribution reflects the marginal
distribution. Any cell could have been used to make
the calculation, because once one cell is specified in
a 2x2 table with fixed marginal distributions, all
other cells are also specified.

EXPECTED MD #1

Yes No
MD#2 Yes 0.12 3.88 4
No 2.88 93.12 96
3 97 100

Now you can see that just by the operation
of chance, 93.24 of the 100 observations
should have been agreed to by the two
observers. (93.12 + 0.12)

EXPECTED MD #1

Yes No
MD#2 Yes 0.12 3.88 4
No 2.88 93.12 96
3 97 100

Below is the formula for calculating Kappa
from expected agreement

Observed agreement - Expected Agreement
1 - Expected Agreement

95% - 93.24% = 1.76% = .26
1 - 93.24% 6.76%

How good is a Kappa of 0.26?

Kappa > .80 is considered excellent
Kappa .60 - .80 is considered good
Kappa .40 - .60 is considered fair
Kappa < .40 is considered poor

In the second example, the observed
agreement was also 95%, but the marginal
totals were very different

ACTUAL MD #1

Yes No
MD#2 Yes 46
No 54
45 55 100

Using the same procedure as before, we calculate the expected
N in any one cell, based on the marginal totals. For example, the
lower right cell is 54% of 55, which is 29.7

ACTUAL MD #1

Yes No
MD#2 Yes 46
No 29.7 54
45 55 100

And, by subtraction the other cells are
as below. The cells which indicate
agreement are highlighted in yellow,
and add up to 50.4%

ACTUAL MD #1

Yes No
MD#2 Yes 20.7 25.3 46
No 24.3 29.7 54
45 55 100

Enter the two agreements into the formula

Observed agreement - Expected Agreemen
1 - Expected Agreement

95% - 50.4% = 44.6% = .90
1 - 50.4% 49.6%

In this example, the observers have the
same % agreement, but now they are
much different from chance.
Kappa of 0.90 is considered excellent

Validity

Recommandé

Recommandé

Contenu connexe

Similaire à Validity

Similaire à Validity (20)

Plus de mums1

Plus de mums1 (7)

Dernier

Dernier (20)

Validity