ABCs of IRT

November 18, 2010
Diane M. Talley, MA
Stephen B. Johnson, PhD
James A. Penny, PhD
Castle Worldwide

Psychometrics as Science and Art
2010 ICE Educational Conference

 IRT and Classical
 Concepts of IRT
– A logit
– The abc’s
 Benefits
– Pre-equating
– immediate scoring
– Population invariance
 Assumptions
 Implications

The right tools for the job
 Data
 Program
 Tool
Versus

Classical versus IRT model

Classical versus IRT
Classical Model IRT Model
 Traditional  Modern
 Requires less strict
adherence to assumptions
 Requires stricter
adherence to assumptions
 Sample dependent  Population invariant
 Statistics
(p – diff, p-biserial – disc)
 Probability-based statistics
(b-diff, a-disc, c-guessing)
 Simple scoring model (raw
score)
 Scoring is more complex

What’s a logit?
Ability
The
Performance
Standard
Probability

b (difficulty)
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
-3
-2.8
-2.5
-2.3
-2
-1.8
-1.5
-1.3
-1
-0.8
-0.5
-0.3
0
0.25
0.5
0.75
1
1.25
1.5
1.75
2
2.25
2.5
2.75
THETA
P(u=1|THETA)
Paint by Numbers Leonardo
1
4
3
2
5

a (discrimination) and b
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
-3
-2.75
-2.5
-2.25
-2
-1.75
-1.5
-1.25
-1
-0.75
-0.5
-0.25
0
0.25
0.5
0.75
1
1.25
1.5
1.75
2
2.25
2.5
2.75
THETA
P(u=1|THETA)
1
2
3

a, b, and c (guessing)
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
-3
-2.75
-2.5
-2.25
-2
-1.75
-1.5
-1.25
-1
-0.75
-0.5
-0.25
0
0.25
0.5
0.75
1
1.25
1.5
1.75
2
2.25
2.5
2.75
THETA
P(u=1|THETA)
1
2
3

Fit statistics
Comparison of Infitand Outfit
0
1
2
3
4
5
6
Infit Outfit
ItemOrder
ICE 2010 Conference Atlanta Georgia
Outfit Mean Square Plot
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25 30
Item Order
MSQ
Infit Mean Square Plot
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 5 10 15 20 25 30
Item Order
MSQ

Population Invariance
Low
Performing
High
Performing
Item 1 .15 .50
Item 2 .60 .80
Item 3 .70 .92
Classical Difficulty Values IRT Difficulty Values
Low
Performing
High
Performing
Item 1 1.50 1.50
Item 2 0.00 0.00
Item 3 -.75 -.75

IRT Pre-Equating
 What does it mean?
 Why would you want to do it?
 What does it mean for building item banks
and forms?

Test Information Function (TIF)
Comparison of Test Information Functions
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
-3 -2.75 -2.5 -2.25 -2 -1.75 -1.5 -1.25 -1 -0.75 -0.5 -0.25 0 0.25 0.5 0.775 1.025 1.275 1.525 1.775 2.025 2.275 2.525 2.775 3.025
Theta
Information
Form A
Form B

Assumptions
 Unidimensionality
 Local Independence

Implications
 Item writing
– Leave those scored items alone!
– Focused item writing targeting the performance standard
 Assembly
– Items selected for a form should be around the standard
 Testing and Reporting
– Field test items for pre-equating/on-demand scoring
– Form assignment
– Scoring
– Recalibration
– Harder to explain to stakeholders

Does IRT make sense for you?
 What is the size and maturity of your program and
item bank?
 Do you like to tinker with items?
 Do your program requirements change frequently?
 How experienced/capable are your item writers?
 How do you score candidates?
 IRT or number correct
 Do you hold scores or do immediate scoring?
 Can you afford a psychometrician?

Questions?
Diane M. Talley dtalley@castleworldwide.com
James A. Penny jpenny@castleworldwide.com
Stephen B. Johnson sjohnson@castleworldwide.com
919.572.6880

ABCs of IRT

Recommandé

Recommandé

Contenu connexe

Similaire à ABCs of IRT

Similaire à ABCs of IRT (20)

ABCs of IRT