6. Classical versus IRT
Classical Model IRT Model
Traditional Modern
Requires less strict
adherence to assumptions
Requires stricter
adherence to assumptions
Sample dependent Population invariant
Statistics
(p – diff, p-biserial – disc)
Probability-based statistics
(b-diff, a-disc, c-guessing)
Simple scoring model (raw
score)
Scoring is more complex
2010 ICE Educational Conference
13. IRT Pre-Equating
What does it mean?
Why would you want to do it?
What does it mean for building item banks
and forms?
2010 ICE Educational Conference
14. Test Information Function (TIF)
Comparison of Test Information Functions
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
-3 -2.75 -2.5 -2.25 -2 -1.75 -1.5 -1.25 -1 -0.75 -0.5 -0.25 0 0.25 0.5 0.775 1.025 1.275 1.525 1.775 2.025 2.275 2.525 2.775 3.025
Theta
Information
Form A
Form B
2010 ICE Educational Conference
16. Implications
Item writing
– Leave those scored items alone!
– Focused item writing targeting the performance standard
Assembly
– Items selected for a form should be around the standard
Testing and Reporting
– Field test items for pre-equating/on-demand scoring
– Form assignment
– Scoring
– Recalibration
– Harder to explain to stakeholders
2010 ICE Educational Conference
17. Does IRT make sense for you?
What is the size and maturity of your program and
item bank?
Do you like to tinker with items?
Do your program requirements change frequently?
How experienced/capable are your item writers?
How do you score candidates?
IRT or number correct
Do you hold scores or do immediate scoring?
Can you afford a psychometrician?
2010 ICE Educational Conference
18. Questions?
Diane M. Talley dtalley@castleworldwide.com
James A. Penny jpenny@castleworldwide.com
Stephen B. Johnson sjohnson@castleworldwide.com
919.572.6880