The traditional approach to classification testing is extremely inefficient and often difficult to implement in applied settings. Typically, examinees are rank ordered either through Item Response Theory or Classical Test Theory, and then scores are compared to a difficult-to-define cut score.
This webinar will introduce the use of decision theory which basically asks: “Does this response pattern look like the response pattern of a master or a non-master?” This simpler model has major advantages over IRT and CTT:
1. Only a small sample of clear masters and a small sample of clear non-masters are needed to calibrate questions.
2. There are no assumptions for unidimensionality, and normal distribution or requirement for monotonically increasing probabilities of correct responses.
This model is attractive and a natural for end-of-unit examinations, adaptive testing, and as the routing mechanism for intelligent tutoring systems.
This webinar will explain the model, identify current applications, and introduce free tools for generating, calibrating and scoring data.
Caveon Webinar Series: Using Decision Theory for Accurate Pass/Fail Decisions
1. Upcoming Caveon Events
• Caveon Webinar Series: Next session, June 19
Protecting your Tests Using Copyright Law
• Presenters include Intellectual Property Attorney Kenneth Horton and a
member of the Caveon Web Patrol team
• Register at: http://bit.ly/protectingip
• NCSA – June 19-21 National Harbor, MD
– Dr. John Fremer is co-presenting Preventing, Detecting, and Investigating Test
Security Irregularities: A Comprehensive Guidebook On Test Security For States
– Visit the Caveon booth!
2. Latest Publications
• Handbook of Test Security – Now available for
purchase! We’ll share a discount code before end
of session.
• TILSA Guidebook for State Assessment
Directors on Data Forensics – coming soon!
3. Caveon Online
• Caveon Security Insights Blog
– http://www.caveon.com/blog/
• twitter
– Follow @Caveon
• LinkedIn
– Caveon Company Page
– “Caveon Test Security” Group
• Please contribute!
• Facebook
– Will you be our “friend?”
– “Like” us!
www.caveon.com
4. “Using Decision Theory to Score
Accurate Pass/Fail Decisions”
Lawrence M. Rudner, Ph.D., MBA
Vice President and Chief Psychometrician
Research and Development
GMAC®
May 15, 2013
Caveon Webinar Series:
Jamie Mulkey, Ed.D.
Vice President and General Manager
Test Development Services
Caveon
5. Agenda for today
• Role of decision theory
• Examples
• Logic
• Tools
• Adaptive Testing
6. Goal of Measurement Decision Theory
Classify an examinee into one of K groups
– mastery/non-master
– below basic / basic / proficient / advanced
– A / B / C / D / F
7. Poll #1
Are you involved with any classification
tests as part of your work?
Attendee Responses:
Yes – Pass/Fail – 49%
Yes - Yes - Multiple categories, e.g. A,B,C,D,F – 39%
No – 11%
8. Poll #2
How familiar are you with Item Response
Theory?
Attendee Responses:
Very – I understand and routinely apply IRT formulas – 37%
Somewhat – I understand the logic and concepts – 38%
A little – I have heard of it – 20%
Not at all – I have never heard of it – 5%
9. Poll #3
What is your primary job function?
Attendee Responses:
Teacher or Content Expert -6%
Item Writer – 8%
Psychometrician – 30%
Manager and I am a non Psychometrician – 35%
Manager and I am a Psychometrician – 21%
12. New Thinking
Probability of being a Master
or a Non-Master
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Non-Master Master
13. A Different Question
Old: Your score was 76 which is above the
passing score of 72. You passed.
vs
New: Probability of this response pattern for a
master is 85% and the probability for a non-
master is 15%. You passed.
14. IRT Approach
Probability of a correct response to Question 123 given ability level
Question 123
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
-3 -2 -1 0 1 2 3
16. Advantages
• Simple framework
• Small number of items
• Small calibration sample sizes
• Classifies as well as or better
than IRT
• Effective for adaptive testing
• Well developed science
24. Notation
• K - # of mastery states
• P(mk) - Prob of a randomly drawn examinee being
in each mastery state k
• z - an individual’s response vector z1,z2,…,zN
zi ∈ (0,1) for N questions
25. Want
P(mk | z )
The probability of each mastery state k, mk, given the
response vector z.
The probability of being a master given z
The probability of being a non-master given z
28. Mastery state
(using Bayes Theorem)
P(m | ) = P( |m ) P(m )k k kz zc
But there are too many possible response
vectors z
29. Mastery state
(using Bayes Theorem)
P(m | ) = P( |m ) P(m )k k kz zc
But there are too many possible response
vectors z
P( |m ) = P(z | m )k i k
i=1
N
z
Simplifying assumption
31. Probability of the response vector z for each
mastery state is:
P(z| m1) =.8 * .8 * (1-.6) = .26
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1 Item 2 Item 3
Masters (m1) .8 .8 .6
Non-masters (m2) .3 .6 .5
Response Vector [1,1,0]
Examinee 1
32. Probability of the response vector z for each
mastery state is:
P(z| m1) =.8 * .8 * (1-.6) = .26
P(z| m2) =.3 * .6 * (1-.5) = .09
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1 Item 2 Item 3
Masters (m1) .8 .8 .6
Non-masters (m2) .3 .6 .5
Response Vector [1,1,0]
Examinee 1
33. Probability of the response vector z for each
mastery state is:
P(z| m1) =.8 * .8 * (1-.6) = .26
P(z| m2) =.3 * .6 * (1-.5) = .09
Normalized
P(z| m1) = .26 / (.26 + .09) = .74
P(z| m2) = .09 / (.26 + .09) = .26
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1 Item 2 Item 3
Masters (m1) .8 .8 .6
Non-masters (m2) .3 .6 .5
Response Vector [1,1,0]
Examinee 1
34. Probability of the response vector z for each
mastery state is:
P(z| m1) =.2 * .2 * .6 = .024
P(z| m2) =.7 * .4 * .5 = .14
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1 Item 2 Item 3
Masters (m1) .8 .8 .6
Non-masters (m2) .3 .6 .5
Response Vector [0,0,1]
Examinee 2
35. Probability of the response vector z for each
mastery state is:
P(z| m1) =.2 * .2 * .6 = .024
P(z| m2) =.7 * .4 * .5 = .14
Normalized
P(z| m1) = .024 / (.024 + .14) = .15
P(z| m2) = .14 / (.024 + .14) = .85
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1 Item 2 Item 3
Masters (m1) .8 .8 .6
Non-masters (m2) .3 .6 .5
Response Vector [0,0,1]
Examinee 2
39. Decision Rule – Maximum Likelihood
0
0.05
0.1
0.15
0.2
0.25
0.3
P(z|mk)
Master
Non-Master
• Probability of the response vector, z, for each mastery state is:
P(z| m1) = .8 * .8 * (1-.6) = .26
P(z| m2) = .3 * .6 * (1-.5) = .09
40. Decision Rule - Maximum a posteriori
probability
• Probability of each mastery state is
P(m1|z) = c * .26 *.7 = c* .52 = .87
P(m2|z) = c * .09 *.3 = c* .08 = .13
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
P(mk|z)
Master
Non-Master
41. Decision Criteria
Bayes Risk
Given a set of item
responses z and the
costs associated
with each
decision, select dk to
minimize the total
expected cost.
54. 1. Sequentially select items to maximize
certainty,
2. Administer and score item,
3. Update the estimated mastery state
classification probabilities,
4. Evaluate whether there is enough information
to terminate testing,
5. Back to Step 1 if needed.
Sequential Testing
56. Entropy
A measure of the disorder of a system.
How many bits of information are needed to send
a) 1,000,000 random signals
b) 1,000,000 zero’s
H S p pk
k
K
k( ) log
1
2
57. Less peaked = more uncertainty
= more entropy
0.0
0.2
0.4
0.6
0.8
1.0
Non-Master Master
0.0
0.2
0.4
0.6
0.8
1.0
Non-Master MasterH(s) = 1.00
H(s) = 0.72
58. Adaptive Testing
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30 35 40 45 50
Max No of items
Proportion
Accuracy
Classified
Percent classified vs accuracy as a function of the
maximum number of items administered (NAEP items)
59. Recap
• Simple framework
• Small number of items
• Classifies as well as or better than
much more complicated IRT
• Effective for adaptive testing
• Small sample sizes
• Well developed science
60. Option For
• Small certification programs
• Large certification programs
• Embedded in instructional systems
• Test preparation
61. HANDBOOK OF TEST SECURITY
• Editors - James Wollack & John Fremer
• Published March 2013
• Preventing, Detecting, and Investigating Cheating
• Testing in Many Domains
– Certification/Licensure
– Clinical
– Educational
– Industrial/Organizational
• Don’t forget to order your copy at www.routledge.com
– http://bit.ly/HandbookTS (Case Sensitive)
– Save 20% - Enter discount code: HYJ82
63. THANK YOU!
- Follow Caveon on twitter @caveon
- Check out our blog…www.caveon.com/blog
- LinkedIn Group – “Caveon Test Security”
Lawrence M. Rudner, Ph.D. MBA
Vice President and Chief Psychometrician
Research and Development
GMAC®
Jamie Mulkey, Ed.D.
Vice President and General Manager
Test Development Services
Caveon
Notes de l'éditeur
Are you involved with any classification tests as part of your work?Yes – Pass/FailYes – Multiple categories, e.g. A,B,C,D,FNo
Are you involved with any classification tests as part of your work?Yes – Pass/FailYes – Multiple categories, e.g. A,B,C,D,FNo
Are you involved with any classification tests as part of your work?Yes – Pass/FailYes – Multiple categories, e.g. A,B,C,D,FNo
Abraham Wald (October 31, 1902(1902-10-31) - December 13, 1950) was a mathematician born in Cluj, in the then Austria–Hungary (present-day Romania) who contributed to decision theory, geometry, and econometrics, and founded the field of statisticalsequential analysis.[1]was thus home-schooled by his parents until college.[1] His parents were quite knowledgeable and competent as teachers.[2]Emigrated to US to avoid the nazi’sThomas Bayes (pronounced: ˈbeɪz) (c. 1702 – 17 April 1761) was an Englishmathematician and Presbyterian minister, known for having formulated a specific case of the theorem that bears his name: Bayes' theorem, which was published posthumously.
Shannon is famous for having founded information theory with a landmark paper that he published in 1948. However, he is also credited with founding both digital computer and digital circuit design theory in 1937, when, as a 21-year-old master's degree student at the Massachusetts Institute of Technology (MIT), he wrote his thesisdemonstrating that electrical applications of boolean algebra could construct and resolve any logical, numerical relationship. It has been claimed that this was the most important master's thesis of all time.[3] Shannon contributed to the field of cryptanalysis for national defense during World War II, including his basic work on codebreaking and secure telecommunications.