What questions are MOOCs asking? An evidence based investigation
1. What Questions are MOOCs Asking?
– An Evidence-Based Investigation
Eamon Costello, Mark Brown, Jane Holland
EMOOCs Conference
24th February, 2016
Graz, Austria
2. Background
• Me – @eam0
• These slides: slideshare url here
• The National Institute for Digital Learning
at Dublin City University
3. MOOCs at DCU
• MOOC development and research
• Projects: SCORE2020, HOME
• Get Ready to Learn: studentsuccess.ie
4. MOOC Futures?
“Credit Barrier
is the Sonic
Boom of
Education”
Agarwal, A. (2016) As MOOCs Grow Up: Innovation and Accountability. Keynote
address of the 3rd European MOOOCs Stakeholder Summit. Graz, Austria
7. Brightest MOOC Futures
Dillenbourg, P. (2015) Proposal for a Digital Education Strategy for Flanders Universities.
“Thinkers in Residence” Programme from KVABKoninklijke Vlaamse Academie van België
voor Wetenschappen en Kunsten.
“Sooner or later,
online tests will
be as reliable or
even more
reliable than on
campus exams”
8. Current State of Play
• The future is taken care of but what about the
present?
– How mature are (x)MOOCs?
– Research Question: What is the Quality of
Multiple Choice Questions in MOOCs?
9. Multiple Choice Question (MCQ) Tests
(Single Best Answer)
– Reliability:
If we repeated this would we get the
same result?
- Validity:
Are we measuring what we think we
are?
Don’t get
“fooled by
randomness”
Taleb, N. (2004). Fooled by randomness: The hidden role of chance in life and
in the markets. Random House Incorporated.
10. Best Practice
Case, S. M., & Swanson, D. B. (2003). Constructing written test questions for the basic
and clinical sciences (3rd ed.). Philadelphia, PA: National Board of Medical Examiners.
11. Methodology
• Qualitative Study
• Use Tarrant et. al.’s (2006) diagnostic tool to
analyse MCQs in MOOC systematically
• Look for item writing flaws
Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing
flaws in multiple-choice questions used in high stakes nursing assessments. Nurse
Education Today, 26(8), 662-671
12. 1. Ambiguous or unclear information
2. Negative worded stem (not, incorrect, except)
3. Implausible distracters
4. Gratuitous information in stem
5. More than one or no correct answer
6. Longest option is correct
7. Logical cues in stem
8. Word repeats in stem and correct answer
9. Unfocused stem
10. True/false question
11. Use of all of the above
12. Vague terms (sometimes, frequently)
13. Absolute terms (never, always)
14. Use of none of the above
15. Fill-in-blank
16. Complex or K-type
17. Grammatical cues in sentence completion
18. Convergence cues
Tool
Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing
flaws in multiple-choice questions used in high stakes nursing assessments. Nurse
Education Today, 26(8), 662-671
13. Data
Phase 1
• 12 Courses (116 MCQs)
Phase 1 + Phase 2
• 18 Courses
– from five MOOC platforms
edX, Coursera, Futurelearn, Iversity and Eliademy
• Total MCQs: 204
– 934 data points
14. Analysis
Part Computational see (Brunnquell et. al., 2011)
Part Human (Qualitative)
Phase 1 – One reviewer
Phase 2 – Two reviewers (inter-rater reliability)
len = “How long is a piece of
string?”.length()
Brunnquell, A., Degirmenci, U., Kreil, S., Kornhuber, J., & Weih, M. (2011). Web-based
application to eliminate five contraindicated multiple-choice question practices. Evaluation
& the Health Professions, 34(2), 226-238. doi:10.1177/0163278710370459 [doi]
15. Findings (n = 204)
• 202 item writing flaws (errors)
• At least 1 error in 112 questions (54.9%)
• More than 2 errors in 57 questions (27.9%)
16. Findings (n = 204)
Item Writing Flaw
Number
Detected
Percentage
of Total
Absolute terms (never, always, none, all) 30 14.7%
True false question 22 10.8%
Stem is negativley worded 21 10.3%
Ambiguous language 19 9.3%
More than one correct option 18 8.8%
Complex or K-type 17 8.3%
Contains implausible distracters 15 7.4%
17. The Longest option is the correct option
(χ² [1, N = 204] = 10.75, p < 0.01)
14%
27%
31%
27%
0
5
10
15
20
25
30
35
40
1st Option 2nd Option 3rd Option 4th Option
Frequency of Position of Correct Option in 113 Four
Option MCQs (χ² [1, N = 113] = 7.46, p = 0.06)
18. Potential Impact?
• Downing’s (2005) study
– found 33–46% of MCQs flawed
– 10–15% of students who failed should have passed
Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: The
consequences of using flawed test items on achievement examinations in medical education. Advances in
Health Sciences Education, 10(2), 133-143.
19. Recommendations
• Peer review by question writers who have
training before students take tests
• Post Facto Item Analysis of your MCQs
(Classical Test Theory)
• Better tools for developing and analysing
MCQs in MOOCs
20. Now that this engaging presentation
has finished do you wish to?
Notes de l'éditeur
Several flaws can be computationally determined while the second are that require the opinion of a human evaluator. Those that may be identified by means of simple calculation include; longer than average correct option,
option position
and inclusion of options such as “all of the above” or “none of the above”.
For instance it has been shown that the longest option provided within an MCQ is frequently the correct one. This is due to a cognitive bias of (untrained) question writers who first compose the correct answer, taking due care in doing so, but then later create the distracters (incorrect options) which they may spend less time on. The length of each of the options was calculated by counting the number of characters in it. The number of the options and the number of correct options were calculated similarly, as was the position of the correct option. Once again, the available evidence-base suggests that the third (of four options) is most frequently the correct one. The strings of “all of the above” and “none of the above” were programmatically detected. The options are considered flawed in the Tarrant, Knierim, Hayes and Ware (2006) framework and this finding is supported by recent research (Pachai, DiBattista, & Kim, 2015).
. Using the review tool the evaluators answered questions such as: “are the distracters plausible?”, “is the question error free?”, “is the language of the question ambiguous?” and so forth
Laziness impatience and hubris