SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 1
Measuring what we want to measure
Writing excellent questions for
College examinations
Liz Norman
Massey University
Why is question writing
important?
So that we can have the best chance of measuring what we want to measure
The process of assessment is a judgement process, and those of you who have ever examined anyone will
know that sometimes that judgement seems easy, and at other times it is very challenging.
We make inferences based on evidence during that judgement process. The questions are there to collect
evidence.
We need to be sure they are collecting evidence of the right thing.
So today I am going to first discuss how we go about deciding what it is we are trying to measure
Then we will look at the advantages and disadvantages of different question types for that purpose
We will look at some aspects of designing long answer questions and some of the traps
And finally we will look at some aspects of designing MCQs and some of the traps
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 2
What we want to measure
The candidate will have a detailed knowledge of:
The aetiology, pathogenesis and pathophysiology of
cardiac, renal, respiratory, alimentary, musculoskeletal,
endocrine, ophthalmological and neurological organ
dysfunction in the cat and the dog.
The candidate will be able to, with a detailed level
of expertise:
Analyse complex clinical problems and make sound
clinical judgements.
The subject guidelines for each subject specify the scope for both knowledge and skills. Some skills are
technical and assessed through credentialing. Some are cognitive skills such as this one and are assessed in
our examinations.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 3
Scope - breadth
Pathophysiology
Investigation and
diagnosis
Treatment and
management
Gastrointestinal P1Q1 P1Q1, P2Q4
Cardiovascular P1Q4 P2Q2 P2Q2
Nervous P1Q3, P2Q1
Endocrine P1Q3 P2Q3
Musculoskeletal P2Q5
So these form the topics that will be covered by the questions. Blueprinting is a good way to ensure that
the whole subject is sampled from representatively across the 3-4 components of the exam. It is why the
whole exam (all 3-4 components) needs to be designed at once. Note that questions often span more than
one category.
Knowledge levels:
Detailed knowledge — candidates must be able to
demonstrate an in-depth knowledge of the topic
including differing points of view and published
literature. The highest level of knowledge.
Sound knowledge — candidate must know all of the
principles of the topic including some of the finer detail,
and be able to identify areas where opinions may
diverge. A middle level of knowledge.
Basic knowledge — candidate must know the main
points of the topic and the core literature.
Currently the College templates for subject guidelines specify the level of knowledge in this way.
In a way this is only showing the level of detail required.
Knowledge isn’t all about recalling a level of detail though. Experts are able to use their knowledge in
appropriate ways, recall the information in appropriate situations and apply it to those situations to solve
problems.
It isn’t just about knowledge, but what can be done with the knowledge.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 4
Skill levels:
Detailed expertise — the candidate must be able to
perform the technique with a high degree of skill, and
have extensive experience in its application. The
highest level of proficiency.
Sound expertise — the candidate must be able to
perform the technique with a moderate degree of skill,
and have moderate experience in its application. A
middle level of proficiency.
Basic expertise — the candidate must be able to
perform the technique competently in uncomplicated
circumstances
These skill levels don’t just apply to technical skills (psychomotor skills) but also to the way knowledge is
used.
The candidate will have a detailed knowledge of:
The aetiology, pathogenesis and pathophysiology of
cardiac, renal, respiratory, alimentary, musculoskeletal,
endocrine, ophthalmological and neurological organ
dysfunction in the cat and the dog.
The candidate will be able to, with a detailed level
of expertise:
Analyse complex clinical problems and make sound
clinical judgements.
Candidates need base knowledge in order to “analyse complex clinical problems and make sound clinical
judgements” and so if you aim to assess the cognitive skills you will also be assessing the knowledge base of
the candidate
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 5
Fact recall vs applied
Fact recall:
Questions capable of being answered by reference to one
paragraph in a text or notes (or several paragraphs for
questions requiring recall of several facts)
Applied (higher order)
Questions that require the use of facts or concepts, the
solution of a diagnostic or physiologic problem, the
perception of a relationship, or other process beyond
recalling discrete fact
From: Peitzman et al. (1990). Academic Medicine, 65(9), S59-60.
Questions that assess cognitive skills are applied or higher order questions, as opposed to fact recall
questions. This is a useful operational definition used in a research paper which I find helpful to work out if
a question is higher order or fact recall. If the answer can be looked up and appears on one paragraph/page
of a textbook then really its fact recall.
Note that even complex judgements can become fact recall for candidates once someone writes a review
paper or textbook chapter that sums up the complexity.
Level - depth
Pathophysiology
Investigation and
diagnosis
Treatment and
management
recall
higher
order
recall
higher
order
recall
higher
order
Gastrointestinal P1Q1
P1Q1,
P2Q4
Cardiovascular P1Q4 P2Q2 P2Q2
Nervous
P1Q2,
P2Q1
Endocrine P1Q3 P2Q3
Musculoskeletal P2Q5
You can also categorise your questions into recall and higher order on your blueprint to check that you
mostly have higher order (as is appropriate for membership and fellowship) across the examinations.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 6
What we don’t want to measure
• Ability to take tests
• Ability to write legibly and fast
• Ability to rote learn whole pages of textbooks or
review articles - prewriting
• Ability to write down a huge series of unconnected
facts in no particular order
• Ability to research examiner's fields of interest and
rote learn impressive aspects of that
• Ability to interpret what examiners are thinking
There is a whole lot of things we don’t want t measure, and these are just some.
Ability to take tests
Ability to write clearly and fast
this is why it is important to pace examinations so that candidates do have time to write legibly
Ability to rote learn whole pages of textbooks or review articles – prewriting (where candidates predict and
then prelearn answers)
This is why we need to avoid fact recall questions but get candidates to use their knowledge in
examinations
Ability to write down a huge series of unconnected facts in no particular order
We need to ensure we don’t reward this in our marking schemes. Importantly as we move away
from fact recall Qs to higher order Qs it becomes the quality of the answer that is most important
more than the quantity of facts the candidate writes down
Ability to research examiner's fields of interest and rote learn impressive aspects of that
This is why examinations need to be blueprinted against the subject guidelines not examiners
interests
Ability to interpret what examiners are thinking
This is why we need to give clear instructions in our questions and what we will talk about in the
next section
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 7
Types of questions
Recall
knowledge
Apply
knowledge
Stimulus formats
Question types can be categorised broadly by two aspects
The stimulus format: fact recall or applied knowledge
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 8
Recall
knowledge
Apply
knowledge
Selected
response
Constructed
response
Stimulus formats
Responseformats
The response format: what the candidate does to indicate their response:
Selected response: eg MCQs where candidates select their response from prespecified options.
Constructed response – eg long and short answer Qs where candidates generate their own response
Recall
knowledge
Apply
knowledge
Selected
response
Selected
recall
Selected
applied
Constructed
response
Constructed
recall
Constructed
applied
Stimulus formats
Responseformats
Both selected response and constructed response questions can be fact recall and both can be applied
knowledge types
The literature is clear that it is the stimulus format (fact recall vs applied) that is the most important
determiner of what is measured, and the response format is of less importance
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 9
Advantages of constructed
response Qs (long answer)
• Non-cued writing – can measure what
candidate’s spontaneously think of
• Easy to create
• Logic, reasoning, steps in problem solving
• Ease of partial credit scoring
• In-depth assessment
Non-cued writing – can measure what candidate’s spontaneously think of
Cueing means that a candidate can answer a multiple-choice question correctly by recognising the
correct option, rather than by generating the answer spontaneously. Cueing clearly exists as you
will recognise if think about how you are thinking when you look at an MCQ. Your strategy is often
to look at the options first and try and recognise the correct answer rather than working it out.
Easy to create
Logic, reasoning, steps in problem solving
Ease of partial credit scoring
In-depth assessment
A good long answer question asks the candidate to process information or knowledge rather than
to reproduce it, by, for example, requiring candidates to set up a reasoning process or summarise
information, or asking them to apply a known principle in different contexts, etc
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 10
Limitations of constructed response
Qs (long answer)
• Subjective scoring
• Reproducibility issues
• Limited breadth of content
• Inefficient?
– Marking time
– Testing time
• Quality control tends to be qualitative
Subjective scoring
This is a frequently levelled criticism, but I do not see it as a big problem. Assessment is a process
of judgement, similar to diagnosis. Rather than trying to take the judgement out of the equation,
we need to ensure that the judgement is made on good, relevant and sufficient evidence, by
appropriately qualified judges.
Reproducibility issues
This is referring to the agreement between different judges or the same judge looking at it at a
different time. This can be a problem, but as above, is one we should acknowledge is inherent in
complex judgement, and we need to take other approaches to quality control.
Limited breadth of content
This is a definite limitation we need to be aware of. Because the questions take longer than say
MCQ questions, we can ask far fewer and therefore we are taking a much smaller sample of the
topics in the subject. We need to remember that the smaller the sample, the less able we are to
generalise the performance of the candidate to the whole subject area, which is actually the aim of
the examinations.
Inefficient?
This is often said, but only true when you are talking about large numbers of candidates (like
1000s). Long answer questions definitely take longer to mark than short answer questions, and also
you need longer testing time in order to ask a reasonable number of questions to sample the
topics. However because good MCQs take at least as long to write as good long answer Qs, and you
need more of them, the time trade off only breaks even when you have a large number of
candidates, so this is not relevant for the College.
Quality control tends to be qualitative
Rather than using statistical quantitative methods of quality assurance, we need to use more
qualitative methods, which are aimed at ensuring and documenting the trustworthiness, credibility
and dependability of the judgements. This includes having more than one examiner, checking for
agreement in decisions, ensuring the expertise of examiners and triangulating evidence across all
examination components and credentialing.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 11
Advantages of MCQ
• Can test a wide breadth of subjects in a time-efficient
manner
• Sampling across the subject may therefore be more
representative
• Less predictability - fosters a deep approach to learning
• Reliability
• Can construct examinations of known difficulty
(assuming psychometric analysis carried out)
• Efficient and cost effective for large numbers of
candidates
• Possibility of automated development
Many of these we have already discussed in comparison to long answer questions
• Can test a wide breadth of subjects in a time-efficient manner
• Sampling across the subject may therefore be more representative
• Less predictability - fosters a deep approach to learning
Good MCQs (that test higher order thinking) are not as easily predicted by students and studies
have demonstrated that they foster deep learning.
• Reliability
Very reliable since there is no judgement involved in scoring
• Can construct examinations of known difficulty (assuming psychometric analysis carried out)
• Efficient and cost effective for large numbers of candidates
• Possibility of automated development
See next slide
In this study experts developed a schema for decision making in a particular scenario (diagnosis of wound
infection) and then this was used to automatically generate 1248 different MCQ questions automatically.
Obviously this is still resource intensive to develop the schema, but offers a possibility for future automated
generation which is exciting.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 12
Disadvantages of MCQ
Realising the advantages requires procedures
which makes them resource intensive and
expensive
– Creation of a large question bank
– Pretesting and statistical analysis of Qs
– Post examination statistical analysis
Realising the advantages requires procedures which makes them resource intensive and expensive
• Creation of a large question bank
If you want to draw questions from a bank rather than create new each year, you need the bank to
contain 10 times the number of questions you will be drawing. Therefore for a 2 hour, 120 MCQ
exam you need a bank of at least 1200 questions written
• Pretesting and statistical analysis of Qs
Ideally MCQs should be pretested on a sample similar to candidates – eg existing members or
fellows. You need to do this to detect problems with questions because problems with questions
are common – some studies have found flaws in 36-65% of questions of which 10-15% are serious
enough to influence pass-fail decisions
• Post examination statistical analysis
Sophisticated statistical analysis should be performed to set the passing cut point and check the
performance of questions
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 13
Problems that can occur with MCQs
• Candidates can’t indicate their interpretation of the Q
• Fact recall Qs are easier to write therefore tend to
dominate
• Some topics are particularly difficult to write MCQs for
• Identification of a correct response requires a
different type of thinking from candidates than
generation of a response
• Guessing can be rewarded
• What is correct is still a subjective decision
• Circulating recall papers may reduce even higher
order Qs to recall Qs
Candidates can’t indicate their interpretation of the Q
If a question turns out to not to be well worded or unclear, it is possible for examiners to see this in
the answer, or for candidates to say so in the answer, whereas they are not able to in MCQs (which
is why it is more critical for MCQs to have pretesting)
Fact recall Qs are easier to write therefore tend to dominate
Some topics are particularly difficult to write MCQs for
Identification of a correct response requires a different type of thinking from candidates than generation of
a response
Guessing can be rewarded
This is often of concern but shouldn’t be. Even with 3 option MCQs (where there is a 33% chance of
getting the answer right from guessing) the probability of scoring 70% correct on a 30 question test
is 0.0000356. In any case, candidates with any degree of preparation will use partial knowledge to
select answers rather than guessing strategies.
What is correct is still a subjective decision
See the next slide
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 14
Least
correct
Most
correct
D E AC B
In biological systems we all know that nothing is ALWAYS true and you can never say never. Therefore
unless questions are about trivial content (which they should not be) both the correct option and the
distractors are likely to have some degree of truth to them. This is especially the case for the type of
questions w really want to design which aim to test candidate’s knowledge and decision making in complex
situations that require judgement. While it is important that the answer keyed a “correct” is much more
correct than the other possible answers, there will always be an element of judgement in this decision.
Problems that can occur with MCQs
• Candidates can’t indicate their interpretation of the Q
• Fact recall Qs are easier to write therefore tend to
dominate
• Some topics are particularly difficult to write MCQs for
• Identification of a correct response requires a
different type of thinking from candidates than
generation of a response
• Guessing can be rewarded
• What is correct is still a subjective decision
• Circulating recall papers may reduce even higher
order Qs to recall Qs
Circulating recall papers may reduce even higher order Qs to recall Qs
This is just as much a problem with long answer Qs as with MCQs. But MCQs are so expensive to produce
that they tend to be held for reuse – hence the problem.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 15
This shows a screenshot from a website called PasstheFRACP.com where FRACP refers to the Fellowship of
the Royal Australian College of Physicians. Several past recall papers are available on this site and
undoubtedly there are many others for all sorts of examinations that are not so freely available
General issues with question writing
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 16
Communication
The examination questions are the question setter’s
expression of the question setter’s task.
The candidate’s answer represents the candidate’s
expression of the candidate’s interpretation of the questions.
The marker evaluates the marker’s interpretation of the
candidate’s expression of the candidate’s answer.
The marker uses the marker’s interpretation of the setter’s
expression of the setter’s task to evaluate the candidate’s
answer.
Modified from Pollitt & Ahmed1999
Exams are a communication between 3 people: the candidate, the marker and the question setter. In only
some cases is the question setter the same person as the marker.
Each of these people has their own interpretation of the communication.
• The question setter has a task in mind and expresses this in writing. Anyone reading the question is
interpreting the words in order to arrive at their own understanding of the task the question setter had
in mind. As you can imagine, if the task is not specified very precisely and clearly there is plenty of room
for things to go wrong at this step.
• The candidate is one of the people interpreting the question. They formulate an answer, and then have
to express that in words on the page. Their expression may or may not represent their answer very well.
You can imagine that many factors can interfere with this process, not all of which are things we are
trying to differentiate candidates on.
• The marker has to interpret the candidate’s expression of their answer on the page and from it, they are
making inferences about what the candidate knows and can do. The inferences may be well founded, or
perhaps more tenuous. The candidate’s expression can contribute a lot to the interpretations made by
the marker.
• In order to make an evaluation of the candidate’s performance, the marker must also interpret the
question, using the setters expression of the task the setter had in mind. The evaluation the marker
makes of the task may be different to the evaluation the candidate makes of the task.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 17
Expectations and stereotypes
Examples:
• male animal case
• differential diagnoses candidate would consider
• expectation of hard questions
• expectation that Qs will ask about what
something is rather than what it is not
All of us develop schemas or stereotypes which help us categorise and process complex information
quickly. Particular features of questions trigger certain schemas and hence expectations. Anxiety can make
us “close” on a certain schemas too quickly and not look for others, therefore exams can be measures of a
propensity to anxiety rather than a measure of what we want to measure. In addition, since having well
developed schemas is a mark of expertise, very good candidates may be expected to make much use of
schemas.
Question writers need to be aware of the existence of such schemas and ensure that they very clearly
signal if they want candidates to step out of these schemas. For example a scenario about a male animal
will likely trigger schema to do with differential diagnoses of male animals. If you want candidates to
discuss differential diagnoses of both male and female animals, you will need to understand that the male
animal schema may already have been closed on in the candidate’s mind, and you will need to make it very
very clear that you want them to change out of this if this is the case. Or, perhaps better still, redesign the
question to account for the likely use of this schema.
Other examples of schema that might be elicited by questions include:
• All differential diagnoses for a clinical sign vs those only applicable in a particular case
• The expectation that questions should be hard, which may prevent candidates from seeing the easy
solution to a question
• The expectation that Qs will ask about what something is rather than what it is not
Question writers can reduce the negative effects of expectations by
• using clear language,
• including only relevant and authentic scenarios,
• being clear about the kind of answer and level of answer required and
• being aware of the kind of implicit expectations that come into play in reading comprehension
processes.
• using very very clear signalling if questions contradict expectations (eg using bold font)
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 18
Contextualising Qs
• Context is good because it brings relevance and
authenticity
• Allows assessment of concrete or specific examples
not abstract concepts or generalisations
• Allows assessment of applied learning (doing not
just knowing)
• All these carry with them a potential for bias.
Context is good because it brings relevance and authenticity
Allows assessment of concrete or specific examples not abstract concepts or generalisations
Allows assessment of applied learning (doing not just knowing)
All these carry with them a potential for bias.
Relevance is a personal
Concrete examples will be more familiar to some candidates than others
Application of knowledge may actually just be recall if candidates have considered that example or
a similar one in their learning
Context activates concepts in the mind and therefore may activate the wrong contexts of schemata (as we
saw in the last slide)
Writing long answer questions
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 19
Q parts – the task
• For MCQs the task is given at the beginning of
the examination or MCQ section:
Choose one BEST answer
• For long answer Qs you need to specify the task
Don’t write questions; write tasks
What is your
diagnosis?
State the most likely diagnosis
or
State the most likely diagnosis and
explain your reasoning
or
Discuss the differential diagnoses you
would consider in this case
or
…..
Tell candidates what you want them to do rather than asking them a question
They need to know whether you intend for them to write a one word answer, or to explain it or justify it
and so on.
If you do not make this clear they will give you a “just in case” answer, and their answer may seem to you
to be unfocused or off topic. In addition they will waste time on this instead of concentrating on other
questions in the paper.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 20
Instructional verb examples
Compare: to find similarities between things, or to look for characteristics
and features that resemble each other.
Contrast: to find differences or to distinguish between things.
Discuss: to present a detailed argument or account of the subject matter,
including all the main points, essential details, and pros and cons of the
problem, to show your complete understanding of the subject.
Define: to provide a concise explanation of the meaning of a word or
phrase; or to describe the essential qualities of something.
Explain: to clarify, interpret, give reasons for differences of opinions or
results, or analyse causes.
Illustrate: to use a picture, diagram or example to clarify a point.
You need to provide an instructional verb for all questions
Specify boundaries of the answer
Species
e.g. “in both dogs and cats…”
Quantities and amounts
e.g. “Provide 5 reasons why…”
With reference to
e.g. “ With reference to the published research from ..”
You also need to specify the boundaries of the answer required.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 21
List the clinical
signs of
hypothyroidism
in dogs.
List the three most common
owner-observed clinical signs of
hypothyroidism in dogs and
explain how thyroid hormone
deficiency leads to each of these
signs.
Q parts – the scope
An example of specifying the boundaries or scope
Examples of problems….
Name two (2) diagnostic tests you would run next to
investigate the cause of this dog’s current illness.
This type of question might come after a scenario. While it seems like a perfectly reasonable question think
about the words “diagnostic tests” and what they really mean and what sort of schema they elicit.
Potential problems for candidates include:
• The words diagnostic tests might only elicit schema containing a narrow set of types of investigation
such as laboratory tests, and not include things like imaging or taking an animal’s temperature.
• The specification of 2 diagnostic tests is unclear because different candidates and examiners may
interpret what is one test differently. For example is a biochemical panel one test or 17? Is a PCV and
TPP, commonly performed together, one test or two? Even if a candidate recognises these issues, they
may have trouble deciding what to do and waste time worrying about it, and anxiety may affect their
performance so we end up not measuring what we want to measure.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 22
Examples of problems
Outline your approach to confirming the initial
clinical diagnosis and a management and
prevention plan for this problem. This discussion
should include an outline on further observations
taken about ….
Here the examiners instructions request an outline (a description of the main features of a sketch general terms or a
summary) but then the wording suggests that a discussion should have been completed (an examination of the
argument, a sifting of considerations for and against, a debate). Because outline appears first, the candidate may have
already “closed” on outline and may not even notice the word discussion. (Did you notice it when you read the
question?).
Examples of problems
…list in dot point form: the gross pathological
features, the characteristic histopathological
changes, and the clinical pathology changes. In
your discussion, list one antemortem
test/procedure that can be used to aid in the
diagnosis …
Similarly, in this question the first instruction to list the answer is then contradicted by the suggestion that
a discussion should actually have been completed.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 23
Examples of problems
A veterinarian asks you for assistance in designing
a protocol for the delivery of a vaccine for cats in
their practice. What factors would you take into
consideration in designing this protocol?
Here is an example of where a technical term that is also used in everyday language “delivery” could elicit
everyday schemas rather than technical schemas. For example does the question refer to delivery of the
vaccine from the manufacturer to the practice, or the process of injecting the cat, or the recommended
intervals for administering vaccinations?
Examples of question problems
Are there any clinical features which can help you
determine a patient’s prognosis?
Here is an example appearing after a scenario, where the wording of the question suggests only a yes-no
answer. Presumably the answer “yes” is 100% correct in this situation.
Candidates will usually supply more detail in these situations regardless, not because the task is clear, but
because the schema they will invoke will suggest to them they more explanation of their response is
required.
However if a simple answer is all that the examiner requires, this will be a waste of their time. Therefore it
is important to be very clear.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 24
Examples of question problems
Describe and discuss the following:
a) preparedness
Here is an example where a word can have different meanings for different groups of people. For example,
in NZ, preparedness has an everyday meaning that is very different to what is likely required of candidates
in a veterinary behaviour examination.
Examples of question problems
State what you believe is your most likely
diagnosis.
Here the question asks candidates to say what they believe. There is no wrong answer since the
candidate’s belief is their belief whether it is true or not. What the examiner really wants to know is
whether the candidates belief is the same as their own belief about the answer, or is justifiable in some
other manner.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 25
Examples of question problems
Discuss commonly found tumours and tumour-like
disorders associated with the oral cavity and
dental tissues of the horse.
Here is an example of a question where the scope is potentially endless
For example you could discuss clinical signs or the histopathologcal diagnosis including special
immunological tests, or the treatment, or….
The scope needs to be clearly defined for candidates.
Examples of question problems
How would you localise the site of the lesion?
This question appeared after a neurological scenario was presented. The wording suggests the required
answer would involve the diagnostic methods you would use to arrive at the site of the lesion.
However this was the marking scheme provided by the question writer:
Spinal lesion between T3 and L3
Clearly the question does not actually ask for the answer the examiner wants. Note that it is impossible to
detect this fault without also seeing the marking scheme and hence the need to always evaluate both
together when checking the wording of questions.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 26
Writing MCQs
Focus on a single important concept
• Test application of knowledge not recall
• Don’t test “trivial” knowledge
• Focus on real life problems
• Clinical vignettes are a good basis for a Q
Placing Qs in vignettes does not increase the difficulty for high performing candidates but does increase the
difficulty for low performing candidates
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 27
A 7-year-old mare has had intermittent signs of moderately
severe colic for the past 48 hours. Heart rate is 56
beats/min. Hydration, acid-base balance, and electrolyte
parameters are near normal. On rectal examination, the left
dorsal and ventral colon feels distended and is felt coursing
in a dorsocranial direction. The spleen is displaced
caudomedially. Which of the following is the most likely
diagnosis?
A. Cecocolic intussusception or cecal inversion
B. Displacement of the left colon over the nephrosplenic
ligament
C. Ileocecal intussusception
D. Infarction of the large colon
E. Volvulus of the large colon and the cecum
A example of a clinical vignette – assesses ability to diagnose – deep question.
The stem is relatively long, and each option is relatively short. This is the overall structure to aim for.
Keep options short
Iris prolapse is a common sequel to penetrating corneal
wounds or ruptured corneal ulcers. Which of the following
steps is NOT appropriate for the treatment of iris prolapse?
A. primary closure of the corneal laceration with 8-0 vicryl
and treatment with topical antibiotics to control
infection.
B. placement of a nictitans flap and treatment with
systemic antibiotics to control infection.
C. placement of a corneal graft with an overlying
conjunctival pedicle graft and treatment with systemic
antibiotics to control infection.
Avoid complex and long options like you see in this one – try to put as much as possible into the stem and
leave the options short
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 28
Pose a clear task
Chyle is :-
A. The semifluid mass which empties from the
stomach into the duodenum.
B. The lymph containing fat droplets found in the
lacteals of the small intestine.
C. The contents of the gall bladder.
D. The rounded piece of chewed food which passes
down the oesophagus when the animal swallows.
This is not a good question. You need to pose a clear task. This question has no task. If you covered up the
options you would not be able to predict what the answer was by reading the stem.
Make all distractors plausible and
homogenous
Which of the following statements regarding hepatic
encephalopathy is true?
A. Patients typically present with asymmetrical neurological deficits
B. The most effective and appropriate anticonvulsant to use for a
patient that is seizuring due to hepatic encephalopathy is
phenobarbital
C. Abdominal radiographs of dogs with portosystemic shunts will
often show an enlarged liver
D. Cats with portosystemic shunts often exhibit ptyalism as a clinical
sign
E. An appropriate treatment for hepatic encephalopathy is
intravenous neomycin
This question also has a problem. The incorrect and correct options are about different things – some are
about clinical signs, some about diagnostic investigation and some about treatment. Instead you should
aim to have all the options homogenous – all about diagnosis, all about treatment etc.
As mentioned before, there is a tendency for only trivial facts to be 100% true or 100% false in biological
systems. The types of concepts we really want to examine though are more difficult and subject to
exceptions that mean they are not always true. Therefore in this sort of question you are trying to work out
which is more true than the others on a scale of trueness. If the options are not all on the same scale of
trueness because they are about different things, then the question becomes irrelevantly difficult or even
impossible to answer. See the next slide for an example that will help explain this concept.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 29
Include only ONE best answer
Which of the following is true?
A. Bananas are green
B. Motorbikes are faster than cars
C. Boys are taller than girls
Here is an extreme example to illustrate the issue raised in the last slide. This question is impossible to
answer.
The reason is because all the options are true sometimes and false other times. In order to try and answer
it, you are trying to work out which is the most true, but you are comparing completely different things. Is
“motorbikes are faster than cars” more or less true than “boys are taller than girls”? Impossible to say.
Include only ONE best answer
Least
correct
Most
correct
D E AC B
This figure shows the concept diagrammatically. All the items on a multiple choice question should lie on a
single scale of trueness. While we expect that in non-trivial questions options will not all be 100% true or
100% false, there must be one option that is well separated on the trueness scale from the distractors
(incorrect options).
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 30
Avoid none of the above
Which of the following is true regarding ligament injuries?
A. Ligament injuries are appropriately referred to as
“strains” or “sprains”.
B. Surgical intervention is indicated for treatment of
second-degree sprains with demonstrable instability.
C. The elastic nature of ligaments allows 30% elongation
before permanent deformation.
D. Following surgical repair of ligaments, immobilization
via ESF or external coaptation is contraindicated, as
range of motion is critical to successful repair.
E. None of the above.
This question has a number of problems, but one of them is the use of “none of the above” as an option.
This is problematic in questions where judgement is involved (which should be all questions in College
exams!) and where the options are not absolutely true or false.
Either remove this option all together or fix it by replacing it with an option that is more specific. For
example if the options are a list of possible drugs to prescribe, an option of “no drug should be given at this
time” would be better than “none of the above.”
Avoid negative framing
Which of the following statements is false regarding
arthrotomies?
A. When detachment of a ligament is necessary, this should be
performed by osteotomy of the bony origin rather than
transection of the ligament.
B. Complete closure of the synovium is necessary to prevent
synovial fluid leakage into subcutaneous tissue.
C. Surgical removal of osteophytes is often followed by their
relatively rapid regrowth, and has questionable value.
D. Monofilament absorbable suture material has a lower risk for
long term infection than does braided nonabsorbable suture.
E. None of the above.
You should avoid questions framed in the negative sense - “Which of the following is false…” - completely.
If you do include them, you must never include “none of the above” as an option. The question above
illustrates why this is.
It you were to choose option E are you saying that “none of the above” are false is true? Or are you saying
that “none of the above” are false is false?
And if “none of the above” are false is true then A,B,C and D must be all true. Therefore there is no false
answer and the question cannot be answered
If “none of the above” are false is false, then at least one of A, B, C, D must be true – but isn’t that a given?
So these questions pose irrelevant difficulty – irrelevant because the difficulty has nothing to do with the
learning outcomes we are trying to assess.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 31
Avoid double options
2-year-old male neutered Border Collie is presented
with the following history and neurologic signs: ……..
Which one of the following neuroanatomic localizations
and diagnoses is CORRECT?
A. Left C1-C5 myelopathy, intervertebral disk rupture
B. Right C6-T2 myelopathy, fibrocartilagenous
embolism
C. Right C1-C5 myelopathy, intervertebral disk rupture
D. Left C1-C5 myelopathy, fibrocartilagenous
embolism
This question has “double options”, in that each option contains two different types of fact. It is better to
focus on one fact for each MCQ and split this type of Q into two questions.
3 options is enough
A horse suffering from an acute intestinal accident
is MOST likely to have
A. primary respiratory acidosis
B. primary respiratory alkalosis
C. primary metabolic alkalosis
D. primary metabolic acidosis
Three options for MCQs have been shown to be sufficient and there is no need to force a question to have
4 or 5 options if there is no natural list of 4 or 5 plausible options. However sometimes you do need to
include 4 options, for example in the question above, because it allows a complete set of paired options.
It is fine for MCQ s to have different numbers of options within one exam.
Although you may worry about candidates guessing in 3 option questions, remember the statistical
probability of achieving a score of 70% through random guessing on 30 three option MCQ items is
0.0000356.
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 32
Avoid technical item flaws
• Grammatical cues
• Logical cues: a subset of the options are collectively exhaustive
• Absolute terms: terms such as “always”, “never” or “all” used in
options.
• Vague terms such as “usually” and “frequently” used in options.
• Long correct answer: correct answer is longer, more specific, or
more complete than other options
• Word repeats: a word or phrase is included in the stem and in the
correct answer
• Convergence: the correct answer includes the most elements in
common with the other options
• Numeric data not stated consistently
• Language in the options is not parallel; options are in an illogical
order
• Stems (lead-ins) are tricky or unnecessarily complicated
As well as the things we have just gone through, there a whole bunch of technical item flaws you will see
listed in guides for MCQ writing. These particular flaws can allow clued-in candidates to see the right
answer because of faults in the question structure. I am not going to go through all of these in detail
because advice about these is so widely available elsewhere, and these flaws are easy to avoid.
Key points
Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 33
Key points
• Its important to think about what it is you are
looking for evidence of when designing Qs.
• Check that the Qs are going to be collecting
evidence of that, and not something else.
• Concentrate on designing Qs that test
application rather than fact recall.
“Effective item writers are trained,
not born … “
Downing and Haladyna 2006, Handbook of test development ,p. 11
So I hope you all learned something today that will help you be more effective question writers.

Contenu connexe

Tendances

Orientation to research methods and Evidence-Based Medicine
Orientation to research methods and Evidence-Based MedicineOrientation to research methods and Evidence-Based Medicine
Orientation to research methods and Evidence-Based MedicineDr Ghaiath Hussein
 
Malimu research protocol
Malimu research protocolMalimu research protocol
Malimu research protocolMiharbi Ignasm
 
How to Critique a Research paper
How to Critique a Research paperHow to Critique a Research paper
How to Critique a Research paperEdem Gerald Adotevi
 
L7 formulating objectives & research questions
L7 formulating objectives & research questionsL7 formulating objectives & research questions
L7 formulating objectives & research questionsDr Ghaiath Hussein
 
Lecture4 philosophical foundations_chap4
Lecture4 philosophical foundations_chap4Lecture4 philosophical foundations_chap4
Lecture4 philosophical foundations_chap4Angelie Reambon
 
Research Methodology
Research MethodologyResearch Methodology
Research MethodologyHafez Ahmad
 
Community medicine curriculum in Sudan Medical Specialization Board
Community medicine curriculum in Sudan Medical Specialization BoardCommunity medicine curriculum in Sudan Medical Specialization Board
Community medicine curriculum in Sudan Medical Specialization BoardDr Ghaiath Hussein
 
T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)
T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)
T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)Dr Ghaiath Hussein
 
Critiquing research
Critiquing researchCritiquing research
Critiquing researchNursing Path
 
Protocol presentation-slideshare
Protocol presentation-slideshareProtocol presentation-slideshare
Protocol presentation-slideshareDr. Farzana Saleh
 
Writing A Health Research Proposal
Writing A Health Research ProposalWriting A Health Research Proposal
Writing A Health Research ProposalSoha Rashed
 
protocol writing in clinical research
protocol writing in clinical research protocol writing in clinical research
protocol writing in clinical research pavithra vinayak
 
Research methodology
Research methodologyResearch methodology
Research methodologyPriya Nigan
 

Tendances (20)

Orientation to research methods and Evidence-Based Medicine
Orientation to research methods and Evidence-Based MedicineOrientation to research methods and Evidence-Based Medicine
Orientation to research methods and Evidence-Based Medicine
 
Malimu research protocol
Malimu research protocolMalimu research protocol
Malimu research protocol
 
How to Critique a Research paper
How to Critique a Research paperHow to Critique a Research paper
How to Critique a Research paper
 
L7 formulating objectives & research questions
L7 formulating objectives & research questionsL7 formulating objectives & research questions
L7 formulating objectives & research questions
 
Lecture4 philosophical foundations_chap4
Lecture4 philosophical foundations_chap4Lecture4 philosophical foundations_chap4
Lecture4 philosophical foundations_chap4
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
L6 rm experimental design
L6 rm experimental designL6 rm experimental design
L6 rm experimental design
 
Community medicine curriculum in Sudan Medical Specialization Board
Community medicine curriculum in Sudan Medical Specialization BoardCommunity medicine curriculum in Sudan Medical Specialization Board
Community medicine curriculum in Sudan Medical Specialization Board
 
T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)
T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)
T6 course specifications 10_6_2017 (Medical Ethics & Professionalism)
 
Critiquing research
Critiquing researchCritiquing research
Critiquing research
 
Thesis protocal
Thesis protocalThesis protocal
Thesis protocal
 
Protocol presentation-slideshare
Protocol presentation-slideshareProtocol presentation-slideshare
Protocol presentation-slideshare
 
Quality in scientific research final
Quality in scientific research finalQuality in scientific research final
Quality in scientific research final
 
Writing A Health Research Proposal
Writing A Health Research ProposalWriting A Health Research Proposal
Writing A Health Research Proposal
 
protocol writing in clinical research
protocol writing in clinical research protocol writing in clinical research
protocol writing in clinical research
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Research critique
Research critiqueResearch critique
Research critique
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
المادة العلمية محاضرة 1 كيفية كتابة خطة البحث
المادة العلمية محاضرة 1 كيفية كتابة خطة البحثالمادة العلمية محاضرة 1 كيفية كتابة خطة البحث
المادة العلمية محاضرة 1 كيفية كتابة خطة البحث
 
Research Protocol
Research ProtocolResearch Protocol
Research Protocol
 

Similaire à Measuring what we want to measure, Liz Norman ANZCVS 2013

Liz Norman Examination and moderation guidelines
Liz Norman   Examination and moderation guidelinesLiz Norman   Examination and moderation guidelines
Liz Norman Examination and moderation guidelinesLiz Norman
 
Research problem presentation
Research problem presentationResearch problem presentation
Research problem presentationAnamika Ramawat
 
Mb0051 legal aspects of business
Mb0051 legal aspects of businessMb0051 legal aspects of business
Mb0051 legal aspects of businesssmumbahelp
 
Mb0051 legal aspects of business
Mb0051 legal aspects of businessMb0051 legal aspects of business
Mb0051 legal aspects of businesssmumbahelp
 
HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...
HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...
HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...Dr. Khaled OUANES
 
Mb0050 research methodology
Mb0050 research methodology Mb0050 research methodology
Mb0050 research methodology smumbahelp
 
QUESTIONNAIRE METHOD.pptx
QUESTIONNAIRE METHOD.pptxQUESTIONNAIRE METHOD.pptx
QUESTIONNAIRE METHOD.pptxGarimaBhati5
 
ERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPT
ERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPTERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPT
ERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPTAxmedXBullaale
 
Edu 702 group presentation (questionnaire) 2
Edu 702   group presentation (questionnaire) 2Edu 702   group presentation (questionnaire) 2
Edu 702 group presentation (questionnaire) 2Dhiya Lara
 
Developing a Systematic Review Topic and Research Question - Dr Buna Bhandari
Developing a Systematic Review Topic and Research Question - Dr Buna BhandariDeveloping a Systematic Review Topic and Research Question - Dr Buna Bhandari
Developing a Systematic Review Topic and Research Question - Dr Buna BhandariACSRM
 
Research methodology pdf
Research methodology pdfResearch methodology pdf
Research methodology pdfSaqib Imran
 
Qualitative analysis
Qualitative analysisQualitative analysis
Qualitative analysisjosefinacamo
 
Research Methodology
Research MethodologyResearch Methodology
Research MethodologyAneel Raza
 
What is research problem
What is research problemWhat is research problem
What is research problemMuhammadArif630
 
Mk0013 market research
Mk0013 market researchMk0013 market research
Mk0013 market researchsmumbahelp
 

Similaire à Measuring what we want to measure, Liz Norman ANZCVS 2013 (20)

Liz Norman Examination and moderation guidelines
Liz Norman   Examination and moderation guidelinesLiz Norman   Examination and moderation guidelines
Liz Norman Examination and moderation guidelines
 
Test construction
Test constructionTest construction
Test construction
 
Research problem presentation
Research problem presentationResearch problem presentation
Research problem presentation
 
Mb0051 legal aspects of business
Mb0051 legal aspects of businessMb0051 legal aspects of business
Mb0051 legal aspects of business
 
Mb0051 legal aspects of business
Mb0051 legal aspects of businessMb0051 legal aspects of business
Mb0051 legal aspects of business
 
HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...
HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...
HEALTHCARE RESEARCH METHODS: Primary Studies: Developing a Questionnaire - Su...
 
Mb0050 research methodology
Mb0050 research methodology Mb0050 research methodology
Mb0050 research methodology
 
QUESTIONNAIRE METHOD.pptx
QUESTIONNAIRE METHOD.pptxQUESTIONNAIRE METHOD.pptx
QUESTIONNAIRE METHOD.pptx
 
ERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPT
ERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPTERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPT
ERA-ASSESSMENT_OF_KNOWLEDGE-16-12-14.PPT
 
Edu 702 group presentation (questionnaire) 2
Edu 702   group presentation (questionnaire) 2Edu 702   group presentation (questionnaire) 2
Edu 702 group presentation (questionnaire) 2
 
Developing a Systematic Review Topic and Research Question - Dr Buna Bhandari
Developing a Systematic Review Topic and Research Question - Dr Buna BhandariDeveloping a Systematic Review Topic and Research Question - Dr Buna Bhandari
Developing a Systematic Review Topic and Research Question - Dr Buna Bhandari
 
Research methodology pdf
Research methodology pdfResearch methodology pdf
Research methodology pdf
 
Qualitative analysis
Qualitative analysisQualitative analysis
Qualitative analysis
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
Assessement tools
Assessement tools  Assessement tools
Assessement tools
 
What is research problem
What is research problemWhat is research problem
What is research problem
 
Data Collection
Data CollectionData Collection
Data Collection
 
Presentation2 with norbert boruett
Presentation2 with norbert boruettPresentation2 with norbert boruett
Presentation2 with norbert boruett
 
Presentation2
Presentation2Presentation2
Presentation2
 
Mk0013 market research
Mk0013 market researchMk0013 market research
Mk0013 market research
 

Plus de Liz Norman

Marking schemes Liz Norman ANZCVS 2021
Marking schemes  Liz Norman ANZCVS 2021Marking schemes  Liz Norman ANZCVS 2021
Marking schemes Liz Norman ANZCVS 2021Liz Norman
 
Higher-order questions Liz Norman ANZCVS 2021
Higher-order questions Liz Norman ANZCVS 2021Higher-order questions Liz Norman ANZCVS 2021
Higher-order questions Liz Norman ANZCVS 2021Liz Norman
 
Blueprinting Liz Norman ANZCVS 2021
Blueprinting Liz Norman ANZCVS 2021Blueprinting Liz Norman ANZCVS 2021
Blueprinting Liz Norman ANZCVS 2021Liz Norman
 
Question clarity and why its important Liz Norman ANZCVS 2020
Question clarity and why its important Liz Norman ANZCVS 2020Question clarity and why its important Liz Norman ANZCVS 2020
Question clarity and why its important Liz Norman ANZCVS 2020Liz Norman
 
Drafting written questions and marking schemes Liz Norman ANZCVS 2020
Drafting written questions and marking schemes Liz Norman ANZCVS 2020Drafting written questions and marking schemes Liz Norman ANZCVS 2020
Drafting written questions and marking schemes Liz Norman ANZCVS 2020Liz Norman
 
Oral exams Liz Norman ANZCVS 2020
Oral exams Liz Norman ANZCVS 2020Oral exams Liz Norman ANZCVS 2020
Oral exams Liz Norman ANZCVS 2020Liz Norman
 
Question clarity and why its important Liz Norman ANZCVS 2019
Question clarity and why its important Liz Norman ANZCVS 2019Question clarity and why its important Liz Norman ANZCVS 2019
Question clarity and why its important Liz Norman ANZCVS 2019Liz Norman
 
Grading criteria and marking schemes Liz Norman ANZCVS 2019
Grading criteria and marking schemes Liz Norman ANZCVS 2019Grading criteria and marking schemes Liz Norman ANZCVS 2019
Grading criteria and marking schemes Liz Norman ANZCVS 2019Liz Norman
 
Blueprinting and drafting questions Liz Norman ANZCVS 2019
Blueprinting and drafting questions Liz Norman ANZCVS 2019Blueprinting and drafting questions Liz Norman ANZCVS 2019
Blueprinting and drafting questions Liz Norman ANZCVS 2019Liz Norman
 
Oral exams Liz Norman ANZCVS 2019
Oral exams Liz Norman ANZCVS 2019Oral exams Liz Norman ANZCVS 2019
Oral exams Liz Norman ANZCVS 2019Liz Norman
 
Making time for learning: managing student workload
Making time for learning: managing student workloadMaking time for learning: managing student workload
Making time for learning: managing student workloadLiz Norman
 
4 Clarity and why it’s important Liz Norman ANZCVS 2018
4 Clarity and why it’s important Liz Norman ANZCVS 20184 Clarity and why it’s important Liz Norman ANZCVS 2018
4 Clarity and why it’s important Liz Norman ANZCVS 2018Liz Norman
 
3 Oral exams Liz Norman ANZCVS 2018
3 Oral exams Liz Norman ANZCVS 20183 Oral exams Liz Norman ANZCVS 2018
3 Oral exams Liz Norman ANZCVS 2018Liz Norman
 
2 Grading criteria and marking schemes Liz Norman ANZCVS 2018
2 Grading criteria and marking schemes Liz Norman ANZCVS 20182 Grading criteria and marking schemes Liz Norman ANZCVS 2018
2 Grading criteria and marking schemes Liz Norman ANZCVS 2018Liz Norman
 
1 Blueprinting and drafting questions Liz Norman ANZCVS 2018
1 Blueprinting and drafting questions Liz Norman ANZCVS 20181 Blueprinting and drafting questions Liz Norman ANZCVS 2018
1 Blueprinting and drafting questions Liz Norman ANZCVS 2018Liz Norman
 
The SOLO taxonomy: a framework that gives clues to student thinking
The SOLO taxonomy: a framework that gives clues to student thinkingThe SOLO taxonomy: a framework that gives clues to student thinking
The SOLO taxonomy: a framework that gives clues to student thinkingLiz Norman
 
Writing MCQs, Liz Norman 2017
Writing MCQs, Liz Norman 2017Writing MCQs, Liz Norman 2017
Writing MCQs, Liz Norman 2017Liz Norman
 
Writing short and long answer exam questions, Liz Norman 2017
Writing short and long answer exam questions, Liz Norman 2017Writing short and long answer exam questions, Liz Norman 2017
Writing short and long answer exam questions, Liz Norman 2017Liz Norman
 
Oral exams Liz Norman ANZCVS 2017
Oral exams Liz Norman ANZCVS 2017Oral exams Liz Norman ANZCVS 2017
Oral exams Liz Norman ANZCVS 2017Liz Norman
 
Clarity and why it’s important Liz Norman ANZCVS 2017
Clarity and why it’s important Liz Norman ANZCVS 2017Clarity and why it’s important Liz Norman ANZCVS 2017
Clarity and why it’s important Liz Norman ANZCVS 2017Liz Norman
 

Plus de Liz Norman (20)

Marking schemes Liz Norman ANZCVS 2021
Marking schemes  Liz Norman ANZCVS 2021Marking schemes  Liz Norman ANZCVS 2021
Marking schemes Liz Norman ANZCVS 2021
 
Higher-order questions Liz Norman ANZCVS 2021
Higher-order questions Liz Norman ANZCVS 2021Higher-order questions Liz Norman ANZCVS 2021
Higher-order questions Liz Norman ANZCVS 2021
 
Blueprinting Liz Norman ANZCVS 2021
Blueprinting Liz Norman ANZCVS 2021Blueprinting Liz Norman ANZCVS 2021
Blueprinting Liz Norman ANZCVS 2021
 
Question clarity and why its important Liz Norman ANZCVS 2020
Question clarity and why its important Liz Norman ANZCVS 2020Question clarity and why its important Liz Norman ANZCVS 2020
Question clarity and why its important Liz Norman ANZCVS 2020
 
Drafting written questions and marking schemes Liz Norman ANZCVS 2020
Drafting written questions and marking schemes Liz Norman ANZCVS 2020Drafting written questions and marking schemes Liz Norman ANZCVS 2020
Drafting written questions and marking schemes Liz Norman ANZCVS 2020
 
Oral exams Liz Norman ANZCVS 2020
Oral exams Liz Norman ANZCVS 2020Oral exams Liz Norman ANZCVS 2020
Oral exams Liz Norman ANZCVS 2020
 
Question clarity and why its important Liz Norman ANZCVS 2019
Question clarity and why its important Liz Norman ANZCVS 2019Question clarity and why its important Liz Norman ANZCVS 2019
Question clarity and why its important Liz Norman ANZCVS 2019
 
Grading criteria and marking schemes Liz Norman ANZCVS 2019
Grading criteria and marking schemes Liz Norman ANZCVS 2019Grading criteria and marking schemes Liz Norman ANZCVS 2019
Grading criteria and marking schemes Liz Norman ANZCVS 2019
 
Blueprinting and drafting questions Liz Norman ANZCVS 2019
Blueprinting and drafting questions Liz Norman ANZCVS 2019Blueprinting and drafting questions Liz Norman ANZCVS 2019
Blueprinting and drafting questions Liz Norman ANZCVS 2019
 
Oral exams Liz Norman ANZCVS 2019
Oral exams Liz Norman ANZCVS 2019Oral exams Liz Norman ANZCVS 2019
Oral exams Liz Norman ANZCVS 2019
 
Making time for learning: managing student workload
Making time for learning: managing student workloadMaking time for learning: managing student workload
Making time for learning: managing student workload
 
4 Clarity and why it’s important Liz Norman ANZCVS 2018
4 Clarity and why it’s important Liz Norman ANZCVS 20184 Clarity and why it’s important Liz Norman ANZCVS 2018
4 Clarity and why it’s important Liz Norman ANZCVS 2018
 
3 Oral exams Liz Norman ANZCVS 2018
3 Oral exams Liz Norman ANZCVS 20183 Oral exams Liz Norman ANZCVS 2018
3 Oral exams Liz Norman ANZCVS 2018
 
2 Grading criteria and marking schemes Liz Norman ANZCVS 2018
2 Grading criteria and marking schemes Liz Norman ANZCVS 20182 Grading criteria and marking schemes Liz Norman ANZCVS 2018
2 Grading criteria and marking schemes Liz Norman ANZCVS 2018
 
1 Blueprinting and drafting questions Liz Norman ANZCVS 2018
1 Blueprinting and drafting questions Liz Norman ANZCVS 20181 Blueprinting and drafting questions Liz Norman ANZCVS 2018
1 Blueprinting and drafting questions Liz Norman ANZCVS 2018
 
The SOLO taxonomy: a framework that gives clues to student thinking
The SOLO taxonomy: a framework that gives clues to student thinkingThe SOLO taxonomy: a framework that gives clues to student thinking
The SOLO taxonomy: a framework that gives clues to student thinking
 
Writing MCQs, Liz Norman 2017
Writing MCQs, Liz Norman 2017Writing MCQs, Liz Norman 2017
Writing MCQs, Liz Norman 2017
 
Writing short and long answer exam questions, Liz Norman 2017
Writing short and long answer exam questions, Liz Norman 2017Writing short and long answer exam questions, Liz Norman 2017
Writing short and long answer exam questions, Liz Norman 2017
 
Oral exams Liz Norman ANZCVS 2017
Oral exams Liz Norman ANZCVS 2017Oral exams Liz Norman ANZCVS 2017
Oral exams Liz Norman ANZCVS 2017
 
Clarity and why it’s important Liz Norman ANZCVS 2017
Clarity and why it’s important Liz Norman ANZCVS 2017Clarity and why it’s important Liz Norman ANZCVS 2017
Clarity and why it’s important Liz Norman ANZCVS 2017
 

Dernier

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Dernier (20)

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Measuring what we want to measure, Liz Norman ANZCVS 2013

  • 1. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 1 Measuring what we want to measure Writing excellent questions for College examinations Liz Norman Massey University Why is question writing important? So that we can have the best chance of measuring what we want to measure The process of assessment is a judgement process, and those of you who have ever examined anyone will know that sometimes that judgement seems easy, and at other times it is very challenging. We make inferences based on evidence during that judgement process. The questions are there to collect evidence. We need to be sure they are collecting evidence of the right thing. So today I am going to first discuss how we go about deciding what it is we are trying to measure Then we will look at the advantages and disadvantages of different question types for that purpose We will look at some aspects of designing long answer questions and some of the traps And finally we will look at some aspects of designing MCQs and some of the traps
  • 2. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 2 What we want to measure The candidate will have a detailed knowledge of: The aetiology, pathogenesis and pathophysiology of cardiac, renal, respiratory, alimentary, musculoskeletal, endocrine, ophthalmological and neurological organ dysfunction in the cat and the dog. The candidate will be able to, with a detailed level of expertise: Analyse complex clinical problems and make sound clinical judgements. The subject guidelines for each subject specify the scope for both knowledge and skills. Some skills are technical and assessed through credentialing. Some are cognitive skills such as this one and are assessed in our examinations.
  • 3. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 3 Scope - breadth Pathophysiology Investigation and diagnosis Treatment and management Gastrointestinal P1Q1 P1Q1, P2Q4 Cardiovascular P1Q4 P2Q2 P2Q2 Nervous P1Q3, P2Q1 Endocrine P1Q3 P2Q3 Musculoskeletal P2Q5 So these form the topics that will be covered by the questions. Blueprinting is a good way to ensure that the whole subject is sampled from representatively across the 3-4 components of the exam. It is why the whole exam (all 3-4 components) needs to be designed at once. Note that questions often span more than one category. Knowledge levels: Detailed knowledge — candidates must be able to demonstrate an in-depth knowledge of the topic including differing points of view and published literature. The highest level of knowledge. Sound knowledge — candidate must know all of the principles of the topic including some of the finer detail, and be able to identify areas where opinions may diverge. A middle level of knowledge. Basic knowledge — candidate must know the main points of the topic and the core literature. Currently the College templates for subject guidelines specify the level of knowledge in this way. In a way this is only showing the level of detail required. Knowledge isn’t all about recalling a level of detail though. Experts are able to use their knowledge in appropriate ways, recall the information in appropriate situations and apply it to those situations to solve problems. It isn’t just about knowledge, but what can be done with the knowledge.
  • 4. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 4 Skill levels: Detailed expertise — the candidate must be able to perform the technique with a high degree of skill, and have extensive experience in its application. The highest level of proficiency. Sound expertise — the candidate must be able to perform the technique with a moderate degree of skill, and have moderate experience in its application. A middle level of proficiency. Basic expertise — the candidate must be able to perform the technique competently in uncomplicated circumstances These skill levels don’t just apply to technical skills (psychomotor skills) but also to the way knowledge is used. The candidate will have a detailed knowledge of: The aetiology, pathogenesis and pathophysiology of cardiac, renal, respiratory, alimentary, musculoskeletal, endocrine, ophthalmological and neurological organ dysfunction in the cat and the dog. The candidate will be able to, with a detailed level of expertise: Analyse complex clinical problems and make sound clinical judgements. Candidates need base knowledge in order to “analyse complex clinical problems and make sound clinical judgements” and so if you aim to assess the cognitive skills you will also be assessing the knowledge base of the candidate
  • 5. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 5 Fact recall vs applied Fact recall: Questions capable of being answered by reference to one paragraph in a text or notes (or several paragraphs for questions requiring recall of several facts) Applied (higher order) Questions that require the use of facts or concepts, the solution of a diagnostic or physiologic problem, the perception of a relationship, or other process beyond recalling discrete fact From: Peitzman et al. (1990). Academic Medicine, 65(9), S59-60. Questions that assess cognitive skills are applied or higher order questions, as opposed to fact recall questions. This is a useful operational definition used in a research paper which I find helpful to work out if a question is higher order or fact recall. If the answer can be looked up and appears on one paragraph/page of a textbook then really its fact recall. Note that even complex judgements can become fact recall for candidates once someone writes a review paper or textbook chapter that sums up the complexity. Level - depth Pathophysiology Investigation and diagnosis Treatment and management recall higher order recall higher order recall higher order Gastrointestinal P1Q1 P1Q1, P2Q4 Cardiovascular P1Q4 P2Q2 P2Q2 Nervous P1Q2, P2Q1 Endocrine P1Q3 P2Q3 Musculoskeletal P2Q5 You can also categorise your questions into recall and higher order on your blueprint to check that you mostly have higher order (as is appropriate for membership and fellowship) across the examinations.
  • 6. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 6 What we don’t want to measure • Ability to take tests • Ability to write legibly and fast • Ability to rote learn whole pages of textbooks or review articles - prewriting • Ability to write down a huge series of unconnected facts in no particular order • Ability to research examiner's fields of interest and rote learn impressive aspects of that • Ability to interpret what examiners are thinking There is a whole lot of things we don’t want t measure, and these are just some. Ability to take tests Ability to write clearly and fast this is why it is important to pace examinations so that candidates do have time to write legibly Ability to rote learn whole pages of textbooks or review articles – prewriting (where candidates predict and then prelearn answers) This is why we need to avoid fact recall questions but get candidates to use their knowledge in examinations Ability to write down a huge series of unconnected facts in no particular order We need to ensure we don’t reward this in our marking schemes. Importantly as we move away from fact recall Qs to higher order Qs it becomes the quality of the answer that is most important more than the quantity of facts the candidate writes down Ability to research examiner's fields of interest and rote learn impressive aspects of that This is why examinations need to be blueprinted against the subject guidelines not examiners interests Ability to interpret what examiners are thinking This is why we need to give clear instructions in our questions and what we will talk about in the next section
  • 7. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 7 Types of questions Recall knowledge Apply knowledge Stimulus formats Question types can be categorised broadly by two aspects The stimulus format: fact recall or applied knowledge
  • 8. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 8 Recall knowledge Apply knowledge Selected response Constructed response Stimulus formats Responseformats The response format: what the candidate does to indicate their response: Selected response: eg MCQs where candidates select their response from prespecified options. Constructed response – eg long and short answer Qs where candidates generate their own response Recall knowledge Apply knowledge Selected response Selected recall Selected applied Constructed response Constructed recall Constructed applied Stimulus formats Responseformats Both selected response and constructed response questions can be fact recall and both can be applied knowledge types The literature is clear that it is the stimulus format (fact recall vs applied) that is the most important determiner of what is measured, and the response format is of less importance
  • 9. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 9 Advantages of constructed response Qs (long answer) • Non-cued writing – can measure what candidate’s spontaneously think of • Easy to create • Logic, reasoning, steps in problem solving • Ease of partial credit scoring • In-depth assessment Non-cued writing – can measure what candidate’s spontaneously think of Cueing means that a candidate can answer a multiple-choice question correctly by recognising the correct option, rather than by generating the answer spontaneously. Cueing clearly exists as you will recognise if think about how you are thinking when you look at an MCQ. Your strategy is often to look at the options first and try and recognise the correct answer rather than working it out. Easy to create Logic, reasoning, steps in problem solving Ease of partial credit scoring In-depth assessment A good long answer question asks the candidate to process information or knowledge rather than to reproduce it, by, for example, requiring candidates to set up a reasoning process or summarise information, or asking them to apply a known principle in different contexts, etc
  • 10. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 10 Limitations of constructed response Qs (long answer) • Subjective scoring • Reproducibility issues • Limited breadth of content • Inefficient? – Marking time – Testing time • Quality control tends to be qualitative Subjective scoring This is a frequently levelled criticism, but I do not see it as a big problem. Assessment is a process of judgement, similar to diagnosis. Rather than trying to take the judgement out of the equation, we need to ensure that the judgement is made on good, relevant and sufficient evidence, by appropriately qualified judges. Reproducibility issues This is referring to the agreement between different judges or the same judge looking at it at a different time. This can be a problem, but as above, is one we should acknowledge is inherent in complex judgement, and we need to take other approaches to quality control. Limited breadth of content This is a definite limitation we need to be aware of. Because the questions take longer than say MCQ questions, we can ask far fewer and therefore we are taking a much smaller sample of the topics in the subject. We need to remember that the smaller the sample, the less able we are to generalise the performance of the candidate to the whole subject area, which is actually the aim of the examinations. Inefficient? This is often said, but only true when you are talking about large numbers of candidates (like 1000s). Long answer questions definitely take longer to mark than short answer questions, and also you need longer testing time in order to ask a reasonable number of questions to sample the topics. However because good MCQs take at least as long to write as good long answer Qs, and you need more of them, the time trade off only breaks even when you have a large number of candidates, so this is not relevant for the College. Quality control tends to be qualitative Rather than using statistical quantitative methods of quality assurance, we need to use more qualitative methods, which are aimed at ensuring and documenting the trustworthiness, credibility and dependability of the judgements. This includes having more than one examiner, checking for agreement in decisions, ensuring the expertise of examiners and triangulating evidence across all examination components and credentialing.
  • 11. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 11 Advantages of MCQ • Can test a wide breadth of subjects in a time-efficient manner • Sampling across the subject may therefore be more representative • Less predictability - fosters a deep approach to learning • Reliability • Can construct examinations of known difficulty (assuming psychometric analysis carried out) • Efficient and cost effective for large numbers of candidates • Possibility of automated development Many of these we have already discussed in comparison to long answer questions • Can test a wide breadth of subjects in a time-efficient manner • Sampling across the subject may therefore be more representative • Less predictability - fosters a deep approach to learning Good MCQs (that test higher order thinking) are not as easily predicted by students and studies have demonstrated that they foster deep learning. • Reliability Very reliable since there is no judgement involved in scoring • Can construct examinations of known difficulty (assuming psychometric analysis carried out) • Efficient and cost effective for large numbers of candidates • Possibility of automated development See next slide In this study experts developed a schema for decision making in a particular scenario (diagnosis of wound infection) and then this was used to automatically generate 1248 different MCQ questions automatically. Obviously this is still resource intensive to develop the schema, but offers a possibility for future automated generation which is exciting.
  • 12. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 12 Disadvantages of MCQ Realising the advantages requires procedures which makes them resource intensive and expensive – Creation of a large question bank – Pretesting and statistical analysis of Qs – Post examination statistical analysis Realising the advantages requires procedures which makes them resource intensive and expensive • Creation of a large question bank If you want to draw questions from a bank rather than create new each year, you need the bank to contain 10 times the number of questions you will be drawing. Therefore for a 2 hour, 120 MCQ exam you need a bank of at least 1200 questions written • Pretesting and statistical analysis of Qs Ideally MCQs should be pretested on a sample similar to candidates – eg existing members or fellows. You need to do this to detect problems with questions because problems with questions are common – some studies have found flaws in 36-65% of questions of which 10-15% are serious enough to influence pass-fail decisions • Post examination statistical analysis Sophisticated statistical analysis should be performed to set the passing cut point and check the performance of questions
  • 13. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 13 Problems that can occur with MCQs • Candidates can’t indicate their interpretation of the Q • Fact recall Qs are easier to write therefore tend to dominate • Some topics are particularly difficult to write MCQs for • Identification of a correct response requires a different type of thinking from candidates than generation of a response • Guessing can be rewarded • What is correct is still a subjective decision • Circulating recall papers may reduce even higher order Qs to recall Qs Candidates can’t indicate their interpretation of the Q If a question turns out to not to be well worded or unclear, it is possible for examiners to see this in the answer, or for candidates to say so in the answer, whereas they are not able to in MCQs (which is why it is more critical for MCQs to have pretesting) Fact recall Qs are easier to write therefore tend to dominate Some topics are particularly difficult to write MCQs for Identification of a correct response requires a different type of thinking from candidates than generation of a response Guessing can be rewarded This is often of concern but shouldn’t be. Even with 3 option MCQs (where there is a 33% chance of getting the answer right from guessing) the probability of scoring 70% correct on a 30 question test is 0.0000356. In any case, candidates with any degree of preparation will use partial knowledge to select answers rather than guessing strategies. What is correct is still a subjective decision See the next slide
  • 14. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 14 Least correct Most correct D E AC B In biological systems we all know that nothing is ALWAYS true and you can never say never. Therefore unless questions are about trivial content (which they should not be) both the correct option and the distractors are likely to have some degree of truth to them. This is especially the case for the type of questions w really want to design which aim to test candidate’s knowledge and decision making in complex situations that require judgement. While it is important that the answer keyed a “correct” is much more correct than the other possible answers, there will always be an element of judgement in this decision. Problems that can occur with MCQs • Candidates can’t indicate their interpretation of the Q • Fact recall Qs are easier to write therefore tend to dominate • Some topics are particularly difficult to write MCQs for • Identification of a correct response requires a different type of thinking from candidates than generation of a response • Guessing can be rewarded • What is correct is still a subjective decision • Circulating recall papers may reduce even higher order Qs to recall Qs Circulating recall papers may reduce even higher order Qs to recall Qs This is just as much a problem with long answer Qs as with MCQs. But MCQs are so expensive to produce that they tend to be held for reuse – hence the problem.
  • 15. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 15 This shows a screenshot from a website called PasstheFRACP.com where FRACP refers to the Fellowship of the Royal Australian College of Physicians. Several past recall papers are available on this site and undoubtedly there are many others for all sorts of examinations that are not so freely available General issues with question writing
  • 16. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 16 Communication The examination questions are the question setter’s expression of the question setter’s task. The candidate’s answer represents the candidate’s expression of the candidate’s interpretation of the questions. The marker evaluates the marker’s interpretation of the candidate’s expression of the candidate’s answer. The marker uses the marker’s interpretation of the setter’s expression of the setter’s task to evaluate the candidate’s answer. Modified from Pollitt & Ahmed1999 Exams are a communication between 3 people: the candidate, the marker and the question setter. In only some cases is the question setter the same person as the marker. Each of these people has their own interpretation of the communication. • The question setter has a task in mind and expresses this in writing. Anyone reading the question is interpreting the words in order to arrive at their own understanding of the task the question setter had in mind. As you can imagine, if the task is not specified very precisely and clearly there is plenty of room for things to go wrong at this step. • The candidate is one of the people interpreting the question. They formulate an answer, and then have to express that in words on the page. Their expression may or may not represent their answer very well. You can imagine that many factors can interfere with this process, not all of which are things we are trying to differentiate candidates on. • The marker has to interpret the candidate’s expression of their answer on the page and from it, they are making inferences about what the candidate knows and can do. The inferences may be well founded, or perhaps more tenuous. The candidate’s expression can contribute a lot to the interpretations made by the marker. • In order to make an evaluation of the candidate’s performance, the marker must also interpret the question, using the setters expression of the task the setter had in mind. The evaluation the marker makes of the task may be different to the evaluation the candidate makes of the task.
  • 17. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 17 Expectations and stereotypes Examples: • male animal case • differential diagnoses candidate would consider • expectation of hard questions • expectation that Qs will ask about what something is rather than what it is not All of us develop schemas or stereotypes which help us categorise and process complex information quickly. Particular features of questions trigger certain schemas and hence expectations. Anxiety can make us “close” on a certain schemas too quickly and not look for others, therefore exams can be measures of a propensity to anxiety rather than a measure of what we want to measure. In addition, since having well developed schemas is a mark of expertise, very good candidates may be expected to make much use of schemas. Question writers need to be aware of the existence of such schemas and ensure that they very clearly signal if they want candidates to step out of these schemas. For example a scenario about a male animal will likely trigger schema to do with differential diagnoses of male animals. If you want candidates to discuss differential diagnoses of both male and female animals, you will need to understand that the male animal schema may already have been closed on in the candidate’s mind, and you will need to make it very very clear that you want them to change out of this if this is the case. Or, perhaps better still, redesign the question to account for the likely use of this schema. Other examples of schema that might be elicited by questions include: • All differential diagnoses for a clinical sign vs those only applicable in a particular case • The expectation that questions should be hard, which may prevent candidates from seeing the easy solution to a question • The expectation that Qs will ask about what something is rather than what it is not Question writers can reduce the negative effects of expectations by • using clear language, • including only relevant and authentic scenarios, • being clear about the kind of answer and level of answer required and • being aware of the kind of implicit expectations that come into play in reading comprehension processes. • using very very clear signalling if questions contradict expectations (eg using bold font)
  • 18. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 18 Contextualising Qs • Context is good because it brings relevance and authenticity • Allows assessment of concrete or specific examples not abstract concepts or generalisations • Allows assessment of applied learning (doing not just knowing) • All these carry with them a potential for bias. Context is good because it brings relevance and authenticity Allows assessment of concrete or specific examples not abstract concepts or generalisations Allows assessment of applied learning (doing not just knowing) All these carry with them a potential for bias. Relevance is a personal Concrete examples will be more familiar to some candidates than others Application of knowledge may actually just be recall if candidates have considered that example or a similar one in their learning Context activates concepts in the mind and therefore may activate the wrong contexts of schemata (as we saw in the last slide) Writing long answer questions
  • 19. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 19 Q parts – the task • For MCQs the task is given at the beginning of the examination or MCQ section: Choose one BEST answer • For long answer Qs you need to specify the task Don’t write questions; write tasks What is your diagnosis? State the most likely diagnosis or State the most likely diagnosis and explain your reasoning or Discuss the differential diagnoses you would consider in this case or ….. Tell candidates what you want them to do rather than asking them a question They need to know whether you intend for them to write a one word answer, or to explain it or justify it and so on. If you do not make this clear they will give you a “just in case” answer, and their answer may seem to you to be unfocused or off topic. In addition they will waste time on this instead of concentrating on other questions in the paper.
  • 20. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 20 Instructional verb examples Compare: to find similarities between things, or to look for characteristics and features that resemble each other. Contrast: to find differences or to distinguish between things. Discuss: to present a detailed argument or account of the subject matter, including all the main points, essential details, and pros and cons of the problem, to show your complete understanding of the subject. Define: to provide a concise explanation of the meaning of a word or phrase; or to describe the essential qualities of something. Explain: to clarify, interpret, give reasons for differences of opinions or results, or analyse causes. Illustrate: to use a picture, diagram or example to clarify a point. You need to provide an instructional verb for all questions Specify boundaries of the answer Species e.g. “in both dogs and cats…” Quantities and amounts e.g. “Provide 5 reasons why…” With reference to e.g. “ With reference to the published research from ..” You also need to specify the boundaries of the answer required.
  • 21. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 21 List the clinical signs of hypothyroidism in dogs. List the three most common owner-observed clinical signs of hypothyroidism in dogs and explain how thyroid hormone deficiency leads to each of these signs. Q parts – the scope An example of specifying the boundaries or scope Examples of problems…. Name two (2) diagnostic tests you would run next to investigate the cause of this dog’s current illness. This type of question might come after a scenario. While it seems like a perfectly reasonable question think about the words “diagnostic tests” and what they really mean and what sort of schema they elicit. Potential problems for candidates include: • The words diagnostic tests might only elicit schema containing a narrow set of types of investigation such as laboratory tests, and not include things like imaging or taking an animal’s temperature. • The specification of 2 diagnostic tests is unclear because different candidates and examiners may interpret what is one test differently. For example is a biochemical panel one test or 17? Is a PCV and TPP, commonly performed together, one test or two? Even if a candidate recognises these issues, they may have trouble deciding what to do and waste time worrying about it, and anxiety may affect their performance so we end up not measuring what we want to measure.
  • 22. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 22 Examples of problems Outline your approach to confirming the initial clinical diagnosis and a management and prevention plan for this problem. This discussion should include an outline on further observations taken about …. Here the examiners instructions request an outline (a description of the main features of a sketch general terms or a summary) but then the wording suggests that a discussion should have been completed (an examination of the argument, a sifting of considerations for and against, a debate). Because outline appears first, the candidate may have already “closed” on outline and may not even notice the word discussion. (Did you notice it when you read the question?). Examples of problems …list in dot point form: the gross pathological features, the characteristic histopathological changes, and the clinical pathology changes. In your discussion, list one antemortem test/procedure that can be used to aid in the diagnosis … Similarly, in this question the first instruction to list the answer is then contradicted by the suggestion that a discussion should actually have been completed.
  • 23. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 23 Examples of problems A veterinarian asks you for assistance in designing a protocol for the delivery of a vaccine for cats in their practice. What factors would you take into consideration in designing this protocol? Here is an example of where a technical term that is also used in everyday language “delivery” could elicit everyday schemas rather than technical schemas. For example does the question refer to delivery of the vaccine from the manufacturer to the practice, or the process of injecting the cat, or the recommended intervals for administering vaccinations? Examples of question problems Are there any clinical features which can help you determine a patient’s prognosis? Here is an example appearing after a scenario, where the wording of the question suggests only a yes-no answer. Presumably the answer “yes” is 100% correct in this situation. Candidates will usually supply more detail in these situations regardless, not because the task is clear, but because the schema they will invoke will suggest to them they more explanation of their response is required. However if a simple answer is all that the examiner requires, this will be a waste of their time. Therefore it is important to be very clear.
  • 24. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 24 Examples of question problems Describe and discuss the following: a) preparedness Here is an example where a word can have different meanings for different groups of people. For example, in NZ, preparedness has an everyday meaning that is very different to what is likely required of candidates in a veterinary behaviour examination. Examples of question problems State what you believe is your most likely diagnosis. Here the question asks candidates to say what they believe. There is no wrong answer since the candidate’s belief is their belief whether it is true or not. What the examiner really wants to know is whether the candidates belief is the same as their own belief about the answer, or is justifiable in some other manner.
  • 25. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 25 Examples of question problems Discuss commonly found tumours and tumour-like disorders associated with the oral cavity and dental tissues of the horse. Here is an example of a question where the scope is potentially endless For example you could discuss clinical signs or the histopathologcal diagnosis including special immunological tests, or the treatment, or…. The scope needs to be clearly defined for candidates. Examples of question problems How would you localise the site of the lesion? This question appeared after a neurological scenario was presented. The wording suggests the required answer would involve the diagnostic methods you would use to arrive at the site of the lesion. However this was the marking scheme provided by the question writer: Spinal lesion between T3 and L3 Clearly the question does not actually ask for the answer the examiner wants. Note that it is impossible to detect this fault without also seeing the marking scheme and hence the need to always evaluate both together when checking the wording of questions.
  • 26. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 26 Writing MCQs Focus on a single important concept • Test application of knowledge not recall • Don’t test “trivial” knowledge • Focus on real life problems • Clinical vignettes are a good basis for a Q Placing Qs in vignettes does not increase the difficulty for high performing candidates but does increase the difficulty for low performing candidates
  • 27. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 27 A 7-year-old mare has had intermittent signs of moderately severe colic for the past 48 hours. Heart rate is 56 beats/min. Hydration, acid-base balance, and electrolyte parameters are near normal. On rectal examination, the left dorsal and ventral colon feels distended and is felt coursing in a dorsocranial direction. The spleen is displaced caudomedially. Which of the following is the most likely diagnosis? A. Cecocolic intussusception or cecal inversion B. Displacement of the left colon over the nephrosplenic ligament C. Ileocecal intussusception D. Infarction of the large colon E. Volvulus of the large colon and the cecum A example of a clinical vignette – assesses ability to diagnose – deep question. The stem is relatively long, and each option is relatively short. This is the overall structure to aim for. Keep options short Iris prolapse is a common sequel to penetrating corneal wounds or ruptured corneal ulcers. Which of the following steps is NOT appropriate for the treatment of iris prolapse? A. primary closure of the corneal laceration with 8-0 vicryl and treatment with topical antibiotics to control infection. B. placement of a nictitans flap and treatment with systemic antibiotics to control infection. C. placement of a corneal graft with an overlying conjunctival pedicle graft and treatment with systemic antibiotics to control infection. Avoid complex and long options like you see in this one – try to put as much as possible into the stem and leave the options short
  • 28. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 28 Pose a clear task Chyle is :- A. The semifluid mass which empties from the stomach into the duodenum. B. The lymph containing fat droplets found in the lacteals of the small intestine. C. The contents of the gall bladder. D. The rounded piece of chewed food which passes down the oesophagus when the animal swallows. This is not a good question. You need to pose a clear task. This question has no task. If you covered up the options you would not be able to predict what the answer was by reading the stem. Make all distractors plausible and homogenous Which of the following statements regarding hepatic encephalopathy is true? A. Patients typically present with asymmetrical neurological deficits B. The most effective and appropriate anticonvulsant to use for a patient that is seizuring due to hepatic encephalopathy is phenobarbital C. Abdominal radiographs of dogs with portosystemic shunts will often show an enlarged liver D. Cats with portosystemic shunts often exhibit ptyalism as a clinical sign E. An appropriate treatment for hepatic encephalopathy is intravenous neomycin This question also has a problem. The incorrect and correct options are about different things – some are about clinical signs, some about diagnostic investigation and some about treatment. Instead you should aim to have all the options homogenous – all about diagnosis, all about treatment etc. As mentioned before, there is a tendency for only trivial facts to be 100% true or 100% false in biological systems. The types of concepts we really want to examine though are more difficult and subject to exceptions that mean they are not always true. Therefore in this sort of question you are trying to work out which is more true than the others on a scale of trueness. If the options are not all on the same scale of trueness because they are about different things, then the question becomes irrelevantly difficult or even impossible to answer. See the next slide for an example that will help explain this concept.
  • 29. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 29 Include only ONE best answer Which of the following is true? A. Bananas are green B. Motorbikes are faster than cars C. Boys are taller than girls Here is an extreme example to illustrate the issue raised in the last slide. This question is impossible to answer. The reason is because all the options are true sometimes and false other times. In order to try and answer it, you are trying to work out which is the most true, but you are comparing completely different things. Is “motorbikes are faster than cars” more or less true than “boys are taller than girls”? Impossible to say. Include only ONE best answer Least correct Most correct D E AC B This figure shows the concept diagrammatically. All the items on a multiple choice question should lie on a single scale of trueness. While we expect that in non-trivial questions options will not all be 100% true or 100% false, there must be one option that is well separated on the trueness scale from the distractors (incorrect options).
  • 30. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 30 Avoid none of the above Which of the following is true regarding ligament injuries? A. Ligament injuries are appropriately referred to as “strains” or “sprains”. B. Surgical intervention is indicated for treatment of second-degree sprains with demonstrable instability. C. The elastic nature of ligaments allows 30% elongation before permanent deformation. D. Following surgical repair of ligaments, immobilization via ESF or external coaptation is contraindicated, as range of motion is critical to successful repair. E. None of the above. This question has a number of problems, but one of them is the use of “none of the above” as an option. This is problematic in questions where judgement is involved (which should be all questions in College exams!) and where the options are not absolutely true or false. Either remove this option all together or fix it by replacing it with an option that is more specific. For example if the options are a list of possible drugs to prescribe, an option of “no drug should be given at this time” would be better than “none of the above.” Avoid negative framing Which of the following statements is false regarding arthrotomies? A. When detachment of a ligament is necessary, this should be performed by osteotomy of the bony origin rather than transection of the ligament. B. Complete closure of the synovium is necessary to prevent synovial fluid leakage into subcutaneous tissue. C. Surgical removal of osteophytes is often followed by their relatively rapid regrowth, and has questionable value. D. Monofilament absorbable suture material has a lower risk for long term infection than does braided nonabsorbable suture. E. None of the above. You should avoid questions framed in the negative sense - “Which of the following is false…” - completely. If you do include them, you must never include “none of the above” as an option. The question above illustrates why this is. It you were to choose option E are you saying that “none of the above” are false is true? Or are you saying that “none of the above” are false is false? And if “none of the above” are false is true then A,B,C and D must be all true. Therefore there is no false answer and the question cannot be answered If “none of the above” are false is false, then at least one of A, B, C, D must be true – but isn’t that a given? So these questions pose irrelevant difficulty – irrelevant because the difficulty has nothing to do with the learning outcomes we are trying to assess.
  • 31. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 31 Avoid double options 2-year-old male neutered Border Collie is presented with the following history and neurologic signs: …….. Which one of the following neuroanatomic localizations and diagnoses is CORRECT? A. Left C1-C5 myelopathy, intervertebral disk rupture B. Right C6-T2 myelopathy, fibrocartilagenous embolism C. Right C1-C5 myelopathy, intervertebral disk rupture D. Left C1-C5 myelopathy, fibrocartilagenous embolism This question has “double options”, in that each option contains two different types of fact. It is better to focus on one fact for each MCQ and split this type of Q into two questions. 3 options is enough A horse suffering from an acute intestinal accident is MOST likely to have A. primary respiratory acidosis B. primary respiratory alkalosis C. primary metabolic alkalosis D. primary metabolic acidosis Three options for MCQs have been shown to be sufficient and there is no need to force a question to have 4 or 5 options if there is no natural list of 4 or 5 plausible options. However sometimes you do need to include 4 options, for example in the question above, because it allows a complete set of paired options. It is fine for MCQ s to have different numbers of options within one exam. Although you may worry about candidates guessing in 3 option questions, remember the statistical probability of achieving a score of 70% through random guessing on 30 three option MCQ items is 0.0000356.
  • 32. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 32 Avoid technical item flaws • Grammatical cues • Logical cues: a subset of the options are collectively exhaustive • Absolute terms: terms such as “always”, “never” or “all” used in options. • Vague terms such as “usually” and “frequently” used in options. • Long correct answer: correct answer is longer, more specific, or more complete than other options • Word repeats: a word or phrase is included in the stem and in the correct answer • Convergence: the correct answer includes the most elements in common with the other options • Numeric data not stated consistently • Language in the options is not parallel; options are in an illogical order • Stems (lead-ins) are tricky or unnecessarily complicated As well as the things we have just gone through, there a whole bunch of technical item flaws you will see listed in guides for MCQ writing. These particular flaws can allow clued-in candidates to see the right answer because of faults in the question structure. I am not going to go through all of these in detail because advice about these is so widely available elsewhere, and these flaws are easy to avoid. Key points
  • 33. Liz Norman Massey University, ANZCVS Science Week Plenary 2013 page 33 Key points • Its important to think about what it is you are looking for evidence of when designing Qs. • Check that the Qs are going to be collecting evidence of that, and not something else. • Concentrate on designing Qs that test application rather than fact recall. “Effective item writers are trained, not born … “ Downing and Haladyna 2006, Handbook of test development ,p. 11 So I hope you all learned something today that will help you be more effective question writers.