This is an assignment I completed for the Assessment unit of the University of Bath's MA in International Education programme.
It is shared here to allow me to embed it onto my professional reflective blog at http://ibiologystephen.wordpress.com
Downloads have been disabled.
How to Add a New Field in Existing Kanban View in Odoo 17
Assessment Assignment: Bath MA International Education
1. Stephen Taylor Assessment
Survival of the Fittest for Purpose?
Exploring
reliability
and
validity
in
criterion-‐related
assessment
of
the
IB
Middle
Years
Programme
sciences
as
it
moves
into
the
Next
Chapter.
Stephen
Taylor
MA
International
Education
University
of
Bath
(@IBiologyStephen)
This
assignment
was
submitted
as
part
of
my
MA
coursework
in
February
2012.
It
is
uploaded
here
(with
permission)
to
be
included
as
part
of
my
professional
development
and
reflective
portfolio
at
is.gd/IBiologyReflections.
2. Stephen Taylor Assessment
Introduction
The
International
Baccalaureate’s
Middle
Years
Programme
(MYP)
is
going
through
an
exciting
period
of
reinvention.
Dubbed
“MYP:
The
Next
Chapter,”
this
programme
overhaul
will
affect
all
MYP
teachers,
students
and
school
leaders
over
the
coming
years
(IB,
2011a).
The
Next
Chapter
breaks
from
the
usual
curriculum
review
cycle,
which
runs
on
a
per-‐
subject
group
basis,
and
will
result
in
new
subject
guides,
assessment
criteria
and
practices
being
published
for
every
subject
simultaneously.
Due
to
be
officially
launched
in
2014,
subject
and
assessment
reviews
and
trials
are
currently
ongoing
in
schools
around
the
globe.
In
this
essay,
I
will
explore
the
implications
of
key
changes
proposed
under
the
Next
Chapter
and
their
implications,
in
terms
of
validity
and
reliability,
of
assessment
of
the
sciences.
I
will
attempt
to
evaluate
these
proposals
and
make
recommendations
for
teachers
and
the
IB
on
steps
that
may
make
for
a
smoother
transition
from
principles
into
practice.
Structure
and
assessment
of
the
IB
Middle
Years
Programme
The
IB
MYP
is
a
rapidly
growing
educational
framework
for
middle
school-‐aged
students
(11-‐16
years
of
age).
With
its
roots
as
a
‘pre-‐IB’
programme
in
Africa
in
the
1980’s,
it
has
developed
into
a
four
or
five-‐year
programme,
acting
not
only
as
a
precursor
to
the
Diploma
Programme
(its
original
intended
purpose),
but
also
as
an
interface
with
the
Primary
Years
Programme.
(Nicolson
&
Hannah,
2010).
The
holistic
nature
of
the
programme
is
intended
to
develop
both
concepts
and
skills
in
its
learners,
developing
not
only
knowledge
and
understanding
of
the
eight
subject
groups,
but
also
allowing
students
to
become
versed
in
the
learning
skills
required
to
be
successful
in
the
IB
Diploma,
university
and
beyond.
(Nicolson
&
Hannah,
2010).
The
core
of
the
MYP
is
similar
in
nature
to
that
of
the
Diploma
Programme,
with
the
IB’s
Learner
Profile
focusing
on
the
desired
attributes
of
learners.
The
five
Areas
of
Figure
1:
The
current
MYP
model.
Taken
from
A
History
of
the
interaction
form
contexts
for
learning
within
Middle
Years
Programme
(Appendix)
(Nicolson
&
Hannah,
2010)
3. Stephen Taylor Assessment
the
curriculum.
Community
and
service
is
analogous
to
the
Creativity,
action
and
service
component
of
the
Diploma
Programme.
Approaches
to
learning,
another
Area
of
interaction,
highlights
the
development
of
study
and
research
skills
(IB,
2009)
and
also
allows
for
some
introduction
to
the
Theory
of
knowledge
component
of
the
Diploma
Programme
(Nicolson
&
Hannah,
2010).
A
culminating,
student-‐directed
task,
the
Personal
Project,
aims
to
facilitate
student
exploration
in
a
similar
way
to
the
Diploma
Programme’s
Extended
Essay.
Growth
and
development
in
the
MYP:
Why
the
Next
Chapter?
From
407
schools
running
the
MYP
in
2007,
there
are
now
729
MYP
schools
worldwide
(IB,
2011a,
p.4).
This
rapid
growth
in
the
programme
could
be
due
a
number
of
factors,
such
as
a
greater
demand
for
international
education
in
developed
and
developing
nations
and
a
increasing
‘brand
recognition’
of
the
International
Baccalaureate
in
the
education
sector.
The
International
Baccalaureate
Organisation
has
three
regions.
The
Americas
(IBA)
encompasses
the
USA,
Canada
and
South
America
and
in
recent
years
has
been
the
fastest-‐
growing
market,
with
over
71%
of
IB
schools
running
the
MYP
(IB,
2011a,
p.6).
Growth
is
slower
but
steady
in
the
IB’s
other
two
regions,
Africa,
Europe
and
the
Middle
East
(IBAEM),
and
Asia-‐Pacific
(IBAP).
Despite
this
growth
in
the
MYP,
the
proportion
of
schools
choosing
to
moderate
their
assessment
is
decreasing:
from
38.8%
(155/407
schools)
of
June-‐session
schools
registering
candidates
for
moderation
in
2007
to
just
22.91%
(167/729
schools)
in
2011
(IB,
2011a).
Although
in
real
terms
this
represents
a
small
increase
in
the
number
of
schools
choosing
to
have
their
assessments
moderated,
it
does
raise
questions
of
the
reliability
of
the
grades
given
to
students
in
the
majority
of
schools.
As
part
of
the
five-‐year
programme
evaluation
process,
schools
which
do
not
have
their
grades
formally
moderated
are
required
to
submit
some
samples
of
assessed
final-‐year
work
for
monitoring,
a
version
of
moderation
which
provides
feedback
on
assessment
without
affecting
grades
awarded
(IB,
2010a,
p.49).
This
low
uptake
of
moderation
and
potential
loophole
in
quality
control
leaves
the
MYP
in
an
interesting
position
in
terms
of
reliability,
recognition
and
competition.
Globally
it
is
growing
and
becoming
the
choice
of
international
schools
and
local
schools
aiming
to
‘internationalise’
their
learning.
The
IB
Diploma
is
a
well-‐established
programme
4. Stephen Taylor Assessment
internationally,
with
a
current
tally
of
2,313
schools
offering
the
programme
(IB,
2012).
However,
of
these
schools,
just
212
offer
the
MYP
preceding
the
Diploma
Programme
(IB,
2012).
Of
course,
many
of
the
DP-‐only
schools
will
be
similar
to
sixth-‐form
colleges
with
an
exclusively
16-‐19
student
body,
but
there
is
still
some
shortfall
with
its
leading
competitor,
the
IGCSE.
Boasting
over
9,000
schools
enrolled
internationally
(CIE,
2011),
the
Cambridge
International
GCSE
is
often
found
as
the
‘pre-‐IB’
qualification
in
international
schools
that
offer
the
Diploma
but
not
MYP.
The
IGCSE
is
closely
based
on
England’s
GCSE,
developed
in
1988
as
a
broader
style
of
assessment
for
Key
Stage
4
in
the
UK
than
the
incumbent
O-‐
Levels
system
(Bishop
et
al.,
1999).
Originally
the
GCSE,
like
the
MYP,
was
intended
to
go
beyond
selection
and
summative
assessment
of
content,
to
also
“embrace
the
broader
notion
of
assessment,
which
includes
the
following:
• a
system
which
tests
a
balance
of
knowledge,
understanding
and
skills;
this
system
employs
different
types
of
assessment
within
the
courses
of
study
which
reflects
a
variety
of
styles
of
teaching
and
learning;
• challenging
the
range
of
abilities
of
pupils
at
the
end
of
key
stage
4;
• being
relevant
to
everyday
life.”
(Bishop
et
al.,
1999)
In
their
paper
Users’
perceptions
of
the
GCSE,
Bishop,
Black,
Martin
and
Thompson
(1999)
conclude
that
“it
must
be
recognized
that
the
[GCSE]
examination
cannot
perform
concurrently
all
functions
that
users
are
claiming
for
it.”
These
sentiments
could
well
be
shared
of
the
MYP
in
its
current
form:
philosophically
sound
and
in-‐tune
with
the
needs
of
international
education,
but
with
a
wide
range
of
goals,
assessment
methods
and
low
moderation
somewhat
vulnerable
in
terms
of
validity
and
reliability.
As
Hayden
and
Thompson
(2011,
p.17),
conclude:
“
[for
some]
…the
absence
of
external
external
examination
leading
to
an
externally-‐awarded
certificate
at
age
16
is
anathema.”
While
discussions
continue
over
UK
schools
moving
away
from
the
GCSE
and
OFQUAL
questions
over
standards
following
recent
revisions
(Morrison,
2009),
the
IB
are
working
on
their
next
incarnation
of
the
MYP:
The
Next
Chapter.
Fundamentally,
The
Next
Chapter
involves
more
streamlined,
structured
and
potentially
more
valid
and
reliable
assessment
of
student
learning.
5. Stephen Taylor Assessment
One
justification
for
the
move
to
the
Next
Chapter
and
its
associated
modes
of
assessment
is
for
the
MYP
to
gain
accreditation
in
many
of
the
countries
in
which
it
is
being
implemented.
In
a
recent
email
exchange
Malcolm
Nicolson,
the
Head
of
MYP
Programme
Development,
stated,
“We
will
be
looking
at
accreditation
standards
globally
–
so
looking
at
USA,
Australia,
Canada,
Netherlands,
Germany,
Japan
and
many
others.
The
UK
is
one
of
the
countries
will
aim
to
satisfy.”
In
order
to
satisfy
the
UK,
the
MYP
must
adhere
to
the
assessment
principles
laid
out
by
OFQUAL,
the
same
body
which
currently
accredits
the
IGCSE,
GCSE
and
A-‐Level
qualifications,
along
with
the
IB’s
own
Diploma
Programme.
Condition
E4.2
of
the
OFQUAL
document
General
Conditions
of
Recognition,
states
that:
“…In
designing
such
an
assessment,
an
awarding
organization
must
[…]
ensure
that
the
assessment
is:
fit
for
purpose,
[…]
allows
each
Learner
to
generate
evidence
which
can
be
Authenticated,
[and
which]
allows
Assessors
to
be
able
to
differentiate
accurately
and
consistently
between
a
range
of
attainments
of
Learners.”
(OFQUAL,
2011,
p.44)
Inherently
the
recognition
sought
by
the
MYP
is
an
issue
of
validity
and
reliability
in
assessment.
Here
we
can
try
to
evaluate
validity
and
reliability
in
the
current
model
of
the
MYP
and
look
at
some
of
the
key
proposals
for
change
under
the
Next
Chapter.
What
makes
for
valid
and
reliable
assessment?
When
teaching
my
own
science
classes
I
often
ask
my
students
two
questions
when
they
are
designing,
carrying
out
and
evaluating
lab
work
and
processing
their
results.
The
first
is
“how
do
you
know
your
method
is
allowing
you
to
address
your
research
question?”
The
second
is
“how
do
you
know
you
can
rely
on
your
results?”
Moss
et.
al
(2006)
state
that
educational
assessment
“should
be
able
to
support
[educators]
in
developing
interpretations,
decisions
and
actions
that
enhance
students’
learning.”
Validity
“refers
to
the
soundness
of
[those]
decisions,
interpretations
or
actions.”
(Moss
et
al.,
2006).
Wynne
Harlen
(2007)
defines
validity
as
“how
well
what
is
assessed
corresponds
with
the
behaviour
or
learning
outcome
that
it
is
intended
should
be
assessed;
this
is
often
referred
to
as
construct
validity.”
He
clarifies
that
the
“important
requirement
is
that
the
6. Stephen Taylor Assessment
assessment
concerns
all
aspects
–
and
only
those
aspects
-‐
of
students'
achievement
relevant
to
a
particular
purpose.”
(Harlen,
2007).
Validity
has
been
traditionally
broken
into
three
domains.
Content
validity
“demonstrates
how
well
the
test
samples
the
class
situations
or
subject
matter
about
which
conclusions
are
to
be
drawn.”
(Moss
et
al.,
2006).
Criterion-‐related
validity
compares
those
scores
with
“one
or
more
external
variables
considered
to
provide
a
direct
measure
of
the
characteristic
or
behavior
in
question.”
(Moss
et
al.,
2006).
Construct
validity
can
be
described
as
“a
more
indirect
method
of
validation,
“
(Moss
et
al.,
2006).
Harlen
elucidates
construct
validity
as
being
“based
on
an
integration
of
any
evidence
that
bears
on
the
interpretation
or
meaning
of
the
test
scores—including
content-‐
and
criterion-‐related
evidence—which
are
thus
subsumed
as
part
of
construct
validity.” On
the
other
hand, Messick
(1995)
describes
construct
validity
as
being
“not
a
property
of
the
test
or
assessment
as
such,
but
rather
of
the
meaning
of
the
test
scores.”
(Messick,
1995).
He
goes
further,
arguing
that
construct
validity
can
be
broken
into
six
sub-‐domains:
“content,
substantive,
structural,
generalizability,
external,
and
consequential
aspects
of
construct
validity.”
In
classroom
practice
and
assessment
in
the
MYP,
we
are
most
concerned
about
‘what
to
assess’
and
‘how
to
assess’.
A
third
fundamental
aspect
of
evaluating
the
usefulness
of
an
assessment
model
or
tools
is
reliability.
This
is
described
by
Harlen
(2007)
in
his
Criteria
for
evaluating
systems
for
student
assessment
as
being
“the
extent
to
which
results
are
of
acceptable
consistency
for
a
particular
use,”
or,
more
commonly
“the
extent
to
which
the
assessment,
if
repeated,
would
give
the
same
result.”
Harlen
also
makes
the
distinction
between
tools
used
for
formative
assessment
and
summative
assessment.
Formative
assessment,
(assessment
for
learning),
has
the
intended
purpose
“of
helping
learning
and
teaching.”
Summative
assessment
information,
(assessment
of
learning),
“is
required
for
the
purpose
of
keeping
records
of
the
progress
of
individual
students,
reporting
to
parents
and
students
at
regular
intervals,
passing
information
to
other
teachers
on
transfer
from
class
to
class
or
in
guiding
decisions
about
subjects
for
further
study”.
(Harlen,
2007).
Formative
assessment
plays
a
vital
role
in
the
classroom,
though
in
this
case
I
will
be
looking
at
the
assessment
of
the
MYP
models
through
the
lens
of
final-‐year,
summative
assessment.
7. Stephen Taylor Assessment
When
evaluating
assessment
in
the
MYP,
we
should
ask
three
key
questions:
• Does
this
mode
of
assessment
allow
us
to
assess
the
content
we
intend
to
assess
-‐
does
it
have
content
validity?
• Does
this
mode
of
assessment
allow
us
to
assess
the
skills
or
attributes
we
intend
to
assess
–
does
it
have
criterion-‐related
validity?
• Does
this
mode
of
assessment
provide
us
with
reliable
and
verifiable
assessment
data
–
is
it
reliable?
If
we
can
answer
these
three
questions
in
the
affirmative,
we
could
conclude
that
the
assessment
is
indeed
‘fit
for
purpose’.
Assessment
in
the
Middle
Years
Programme:
methods
and
challenges
Assessment
of
student
achievement
in
both
the
current
MYP
and
the
Next
Chapter
derive
from
shared
foundations
in
educational
assessment
theory.
These
are
well
documented
in
the
IB’s
publication
MYP:
principles
into
practice
(2008),
and
include
criterion-‐related
assessment,
the
best-‐fit
approach,
the
value
of
formative
assessment,
fitness
for
purpose
of
assessment
tools,
feedback
and
grade
determination.
Assessment
in
the
MYP
has
a
set
of
aims,
below
quoted
from
MYP:
from
principles
to
practice
(IB,
2008):
“Assessment
in
the
MYP
aims
to:
• support
and
encourage
student
learning
by
providing
feedback
on
the
learning
process
• inform,
enhance
and
improve
the
teaching
process
• promote
positive
student
attitudes
towards
learning
• promote
a
deep
understanding
of
subject
content
by
supporting
students
in
their
inquiries
set
in
real
world
contexts
using
the
areas
of
interaction
• promote
the
development
of
higher-‐order
cognitive
skills
by
providing
rigorous
final
objectives
that
value
these
skills
• reflect
the
international-‐mindedness
of
the
programme
by
allowing
for
assessments
to
be
set
in
a
variety
of
cultural
and
linguistic
contexts
•
support
the
holistic
nature
of
the
programme
by
including
in
its
model
principles
that
take
account
of
the
development
of
the
whole
student”
(IB,
2008,
p.41)
8. Stephen Taylor Assessment
These
aims
are
supplemented
with
further
subject-‐specific
aims
and
objectives.
Within
each
of
the
subjects
there
are
multiple
criteria,
each
with
its
own
aims
and
objectives.
Appendix
2
lists
the
aims
and
objectives
of
the
sciences.
With
eight
subject
groups
in
total
we
can
see
that
one
obstacle
to
validity
in
MYP
assessment
lies
in
the
sheer
volume
of
content
and
objectives
that
are
to
be
assessed.
It
is
good
practice
to
assess
through
‘multiple
measures’,
with
an
IB
stipulation
of
at
least
two
data
points
per
criterion
per
year
(IB,
2010a,
p.54).
In
reality,
that
plays
out
in
schools
as
being
two
data
points
per
reporting
period
(commonly
a
semester).
There
are
six
assessed
criteria
in
the
sciences,
four
in
other
subjects.
As
a
consequence,
students
face
a
minimum
of
eight
summative
assessments
per
semester
in
some
subjects
and
twelve
in
others.
This
is
an
incredible
load
on
teachers
and
students
and
at
the
high-‐school
level
can
leave
teachers
in
a
position
of
poor
assessment
practices
–
cramming
content
‘in
preparation
for
the
Diploma’,
assigning
assessed
tasks
as
homework
or
simply
missing
out
valuable
steps
such
as
exploration,
drafting
and
peer
or
self-‐
assessment.
As
a
result,
the
size
of
the
MYP,
paired
with
the
significant
backwash
effect
of
Diploma
Programme
preparation,
could
be
having
a
negative
impact
not
just
on
content
validity
but
likely
also
reliability.
Grading
and
reporting
Overall
grades
on
students’
progress
in
the
eight
academic
subject
groups
are
reported
on
a
1-‐7
scale.
A
full
set
of
descriptors
of
these
grades
is
included
in
Appendix
2.
These
1-‐7
scores
are
determined
against
a
set
of
published
grade
boundaries
for
each
of
the
eight
subject
groups.
A
best-‐fit
approach
is
used
to
determine
the
score
for
each
of
the
subject’s
assessment
criteria.
These
scores
are
then
added
up
and
grade
boundaries
are
applied.
The
positioning
of
these
boundaries
is
an
example
of
norm-‐referencing
to
some
extent
in
the
MYP.
It
is
a
point
at
which
and
essentially
descriptive,
criterion-‐referenced
system
is
used
to
produce
a
single
numerical
score
–
and
in
the
sciences
it
does
not
quite
add
up.
A
student
who
scores
4
in
all
criteria
falls
one
point
the
wrong
side
of
a
5
grade
overall
–
the
grade
which
best
represents
his
achievement
when
the
descriptors
are
compared
to
one
another.
This
suggests
an
issue
with
criterion
validity,
but
could
be
remedied
with
a
normative
decision
to
move
the
boundary.
9. Stephen Taylor Assessment
In
the
current
model
of
the
MYP,
“all
the
work
of
students
is
internally
assessed
by
teachers.
There
is
no
formal
examination
structure,
no
system
of
external
assessment
and
the
IB
does
not
provide
MYP
exams.”
(IB,
2010a,
p.52).
The
MYP
in
its
current
guise
is
commonly
described
as
a
framework
for
teaching
and
assessment,
and
is
not
intended
to
be
a
curriculum
or
replacement
for
standardized
testing.
The
MYP
Coordinator’s
Handbook
(IB,
2010a)
goes
on
to
say
that
“external
examinations
provided
by
other
organisations
are
unlikely
to
address
the
MYP
subject-‐specific
objectives.”
(IB,
2010a,
p.52).
As
mentioned
before,
the
low
uptake
of
schools
in
the
formal
moderation
process
raises
concerns
about
the
reliability
of
grades
awarded
in
MYP
assessment.
It
is
a
requirement
that
schools
with
multiple
teachers
per
section
moderate
internally,
though
there
is
little
in
the
way
of
quality
control
to
ensure
that
this
takes
place
until
the
school’s
five-‐year
evaluation
visit.
Criterion-‐related
assessment
in
the
MYP
Assessment
of
student
achievement
in
eight
subject
areas
and
the
personal
project
of
the
MYP
are
entirely
criterion-‐related,
using
a
best-‐fit
approach
(IB,
2008,
p.40).
This
is
derived
from
previous
practice
in
criterion-‐referenced
assessment.
Although
similar,
and
often
confused
by
teachers
and
administrators,
there
are
subtle
differences
between
the
two
approaches.
To
fully
understand
the
impacts
of
the
Next
Chapter
and
the
continuing
role
of
criterion-‐related
assessment,
we
must
first
understand
these
key
modes
of
assessment.
Norm-‐referenced
assessment
of
student
achievement
does
not
overtly
exist
in
the
MYP
and
is
not
generally
accepted
practice
in
the
MYP
classroom.
Norms
are
traditionally
used
to
rank
learners
in
terms
of
their
perceived
achievement
in
a
test
or
assessment
battery.
Norm-‐referencing
“places
groups
of
students
into
predetermined
bands
of
achievements.
Students
compete
for
limited
numbers
of
grades
within
these
bands
which
range
between
fail
and
excellence.”
(Dunn
et
al.,
2002)
In
its
most
traditional
sense,
norm-‐referencing
measures
students
only
against
others
and
is
not
necessarily
a
good
measure
of
content
mastery
(O'Connor,
2011,
pp.79-‐80).
Norm-‐referenced
grading
is,
in
essence,
a
competitive
pursuit
and
not
in
the
interests
of
all
students
–
especially
those
who
struggle
to
succeed.
10. Stephen Taylor Assessment
This
may
be
appropriate
in
a
competitive
environment,
but
it
does
not
suit
the
inclusive
nature
of
the
IB
programmes.
Criterion-‐referenced
achievement
“is
not
dependent
on
how
well
others
in
the
cohort
have
performed,
but
on
how
well
the
individual
student
has
performed
as
measured
against
specific
criteria
and
standards.”
(Dunn
et
al.,
2002).
It
is
an
assessment
idea
which
has
been
in
use
since
the
1960s
although
it
wasn’t
until
the
early
1970’s
that
academics
such
as
Hambleton
&
Novick
(1973)
joined
up
key
ideas
in
theory
and
practice.
They
state
that
in
common
with
all
previous
definitions
of
criterion-‐referenced
assessment
is
that
“the
definition
of
a
well-‐specified
content
domain
and
the
development
of
procedures
for
generating
appropriate
samples
of
test
items
are
important.”
(Hambleton
&
Novick,
1973)
Having
said
this,
it
could
be
argued,
as
David
F.
Lohman
quotes,
that,
“behind
every
criterion
lurks
a
norm”
(Lohman,
2009).
In
assessment
of
learners
in
the
MYP
we
aim
to
measure
them
against
pre-‐determined
performance
outcomes
–
criterion
descriptors
–
but
how
are
these
outcomes
decided?
This
is
where,
to
a
greater
extent,
we
find
the
norm:
hiding
in
plain
sight
as
the
command
terms
of
an
achievement-‐level
descriptor!
Assessment
in
a
criterion-‐referenced
system
raises
more
challenges
in
terms
of
construct
validity
than
traditional
norm-‐referenced
tests,
as
described
by
Edward
Haertal
in
1985:
“When
tests
are
used
only
to
rank
examinees,
validity
can
be
established
by
simple
correlations
of
test
scores
with
criteria.
Criterion-‐referenced
interpretations,
using
test
performance
[…]
require
new
approaches
to
test
validation.”
Essentially
here
we
see
the
importance
of
command
terms
come
to
the
fore
–
the
language
or
action-‐verbs
used
in
assessment
tasks
and
descriptors:
“This
methodology
begins
with
the
description
of
the
achievement
construct
in
psychological
and
behavioral
terms.
The
psychological
description
of
the
achievement
construct
is
an
account
of
the
knowledge
and
skills
it
entails.”
(Haertal,
1985)
11. Stephen Taylor Assessment
The
command
terms
are
a
defined
set
of
action
verbs
which
have
been
categorized
in
accordance
with
the
ideas
of
Bloom’s
taxonomy
to
represent
a
hierarchy
of
desired
achievement
constucts.
The
example
rubric
below,
for
the
sciences
criterion
C:
Knowledge
and
understanding,
demonstrates
this:
Table
1:
Criterion
C:
Knowledge
&
understanding
(current)
taken
from
the
MYP
Science
Guide
(IB,
2010b)
Level
Descriptor
0
The
student
does
not
meet
any
of
the
descriptors
below.
1-‐2
The student recalls some scientific ideas, concepts and/or processes.
The student applies scientific understanding to solve simple problems.
3-‐4
The student describes scientific ideas, concepts and/or processes.
The student applies scientific understanding to solve complex problems
in familiar situations.
The student analyses scientific information by identifying parts, relationships or causes.
5-‐6
The student uses scientific ideas, concepts and/or processes correctly to construct scientific explanations.
The student applies scientific understanding to solve complex problems including those in unfamiliar situations.
The student analyses and evaluates scientific information and makes judgments supported by scientific
understanding.
The
descriptors
‘recall’
and
‘describe’
are
in
line
with
the
lower
end
on
Bloom’s
taxonomy
–
the
knowledge
domain.
However,
‘construct’
and
‘analyse’
appear
at
the
higher
end.
By
focusing
assessment
on
these
skills
and
knowledge
outcomes,
the
normative
aspect
of
assessment
is
present
in
the
grade-‐level
descriptors.
This
generates
another
issue
with
content
and
criterion
validity
in
the
current
MYP
model.
At
the
moment,
these
command
terms
are
fully
defined
and
published
in
a
document
entitled
‘Command
terms
in
the
MYP’
(IB,
2010c).
However,
they
are
not
present
in
all
subject
guides
and
the
usage
of
those
that
are
present
may
not
be
consistent
between
subjects.
A
lack
of
coherence
between
classrooms
may
lead
into
issues
of
criterion-‐related
validity,
especially
for
students
and
teachers
who
teach
across
disciplines
and
see
command
terms
used
in
different
ways.
Criterion-‐related
assessment
in
the
MYP
differs
from
criterion-‐referenced
assessment
in
a
subtle
but
important
way.
Criterion-‐referenced
assessment
is
often
used
to
assess
mastery
of
skills
and
content.
Criterion-‐related
assessment
uses
a
best-‐fit
approach
to
assign
grades
to
students:
“When
assessing
a
student’s
work,
teachers
should
read
the
descriptors
(starting
with
level
0)
until
they
reach
a
descriptor
that
describes
an
achievement
level
that
the
work
being
assessed
has
not
attained.”
(IB,
2010a,
p.25)
In
practice,
this
allows
for
a
12. Stephen Taylor Assessment
teacher
to
judge
a
student’s
work
based
on
the
most
appropriate
combination
of
descriptors
as
outlines
in
the
rubric.
The
best-‐fit
approach
also
covers
assigning
final
grades.
Averages
and
percentages
are
not
acceptable
practice
–
instead
one
must
look
at
the
recent
trend
in
a
student’s
work
towards
a
given
criterion.
For
this
reason
it
is
important
that
there
are
multiple
measures
for
each
criterion
per
reporting
period.
A
clarification
of
the
IB’s
position
on
best-‐fit
grading
is
included
in
Appendix
3.
This
best-‐fit
approach
to
assessment
is
a
strength
of
the
MYP
in
terms
of
criterion-‐related
validity
as
it
focuses
on
the
student’s
ability
to
achieve
in
relation
to
a
set
of
pre-‐determined,
published
performance
descriptors.
With
the
best-‐fit
approach,
teachers
are
best
placed
to
assess
a
student’s
work
for
what
they
have
achieved,
rather
than
what
they
have
not
(which
is
a
feature
of
pure
criterion-‐referenced
assessment).
It
is
reliable
as
it
is
based
on
multiple
measures
and
evidence
of
trends
in
student
achievement.
However,
for
the
system
to
work
effectively,
there
needs
to
be
multiple
measures
of
each
criterion
–
which
regularly
proves
a
challenge
in
a
subject
with
six
criteria.
In
some
classes,
a
‘race
to
assess’
can
impact
both
reliability
and
validity.
In
a
recent
study
in
Sweden,
grade
inflation
was
observed
in
criterion-‐referenced
assessment
system.
(Wikström,
2005).
Wikström
found
in
her
study
over
six
years
that
grades
had
been
increasing
in
the
criterion-‐referenced
system
and
was
able
to
exclude
factors
relating
to
authentic
improved
achievements,
strategic
course
selection
and
selective
exclusion
of
low-‐achievers.
What
remained
was
a
lowering
of
standards,
with
a
more
notable
change
in
the
Arts
and
the
lowest
in
English
and
Mathematics,
subjects
calibrated
against
national
tests.
In
a
typical
MYP
classroom,
assessment
is
in
the
hands
of
the
teacher
and
therefore
prone
to
positive
grading
or
an
indivdual’s
interpretation
of
the
criteria.
Under
the
current
system
which
includes
attitudinal
grades,
the
effect
of
grade
inflation
may
be
more
pronounced,
having
a
negative
impact
on
reliability
of
grades
awarded.
In
the
sciences
it
could
be
argued
tha
half
of
a
student’s
current
grade
comes
not
from
the
‘hard
science’
of
knowledge
and
lab
investigative
skills
but
from
a
more
social-‐sciences
and
language
leaning
towards
One
World,
Communication
in
science
and
Attitudes
in
science.
This
raises
a
concern
over
content
validity
–
is
a
student
scoring
well
because
she
is
good
at
13. Stephen Taylor Assessment
science
or
is
it
because
what
is
being
assessed
is
not
science?
It
also
raises
a
more
serious
question
of
reliability
and
appropriateness
when
part
of
a
grade
is
devoted
to
attitudinal
or
behavioural
evidence
–
which
can
be
subjective,
is
hard
to
track
and
does
not
give
a
measure
of
a
student’s
genuine
achivements
in
science.
(O'Connor,
2011,
pp.16-‐20).
Validity
and
reliability
in
science
assessment
in
MYP:
The
Next
Chapter
So
does
the
Next
Chapter
address
the
issues
in
validity
and
reliability
that
are
present
in
the
current
model
and
how
does
this
impact
the
sciences?
To
get
a
better
picture
of
some
of
these
proposed
changes
(which
are
currently
being
implemented
in
selected
pilot
schools),
please
refer
to
Appendices
4-‐7
which
include:
summary
changes
to
the
aims
of
the
sciences;
summary
changes
to
assessment
in
the
sciences;
comparison
of
old
vs
new
assessment
criteria;
and,
comparison
of
grade
level
descriptors
for
the
knowledge-‐related
criterion.
Criterion-‐related
validity
in
the
MYP
sciences
Paring
back
the
aims,
assessed
criteria
and
descriptors
of
the
sciences
is
likely
to
have
a
positive
effect
on
criterion-‐related
validity.
Through
a
clearer,
shorter
and
better-‐defined
set
of
aims
and
objectives,
the
task
of
assessing
whether
a
student
has
met
these
goals
will
be
more
manageable
and
potentially
more
reliable.
Cutting
the
sciences
criteria
from
six
to
four
will
also
likely
have
a
number
of
positive
impacts
on
validity
and
reliability.
The
removal
of
the
behavioural
Attitudes
in
science
criterion
will
allow
for
more
reliable
assessment
of
a
student’s
actual
achievements
against
the
science
aims
and
objectives,
with
a
reduced
risk
of
subjective
contamination.
With
the
best
practice
of
multiple
measures,
four
criteria
are
easier
to
handle
than
six.
This
should
give
more
opportunities
for
meaningful
assessment
of
each
criterion.
It
will
be
an
interesting
study,
that
which
addresses
the
impact
of
removing
these
attitudinal
criteria
on
overall
student
achievement.
One
might
hypothesise
that
overall
1-‐7
scores
will
decrease
as
the
‘safety
nets’
of
Communication
in
science
and
Attitudes
in
science
are
removed
from
the
conceptually
weaker
students.
Finally,
an
increased
programme-‐wide
focus
on
the
command
terms,
with
common
definitions,
should
serve
to
make
the
language
of
assessment
easier
for
all
to
understand
14. Stephen Taylor Assessment
and
lead
to
more
criterion-‐related
reliability.
Wordy
descriptors
with
multiple
command
terms
should
be
replaced
with
more
concise
descriptors,
giving
a
focus
for
assessment
of
the
criterion.
With
a
more
manageable
task
in
hand,
students
should
be
able
to
identify
performance
elements
which
will
allow
them
to
access
higher
grades.
Content
validity
in
the
MYP
sciences
The
MYP
is
described
as
a
framework
for
assessment
and
learning
and
not
an
exhaustive
curriculum.
This
allows
scope
for
schools
to
set
their
own
levels
of
content
validity,
such
as
meeting
the
state
science
content
standards.
However,
this
can
be
a
challenge
for
schools
where
there
is
no
parallel
set
of
standards
and
can
make
the
feed-‐in
role
of
the
MYP
to
the
DP
difficult.
In
the
Next
Chapter,
clearer
guidelines
for
content
in
the
form
of
significant
concepts
and
perhaps
even
online
support
content
should
allow
teachers
to
plan
units
of
work
which
can
be
assessed
with
greater
content
validity.
Testing
knowledge
in
the
MYP
sciences
Under
the
Next
Chapter,
he
key
proposal
that
the
Using
knowledge
criterion
“must
only
be
assessed
through
tests
or
exams,”
(IB,
2011)
is,
to
me,
one
of
the
most
interesting
changes
to
be
put
forth
in
the
MYP
sciences.
It
represents
a
move
to
an
assessment
of
knowledge
that
at
face
value
may
seem
more
‘old-‐fashioned’
and
less
suited
to
differentiation
to
students’
needs
than
the
current
system.
The
working
sciences
guide
allows
for
assessment
of
Knowledge
and
understanding
through
a
diversity
of
modes,
including
case
studies
and
response
to
articles
or
datasets
(IB,
2010a,
p.31).
As
long
as
testing
is
used
well
the
new
system
will
allow
for
greater
reliability
in
the
data
produced
(free
from
potential
contamination
of
other
students’
ideas
such
as
in
the
current
system).
It
may
also
have
a
positive
impact
on
consequential
validity
as
students
move
into
the
DP
and
preparation
for
a
final
exam
marked
on
grade
boundaries,
making
up
76%
of
their
summative
assessment.
Arguably
the
move
to
stipulate
testing
or
exams
as
a
method
of
assessment
of
Using
knowledge
is
one
to
ensure
greater
reliability
of
assessment.
In
practice,
this
will
hold
significant
challenges
for
teachers
that
will
need
to
be
given
professional
development
considerations
from
the
IB.
As
Sylvia
Green
notes,
15. Stephen Taylor Assessment
“…The
links
between
the
level
descriptions
and
[the
national]
test
mark
schemes
are
not
so
transparent.
Different
elements
within
structured
questions
may
address
different
levels
and
content,
even
different
domains
within
the
subject,
therefore
it
may
be
difficult
to
classify
some
questions
as
‘at
a
particular
level’.
In
such
circumstances
standard
setting
is
done
by
determining
‘thresholds’
in
total
test
scores,
initially
by
judgmental
means
and
subsequently
using
statistical
equating
to
support
judgments.“
(Green,
2002)
Test
design
is
a
complex
business
and
designing
tests
that
work
in
a
criterion-‐related
situation
is
a
challenge.
As
a
traditional
mode
of
assessment
that
gives
the
perception
of
rigour
and
‘academia’,
it
will
take
a
concerted
effort
to
change
the
approach
of
stakeholders
in
assessment
and
to
reinforce
the
criterion-‐related
approach.
Conclusions
&
Recommendations
A
great
deal
of
thought
and
scholarship
lies
behind
the
Next
Chapter
and
its
implications
for
assessment
in
the
sciences.
Removal
of
attitudinal
criteria,
clearly
defined
command
terms,
more
concise
achievement-‐level
descriptors
and
a
narrower
set
of
acceptable
assessment
tools
should
serve
to
enhance
reliability
of
assessment.
Emphasis
on
the
aims
of
the
sciences
and
the
proposed
production
of
pre-‐populated
online
unit
planner
tools
may
make
some
headway
in
validity
of
what
is
being
assessed.
However,
it
will
take
considerable
work
on
the
part
of
the
IB,
school
leaders
and
teachers
to
translate
the
Next
Chapter
into
effective
classroom
action.
Professional
development
of
all
teachers
must
play
a
central
role
in
ensuring
that
assessment
in
the
Next
Chapter
makes
a
successful
translation
from
paper
to
practice.
With
over
900
schools
practicing
the
MYP,
it
must
not
be
assumed
that
the
teachers
in
each
classroom
and
the
administrators
in
each
office
are
clued-‐in
to
current
educational
philosophy
and
practices.
It
is
already
a
programme
authorization
and
evaluation
requirement
that
teachers
attend
MYP
workshops
for
programme
delivery
and
development.
Online
and
in-‐school
workshops,
as
well
as
the
Online
Curriculum
Centre
(OCC)
exist
as
tools
for
professional
development
and
are
becoming
stronger.
16. Stephen Taylor Assessment
The
IB
should
take
the
opportunity
to
capitalize
on
its
own
developments
and
opportunities
by
including
making
explicit
discussion
of
validity
and
reliability
in
assessment
practices
a
part
of
these
resources.
Outreach
through
the
OCC,
video
or
article
resources.
Clear
exemplars,
such
as
those
generally
found
in
teachers’
support
material,
must
be
made
widely
available
and
readily
accessible
if
they
are
to
be
put
to
good
use.
This
is
of
particular
importance
to
testing
–
perhaps
the
one
criterion
which
represents
the
biggest
change
for
science
teachers
in
their
methods.
Finally,
there
is
a
need
to
allow
teachers
to
support
the
effective
development
of
their
students’
assessment
practices.
With
the
removal
of
attitudinal
grading,
and
its
consequential
boost
to
validity
and
reliability,
comes
an
increases
likelihood
of
a
failing
student.
It
must
emphasized
through
all
professional
development
modes,
handbooks
and
other
available
media
that
effective,
criterion-‐related
formative
assessment
plays
a
crucial
role
in
development:
“There
is
a
body
of
firm
evidence
that
formative
assessment
is
an
essential
component
of
classroom
work
and
that
its
development
can
raise
standards
of
achievement.”
(Black
&
Wiliam,
2010)
With
some
excitement,
but
also
trepidation,
I
look
forward
to
the
Next
Chapter.
Early
signs
look
positive
that
it
will
become
more
reliable
and
valid
in
its
assessment:
it
will
evolve
into
a
form
that
shows
greater
fitness
for
purpose.
Acknowledgements
Thank-‐you
to
Malcolm
Nicolson,
Head
of
the
Middle
Years
Programme,
and
Sean
Rankin,
Head
of
Curriculum
and
Assessment
for
the
Sciences,
for
their
input
and
willingness
to
answer
questions
by
email.
Thanks
also
to
Sue
Martin
for
her
guidance
and
mentoring
during
the
summer
school
and
by
email
since.
17. Stephen Taylor Assessment
References
Bishop, K., Bullock, K., Martin, S. & Thompson, J., 1999. Users' perceptions of the GCSE.
Educational Research, 41(1), pp.35-49.
Black, P. & Wiliam, D., 2010. Kappan Classic: Inside the Black Box: Raising Standards Through
Classroom Assessment. The Phi Delta Kappan , 92(1), pp.81-90.
CIE, 2011. Cambridge IGCSE Brochure (pdf). [Online] Available at:
http://www.cie.org.uk/docs/qualifications/igcse/IGCSE%20Brochure.pdf [Accessed 4 January
2012].
Dunn, L., Parry, S. & Morgan, C., 2002. Seeking quality in criterion referenced assessment.
[Online] Available at: http://www.leeds.ac.uk/educol/documents/00002257.htm [Accessed 20
February 2012].
Green, S., 2002. Criterion referenced assessment as a guide to learning - the importance of
progression and reliability. [Presentation, available online at:] Johannesburg Available at:
http://www.cambridgeassessment.org.uk/ca/digitalAssets/113775_Criterion_Referenced_Assess
ment_as_a_Guide_to_Learning._The_.pdf [Accessed 13 February 2012].
Haertal, E., 1985. Construct Validity and Criterion-Referenced Testing. Review of Educational
Research, 55(1), pp.23-46.
Hambleton, R.K. & Novick, M.R., 1973. Toward an integration of theory and method for
criterion-referenced tests.. Journal of Educational Measurement, 10(3), pp.159-70.
Harlen, W., 2007. Criteria for evaluating systems for student assessment. Studies in Educational
Evaluation, 33(1), pp.15-28.
IB, 2008. MYP: From principles to practice [Note: Password protected]. Cardiff, UK:
International Baccalaureate Organisation. Available at: http://ibo.org [password protected]
[accessed 18 October 2011].
IB, 2009. The Middle Years Programme: A basis for practice (pdf). Cardiff, UK: International
Baccaluareate Organisation. Available at: http://occ.ibo.org [password protected] [accessed 4
January 2012].
IB, 2010a. MYP Coordinator's Handbook (pdf). Cardiff, UK: International Baccalaureate
Organisation. Available at: http://occ.ibo.org/ [password protected] [accessed 4 January 2012].
IB, 2010b. MYP: Sciences guide. For use from January 2011. Cardiff, UK: International
Baccaluareate Organisation. Available at: http://occ.ibo.org [password protected] [accessed 30
January 2011].
IB, 2010c. Command terms in the MYP. Cardiff, UK: International Baccaluareate Organisation.
Available at: http://occ.ibo.org [password protected] [accessed 30 January 2011].
IB, 2011a. Development Report: MYP Sciences guide (pdf). [Online] Available at:
http://occ.ibo.org [password protected] [Accessed 5 November 2011].
18. Stephen Taylor Assessment
IB, 2011b. MYP Statistical Bulletin, June 2011 moderation session (pdf) [Note: password
protected]. [Online] Available at:
http://www.ibo.org/facts/statbulletin/mypstats/documents/myp_statistical_bulletin_june_2011.pd
f [password protected] [Accessed 12 February 2012].
IB, 2011c. MYP: the next chapter. Project report October 2011. [Online] Available at:
http://occ.ibo.org [password protected] [Accessed 25 November 2011].
IB, 2012. IB Fast Facts. [Online] Available at: http://www.ibo.org/facts/fastfacts/ [password
protected] [Accessed 20 February 2012].
Lohman, D.F., 2009. The Contextual Assessment of Talent. In Vantassel-Baska, J. Leading
Change in Gifted Education: The Festschrift of Dr. Joyce Vantassel-Baska. Accessed online at
http://faculty.education.uiowa.edu/dlohman/pdf/The_Contextual_Assessment_of_Talent.pdf ed.
Waco, Texas, USA: Prufrock Press. pp.229-41.
Messick, S., 1995. Validity of Psychological Assessment: Validation of Inferences From Persons'
Responses and Performances as Scientific Inquiry Into Score Meaning. American Psychologist,
50(9), p.741–749.
Morrison, N., 2009. GCSE your time is up. [Online] Available at:
http://www.tes.co.uk/article.aspx?storycode=6012334 [Accessed 12 February 2012].
Moss, P., Girard, B. & Haniford, L., 2006. Validity in edcuational assesssment. Review of
Research in Education (http://rre.sagepub.com/content/30/1/109.full.pdf+html), 30(1), pp.109-62.
Nicolson, M. & Hannah, L., 2010. History of the Middle Years Programme (pdf). [Online]
Available at: http://occ.ibo.org [Accessed 14 February 2012].
Nicolson, Malcolm. Personal email correspondences. January 16-February 26 2012.
O'Connor, K., 2011. A repair kit for grading/ 15 fixes for broken grades - 2nd Ed.. Boston:
Pearson Education.
OFQUAL, 2011. General Conditions of Recognition. [Online] Available at:
http://www.ofqual.gov.uk/files/2011-05-16-general-conditions-of-recognition.pdf?Itemid=111
[Accessed 12 February 2012].
Rankin, Sean. Personal email correspondences regarding sciences assessment. January 16-
February 28 2012.
Thompson, J. & Hayden, M., 2011. The Middle Years Programme. In Thompson, J. & Hayden,
M. Taking the MYP forward. Melton, UK: John Catt Educational. pp.13-18.
Wikström, C., 2005. Grade stability in a criterion-referenced grading system: a Swedish example..
Assessment in Education, 12(2), pp.125-44.
19. Stephen Taylor Assessment
Appendices
Appendix 1: Aims and objectives of the MYP sciences. Taken from the science subject guide
(IB, 2010a)
Aims
The
aims
of
any
MYP
subject
and
of
the
personal
project
state
in
a
general
way
what
the
teacher
may
expect
to
teach
or
do,
and
what
the
student
may
expect
to
experience
or
learn.
In
addition,
they
suggest
how
the
student
may
be
changed
by
the
learning
experience.
The
aims
of
the
teaching
and
study
of
MYP
sciences
are
to
encourage
and
enable
students
to:
1. develop
curiosity,
interest
and
enjoyment
towards
science
and
its
methods
of
inquiry
2. acquire
scientific
knowledge
and
understanding
3. communicate
scientific
ideas,
arguments
and
practical
experiences
effectively
in
a
variety
of
ways
4. develop
experimental
and
investigative
skills
to
design
and
carry
out
scientific
investigations
and
to
evaluate
evidence
to
draw
a
conclusion
5. develop
critical,
creative
and
inquiring
minds
that
pose
questions,
solve
problems,
construct
explanations,
judge
arguments
and
make
informed
decisions
in
scientific
and
other
contexts
6. develop
awareness
of
the
possibilities
and
limitations
of
science
and
appreciate
that
scientific
knowledge
is
evolving
through
collaborative
activity
locally
and
internationally
7. appreciate
the
relationship
between
science
and
technology
and
their
role
in
society
8. develop
awareness
of
the
moral,
ethical,
social,
economic,
political,
cultural
and
environmental
implications
of
the
practice
and
use
of
science
and
technology
9. observe
safety
rules
and
practices
to
ensure
a
safe
working
environment
during
scientific
activities
10. engender
an
awareness
of
the
need
for
and
the
value
of
effective
collaboration
during
scientific
activities.
Objectives
The
objectives
of
any
MYP
subject
and
of
the
personal
project
state
the
specific
targets
that
are
set
for
learning
in
the
subject.
They
define
what
the
student
will
be
able
to
accomplish
as
a
result
of
studying
the
subject.
These
objectives
relate
directly
to
the
assessment
criteria
found
in
the
“Sciences
assessment
criteria”
section.
A
One
world
This
objective
refers
to
enabling
students
to
gain
a
better
understanding
of
the
role
of
science
in
society.
Students
should
be
aware
that
science
is
a
global
endeavour
and
that
its
development
and
applications
can
have
consequences
for
our
lives.
One
world
should
provide
students
with
the
opportunity
to
critically
assess
the
implications
of
scientific
developments
and
their
applications
to
local
and/or
global
issues.
At
the
end
of
the
course,
students
should
be
able
to:
• explain
the
ways
in
which
science
is
applied
and
used
to
address
specific
problems
or
issues
• discuss
the
effectiveness
of
science
and
its
application
in
solving
problems
or
issues
• discuss
and
evaluate
the
moral,
ethical,
social,
economic,
political,
cultural
and
environmental
implications
of
the
use
of
science
and
its
application
in
solving
specific
problems
or
issues.
B
Communication
in
science
This
objective
refers
to
enabling
students
to
become
competent
and
confident
when
communicating
information
in
science.
Students
should
be
able
to
use
scientific
language
correctly
and
a
variety
of
communication
modes
and
formats
as
appropriate.
Students
should
be
aware
of
the
importance
of
acknowledging
and
appropriately
referencing
the
work
of
others
when
communicating
in
science.
At
the
end
of
the
course,
students
should
be
able
to:
• use
scientific
language
correctly
• use
appropriate
communication
modes
such
as
verbal
(oral,
written),
visual
(graphic,
symbolic)
and
communication
formats
(laboratory
reports,
essays,
presentations)
to
effectively
communicate
theories,
ideas
and
findings
in
science
20. Stephen Taylor Assessment
• acknowledge
the
work
of
others
and
the
sources
of
information
used
by
appropriately
documenting
them
using
a
recognized
referencing
system.
C
Knowledge
and
understanding
of
science
This
objective
refers
to
enabling
students
to
understand
scientific
knowledge
(facts,
ideas,
concepts,
processes,
laws,
principles,
models
and
theories)
and
to
apply
it
to
construct
scientific
explanations,
solve
problems
and
formulate
scientifically
supported
arguments.
At
the
end
of
the
course,
students
should
be
able
to:
• recall
scientific
knowledge
and
use
scientific
understanding
to
construct
scientific
explanations
• apply
scientific
knowledge
and
understanding
to
solve
problems
set
in
familiar
and
unfamiliar
situations
• critically
analyse
and
evaluate
information
to
make
judgments
supported
by
scientific
understanding.
D
Scientific
inquiry
While
the
scientific
method
may
take
on
a
wide
variety
of
approaches,
it
is
the
emphasis
on
experimental
work
that
characterizes
MYP
scientific
inquiry.
This
objective
refers
to
enabling
students
to
develop
intellectual
and
practical
skills
to
design
and
carry
out
scientific
investigations
independently
and
to
evaluate
the
experimental
design
(method).
At
the
end
of
the
course,
students
should
be
able
to:
• state
a
focused
problem
or
research
question
to
be
tested
by
a
scientific
investigation
• formulate
a
testable
hypothesis
and
explain
it
using
scientific
reasoning
• design
and
carry
out
scientific
investigations
that
include
variables
and
controls,
material
and/or
equipment
needed,
a
method
to
be
followed
and
the
way
in
which
the
data
is
to
be
collected
and
processed
• evaluate
the
validity
and
reliability
of
the
method
• judge
the
validity
of
a
hypothesis
based
on
the
outcome
of
the
investigation
suggest
improvements
to
the
method
or
further
inquiry,
when
relevant.
E
Processing
data
This
objective
refers
to
enabling
students
to
collect,
process
and
interpret
sufficient
qualitative
and/or
quantitative
data
to
draw
appropriate
conclusions.
Students
are
expected
to
develop
analytical
thinking
skills
to
interpret
data
and
judge
the
reliability
of
the
data.
At
the
end
of
the
course,
students
should
be
able
to:
• collect
and
record
data
using
units
of
measurement
as
and
when
appropriate
• organize,
transform
and
present
data
using
numerical
and
visual
forms
• analyse
and
interpret
data
• draw
conclusions
consistent
with
the
data
and
supported
by
scientific
reasoning.
F
Attitudes
in
science
This
objective
refers
to
encouraging
students
to
develop
safe,
responsible
and
collaborative
working
practices
in
practical
science.
During
the
course,
students
should
be
able
to:
• work
safely
and
use
material
and
equipment
competently
• work
responsibly
with
regards
to
the
living
and
non-‐living
environment
• work
effectively
as
individuals
and
as
part
of
a
group
by
collaborating
with
others.
•
21. Stephen Taylor Assessment
Appendix 2: Grade-level descriptors in the Middle Years Programme
Grade
Descriptor
1
Minimal
achievement
in
terms
of
the
objectives.
Very
limited
achievement
against
all
the
objectives.
The
student
has
difficulty
in
understanding
the
required
2
knowledge
and
skills
and
is
unable
to
apply
them
fully
in
normal
situations,
even
with
support.
Limited
achievement
against
most
of
the
objectives,
or
clear
difficulties
in
some
areas.
The
student
3
demonstrates
a
limited
understanding
of
the
required
knowledge
and
skills
and
is
only
able
to
apply
them
fully
in
normal
situations
with
support.
A
good
general
understanding
of
the
required
knowledge
and
skills,
and
the
ability
to
apply
them
effectively
in
4
normal
situations.
There
is
occasional
evidence
of
the
skills
of
analysis,
synthesis
and
evaluation.
A
consistent
and
thorough
understanding
of
the
required
knowledge
and
skills,
and
the
ability
to
apply
them
in
5
a
variety
of
situations.
The
student
generally
shows
evidence
of
analysis,
synthesis
and
evaluation
where
appropriate
and
occasionally
demonstrates
originality
and
insight.
A
consistent
and
thorough
understanding
of
the
required
knowledge
and
skills,
and
the
ability
to
apply
them
in
6
a
wide
variety
of
situations.
Consistent
evidence
of
analysis,
synthesis
and
evaluation
is
shown
where
appropriate.
The
student
generally
demonstrates
originality
and
insight.
A
consistent
and
thorough
understanding
of
the
required
knowledge
and
skills,
and
the
ability
to
apply
them
almost
faultlessly
in
a
wide
variety
of
situations.
Consistent
evidence
of
analysis,
synthesis
and
evaluation
is
7
shown
where
appropriate.
The
student
consistently
demonstrates
originality
and
insight
and
always
produces
work
of
high
quality.
Taken
from
the
MYP
Coordinator’s
Handbook
(IB,
2010a,
pp.59-‐60)
Appendix 3: The best-fit approach (clarification from the IB)
“The
descriptors
for
each
criterion
are
hierarchical.
When
assessing
a
student’s
work,
teachers
should
read
the
descriptors
(starting
with
level
0)
until
they
reach
a
descriptor
that
describes
an
achievement
level
that
the
work
being
assessed
has
not
attained.
The
work
is
therefore
best
described
by
the
preceding
descriptor.
Where
it
is
not
clearly
evident
which
level
descriptor
should
apply,
teachers
must
use
their
judgment
to
select
the
descriptor
that
best
matches
the
student’s
work
overall.
The
“best-‐fit”
approach
allows
teachers
to
select
the
achievement
level
that
best
describes
the
piece
of
work
being
assessed.
If
the
work
is
a
strong
example
of
achievement
in
a
band,
the
teacher
should
give
it
the
higher
achievement
level
in
the
band.
If
the
work
is
a
weak
example
of
achievement
in
that
band,
the
teacher
should
give
it
the
lower
achievement
level
in
the
band.”
(IB,
2010a,
p.25)
22. Stephen Taylor Assessment
Appendix 4: A comparison of the aims of the sciences under the current MYP model and after
the proposed changes of the Next Chapter.
Current
MYP
Sciences
Guide
(IB,
2010a,
p.4)
Proposed
changes
(IB,
2011,
p.6)
The
aims
of
the
teaching
and
study
of
MYP
sciences
are
to
The
aims
of
the
teaching
and
study
of
MYP
sciences
are
to
encourage
and
enable
students
to:
encourage
and
enable
students
to:
1.
develop
curiosity,
interest
and
enjoyment
towards
• understand
and
appreciate
science
and
its
implications
science
and
its
methods
of
inquiry
through
the
areas
of
interaction
2.
acquire
scientific
knowledge
and
understanding
• consider
science
as
a
human
endeavour
with
benefits
and
3.
communicate
scientific
ideas,
arguments
and
practical
limitations
experiences
effectively
in
a
variety
of
ways
• cultivate
analytical,
inquiring
and
flexible
minds
that
pose
4.
develop
experimental
and
investigative
skills
to
design
questions,
solve
problems,
construct
explanations
and
and
carry
out
scientific
investigations
and
to
evaluate
evidence
judge
arguments
to
draw
a
conclusion
• develop
skills
to
design
and
perform
investigations,
5.
develop
critical,
creative
and
inquiring
minds
that
pose
evaluate
evidence
and
reach
conclusions
questions,
solve
problems,
construct
explanations,
judge
• engender
an
awareness
of
the
need
to
effectively
arguments
and
make
informed
decisions
in
scientific
and
other
collaborate
and
communicate
contexts
• apply
language
skills
and
knowledge
in
a
variety
of
real-‐
6.
develop
awareness
of
the
possibilities
and
limitations
of
life
contexts
science
and
appreciate
that
scientific
knowledge
is
evolving
• demonstrate
sensitivity
towards
the
living
and
non-‐living
through
collaborative
activity
locally
and
internationally
environments
7.
appreciate
the
relationship
between
science
and
• reflect
on
learning
experiences
and
make
informed
technology
and
their
role
in
society
choices
8.
develop
awareness
of
the
moral,
ethical,
social,
economic,
political,
cultural
and
environmental
implications
of
the
practice
and
use
of
science
and
technology
9.
observe
safety
rules
and
practices
to
ensure
a
safe
working
environment
during
scientific
activities
10.
engender
an
awareness
of
the
need
for
and
the
value
of
effective
collaboration
during
scientific
activities.
23. Stephen Taylor Assessment
Appendix 5: Summary of assessment-related changes to the MYP sciences programme under
the Next Chapter.
Current
Sciences
Guide.
(IB,
2010a)
Proposed
changes
under
the
Next
Chapter.
(IB,
2011)
Six
assessment
criteria
Four
assessment
criteria
Zero-‐band
plus
three
dual
bands
of
achievement-‐ Zero-‐band
plus
four
bands
of
achievement-‐level
level
descriptors
(0,
1-‐2,
3-‐4,
5-‐6)
descriptors
(0,
1,
2,
3,
4
in
the
current
working
guide
for
pilot
schools)*.
Command
terms
defined
in
the
subject
guide
and
Command
terms
defined
across
the
whole
MYP
and
used
in
achievement-‐level
descriptors.
used
in
a
more
focused
manner
in
achievement-‐level
descriptors.
Attitudes
in
science
criterion
in
use.
Attitudes
in
science
criterion
removed.
One
world
and
Communication
in
science
criteria
in
Science
in
the
world
criterion
merges
the
aims
of
One
use.
world
and
Communication
in
science.
Assessment
of
subject
content
acquisition
primarily
Using
knowledge
criterion
to
assess
subject
content
through
Knowledge
and
understanding
criterion.
acquisition.
Key
proposal
that
this
“must
only
be
Modes
of
assessment
open
to
teachers.
assessed
through
tests
or
exams”
(IB,
2011,
p.13)
Lab
and
investigative
work
assessed
through
Lab
and
investigative
work
assessed
through
Inquiring
Scientific
inquiry
and
Processing
data
criteria.
and
designing
and
processing
and
evaluating
criteria.
*This
change
is
noted
in
a
copy
of
an
assessment
rubric
from
the
guide
for
pilot
schools,
shared
by
Sean
Rankin,
Curriculum
and
Assessment
Manager
for
the
MYP
sciences.