From the CALPER/LARC Testing and Assessment Webinar Series
Download the handouts and ppt: https://larc.sdsu.edu/archived-events/
View the recording: http://vimeo.com/55788959
What’s it About?
This webinar is about assessing second language speaking. At the start of the webinar I’ll talk a little bit about why we assess speaking – because it hasn’t always been the case – and cover some history of assessing speaking in the United States. Even though I’m from the other side of the pond, I’ve dug around quite a bit in developments in the US. We’ll quickly move on to talk about what we assess, particularly looking at “constructs” and “skills”. Clearly we can’t assess everything in one assessment or test, so we have to select what is most important for our students. Next we’ll look at how we can elicit the evidence we need to make judgments about the quality of learner talk, and this will involve looking at task types that we can use, and there are a small number of samples on a hand-out that we can discuss. When learners talk during a task, we have to be able to summarize the quality of the speech. To do that, we need to be able to “rate” or “score” the performance, using rubrics or rating scales – depending which side of the Atlantic you’re on! So we’ll look at different ways of doing that, and you’ll have the opportunity to evaluate and vote on the kind of rubrics you prefer. We’ll finish up by looking at the processes involved in designing assessment systems, and putting them into practice.
For more information view this short video introduction.
Who’s it For?
There is no assumption of prior knowledge of assessing speaking, although classroom experience would make the webinar much more useful! Nor is it aimed at people who design large-scale summative tests, as the ideas are as relevant to local formative assessment practices as well as summative assessment. At some points we’ll also be questioning some current practice that we find in standardized testing that just isn’t relevant to classroom learning. So it’s definitely for teachers, although people who work for assessment agencies would also find the overview interesting. It’s also for students studying for MA or PhD degrees, and I will be mentioning research that informs practice; but clearly, in just one hour it is impossible to get into research in detail.
What do I Need to Do in Advance?
There are three hand-outs to accompany this webinar, and there’s a lot of information in them (download below). I’ll be referring to these hand-outs during the webinar, but you don’t want to be reading them when you can be sharing ideas with others or following the talk. So please download these before the webinar, read them at your leisure, and have them to hand for reference during the webinar. And if you want to know a little bit about me – then visit my website at http://languagetesting.info.
Webinar Date: April 1
General Principles of Intellectual Property: Concepts of Intellectual Proper...
Assessing Speaking: Putting the Pieces Together with Dr. Glenn Fulcher
1. Assessing Speaking: Putting the Pieces Together
Glenn Fulcher
“The ability to speak a foreign
language is without doubt the
most prized language skill,
and rightly so. . ..”
Robert Lado
(LanguageTesting, 1961, p. 239)
http://languagetesting.info
2. Outline of the Webinar
“Yet, testing the ability to speak a
foreign language is perhaps the least
developed and the least practiced in
the language testing field”
Why assess speaking? – A little bit of history
What are we assessing? – Defining the constructs
How can we assess speaking? – Rubrics and scales
Designing assessments – Creating
specifications
Using assessments – Raters,
Interlocutors,
and feedback
Background
How can we elicit evidence? – Tasks types
Concepts
Information
Decisions
Planning
Quality
3. Why assess speaking? A little bit of history
• The situation prior to WWII: Army Beta
• Kaulfers, W. (1944). War-time developments in modern language
achievement tests. Modern Language Journal 70(4), 366 – 372.
The urgency of the world situation does not permit of erudite theorizing in English about the
grammaticalstructure of the language for two years before attemptingto converse of to
understand telephone conversations…The nature of the individual test items should be such as to
provide specific, recognizableevidence of the examinee’s readiness to perform in a life-situation,
where lack of ability to understand and speak extemporaneously might be a serious handicap to
safety and comfort, or to the effective execution of military responsibilities.
• ASTP (Agard & Dunkel, 1948:
http://eric.ed.gov/PDFS/ED041490.pdf)
• The FSI (1952 – 1958)
• ILR (1968)
• ACTFL & the OPI (1987)
• cf. Assessing speaking in the United Kingdom
4. Yesterday and today…
• Linking language production to real-world contexts
• Valuing communication over knowledge about the language
• Achieving communicative goals effectively
• Performing work related tasks safely
• Placing individuals in appropriate training or jobs
• Acquiring competence in educational contexts
• Giving learners a sense of achievement
• Motivating further learning
• Providing useful feedback on learning
• Producing competent individuals for the nation
Continuity:
5. Poll Time
What is the main reason for you to assess
speaking in your own context?
(a) Focus on communication in the classroom
(b) Prepare learners for the world of work
(c) Prepare learners to operate in an
educational environment
(d) Motivate learners
(e) Other (Let us know on the discussion
board)
6. What are we assessing?
• ACCURACY – Language Competence
• Pronunciation
• Stress
• Intonation
• Syntax
• Vocabulary
• Cohesion
Defining the constructs – or what does it mean to ‘speak’?
7. What are we assessing?
• FLUENCY
• Hesitation
• Repetition
• False starts
• Self-correction
• Re-selecting lexical items
• Restructuring sentences
Defining the constructs – or what does it mean to ‘speak’?
8. What are we assessing?
• COMMUNICATION STRATEGIES
• Overgeneralization
• Paraphrase
• Word coinage
• Restructuring
• Cooperative strategies
• Code switching
• Non-linguistic strategies
vs.
• Formal avoidance
• Functional avoidance
Defining the constructs – or what does it mean to ‘speak’?
Achievement
Strategies
Avoidance
Strategies
9. What are we assessing?
• DISCOURSE COMPETENCE
• Scripts
• Adjacency pairs
• Turn taking
• Openings and closings
• INTERACTIONAL COMPETENCE
• Managing co-constructed speech
• TASK COMPLETION
• Is the outcome successful?
Defining the constructs – or what does it mean to ‘speak’?
10. What are we assessing?
• PRAGMATIC & SOCIOLINGUISTIC COMPETENCE
• Appropriateness
• Implicature (speech acts)
• Establishing identity (being through words)
• Situational sensitivity
• Topical knowledge
• Cultural knowledge
• Honorifics
Defining the constructs – or what does it mean to ‘speak’?
11. Poll Time
What do you think is the most important
aspect of speaking to assess in your own
context?
(a) Accuracy
(b) Fluency
(c) Strategies
(d) Discourse and Interactional Competence
(e) Pragmatics
12. How can we elicit evidence?
• Task Orientation
• Open (outcomes dependent upon speakers)
• Guided (by instructions)
• Closed (outcomes dictated by input or instructions)
• Interactional Relationship
• Non-interactional (monologue)
• One-way
• Two-way
• Multi-way
Task Types (Fulcher, 2003, p. 57)
13. How can we elicit evidence?
• Goal Orientation
• None
• Convergent
• Divergent
• Interlocutor Status and Familiarity
• No interlocutor
• Higher status
• Lower status
• Same status
• Degree of familiarity
• Topic(s)
• Situations
Task Types (Fulcher, 2003, p. 57)
14. Poll Time
Which of the four task types would you
prefer to use?
(a) Reading aloud
(b) Talking with the teacher
(c) Pictures
(d) Simulation
After voting you may wish to say why in the
chat box.
15. Where Do We Store Speech?
• Let it pass
• Score in real time
• No going back
The ephemeral nature of speaking
Options for
Pedagogy
• Integrated Projects
• News/Magazines
• Content & Language
Integrated Learning
(CLIL)
• Audio
• Web Portfolio
• Podcasting
• Video
• Web Portfolio
• Vodcasting
16. MEASUREMENT DRIVEN APPROACHES
• Expert judgment
• Experiential
• Scaling ‘can do’ statements
• Data-based scales
• EBB scales
• Performance Decision Trees
DATA DRIVEN APPROACHES
Rating Performance
Evaluating the Quality of Language Using Rating Scales
17. Poll Time
Which rating scale do you think might be
most useful in your context?
(a) ACTFL
(b) CEFR
(c) EBB
(d) Data-Based
(e) PDT
18. • Defining the Learners
• Describing Reasons for Communicating
• Identifying Constructs/Intended Outcomes
Identifying Purpose for Assessment
• Teachers and Other Stakeholders
• Assessment Advisors
• Administrators
Setting Up a Design Team
•Writing and Prototyping Tasks
• Developing Scoring Systems
Designing Specs and Tasks Iteratively
•Timing, Resources, Equipment
Administrative Guidelines
Designing Speaking Assessments
Creating Test Specifications
19. Using Assessments
• Consistency between
judges
• Consistency within
judges
• Convergence or
Divergence?
Raters
• Accommodations
• Scaffolding (teacher
talk)
• Behavior
• Mannerisms
Interlocutors
• Audience
• Purpose
• Response
• Time
• Opportunities to
Learn
Feedback
20. Poll Time
Is convergence or divergence more important
in making qualitative judgments? ?
(a) Convergence
(b) Divergence