29. Practice vs Testing In class practice During exam Goals Content Learner activity Teacher activity Class-room climate learning feedback on learning process oriented product oriented open ended close ended ss know material students might not know success-oriented success/failure oriented peer teaching no peer teaching helps performance gives tasks cooperative competitive relaxed tense intrinsic motivation extrinsic motivation
62. IV. The clerk knows Cleopatra. Caesar asks the clerk about Cleo. Complete the conversation. Use the words in parentheses. ( 4 points, .5 each) Clerk : Yes, I know her. Julius : (1) _______________________________________ (work) ? Clerk : (2) ________________________________ ( palace downtown). Julius : (3) _______________________________________ ( do) ? Clerk : (4) ___________________________________ ( help people). Julius : (5) ___________________________________ ( close friend)? Clerk : Yes, (6) ____________________________________ (funny). Julius : (7) _______________________________________ ( sports)? Clerk : Yes, (8) ___________________________________ ( tennis ). Actual Student Responses on the worksheet
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73. Student A: Imagine you are at a meeting and not in an exam. All of your classmates are at the meeting too. Your partner doesn’t know anyone. Tell him who the people are. Student B: Imagine you are at a meeting and not in an exam. All of your classmates are at the meeting too. You don’t know anyone. Ask your partner who the people are. Student A: Tell your partner about an accident you or some member of your family had. When he/she tells you, ask some intelligent questions or make relevant comments. Student B: Tell your partner about an accident you or some member of your family had. When he/she tells you, ask some intelligent questions or make relevant comments. Student A: You are making a survey about what people think they will be able to do with telecommunications in twenty years. Ask your partner at least three questions about the topic. Student B: Your partner is making a survey. Answer his/her questions.
74.
75.
76.
77.
Notes de l'éditeur
Go over the Overview and have students predict what each theme refers to…
Why should we test? [click] Exams can indicated if a student is able to take a specific course, as in a placement exam. [click] Exams can check a students’ general progress, as in proficiency exams or diagnostics [click] These are exams like the TOEFL…they can exempt students from studying, but they can’t place them in a spcific course. [click] Exams can check how much a student has learned in a specific course. These are the exams we give every unit, every three units or every semester. These exams are called achievement exams. Each type of exam has a different purpose and they cannot be interchanged. For example, the TOEFL is a proficiency exam written to determine if the student’s level of English is high enough to undertake a course of study in the United States. It isn’t a placement exam for a specific course. A placement exam should be developed in direct relationship with the text used so that the student can be placed in the appropriate level. When you change texts, you change placement exams. Achievement exams are written to cover a limited amount of material to determine if the student has mastered it satisfactorily.
There are testing alternatives [click] that aren’t the traditional “paper and pencil” exams. [click] You can evaluate students’ progress constantly, not punctually (at specific times) [click] Definitions from REF 1: The Reflective Portfolio: Two Case Studies from the United Arab Emirates, Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
Reference 1 (REF1) From The Reflective Portfolio: Two Case Studies from the United Arab Emirates, Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
Portfolio assessment is very popular in places like the US where classes are small and teachers have office hours and preparation time [click] But they are more subjective, you can’t guarantee that two teachers will be grading the same way and [click] Teachers in Mexico have physical limitations: lots of students, lots of groups, little free time to correct [click] Would they be accepted by the SEP, schools, parents and students? Have participants discuss in pairs and then compare responses.
Why do we rely on written tests here in Mexico? [click] It isn’t the only way to test and it probably isn’t the fairest…some students don’t do well on written tests [click] But they are the easiest way to test with the large numbers of students we have here [click] It is more objective and can be designed so that results can be fairly consistent among teachers at the same institution [click] And they are accepted by institutions (official study plans, schools) parents and the students themselves
Only show the title of the slide. Ask teachers how many have exam banks at their schools (If some participants don’t know what exam banks are, ask for a definition). Put students into groups, try to get at least one teacher who has worked with an exam bank in each group. Have them discuss the advantages and disadvantages of the exam banks. When they finish, listen to some of their conclusions. Then continue on with the slide. Exam banks have many advantages for large schools. They are a collection of exams [click] that could be written by the teachers or a special committee [click]. They must be based on institutional guidelines. [click] It is possible to have various cycles of the exams.
Benefits: Teachers don’t have to write all their own exams. Instead of writing 5, 10 or more exams a semester, they write one or two. It save time, energy and quality. [click] They are useful in larger schools because they help standardize performance assuring all the students who pass from one level to another are ready. The criteria, instructions and grading are uniform. [click] Face validity (an exam looks like the teachers and students expect it to). The format is similar. All exams in the exam bank follow the same rules so there are no surprises. They even look similar. [click] Exams are also about the same length. In reality exams banks are the most time-efficient, reliable ways to test. Each exam can be written with the same criteria and the results between groups is much more reliable. Various cycles can be created so that cheating is minimized and each individual teacher works less since instead of writing an exam for every class they are teaching, they might just write a couple of exams for the exam bank. In pairs: Do they have exam banks? How do they work? If not, would they work?
So, if you are going to give a written test, you need to first decide what you are going to test [click] And that depends on many variables [click] The institution: often your school decides what you will test or gives you the exams you will use [click] Your students: Exams should reflect the age, interests, goals, and abilities of your students [click] Other teachers: If you are writing for an institutional exam bank, you have to write exams other teachers can use. They can’t just reflect what you do in class. [click] The text: Exams vary depending on the textbook you use. If you change texts, you can’t continue using the same old exams since the material covered is probably quite different [click] Time: It makes a big difference if your students have 30 or 50 minutes or an hour and a half to take an exam.
Writing for an exam bank requires the writer to think about the other teachers who will be using the exam. There are many criteria that need to be considered: [click] Level of English: Not all your colleagues speak English as well as you do. [click] Lack of mathematical skills: We all became teachers because we couldn’t do math. Therefore, don’t make exams practice in adding fractions. Don’t assign point values of 5/6, ¾, 8/9, etc. Teachers will not appreciate it. [click] Time factor: Pay attention to how much time is available to take the test. Remember, 50 minutes for an exam also includes a lot of organizational activity: seating arrangements, handing out the exam, going over the instructions, etc. [click] Ease of grading: Teachers don’t have that much free time to correct exams. Make them as easy to correct as possible without relying entirely on multiple choice and true/false. [click] Answer key: Provide the answers. Not all teachers will be able to read your mind. Understand why: Be sure other teachers understand why you are testing in a specific way BEFORE they give the test. It will save time and misunderstandings later.
You always have to adapt your test-writing to your students [click] If they are younger, use more images and make the exams shorter (students will work slower and have shorter attention spans), and always include humor. It relaxes them and makes an unenjoyable experience a bit more tolerable. [click] If they are older, consider making the exams more professional, reflecting what the students are studying or doing in their everyday lives and again, include humor. Even older students like to laugh occasionally.
You also have to decide how to divide up the points you have to cover the skills your students have practiced and emphasis they have given to grammar and vocabulary training. For example, maybe you are using a text in reading comprehension that includes some grammar and vocabulary work. If that were true, you’d want to devote more points to testing reading than to grammar and vocabulary. But if you were using a “four-skills” text that concentrates on vocabulary and grammar development and just has one reading practice per unit, you would devote many fewer points to reading. [click] If your institution tells you how many points to devote to each skill, you just follow through. [click] However, it your have to determine how many points to assign yourself, use your textbook as a guide. If you are writing exams other teachers will be using, it is only fair since it is the only common denominator. If you love teaching vocabulary and always give students extra vocabulary lists and a colleague prefers to teach grammar and doesn’t have access to your creative lists, imagine using his exam that ignores your beloved vocabulary items that you had the students review every night. Even if you write exams just for your students, it is only fair to base them on the text. Imagine the poor student who gets chicken pox and has to miss two weeks of class. She might not get your beautifully designed lists. All she can study from is the text and when she gets back to class and you give her your exam with 50% of the points dedicated to vocabulary you only presented in class, she’ll be very shocked to realize her hours of self-study were useless and she can’t understand what you are testing. [click] So, base your analysis on a general text analysis [click] Look at about how much time is spent on each skill [click] Count how many grammar, vocabulary, listening, reading, etc. exercises there are in each unit and calculate what percent of the time spent in the book is dedicated to each skill. [click] However, always keep the institutional goals in mind. Often the text chosen by the administration doesn’t really reflect the goals they have in mind for the students.
Base the content analysis on the textbook for the reasons we mentioned previously…Slide 13
You can’t test students on something they haven’t seen. [click] If you are writing for other teachers, the only thing you have in common is the textbook. If you are only writing for your students and they are absent, the only resource they have is their textbooks[click] You need to make an analysis of how much time is spent on each aspect you want to test.
Here is an example of a final content analysis. If you are using an exam bank, all the teachers should have it. The students should also have it. It can help them study. Look at the Grammar section. Notice that one point of the 15 points on that section of the exam is dedicated to the comparative. If the student didn’t know only one point was dedicated to that structure, he would know that it was more important to study BE and the simple present (7 points). This can also help the teachers since no one would spend hours teaching all aspects of the comparative if it is not represented with more than one point on the exam. This would be a positive washback effect.
Ask participants what they think “communicative testing” refers to. After hearing some possibilities, tell them: The “Communicative Approach” has been in existence since the late 80s. We now accept the ideas of teaching communicatively: group work, teaching language in context, etc, but we continue testing traditionally [click] Ask participants what “traditional testing” is (isolated sentences, transformation, grammar-based) [click] Communicative testing means testing in context…not isolated sentences
HO2-- Let’s look at some examples. What will you test? [click] Grammar? Refer participants to HO2. Have them look at the grammar examples and compare them in pairs. What are the differences ? When they finish giving you some differences go to the next slide.
[click] The example on the left is traditional. There is no context. It is OK for simple structures like this one, but what about more complex structures such as If clauses or present perfect/past? Here is an example you can give them: If I __________ (be invited) to your party, I __________ (go). What is the answer? If I am invited to her party, I’ll go (If 1) If I were invited to her party, I’d go (If 2) If I had been invited to her party, I would have gone (If 3) Since there is no context, any of them would be correct. [click] In the other example the grammar is presented in context, a conversation. If this had been about If clauses, it might have been: “ I’m sorry. I didn’t know she was going to have a party. If I ______________ (be invited), I ___________ (go).” Obviously If 3.
What will you test? [click] Vocabulary? Refer participants to the HO2. Have them look at the vocabulary examples and compare them in pairs. What are the differences? When they finish giving you some differences go to the next slide.
[click] The example on the left is a traditional vocabulary section. It tests if they learned a vocabulary list, but it doesn’t test if they really know how to use the vocabulary. [click] The example on the right tests many different problems students can have when they work with vocabulary: Difficult pairs: (1) (3) (6) Collocation: (2) (4) (5) The context lets you test more.
Ask participants how they could test functions…collect some ideas.
Go over the definition of functions just in case someone doesn’t know what they are.
Go through the examples. Emphasize how the following aspects become more complex as we go down the list: Grammar structures (easy to more difficult) Length of sentences (short to long) Register (from informal to formal)
Get some examples of functions [click] Then ask where they can find the functions in the text (in the contents at the beginning of the book) Have teachers look at the Overview for Attitude 2 (SB)… Which functions are similar? Look at Units 4-5…how are the functions presented?
This is the best type of section to test students’ knowledge of functions. (In HO2) [click] Students complete a conversation (or paragraph) with sentences (or even phrases) that communicate the correct function logically. Go over the first example. Show that there isn’t one correct answer. The first one could be: It has / I have / There is a big living room. Elicit possible answers for # 2 and 3. The section can be even more open as the other two option show. Elicit possible answers. Have participants compare this with the grammar and vocabulary sections they have seen. What are the differences? Emphasize that here the purpose is communication and that communication can occur even if the students don’t use complete sentences or make some grammar errors. This will be seen later in the workshop when they learn to correct these sections.
It is important that participants realize they can’t just take exercise sections from textbooks and use them to test. The purposes behind textbook exercises and examinations are the same as those between “in class practice” and “during exam” activities. The exam sections have different purposes and, therefore, different structures. Go over this chart carefully… Aims: In class the purpose is to learn, in exams it is to get feedback on the learning Content: activities in class are process oriented (the “doing” is usually more important than the result), exams are product oriented. In language exams it doesn’t matter how you get to a result....it’s the result that is graded. activities in class are also open-ended, there isn’t always a result, they can continue for days; on exams they have to close. Learner activities: in class students basically know the material or can use books and dictionaries to find out what they don’t know, on exams they often don’t know the material classroom activities are success-oriented, students are helped to succeed, exams are often success oriented, but not for all students you can have peer-teaching in classroom activities, but working in groups is ususally frowned on during exams. Teacher activity: In class, the teacher helps students improve their performance, on exams the teachers might help the student understand what he is supposed to do, but the teacher doesn’t help the student directly with the answers. Classroom climate: The climate in class is cooperative, students help each other, but on exams it could be competitive Classroom activities are relaxed, exams are tense There is intrinsic motivation in class (students often are motivated to do well by their own internal desires), but it is extrinsic during exams (parents, schools, grade pressure, etc usually motivate students to do well).
Balance means it adds up to 100% For example, you don’t need to have a 50-50 balance. 80-20, 70-30, 40-60 are also balances. The balance depends on the school situation…. BUT all three should be present in an exam.
Go over the definitions…be sure participants understand them… Accuracy and Fluency balance depends on what the students will be doing with English. Ask the participants: What accuracy/fluency balance would you recommend for tourism students? What is more important for them? (Fluency-maybe 70-30 over accuracy). What balance for students who will be translators? (Accuracy—maybe 80-20 over fluency).
In production, the student writes more than one word (remind them of Complete the Conversation for testing functions) The student can be more creative and the teacher has to be more alert because more than one answer might be correct. In recognition (for example, multiple choice or true/false), there is only one correct answer and it isn’t creative. Production means more work. Teachers with large number of students can’t handle a 50-50 balance. A good exam could be 20-80 or 30-70 (production/recognition) and still be fair. TOEFL and other similar exams are all recognition…they never test whether the student can produce language. Years ago this led to a big influx of Asian students into US universities. They had studied grammar and reading, but no speaking or writing. They did great recognizing correct answers on the TOEFL and got very high scores, but when they arrived in the US, authorities realized they couldn’t say two words in English….The TOEFL exam was revised and now includes writing sections and often oral interviews are required to study in the US.
Sections can also be objective or subjective. It is good to have some subjective sections, but an exam with sections that are all graded subjectively make it difficult to judge student ability between different teachers…. However, even though they are more difficult to grade, they do give more information about students’ ability. The limitations can be overcome if the graders are trained…. This is a very common problem with oral grading.
HO 4 ( Put participants into pairs and have them look at each exam section. For each one, they identify it as Accuracy/fluency, production/recognition and subjective/objective. About 15 mintues. Then go over all of them using the slides.) We are going to look at some exam sections and identify if they test accuracy/fluency, if they are designed to be production or recognition sections and if they are to be graded subjectively or objectively. Have participants go over each section in pairs using the handout. Then use this slide and the following ones to go over them. Answers: Fluency: Have participants show you there are more than one possible answer for most of the items. (for example, (1) What’s your name? / Your name, please? / Name, please / etc.) Production: Students write complete sentences or phrases. Subjective: The grader has to understand possible answers.
Answers: Fluency: it practices functions through vocabulary chunks which are essential for fluent conversations. Recognition: There is only one right answer. Objective: Anyone could grade it given an answer key
Answers: Accuracy: This is testing correct use of the present tense. Recognition: In reality all the student needs to do is find the subject and put the verb in the correct form Objective: There is only one correct answer.
Answers: Fluency: This is similar to the first example, but it is limited by cues. However, the students still have some freedom to be creative: (1) Look at those earrings / Look at these earrings / Look earrings ) Production: They are writing phrases Subjective: The grader must understand English to tell if a student’s answer communicates clearly.
Answers: Accuracy: tests grammar structures taught in the text Recognition: Just find the answer in the box and write it. Objective: One correct answer
Answers: Fluency: This requires the students to use discourse cues to order the conversation. It goes beyond simple accurate sentences. Discourse cues are essential for effective communication. Recognition: They just copy the sentences Objective: There is one correct order.
Go over slide with participants…. If necessary, go back and look at the examples on the previous slides and on HO 4.
Now, we’ll look into some more difficult formats in more detail
Get examples so you can be sure they understand what the stem (first part) and options are (A.b.c)
REF 3: From Kehoe, Jerard. Writing Multiple-Choice Test Items. ERIC/AE Digest Series EDO-TM-95-3, October 1995. http://www.ericdigests.org/1997-1/test.html Go through the points one-by-one…checking comprehension as you go
Continued
You might want to discuss how many items they feel they should use. Kehoe recommends 3-4, but some really formal exams use 5 options.
These are the different orders possible…. Talk about which seem best and why? Do multiple Choice handout (HO5) in pairs (30 minutes). These are the poor examples from the Burton article. Have participants think about what they think is wrong with them and how they could be improved. Then go over them. Use REF 4 to correct… REF 4: Burton, Steven J. Richard R. Sudweeks, Paul F. Merrill, Bud Wood. How to Prepare Better Multiple-Choice Test Items: Guidelines for University Faculty, Brigham Young University Testing Services and The Department of Instructional Science. 1991. http://testing.byu.edu/info/handbooks/betteritems.pdf When they finish HO5, have them write a multiple choice section using their content analyses and the section divisions they recently did. When they finish, have them compare and criticize their work. Be careful they are using multiple choice for reasonable sections…(20 minutes)
Go over with participants. Ask for their opinions. From (REF5) Designing Test Questions, Grayson H. Walker Teaching Resource Center, The University of Tennessee at Chattanooga, http://www.utc.edu/Administration/WalkerTeachingResourceCenter/FacultyDevelopment/Assessment/test-questions.html
Go over… From same source as previous slide
Get examples from them…
Macmillan English Dictionary…on line Go over the definition and be sure everyone understands.
Look for examples
Go over Macmillan English Dictionary…on line
Find examples When you finish the discussion , have them write two exam sections using their content analyses and the section divisions they recently did. They decide on which types of sections they want to write. When they finish, have them compare and criticize their work. Be careful they are using formats that are reasonable for what they are testing…(20 minutes)
Go over this with the participants. It will be the hardest section to write since they have very little experience with communicative exams.
This is all logical, but we often forget when we are writing exams [click] Production items should have more points because the student is required to do more [click] Recognition items are easier (it’s easier to recognize a correct answer than to think of it) so they should be worth less [click] Production sections also require more points because you should give partial credit [click] Keep away from complex fractions Have them assign point values to the sections they have written.
Go over these points one-by-one and let participants comment. Some are a bit controversial. (The instructions are not the exam. This is the basis of this slide…) Instructions should be as simple as possible. They aren’t testing the student’s knowledge of English, the items are. [click] Instructions can be very intimidating for lower level students. Write them on the exam in Spanish. They are not testing English, the items are. [click] In higher levels, if the instructions are in English and a student just can’t understand what to do, translate them. The instructions are NOT the test. [click] Always give the same instructions using the same wording. Students will learn them and save time. [click] Be careful when using examples. Some structures are based on patterns and examples can give away the pattern. For example, want someone to do something, if clauses, question formation, etc. [click] And finally….the instructions are NOT the exam Have them write instructions for each of their sections. Compare and discuss. (10 minutes)
It’s easy to correct an “A, b, c” section, but how do you correct a “complete the conversation” section so that it really tests communicative ability and competence?
Go over the summary
These are examples of partial credit. Go over then one-by-one. Ask participants if they would give partial credit or not if they were grading communication… First example: Does it communicate? Would you be able to answer the student’s questions? Probably. If it were a C1 student on the first exam, I’d give credit, but write in the corrections. On later exams, I’d probably not be a generous. Second example: The meaning is different. In the correct answer, you didn’t invite me, in the student’s answer you might do so in the future. Although the grammar is correct, it doesn’t communicate and I wouldn’t give any credit. Third example: Click slowly, discussing as you go. The first one communicates and if it were the first time students had worked with the past I’d accept it and just write in the correction. They use “yesterday” to indicate the tense. [click] This could also be accepted if it were in answer to the question. “what did you do yesterday?”. Correcting these sections, you have to ask “does the answer communicate the idea required by the conversation.” If so, accept it or give partial credit depending on the level of the students.
This is the exam section used in the activity (HO7). Go over it and get possible answers before they begin working. Have them work in pairs, grading the students’ work. (15 minutes) Then go over the worksheet, comparing how they graded.
Therefore, in oral work we have to practice both abilities.
I was coordinating the English language program at a large private university in Mexico. The courses were institutional—we supplied the programs, texts and exams for all students. The exams we will be seeing here were for the BA program. The students in this program have 5 hours of English courses a week: 4 in a classroom and one hour in a language lab. They usually have different teachers for the classroom and the lab. The oral exams are given in the lab. This puts less pressure on the students when practicing oral English in the classroom since they know that teacher will not be grading their production.
Since we were giving both fluency and accuracy practice in class, we felt this should be reflected in both the written and oral exams. In the written exams, besides the more traditional formats, students are also asked to produce parts of a conversation. These responses are graded based on if they communicate a logical idea or not. Minor errors in grammar are not counted, but errors in function are since it is possible for a student to make a number of grammatical errors and still communicate while an error in function usually impedes communication.
The oral exams are based on role plays. The accuracy exam is given first since students need preparation time. The fluency exam has no preparation. Both role plays are presented only for the teacher. The other students are either preparing their presentations or have already left. Students spend an average of 3 minutes per pair doing their role plays which are designed to be short. This allows the teacher to test the entire group of 30 in a 50 minute class period.
As students come into the lab they are put into pairs and given a role play card. They are told they will have AT LEAST five minutes to prepare. They are also told that as soon as they present their role plays for the teacher, they can leave. This encourages many of them to work fast. While they are preparing they can use any resource they want: textbooks, dictionaries or even ask questions. When they are ready, they come to the front of the class. They can have no notes or papers with them, so even if they wrote out their roles, they can’t read them. After they have performed the accuracy role play, the teacher gives them, either individually or in pairs, instructions for the fluency role play. They are given no preparation time.
This is the 5 points scale we used for grading. It was developed from numerous popular scales and adapted to our specific needs. Students can only get 0 if they do not present the exam. Most students get between 3 and 4 points on both scales.
These are some of the role plays that were used for testing accuracy. They are cut into cards. The role plays vary according to level and reflect functions practiced in the textbook. There are various cycles in use for each exam to discourage dishonesty, but in reality, since students are given preparation time, it doesn’t matter much if they get one of the cards before class since they can’t know which of the role plays they will be asked to do.
This is an example of the kinds of role plays used for the fluency part of the exam. The teacher can choose which role play to use to add variety and to not let the students know which they will get.
In order to make grading easier, this is the form used by the teachers to record the points. In conclusion, teachers and students have responded favorably to this kind of testing since they feel it allow comprehensive oral testing in a relatively friendly environment.