Online Reading Comprehension Assessment

An Overview of the ORCA Project: Online Reading Comprehension Assessment
Donald J. Leu
University of Connecticut
Portions of this material are based on work supported by the U. S. Department of Education under Award No. R305G050154 and R305A090608.
Opinions expressed herein are solely those of the authors and do not necessarily represent the position of the U. S. Department of Education.
Institute of Educational Sciences, U.S. Department of Education
Donald J. Leu, The University of Connecticut - Jonna Kulikowich, The Pennsylvania State University - Nell Sedransk, National Institute of Statistical Sciences - Julie Coiro, University of Rhode Island
With: Heidi Everett-Cacopardo, J. Greg McVerry; W. Ian O’Byrne, Lisa Zawilinski, University of Connecticut
Michael Hillinger, LexIcon Systems
PROBLEM:
We Lack Assessments of Online Reading
Comprehension
Comprehension For Schools
RESEARCH QUESTIONS
1. How reliable and valid are scores on measures of online reading comprehension for each
of three assessment formats: Closed Internet, Open Internet, Multiple Choice?
1. To what extent is each of the four major components of online reading comprehension
(Locate, Evaluate, Synthesize, Communicate) well represented in the three assessment
formats?
1. What are the relationships between measures of online reading comprehension ability
and the components thereof, and four measures of contextual validity: a) offline reading
comprehension ability; b) access to the Internet at home; c) level of technology
integration at school; and d) the district’s economic status?
2. In the eyes of key stakeholders, which assessment format is most practical for school
districts and states?
GOALS:
Develop Valid, Reliable, and Practical Assessments of Online Reading
Comprehension
TIMELINE:
A Four-Year Project
Partnerships with Departments of Education in Connecticut, Maine, and North Carolina
THEORETICAL FRAMEWORK:
A Dual-Level Theory Of New Literacies
(Leu, O’Byrne, Zawilinski, McVerry, & Everett-Cacopardo, 2009)
• Online reading comprehension ≠ offline reading comprehension
(Castek, 2008; Coiro & Dabbler, 2007; RAND Reading Study
Group, 2002).
• The new literacies of online reading comprehension are essential
for students in the twenty-first century (International Reading
Association, 2009; Partnership for 21st Century Skills, 2006).
• We require valid, reliable, and practical assessments to inform
online reading comprehension instruction (Kirsch, 2007; Leu, et. al,
2008; O’Byrne, et. al, in press; Quellmalz & Haertel, 2008).
OPERATIONAL DECISIONS
NEW LITERACIES
1. New Literacies include the new skills, strategies, dispositions, and
social practices that are required by new technologies for
information and communication;
2. New Literacies are central to full participation in a global
community;
3. New Literacies regularly change as their defining technologies and
social practices change; and
4. New Literacies are multifaceted, and our understanding of them
benefits from multiple points of view.
THE NEW LITERACIES OF ONLINE READING
COMPREHENSION
Knowledge Domain: Science – Human Body Systems – Eyes, Ears, Heart, Lung
Assessment Activity: Information Problem Solving Scenarios
• What is a safe volume level for listening to music? (Learn more about a topic)
• What effect do energy drinks have on heart health? (Learn more about a topic)
• Do video games harm your eyesight? (Take a position on an issue)
• Can Chihuahua dogs cure asthma? (Take a position on an issue)
Grade Level: 7th
Grade
Communication Tools: Wikis, E-mail, Facebook Text Messages
Assessment Formats: Multiple Choice, Open Internet, Closed-Simulated-Internet
Levels of Informational Space: More Restricted and Less Restricted
YEAR 1: DEVELOPMENT OF ITEMS IN THREE FORMATS
Open Internet Closed Simulated Internet
Multiple Choice
YEAR 2: COGNITIVE LABS / PILOT STUDY IN CONNECTICUT AND MAINE
Pilot Study: 800 students in Connecticut, sampled to represent the population of 7th
graders in the state
800 students in Maine, sampled to represent the population of 7th
YEAR 3: VALIDATION STUDY IN CONNECTICUT, MAINE, AND NORTH CAROLINA
Validation Study: 800 students in Connecticut, sampled to represent the population of 7th
800 students in Maine, sampled to represent the population of 7th
800 students in North Carolina, sampled to represent the population of 7th
YEAR 4: PRACTICALITY STUDY WITH THE MEMBERS OF THE GOVERNING BOARD OF
THE REGIONAL EDUCATIONAL LAB – NEW ENGLAND AND ISLANDS (REL-NEI)

Developing And Evaluating Three Formats For the Assessment of
Online Reading Comprehension: The ORCA Project
Results From an Initial Practicality Survey Designed to Inform Development of
Online Reading Comprehension Assessments
Sally Drew & W. Ian O’Byrne, University of ConnecticutSally Drew & W. Ian O’Byrne, University of Connecticut
A history of resistance to educational innovation
with technology (Cuban, Kirkpatrick, & Peck,
2001) leads us to believe that no assessment of
online reading comprehension will be used
unless it is practical.
A 22-item instrument based on previous
instruments of practicality (Center for Applied
Linguistics, 2008; Irving, 2001; Swierzbin,
Anderson, Spicuzza, Walz, & Thurlow, 1999;
Shepard, 1977, Stufflebeam, 1974)
Administered to a set of diverse and highly
knowledgeable school leaders who serve on the
Governing Board of the Regional Educational
Lab – Northeast and Islands (REL-NEI).
Respondents include:
•State Commissioner of Ed. [2]
•State Asst. Commissioner [5]
•Superintendent of Schools [3]
•Other (Ed. Consultant, Director of Curriculum &
Instruction, Teacher’s Federation
Representative) [8]
EASE OF SCORING
94% of respondents claim that ease of scoring is essential, or
at least very important. 64% of respondents feel that ease of
scoring was essential or very important, while 24% of
respondents felt that it was somewhat important. Significantly,
94% of respondents claim that an assessment of reading
comprehension must be easy to interpret.
Other major themes include:
Ease of scoring and interpretation are mandatory or the
instrument will not be used by teachers (6 responses)
Correctly interpreted information needed to inform instruction
(5 responses)
Interpretations need to be correct/valid and reported to
students/parents/schools (2 responses)
When asked to rank order the issues of practicality, 53% of
respondents moderately value ease of scoring, while 35%
claim ease of scoring to be the least important issue.
MEANINGFUL & RELEVANT DATA IS OBTAINED
The majority of respondents claim that the instrument should
represent either a simulated Internet environment (41%), or
the actual Internet (41%). 6% of respondents prefer multiple
choice.
24% of respondents claim speed or automaticity by which
students read and solve informational problems online is
essential, while 77% claim it is somewhat important.
100% of respondents claim that the accuracy by which
students read and solve informational problems online is
essential.
94% of respondents claim that measurement of specific
online reading comprehension skills (question, locate,
evaluate, synthesize, communicate) is essential.
respondents viewed the extent to which relevant information
is obtained to be of highest importance.
AVAILABILITY OF THE TECHNOLOGY
In measuring students’ online reading comprehension, the
availability of the technology is a key factor. 65% of
respondents claim that a typical school district would have
adequate technology resources to enable students to take an
assessment online.
Access to technology (6 responses)
Scheduling of students/computers/other assessments (2
responses)
Training of staff/students (2 responses)
respondents claim that the availability of technology is the
most important or close to the most important, while 24%
claim it to be of little to no importance.
EASE OF USE
59% of respondents claim that ease of use and interpretation
is essential, while 35% claim that it is important, and 6% claim
that it is not vital to the practicality of the instrument.
Time/cost invested by staff/students in training,
administration, scoring, and interpretation (7 responses)
Easy to use and interpret instructions (4 responses)
Results able to inform instruction, curricula, and other
assessments (3 responses)
respondents valued ease of use in an assessment as being
fairly important, while 24% responded that it is very important.
ADMINISTRATION TIME
35% of respondents claim that school districts should devote
40 minutes per year to assessments of online reading
comprehension, while 29% claimed that school districts
should devote 80 minutes per year. 29% of respondents
claimed school districts should be: integrated regularly into
instruction; monthly; or as much as needed.
A need for frequent, “on-demand” assessments (5
responses)
A demand for assessments of this kind (4 responses)
Guidance into implementation of assessments into
instruction (3 responses)
When asked to rank order the issues of practicality, an
overwhelming 82% of respondents claim that administration
time is the least or second to least important issue.
RANK ORDER OF ISSUES OF PRACTICALITY
Percentages are representative of rank-ordered values for 16
respondents. The matrix table identifies the value of each
category with 1 identifying the most important, and 5 being the
least important.
1 2 3 4 5
Availability of
technology
24% 41% 6% 12% 12%
Ease of use 12% 24% 47% 12% 0%
Administration time 0% 6% 6% 41% 41%
Ease of scoring 0% 6% 24% 29% 35%
Meaningful/relevant
information
59% 18% 12% 0% 5%

Using Cognitive Labs to Refine Item Design for Assessments of Online Reading
Comprehension in Real-Time Unbounded Internet Environments
Julie Coiro, University of Rhode Island; Lisa Zawilinski, University of Connecticut and Carita Kiili, University of Jyväskylä
PHASE 2: INSTANT MESSAGE/EARLY FACEBOOK
• Interface: Instant Message and early Facebook prototypes were used to give
prompts and collect responses
• Purposes:
1. To explore three formats of Open ORCA (Hybrid, Notepad, IM) and a range of
communication tools (wikis, discussion boards, email, and blogs) using Survey
Monkey and IM to capture student responses and outside interfaces
2. To get feedback on clarity of wording, task authenticity, and timing to complete
tasks
3. To use student’s skilled searching pathways in Open environment to inform
algorithms of websites to be included in the ORCA-Closed environment
4. To use student’s responses to inform rubric development and operationalize
scorepoints
• Population: N=39 students (1-2 times each)
REVISIONS INFORMED BY PHASE 2 COGNITIVE LAB EXPERIENCES
TASK DEFINITION/SCOREPOINTS
• Contextualize tasks with real purpose and authentic communication tool use to less
resemble a testing situation.
• Preface actual assessment with directions that these problems are only scenarios
so students are not led to believe they are really true.
• Refine evaluation tasks to break down into independent scorepoints for author’s
name, authority, agenda, and reliability of claims made.
• Use prompts to parse out synthesis tasks across the whole task (within a website,
across two websites; across all four websites) and ask students to “use your own
words” and “explain why these are important facts”.
• Focus communication score points on tool use, tone, organization, and clarity so to
not confound with synthesis
INTERFACE DESIGN and TIMING
• We still need to resolve the issue that some students still skip embedded
introductory links or go right to the task without reading explicit requests after the
introduction.
• Embed timed supports to move students along if can’t locate information, but we
need to test appropriate wait time before scaffolds appear.
• Locating score points will require hand scoring in ORCA-Open until we build a list of
relevant websites to match with the computer, but web pages will still come and go.
• Scoring the nuances of synthesis (combining information from two sources in your
own words) will require hand scoring in ORCA-Open.
OVERVIEW AND KEY TERMINOLOGY
The purpose of these cognitive labs was to develop, test, and refine a set of eight
online information problem solving scenarios that represent alternative forms of an
Online Reading Comprehension Assessment (ORCA).
•An ORCA-Open is designed to assess real-time reading processes and products
required as students Locate, Evaluate, Synthesize, and Communicate (LESC)
information while reading in the Open Internet, a dynamic and unbounded online
digital information environment.
•Each scenario (e.g., LESC) requires students to locate, evaluate, synthesize, and
communicate information that focuses on a different body part (e.g., lungs, heart, eyes,
or ears) and a related science topic (e.g., asthma, heart healthy snacks, decorative
contact lenses, safe music volume levels). See Figure 1.
For this project, an ORCA-Open consists of 32 items, which are grouped into two
scenarios, or LESCs. Each LESC includes 16 items designed to measure reading
processes and products in an unbounded, real-time Open Internet environment related
to four components of online reading:
•Locate tasks (4 items) require students to use search engines, efficiently read search
results, and identify websites with information that can be used to solve the information
problem scenario.
•Evaluate tasks (4 items) require students to identify a website’s author and evaluate
his/her level of expertise, consider the quality of the evidence an author provides, and
evaluate the reliability of author claims related to the problem scenario.
•Synthesis tasks (4 items) require students to integrate information intratextually
(across multiple ideas within one website) and intertextually (across multiple websites)
in their own words, take a position on the issues involved, and use evidence from
multiple online sources to support their thinking.
•Communicate tasks (4 items) require students to access information in an email or
wiki space and respond with information they have learned about the scenario in an
appropriately crafted, visually organized, and clear message.
An ORCA-Open consists of one 16-item restricted task and one 16-item unrestricted
task; this combination of restricted and unrestricted items provides information about
online reading comprehension proficiency when reading for both types of online
purposes.
•An ORCA-Open Restricted Task is an online reading task for which the information
space to locate relevant claims is restricted to a particular set of online resources
found on the Open Internet related to a topic.
•An ORCA-Open Unrestricted Task is an online reading task for which the
information space to locate relevant claims is left open to any online sources found on
the Open Internet related to a topic.
PHASE 1: SURVEY MONKEY PROTOTYPE
• Interface: Survey Monkey used to give prompts & collect responses
• Purpose: To pilot early prototypes of Open LESC Topics and get
student feedback about how tasks were defined, clarity of directions,
vocabulary challenges, and requests for interface design and topics.
• Population: N=8 students (2 times each)
REVISIONS INFORMED BY PHASE 1 COGNITIVE LAB EXPERIENCES
1.Reorganized lengthy introduction into a numeric list of steps preceded by two sentences to
set the scenario.
2.Clarified synthesis task to “Tell us, in your own words, what you learned from things you
read” to move from compiling information to thinking and synthesizing. Also revised synthesis
prompts with cues to “use evidence to support your thinking” to force the use of reading online
(rather than only referencing prior knowledge).
3.Refined wording for aspects of critical evaluation (e.g., relevance, reliability, author’s
purpose & level of expertise) and continued to grapple with which aspects were most
important to capture.
INTERFACE DESIGN and TIMING ISSUES
1.Added “during reading” note taking space within Survey Monkey format (as opposed to a
separate word document) but felt it did not authentically capture synthesis.
2.To avoid students closing windows needed later in the task, added wording to locate task
steps that said, “Leave the website open. You may need it later.”
3.To avoid data loss, we need to design a plan to help students get back into the capture tool
more readily if they accidently close out of it.
4.Student requests for typical composing functions in online communication tools (tabbing,
bold, bullets, numbered lists) were not possible in Survey Monkey. However, this finding lead
to discussion and concerns about aligning authentic communication purposes and tools.
5.To ensure two LESCS were completed in the one hour time limit, we told students when they
began that they would have approximately 30 minutes for each task; and they should complete
as quickly as possible.
PHASE 3: COMPLETED ORCA-OPEN FACEBOOK INTERFACE
• Interface: Simulated Facebook interface used to give prompts and collect responses; Population: N=6 (we will continue to collect more in early 2011.
• Purpose: To test all 8 LESC versions with Facebook Internet and timed prompts..
EVALUATE
LOCATE
SYNTHESIZE
COMMUNICATE
PRELIMINARY REVISION IDEAS INFORMED BY PHASE 3 COGNITIVE LAB EXPERIENCES
1. Clarify directions for initial locating task that just asks for the URL - or leave out the context setting at the
beginning so students don’t begin synthesizing yet.
2. Test out range of critical evaluation tasks (e.g., source evaluation, reliability, and point of view) on a larger
sample before we make final decisions about which item(s) to include.
3. Test on a larger sample the second part of the synthesis task to “tell us why this information is important” to see
if we need to ask this separate from just listing the claims made.
4. Re-consider multiple “acceptable locations” for wiki post for scorepoints.
INTERFACE DESIGN and TIMING
1. We still need to resolve confusion with introductory scenario - do we force students to view the context setting
wiki post (like we do with the email)? Or do we give them directions prior to starting the tasks?
2. Do we turn off the “Instant Google” feature to keep similar to the closed environment or do we keep it and deal
with adjusting score points (that may continue to need revision) as Google gets smarter and strategies change?
3. To avoid surprise, preface tasks with directions that prompts may come from several different people within the
environment.
4. Consider adding more false information to wiki to encourage text revision rather than adding new information.
5. Consider adding email attachment feature (access with a hyperlink) into the email task
6. Continue testing timed support prompts with a range of students to get an acceptable time limit before help is
offered.
Cognitive Lab Cycle

The Challenges and Opportunities of a Closed Internet Environment for Assessing Online Reading
Comprehension
Michael Hillinger & Mark Lorah, LexIcon Systems
Portions of this material are based on work supported by the U. S. Department of Education
under Award No. R305G050154 and R305A090608. Opinions expressed herein
are solely those of the authors and do not necessarily represent the position of the U. S. Department of Education.
Goal
Create environments in which students can Locate, Evaluate, Synthesize, and Communicate information within real and simulated
web spaces. Approach
Response Capture Object (RCO)
Pose questions and record responses.
Challenge: Create an ecologically valid experience. Early ORCAs used off-the-shelf-survey software. This was
easy to configure but conveyed a clear distinction between the task requirements and the web environment. Our
goal was to present the task in a more familiar environment, blending task requirements and information source into
a single web experience.
Solution: Embed the questions into a Facebook-like interface. Facebook is a social networking tool that
presents statements and invites responses. With multiple modalities including Newsfeed, IM, and email, the
interface is familiar to our audience. Using comments and feedback from other “students” and using the student’s
name in all posts personalizes the experience
-------------------------
Challenge: Balance flexible responses with ease of scoring. The primary data is the student responses to
LESC questions and students are encouraged to use their own words. This precludes easily-scored responses such
as multiple choice.
Solution: The solution is an ongoing issue. One approach to is to provide string matching analysis for
responses that require known text, such as a URL. However, this version will likely still need significant hand
scoring.
-------------------------
Challenge: Capture process measures. As important as the student response is the process that lead to it. What
search terms are used, which web sites receive the most attention, what page links are clicked?
Solution: Closed ORCA captures a variety of measures. The RCO for the open ORCA will capture a time-
stamped record of actions within the Facebook interface. Because the closed ORCA also includes the web sites,
there will be the opportunity to gather data on a wider variety of measures.
Open ORCA
RCO--Facebook
RCO--Wiki RCO--email
RCO—Facebook:Newsfeed
The Response Capture Object Facebook component is a
flexible presentation and response tool that emulates a
newsfeed stream. All of the content is predefined in easy-to-
modify XML files. This shows the newsfeed component.
Closed ORCA
RCO--Facebook
RCO--Wiki RCO--email
RCO—Email
The Response Capture Object--Email
allows multiple-entry inbox, text
formatting, and attachments. All of the
content is predefined in XML files.
RCO—Wiki
The Response Capture Object—Wiki
allows multiple-entry inbox, text
formatting, and attachments. All of the
content is predefined in XML files.
Closed Environment (CE)
Provide stable internet search space.
Challenge: Create a realistic exploration environment. The web is a dynamic and complex environment. A simple simulation will not
afford the same level of realism.
Solution: Use a real browser with standard HTML protocols. Using the existing mechanisms, we create HTML pages in a Closed
Environment (CE) on the web. An open source search engine is the core for our “Gloogle” search page. There are currently about 200
pages in the CE to cover the 8 topics. These include both target pages and distracters. The search engine results can be “tuned” using
metadata on each page to approximate the search results on the web.
-------------------------
Challenge: Balance realism against resources. Even the simplest page on the web can contain animations, paid ads, and multiple
links to other pages containing the same. Reproducing this, even with our limited number of pages, would require thousands of web
pages.
Solution: Limit links and use screen captures for non-critical information. Most of the information on the target web pages is not
relevant to answering the question. The sites in the CE will have live links to pages judged to be key. Clicking on other links will lead to
dead ends or no response. All clicks will be recorded to assess how many live and static links are used.
The closed ORCA uses the Response Capture Object (RCO) connected to the
closed environment (CE).
The open ORCA uses the Response Capture Object (RCO) connected to the
web.
Components
Closed and Open ORCAs
The Open ORCA relies on student access to the web for responding to the questions in each
topic. While the open format requires relatively low development effort, it has significant
shortcomings that prevent it from widespread adoption as an assessment standard. These
include reliance on web content that can change without warning, and limited opportunity to
track student actions.
The alternative is the Closed ORCA. In this approach, the web exploration is confined to a
Closed Environment (CE) of web pages using a specially developed search engine. The
closed ORCA provides a stable collection of web pages that can be adapted to specific
requirements. It also provides a mechanism for more complete tracking of student exploration.
Both the Open and Closed ORCAs share a common set of data presentation and collection
tools collectively called the Response Capture Object.
RCO—Facebook:IM
This shows the Facebook
Internet Messenger component.
Closed Environment— “Gloogle” Search
Engine
Using an open source search tool provides flexibility and realistic search
results.
Web pages
The CE pages use more screen capture images and fewer live links than the live web
pages
Closed Environment
Pages
Live Web
Pages

Using Cognitive Labs to Refine Item Design for Multiple Choice Assessments of Online Reading
Comprehension
Heidi Everett-Cacopardo & J. Gregory McVerry
Research Question
• How valid are scores on a multiple choice measure of online
reading comprehension focused on Locating, Evaluating,
Synthesizing, and Communicating?
Procedures
• Focus group of experts used for item development and first round
of validations.
• Structurally prompted think alouds (Afflerbach, 2002) utilized
during cognitive labs (Ericsson & Simon, 1999) with students.
Data Sources
• Researcher field notes and screen capture videos
Analysis
• General inductive methods (Merriam, 1998) used to identify
patterns in student responses and use of comprehension processes.
Goal:
Create a 32 item multiple choice assessment.
Item Development:
Create an initial battery of 64 items
(4 content areas x 4 constructs x 4 items for each construct)
Scientific Advisory Board would then choose 32 best items.
Locate Evaluate Synthesize Communicate
Easy identify/define key
terms
determine
appropriate links to
learn more about the
author
combine information
from two sentences in
the same
paragraph/text/image
use a particular feature
within a communication
tool
Medium
1
work with the concept
as a system
identify relevant
information about
author's level of
expertise
combine two modes of
information
select the most
appropriate
communication tool
Medium
2
work with the concept in
relation to other organs
compare/contrast
information
combine information
across two web
pages/screen shots
communicate
information with a
particular tool using
appropriate
tone/discourse
Difficult given scenario, locate
information in the
text/image t
given scenario, select
questionable
information on a
website
given scenario, read
across both
websites/modes
label/organize
information correctly to
share with a particular
audience
Example Item: Synthesis Medium
Cognitive Lab Results
No cognitive labs were conducted. Focus group determined
scenario based items with less content demands would be a
more valid measure of online reading comprehension.
Conclusions
Made decision to develop open format first. Then parallel
MC items would be developed.
Goal:
Create a 16 item scenario based multiple choice assessment.
Item Development
Developed an 8 item MC assessment around the problem: Can
Chihuahua dogs cure asthma?
Question stem and answer choices provided on sheet of paper.
Screenshot of websites/search results given on separate sheets.
Students focused on the answering the problem and
was not able to keep in mind the task that needed to
be complete within each step.
Example Item: Locate
Example Item: Evaluate
Student suggested providing an additional page
from the website to orient themselves with the
“about” page
Cognitive Lab Results
Cognitive labs conducted with three students. Participants engaged
by topic, struggled with vocabulary
Conclusions
Made decision to test different item format. Further develop open
ORCA to help determine skills assessed and identify distractors.
Goal:
Create a 16 item multiple choice assessment for each content area.
Item Development:
Create an initial battery of 64 items (4 content areas X 4 constructs X4)
Choose two best for final 32 item assessment.
Algorithm developed to choose websites and distractors.
Cognitive Lab Results:
Cognitive Labs currently underway. Preliminary results suggest
students forget question stem when reading multiple passages
and want uniformity in item format.
Future Directions:
Compare formats: response capture object with static images
or hyperlinked mock websites.
Example Item: Locate
Example Item: Evaluate
Example Item: Communicate
Example Item: Synthesis
Student Response: A because the… by the
newspaper in Boston… this is the only one
related to Boston.
Student Response: C
because it says what she actually
has done physically and what she
does for health.
Student Response: C because…well this is
mostly about heart problems and this one has
some things about heart problems but if it had to
be about both of them…then I would choose one
that is more vague but it’s about heart problems.
Student Response: A because if you are
sending a new message you would want
compose but you are responding so reply.

Online Reading Comprehension Assessment

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Online Reading Comprehension Assessment

Similar to Online Reading Comprehension Assessment (20)

More from Greg Mcverry

More from Greg Mcverry (10)

Recently uploaded

Recently uploaded (20)

Online Reading Comprehension Assessment