SlideShare une entreprise Scribd logo
1  sur  53
Economic Perspectives on
Standardized Testing(c)
Richard P. Phelps
(c)
2002, by Richard P. Phelps
Economic Perspectives on
Standardized Testing: Outline
1. Why can’t economists and psychologists just get along?
2. Overview of economic theory as it pertains to education & testing
3. Human capital theory and the economics of information
4. Supply & demand; benefits & costs; goods & bads
5. The cost of standardized testing (from society’s point of view)
6. The benefits of standardized testing (information)
7. The benefits of standardized testing (motivation)
8. Optimal testing system structures
9. Optimal testing industry structures
10. Discussion
Topic 1: Why can’t economists and psychologists just get
along?
1) Why can’t economists and
psychologists just get along?
[answer: sometimes they do]
• Tversky and Kahneman, two cognitive psychologists, asked
themselves why rational economic man patronizes casinos, where the
odds are against him.
• Their experiments revealed that tolerance of (or, attraction to) risk
varies widely among individuals, and most weigh small risks against
low-probability, but very large, gains “sub-optimally”
• Tversky’s and Kahneman’s work is now required reading for any
economics major
• Experimental economics, which strongly resembles cognitive
psychology in its methods, is now the fastest growing area of research
in the field.
1) Why can’t economists and
psychologists just get along?
[answer: sometimes they do not]
Test Utility research
• Thousands of studies conducted by I/O psychologists from the 1960s
through the 1980s
• Dozens of meta-analyses
• Even a few meta-analyses of the meta-analyses
• Few economists, then or now, even aware of the field
Decline in interest in Test Utility research
• Regulatory ruling against validity generalization in late 1980s by Civil
Rights office in Reagan administration
• National Research Council forms committee with curious membership
to critique a single Test Utility study (critique interpreted by many as a
condemnation of all Test Utility research)
Topic 2: Overview of economic theory as it pertains to
education & testing
2) Economic theory as it
pertains to education
in general
Traditionally, education economics conducted in 2 fields
Labor Economics
• Labor markets for teachers and graduates
• Returns (in wages) to investment (in years) in education
Public Finance
• Returns (in achievement, attainment) to investment (in tax revenues)
• Funding equity, adequacy, efficiency, & intra-metropolitan migration
2) Economic theory as it
pertains to testing
in particular
Human Capital Theory
• Higher wages over the long term can more than compensate for the
earnings foregone while still in school
• …assumed a strong correlation between accumulation (years in
school, any school) and earning power (applicable knowledge and
skills)
Economics of Information
• Basic economic assumption of “perfect information” is simplistic
• When buyer and seller have “asymmetric” information, classic
economic assumptions are not appropriate
Topic 3: Human capital theory and the economics of
information
3) Human capital
theory:
seminal works
• Human Capital (1954), Gary Becker
• Schooling, Experience, and Earnings (1974), Jacob Mincer
• Dozens of World Bank reports
3) Economics of
Information:
seminal works
• “The Market for Lemons” (1970) George Akerlof
– When buyers can evaluate a purchase based only on a quality assessment of the
entire group, sellers have an incentive to market poor quality merchandise and,
over time, the average quality of goods declines. Often-used counters to quality
decline are: guarantees, brand names, franchising, and credentials.
• “Economics of Imperfect Information”(1976) Rothschild, Stiglitz,
Grossman
– Perfectly competitive markets have perfect information. In markets without
perfect information, there is little incentive for private individuals to fill the
breach (Consumers’ Reports is an exception, and not very profitable). Thus,
there can be a role for government to promote market efficiency, by providing
information.
3) Screening,
signaling,
filtering,
credentialing, I
• Education and Jobs: The Great Training Robbery (1970), Ivar Berg
– Employers pay for credentials, not human capital; they know little to nothing of
the quality of education programs, only the perception thereof
• Generating Inequality (1972) Lester Thurow
– Employers want “trainable” employees, and judge that those who could endure
schooling are probably more trainable than those who could not
• Work of Piore and Doeringer on “Market Segmentation”
– Neither education nor education credentials matter in “secondary” labor markets,
only in “primary” market, with career ladders
3) Screening,
signaling,
filtering,
credentialing, II
• Market Signaling (1973), Michael Spence
– Diplomas are a signaling device to employers, who take a gamble with every
new hire; evidence that the graduate is hoping employers will conclude that
certain human capital has been obtained, but not proof that it has
• “On the Weak versus the Strong Version of the Screening Hypothesis”
(1979) George Psacharopoulos
– Weak: employers pay only higher starting wages for “better” credentials
– Strong: employers continue to pay higher wages for “better” credentials even
after they become familiar with each employee’s actual productivity
• “Higher Education as a Filter” (1973) Kenneth Arrow
• “The Theory of Screening” (1975) Joseph Stiglitz
3) Empirical and
theoretical
work on
standards
• Burton Weisbrod (1964)
– Discovered that 90% of adults are hired within the boundaries of a school district
other than the one from which they graduated
– So, employers are not familiar with and have no influence over the education
standards used to train virtually all their employees
• John Bishop (1980s)
– It is unreasonable to expect a teacher to be both a sympathetic coach and a
neutral judge. External exams let them be coaches exclusively, which is in
keeping with what most of them probably want anyway.
• Robert Costrell (1994)
– School district incentives are to inflate grades and socially promote. If they
maintain tough standards, they only hurt their own children in later competition
against graduates of other districts where standards are lax and grades inflated.
– Standards must be enforced externally, or they will not be.
Topic 4: Supply & demand; benefits & costs; goods & bads
4) Benefits & costs;
goods & bads
• Economists are (small d) democrats
– what is a “good” or a benefit is relative to each individual; the researcher
does not get to decide what is good or bad for the consumer; consumers
decide for themselves
– but, we’d all like more money (freely exchangeable) and more free time
• Economists assume we all want more of something (even if it is
spiritual enlightenment), and that we can’t always get it
• Benefits have two phases: creation and capture
– Not all potential benefits are realized, or “captured”
– (e.g.,) You do very well and learn very much at a college with a terrible
reputation, and then cannot get a job because of that reputation
4) The demand for
standardized testing
• Phelps (1998) - 40 years of public opinion poll data
– The adult public is not ignorant about standardized tests, since all have
taken many, for better or for worse
– Support for high-stakes standardized testing is overwhelming, and has
been consistently so for decades
– Most stakeholders, including students and parents, are strongly
supportive. Teachers are usually supportive, but don’t like being judged
for outcomes over which they have little control. Education professors
are strongly opposed. Administrators have been on the fence, may now
be opposed.
– The year 2000 “testing backlash” was very strongly hyped public relations
creature, and completely unsupported by the objective evidence.
4) “Natural Experiments” in test
demand and valuation:
a) countries liberalize education,
b) drop test requirements,
c) find that standards deteriorate,
d) then revert back to testing
• Many Western European and North American states
(1960s – 1970s)
• Many Post-Colonial, Newly-Independent states
(1940s – 1970s)
• Ex-Communist Eastern European states (1990s –
2000s)
4) Trends in test
adding/dropping, OECD
countries: 1974--1999
Cumulative net change in number of tests in 30 countries and
provinces, 1974 to 1999
0
10
20
30
40
50
60
YEAR
Numberoftests
4) Countries adding or
dropping large-scale, external
testing,
by type of testing: 1974-1999
 
 
Number of countries or provinces...
Type of testing ...adding
testing
...dropping
testing
Assessments  17  0
Upper secondary exit exams  12*  0
University entrance exams  5  0
Subject-area end-of-course exams  6  0
Lower secondary exit or entrance exams  4  2
Inclusion of voc/prof tracks in exit exam system  3  0
Primary/secondary-level achievement testing  2  1
Diagnostic testing  2  0
TOTAL  51  3
4) Countries with nationally
standardized high-stakes exit
exams, by level of education
Primary school Lower secondary
school
Upper secondary
school
Belgium (French)
Italy
Netherlands
Russia
Singapore
Switzerland (some 
cantons)
Belgium (French)
Canada: Quebec
China
Czech Republic
Denmark
France
Hungary
Iceland
Ireland
Italy
Japan
Korea
Netherlands
New Zealand
Norway
Portugal
Russia
Singapore
Sweden
Switzerland
United Kingdom: England & Wales, 
Scotland
Belgium: (Flemish) & (French)
Canada: Alberta, British Columbia, 
Manitoba, New Brunswick, 
Newfoundland, Quebec
China
Denmark
Finland
France
Germany
Hungary
Iceland
Italy
Japan
Netherlands
Norway
Portugal
Russia
Singapore
Sweden
Switzerland
United Kingdom: England & Wales, 
Scotland
4) Demand for testing
is not unlimited
– saturation is possible
School district response to state test mandates (1991)
State and local tests'
purpose and content are…
Percent of districts substituting
state test
…exactly the same or very similar 82
…somewhat or moderately similar 69
…not at all similar or very little 41
SOURCE:  U.S. GAO, 1993.
Topic 5: The Cost of Standardized Testing
(from society’s point of view)
5) Cost jargon
• Marginal cost (the cost of the next unit): For a test, it is the cost that
is incurred due to the addition of a test, and only that cost.
– (e.g., during test administration, the school building must be maintained,
but such would be the case without a test, too. The test is not responsible
for this cost.)
– Subject-matter instruction occurs whether or not there is external testing,
so it also is not a cost of the test.
• Opportunity cost (cost of foregone opportunities (i.e., instead of
doing this, you could have been at work making money)): For a test,
the time a teacher spends preparing for, monitoring, or scoring a test is
time he could have been planning his course, grading homework, etc.
– If the teacher makes productive use of the time while students are taking a
test, there are no opportunity costs.
5) Average all-inclusive
per-student costs of two
test types in states having
both:
1990-91
Type of test
Cost factors Multiple-choice Performance
Start-up development $2 $10
Ongoing, annual costs $16 $33
SOURCE:  U.S. GAO, 1993, p.43
5) Average per-student costs
of two test types in states
having both, with
adjustments:
1990-91
All
systemwide
tests
Sample of 11
state
performance
tests
Sample of 6
multiple-choice
tests in those
same states
All-inclusive marginal
cost
$15 $33 $16
…minus adjustment for
regular school year
administration -7 -15 -7
...minus adjustment for
replacement of
preexisting tests -6 -12 -12
Marginal cost after
adjustments
$5 $11 $2
SOURCE:  Phelps, 2000. 
5) “Economies”
jargon
• The unit cost of producing your product declines the more of an
“economy” you have (because fixed/overhead costs get spread out)
– Scale – you can sell at lower cost because you make so many of them
– Scope – you can sell at lower cost because you make other stuff that is
similar, or in similar ways
– Learning – you figure out ways to be more efficient and productive as
you gain experience
• There are many “economies” (just like validities)
Economies of scale in state performance testing
Some economies of scope in state performance testing
5) General structure of testing costs
Scorers
are...
 
GROUPS of 
teachers or 
professional 
scorers
 
INDIVIDUAL 
teachers or 
professional 
scorers
 
a 
COMPUTER
Students take tests...  
EN MASSE          in GROUPS           ONE at a TIME
 
5) Slack capacity in
U.S. students’ time
= opportunity for
windfall gain ?
Average number of hours per day devoted to…
Region/
Country Sports
TV
watching
Playing or
socializing Studying
USA 2.2 2.6 2.5 2.3
East Asia
(N = 5)
0.9 2.4 1.3 3.1
West Europe
(N = 4)
1.6 2.0 2.4 2.8
East Europe
(N = 7)
1.6 2.6 2.5 2.9
Topic 6: The Benefits of Standardized Testing
-- Information
6) Information
benefits of testing
• For whom? Could be anyone – student, parent, teacher, school,
public, postsecondary institution, employer, …
• Information can be used beneficially in:
– Diagnosis (of student, teacher, school, ….)
– Alignment (to standards, schedule, each other, …)
– Learning for teachers
– Goodwill with public
– Decisions (promotion, placement, selection, …)
6) Information
benefits of testing
– how are they
measured?
• Predictive validity (fairly measurable)
• Allocative efficiency (fairly measurable)
– (the greater the range restriction the higher the
allocative efficiency?)
• Alignment (not so easy to measure)
• Goodwill (not at all easy to measure)
Topic 7: The Benefits of Standardized Testing
-- Motivation
6) Motivational benefits of
testing
– how are they measured?
• In controlled experiments:
– Ex. A) One group is told the test at the end of the course comes with a
reward; control group told it does not count
– Ex. B) One group is tested throughout course; control group is not
• In large-scale studies--Graduates from regions with high-
stakes tests compared to their non-tested counterparts:
• By their relative performance on another, common test
• Their relative wages after graduation
• Their relative rates of dropout, persistence, attainment, …
• “Backwash Effect” (e.g., students in states with high-stakes
high school graduation tests perform better even on the 8th
-
grade level IAEP, TIMSS, or NAEP
7) Large-scale studies
finding benefits to the
use of external, high-
stakes examinations
• John Bishop (1980s+) several studies -- IAEP, TIMSS,
SAT, NY State, Canada, …
• Winfield; Fredericksen; Bishop; Jacobson (minimum
comp. states)
• Others: Graham, Husted (SAT); Grissmer, Flanagan
(NAEP); Phelps (TIMSS+); Carnoy (NAEP); Rosenshine
(NAEP); Braun (NAEP); Wenglinsky
7) Smaller-scale
studies finding
benefits to the use of
high-stakes
examinations
• Controlled experiments – Tuckman, Trimble; Webb; Wolf, Smith;
Egeland; Jones; Brown, Walberg; Tuckman; Khalaf, Hanna; others….
• Evaluations -- Anderson, Muir, Bateson, Blackmore, Rogers;
Heyneman; G.A.O.; Achieve; Stake, Theobald; Bond, Cohen; Calder;
Glassnap, Pogio, Miller; others…
• Case studies – S.R.E.B.; Schleisman; Neville; Goldberg, Roswell;
Schlawin; Delong; Lerner; Jett, Shafer; others…
7) Bishop's estimates
of dollar value of
high-stakes exams on
student outcomes
Difference (in
standard deviation
units)
Difference (in grade-
level-equivalent
units)
Difference per student (in
net present value)
in 1993 dollars*
Canada: High-stakes
testing provinces vs.
others
.233 (in math)
.183 (in science)
.75 (in math)
.67 (in science)
$13,370 (in math)
$11,940 (in science)
USA: New York State vs
rest of U.S.
.164 (in
SAT Verbal +Math)
.75 (verbal + math) $13,370
IAEP: High-stakes
testing countries vs.
others
.586 (in math) 2.0 (in math)
.7 (in science)
$35,650 (in math)
$12,480 (in science)
TIMSS: High-stakes
testing countries vs.
others
n/a .9 (in math)
1.3 (in science)
$16,040 (in math)
$23,170 (in science)
*
Based on male-female average, averaged across six longitudinal studies, cited in Bishop, 1995a, Table 2, counting only
general academic achievement, not accounting for technical abilities.
Topic 8: Optimal testing system structures
8) Single or multiple
target systems
• Becker and Rosen (1990)
– A “single target” examination (e.g., minimum competency) is problematic
• Set too high, slower kids will be discouraged and drop out
• Set too low, and advanced kids will be bored and may work less
– Examination systems should have multiple targets
• Empirical Studies of 1970s—1980s Minimum Competency Exams (e.g., Ligon,
Mangino, Babcock Johnstone, Brightman, Davis)
– Performance of lowest students did improve, but that of advanced students either
stayed flat, or decreased
• Jonathan Jacobson (1992)
– Longitudinal analysis of students from minimum competency states showed that
slowest students gained and middle students lost
– Probably, the test induced resource flows to the slow students and away from the
middle students
8) Examples of
multiple target
systems
• Hierarchical, or “tiered,” systems – British system, New York State
– All students must pass exams with broad, common requirements, but at choice of
levels (Advanced or Ordinary; Competency or Honors)
– British just recently changed, creating a hybrid that looks more like continental
exam systems
• Branched or parallel track systems – Most of Continental Europe
– Students choose (or the choice is made for them) where to concentrate their efforts,
and they are tested mostly on that concentration
– First branching (junior high level) into academic, general, vocational
– Second branching (high school level) into subject area or vocational concentration
8) Some current
research on testing
system structure
• John Bishop
– Suspects that standardized end-of-course or end-of-year examinations may be the
most optimal form of standardized testing.
– Why? – perhaps because they combine the best of both worlds
• standardized and external
• concise, targeted, with very strong alignment between curriculum and test
• Value-added systems
– Concerns for volatility and fairness mandate that the testing be frequent – at least
annual
• Tests not only quality control measure; How to optimize whole set (Phelps,
Just for the Kids, others…)
8) The more high-stakes decision points, the better
the student performance ?
Figure 1: Average TIMSS Score and Number of Quality Control
Measures Used, by Country
0
10
20
30
40
50
60
70
80
0 5 10 15 20
Number of Quality Control Measures Used
AveragePercentCorrect(grades7&8)
Top-Performing Countries Bottom-Performing Countries
SOURCE: Phelps, 2001
8) Quality control has proportionally greater effect in
poorer countries
Figure 2: Average TIMSS Score and Number of Quality Control
Measures Used (each adjusted for GDP/capita), by Country
Number of Quality Control Measures Used (per GDP/capita)
AveragePercentCorrect(grades7&8)
(perGDP/capita)
SOURCE: Phelps, 2001
Topic 9: Optimal testing industry structures
9) The industry
structure game,
in theory
• Selfish consumers want a perfectly competitive industry
– Lots of producers, cutthroat competition
– Easy producer entry to, exit from industry
– Low prices, lots of choice and information
• Selfish producers want to be monopolists
– Raise prices, lower quality
– Block new entrants, withhold information
9) The industry
structure game,
in practice
• Consumers want stable suppliers, salespeople they know, brand
names they can trust
– So, sure, they want competition, choice, and low prices…
– But, they do not want to have to try out a new brand of detergent after
every visit to the grocery store
• Producers try to avoid monopoly, or else get regulated or split up
– e.g., Microsoft pushes Apple and Corel to the brink of bankruptcy, then
tosses each of them a lifeline to keep them in business (barely)
– So, the goal is to approach having a monopoly without quite having one
9) Competitive
strategy theory
• In industries with steep economies (of scale, scope, learning, ….)
there is only room for so many producers
– If you do not have the relevant “economies” in your firm, you had better
focus on a specialty niche that makes you unique, or else get out
• (e.g.) General Electric/RCA Consumer Electronics (1987)
– Crowded field: Sony, Zenith, Phillips, Toshiba, Mitsubishi, others
• Sony - technological edge, reputation for quality, could charge high prices
• Niche players – Mitsubishi (big screen TVs); Sharp (flat panels)
• Low cost players – Koreans had entered market, Chinese were purchasing the
facilities of bankrupt American firms (e.g., Admiral, Philco, Sylvania)
• Japanese manufacturers were building assembly plants in US and Mexico in
order to lower their shipping costs for large sets
– GE was “stuck in the middle” – could not compete on cost or quality and
had no unique niche – they sold out
9) Possible sources of
competitive advantage in
the testing industry
• Advantages related to scale economies
– Huge item banks take time to accumulate and test and they are
copyrighted (‘sunk costs’ => barrier to entry)
– Established client base, relationships
• Advantages related to scope economies
– Much psychometric expertise is equally useful across a variety of tests
– Customers needs largely similar across states, countries
– Good brand name provides instant cachet in new markets
• Advantages related to learning economies
– Experience working with, knowledge of clients
– Experience gained with a new type of product will lower cost for
subsequent, similar projects
9) Niche markets in
educational testing (where
“economies” may be of
little help)
• Custom-made performance tests, “built from scratch”
• Some special education and psychological testing that requires
one-on-one administration, highly-specialized protocols, or
licensed test administrators
• Some vocational-occupational testing that employs “hands on”
demonstrations observed by specialists
• Oral interviews
Topic 10: Discussion
Economic perspectives on testing

Contenu connexe

Similaire à Economic perspectives on testing

0._ introduction_0.ppt
0._       introduction_0.ppt0._       introduction_0.ppt
0._ introduction_0.ppt
DipakRathod48
 
Ph.D. Presentation at the University of Barcelona (January 28, 2016)
Ph.D. Presentation at the University of Barcelona (January 28, 2016)Ph.D. Presentation at the University of Barcelona (January 28, 2016)
Ph.D. Presentation at the University of Barcelona (January 28, 2016)
Selene Camargo Correa
 
Assessment and Evaluation inSocial Studies.pptx
Assessment and Evaluation inSocial Studies.pptxAssessment and Evaluation inSocial Studies.pptx
Assessment and Evaluation inSocial Studies.pptx
Rheya4
 

Similaire à Economic perspectives on testing (20)

Prinecomi lectureppt ch01
Prinecomi lectureppt ch01Prinecomi lectureppt ch01
Prinecomi lectureppt ch01
 
Prob statement.ppt
Prob statement.pptProb statement.ppt
Prob statement.ppt
 
2 what is economics
2 what is economics2 what is economics
2 what is economics
 
Team and intercultural management
Team and intercultural managementTeam and intercultural management
Team and intercultural management
 
Team and intercultural management
Team and intercultural managementTeam and intercultural management
Team and intercultural management
 
Labor Market Core Course 2013: Aid, Growth, and Jobs
Labor Market Core Course 2013: Aid, Growth, and Jobs  Labor Market Core Course 2013: Aid, Growth, and Jobs
Labor Market Core Course 2013: Aid, Growth, and Jobs
 
Ub d chapter 13
Ub d chapter 13Ub d chapter 13
Ub d chapter 13
 
“Meritocracy, evaluation, excellence: The case of universities and research”
“Meritocracy, evaluation, excellence: The case of universities and research”“Meritocracy, evaluation, excellence: The case of universities and research”
“Meritocracy, evaluation, excellence: The case of universities and research”
 
Chapter 1
Chapter 1 Chapter 1
Chapter 1
 
Chapter 1 presentation
Chapter 1 presentationChapter 1 presentation
Chapter 1 presentation
 
HLEG thematic workshop on "Intra-generational and Inter-generational Sustaina...
HLEG thematic workshop on "Intra-generational and Inter-generational Sustaina...HLEG thematic workshop on "Intra-generational and Inter-generational Sustaina...
HLEG thematic workshop on "Intra-generational and Inter-generational Sustaina...
 
0._ introduction_0.ppt
0._       introduction_0.ppt0._       introduction_0.ppt
0._ introduction_0.ppt
 
Ph.D. Presentation at the University of Barcelona (January 28, 2016)
Ph.D. Presentation at the University of Barcelona (January 28, 2016)Ph.D. Presentation at the University of Barcelona (January 28, 2016)
Ph.D. Presentation at the University of Barcelona (January 28, 2016)
 
Lecture oslo04 04_18_finn_tarp
Lecture oslo04 04_18_finn_tarpLecture oslo04 04_18_finn_tarp
Lecture oslo04 04_18_finn_tarp
 
Aid and Growth in Perspective - Lecture by Finn Tarp
Aid and Growth in Perspective - Lecture by Finn TarpAid and Growth in Perspective - Lecture by Finn Tarp
Aid and Growth in Perspective - Lecture by Finn Tarp
 
RES_introdiscussionCFC.ppt
RES_introdiscussionCFC.pptRES_introdiscussionCFC.ppt
RES_introdiscussionCFC.ppt
 
Economics-Unit-2-Scarcity.ppt
Economics-Unit-2-Scarcity.pptEconomics-Unit-2-Scarcity.ppt
Economics-Unit-2-Scarcity.ppt
 
Addicted to Reform: A 12-STEP PROGRAM TO RESCUE PUBLIC EDUCATION by John Merrow
Addicted to Reform: A 12-STEP PROGRAM TO RESCUE PUBLIC EDUCATION by John MerrowAddicted to Reform: A 12-STEP PROGRAM TO RESCUE PUBLIC EDUCATION by John Merrow
Addicted to Reform: A 12-STEP PROGRAM TO RESCUE PUBLIC EDUCATION by John Merrow
 
A Short Introduction to Behavioural Economics.pptx
A Short Introduction to Behavioural Economics.pptxA Short Introduction to Behavioural Economics.pptx
A Short Introduction to Behavioural Economics.pptx
 
Assessment and Evaluation inSocial Studies.pptx
Assessment and Evaluation inSocial Studies.pptxAssessment and Evaluation inSocial Studies.pptx
Assessment and Evaluation inSocial Studies.pptx
 

Plus de Richard P Phelps

Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
Richard P Phelps
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
Richard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
Richard P Phelps
 

Plus de Richard P Phelps (19)

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
 
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksBoarding School: Benefits and Drawbacks
Boarding School: Benefits and Drawbacks
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflationIt's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
 
Designing an Assessment System
Designing an Assessment SystemDesigning an Assessment System
Designing an Assessment System
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
 
Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentation
 
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learningClassroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learning
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
 
Test benefits slide show
Test benefits slide showTest benefits slide show
Test benefits slide show
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
 
Source of Lake Wobegon
Source of Lake WobegonSource of Lake Wobegon
Source of Lake Wobegon
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
 

Dernier

Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Dernier (20)

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 

Economic perspectives on testing

  • 1. Economic Perspectives on Standardized Testing(c) Richard P. Phelps (c) 2002, by Richard P. Phelps
  • 2. Economic Perspectives on Standardized Testing: Outline 1. Why can’t economists and psychologists just get along? 2. Overview of economic theory as it pertains to education & testing 3. Human capital theory and the economics of information 4. Supply & demand; benefits & costs; goods & bads 5. The cost of standardized testing (from society’s point of view) 6. The benefits of standardized testing (information) 7. The benefits of standardized testing (motivation) 8. Optimal testing system structures 9. Optimal testing industry structures 10. Discussion
  • 3. Topic 1: Why can’t economists and psychologists just get along?
  • 4. 1) Why can’t economists and psychologists just get along? [answer: sometimes they do] • Tversky and Kahneman, two cognitive psychologists, asked themselves why rational economic man patronizes casinos, where the odds are against him. • Their experiments revealed that tolerance of (or, attraction to) risk varies widely among individuals, and most weigh small risks against low-probability, but very large, gains “sub-optimally” • Tversky’s and Kahneman’s work is now required reading for any economics major • Experimental economics, which strongly resembles cognitive psychology in its methods, is now the fastest growing area of research in the field.
  • 5. 1) Why can’t economists and psychologists just get along? [answer: sometimes they do not] Test Utility research • Thousands of studies conducted by I/O psychologists from the 1960s through the 1980s • Dozens of meta-analyses • Even a few meta-analyses of the meta-analyses • Few economists, then or now, even aware of the field Decline in interest in Test Utility research • Regulatory ruling against validity generalization in late 1980s by Civil Rights office in Reagan administration • National Research Council forms committee with curious membership to critique a single Test Utility study (critique interpreted by many as a condemnation of all Test Utility research)
  • 6. Topic 2: Overview of economic theory as it pertains to education & testing
  • 7. 2) Economic theory as it pertains to education in general Traditionally, education economics conducted in 2 fields Labor Economics • Labor markets for teachers and graduates • Returns (in wages) to investment (in years) in education Public Finance • Returns (in achievement, attainment) to investment (in tax revenues) • Funding equity, adequacy, efficiency, & intra-metropolitan migration
  • 8. 2) Economic theory as it pertains to testing in particular Human Capital Theory • Higher wages over the long term can more than compensate for the earnings foregone while still in school • …assumed a strong correlation between accumulation (years in school, any school) and earning power (applicable knowledge and skills) Economics of Information • Basic economic assumption of “perfect information” is simplistic • When buyer and seller have “asymmetric” information, classic economic assumptions are not appropriate
  • 9. Topic 3: Human capital theory and the economics of information
  • 10. 3) Human capital theory: seminal works • Human Capital (1954), Gary Becker • Schooling, Experience, and Earnings (1974), Jacob Mincer • Dozens of World Bank reports
  • 11. 3) Economics of Information: seminal works • “The Market for Lemons” (1970) George Akerlof – When buyers can evaluate a purchase based only on a quality assessment of the entire group, sellers have an incentive to market poor quality merchandise and, over time, the average quality of goods declines. Often-used counters to quality decline are: guarantees, brand names, franchising, and credentials. • “Economics of Imperfect Information”(1976) Rothschild, Stiglitz, Grossman – Perfectly competitive markets have perfect information. In markets without perfect information, there is little incentive for private individuals to fill the breach (Consumers’ Reports is an exception, and not very profitable). Thus, there can be a role for government to promote market efficiency, by providing information.
  • 12. 3) Screening, signaling, filtering, credentialing, I • Education and Jobs: The Great Training Robbery (1970), Ivar Berg – Employers pay for credentials, not human capital; they know little to nothing of the quality of education programs, only the perception thereof • Generating Inequality (1972) Lester Thurow – Employers want “trainable” employees, and judge that those who could endure schooling are probably more trainable than those who could not • Work of Piore and Doeringer on “Market Segmentation” – Neither education nor education credentials matter in “secondary” labor markets, only in “primary” market, with career ladders
  • 13. 3) Screening, signaling, filtering, credentialing, II • Market Signaling (1973), Michael Spence – Diplomas are a signaling device to employers, who take a gamble with every new hire; evidence that the graduate is hoping employers will conclude that certain human capital has been obtained, but not proof that it has • “On the Weak versus the Strong Version of the Screening Hypothesis” (1979) George Psacharopoulos – Weak: employers pay only higher starting wages for “better” credentials – Strong: employers continue to pay higher wages for “better” credentials even after they become familiar with each employee’s actual productivity • “Higher Education as a Filter” (1973) Kenneth Arrow • “The Theory of Screening” (1975) Joseph Stiglitz
  • 14. 3) Empirical and theoretical work on standards • Burton Weisbrod (1964) – Discovered that 90% of adults are hired within the boundaries of a school district other than the one from which they graduated – So, employers are not familiar with and have no influence over the education standards used to train virtually all their employees • John Bishop (1980s) – It is unreasonable to expect a teacher to be both a sympathetic coach and a neutral judge. External exams let them be coaches exclusively, which is in keeping with what most of them probably want anyway. • Robert Costrell (1994) – School district incentives are to inflate grades and socially promote. If they maintain tough standards, they only hurt their own children in later competition against graduates of other districts where standards are lax and grades inflated. – Standards must be enforced externally, or they will not be.
  • 15. Topic 4: Supply & demand; benefits & costs; goods & bads
  • 16. 4) Benefits & costs; goods & bads • Economists are (small d) democrats – what is a “good” or a benefit is relative to each individual; the researcher does not get to decide what is good or bad for the consumer; consumers decide for themselves – but, we’d all like more money (freely exchangeable) and more free time • Economists assume we all want more of something (even if it is spiritual enlightenment), and that we can’t always get it • Benefits have two phases: creation and capture – Not all potential benefits are realized, or “captured” – (e.g.,) You do very well and learn very much at a college with a terrible reputation, and then cannot get a job because of that reputation
  • 17. 4) The demand for standardized testing • Phelps (1998) - 40 years of public opinion poll data – The adult public is not ignorant about standardized tests, since all have taken many, for better or for worse – Support for high-stakes standardized testing is overwhelming, and has been consistently so for decades – Most stakeholders, including students and parents, are strongly supportive. Teachers are usually supportive, but don’t like being judged for outcomes over which they have little control. Education professors are strongly opposed. Administrators have been on the fence, may now be opposed. – The year 2000 “testing backlash” was very strongly hyped public relations creature, and completely unsupported by the objective evidence.
  • 18. 4) “Natural Experiments” in test demand and valuation: a) countries liberalize education, b) drop test requirements, c) find that standards deteriorate, d) then revert back to testing • Many Western European and North American states (1960s – 1970s) • Many Post-Colonial, Newly-Independent states (1940s – 1970s) • Ex-Communist Eastern European states (1990s – 2000s)
  • 19. 4) Trends in test adding/dropping, OECD countries: 1974--1999 Cumulative net change in number of tests in 30 countries and provinces, 1974 to 1999 0 10 20 30 40 50 60 YEAR Numberoftests
  • 20. 4) Countries adding or dropping large-scale, external testing, by type of testing: 1974-1999     Number of countries or provinces... Type of testing ...adding testing ...dropping testing Assessments  17  0 Upper secondary exit exams  12*  0 University entrance exams  5  0 Subject-area end-of-course exams  6  0 Lower secondary exit or entrance exams  4  2 Inclusion of voc/prof tracks in exit exam system  3  0 Primary/secondary-level achievement testing  2  1 Diagnostic testing  2  0 TOTAL  51  3
  • 21. 4) Countries with nationally standardized high-stakes exit exams, by level of education Primary school Lower secondary school Upper secondary school Belgium (French) Italy Netherlands Russia Singapore Switzerland (some  cantons) Belgium (French) Canada: Quebec China Czech Republic Denmark France Hungary Iceland Ireland Italy Japan Korea Netherlands New Zealand Norway Portugal Russia Singapore Sweden Switzerland United Kingdom: England & Wales,  Scotland Belgium: (Flemish) & (French) Canada: Alberta, British Columbia,  Manitoba, New Brunswick,  Newfoundland, Quebec China Denmark Finland France Germany Hungary Iceland Italy Japan Netherlands Norway Portugal Russia Singapore Sweden Switzerland United Kingdom: England & Wales,  Scotland
  • 22. 4) Demand for testing is not unlimited – saturation is possible School district response to state test mandates (1991) State and local tests' purpose and content are… Percent of districts substituting state test …exactly the same or very similar 82 …somewhat or moderately similar 69 …not at all similar or very little 41 SOURCE:  U.S. GAO, 1993.
  • 23. Topic 5: The Cost of Standardized Testing (from society’s point of view)
  • 24. 5) Cost jargon • Marginal cost (the cost of the next unit): For a test, it is the cost that is incurred due to the addition of a test, and only that cost. – (e.g., during test administration, the school building must be maintained, but such would be the case without a test, too. The test is not responsible for this cost.) – Subject-matter instruction occurs whether or not there is external testing, so it also is not a cost of the test. • Opportunity cost (cost of foregone opportunities (i.e., instead of doing this, you could have been at work making money)): For a test, the time a teacher spends preparing for, monitoring, or scoring a test is time he could have been planning his course, grading homework, etc. – If the teacher makes productive use of the time while students are taking a test, there are no opportunity costs.
  • 25. 5) Average all-inclusive per-student costs of two test types in states having both: 1990-91 Type of test Cost factors Multiple-choice Performance Start-up development $2 $10 Ongoing, annual costs $16 $33 SOURCE:  U.S. GAO, 1993, p.43
  • 26. 5) Average per-student costs of two test types in states having both, with adjustments: 1990-91 All systemwide tests Sample of 11 state performance tests Sample of 6 multiple-choice tests in those same states All-inclusive marginal cost $15 $33 $16 …minus adjustment for regular school year administration -7 -15 -7 ...minus adjustment for replacement of preexisting tests -6 -12 -12 Marginal cost after adjustments $5 $11 $2 SOURCE:  Phelps, 2000. 
  • 27. 5) “Economies” jargon • The unit cost of producing your product declines the more of an “economy” you have (because fixed/overhead costs get spread out) – Scale – you can sell at lower cost because you make so many of them – Scope – you can sell at lower cost because you make other stuff that is similar, or in similar ways – Learning – you figure out ways to be more efficient and productive as you gain experience • There are many “economies” (just like validities)
  • 28. Economies of scale in state performance testing
  • 29. Some economies of scope in state performance testing
  • 30. 5) General structure of testing costs Scorers are...   GROUPS of  teachers or  professional  scorers   INDIVIDUAL  teachers or  professional  scorers   a  COMPUTER Students take tests...   EN MASSE          in GROUPS           ONE at a TIME  
  • 31. 5) Slack capacity in U.S. students’ time = opportunity for windfall gain ? Average number of hours per day devoted to… Region/ Country Sports TV watching Playing or socializing Studying USA 2.2 2.6 2.5 2.3 East Asia (N = 5) 0.9 2.4 1.3 3.1 West Europe (N = 4) 1.6 2.0 2.4 2.8 East Europe (N = 7) 1.6 2.6 2.5 2.9
  • 32. Topic 6: The Benefits of Standardized Testing -- Information
  • 33. 6) Information benefits of testing • For whom? Could be anyone – student, parent, teacher, school, public, postsecondary institution, employer, … • Information can be used beneficially in: – Diagnosis (of student, teacher, school, ….) – Alignment (to standards, schedule, each other, …) – Learning for teachers – Goodwill with public – Decisions (promotion, placement, selection, …)
  • 34. 6) Information benefits of testing – how are they measured? • Predictive validity (fairly measurable) • Allocative efficiency (fairly measurable) – (the greater the range restriction the higher the allocative efficiency?) • Alignment (not so easy to measure) • Goodwill (not at all easy to measure)
  • 35. Topic 7: The Benefits of Standardized Testing -- Motivation
  • 36. 6) Motivational benefits of testing – how are they measured? • In controlled experiments: – Ex. A) One group is told the test at the end of the course comes with a reward; control group told it does not count – Ex. B) One group is tested throughout course; control group is not • In large-scale studies--Graduates from regions with high- stakes tests compared to their non-tested counterparts: • By their relative performance on another, common test • Their relative wages after graduation • Their relative rates of dropout, persistence, attainment, … • “Backwash Effect” (e.g., students in states with high-stakes high school graduation tests perform better even on the 8th - grade level IAEP, TIMSS, or NAEP
  • 37. 7) Large-scale studies finding benefits to the use of external, high- stakes examinations • John Bishop (1980s+) several studies -- IAEP, TIMSS, SAT, NY State, Canada, … • Winfield; Fredericksen; Bishop; Jacobson (minimum comp. states) • Others: Graham, Husted (SAT); Grissmer, Flanagan (NAEP); Phelps (TIMSS+); Carnoy (NAEP); Rosenshine (NAEP); Braun (NAEP); Wenglinsky
  • 38. 7) Smaller-scale studies finding benefits to the use of high-stakes examinations • Controlled experiments – Tuckman, Trimble; Webb; Wolf, Smith; Egeland; Jones; Brown, Walberg; Tuckman; Khalaf, Hanna; others…. • Evaluations -- Anderson, Muir, Bateson, Blackmore, Rogers; Heyneman; G.A.O.; Achieve; Stake, Theobald; Bond, Cohen; Calder; Glassnap, Pogio, Miller; others… • Case studies – S.R.E.B.; Schleisman; Neville; Goldberg, Roswell; Schlawin; Delong; Lerner; Jett, Shafer; others…
  • 39. 7) Bishop's estimates of dollar value of high-stakes exams on student outcomes Difference (in standard deviation units) Difference (in grade- level-equivalent units) Difference per student (in net present value) in 1993 dollars* Canada: High-stakes testing provinces vs. others .233 (in math) .183 (in science) .75 (in math) .67 (in science) $13,370 (in math) $11,940 (in science) USA: New York State vs rest of U.S. .164 (in SAT Verbal +Math) .75 (verbal + math) $13,370 IAEP: High-stakes testing countries vs. others .586 (in math) 2.0 (in math) .7 (in science) $35,650 (in math) $12,480 (in science) TIMSS: High-stakes testing countries vs. others n/a .9 (in math) 1.3 (in science) $16,040 (in math) $23,170 (in science) * Based on male-female average, averaged across six longitudinal studies, cited in Bishop, 1995a, Table 2, counting only general academic achievement, not accounting for technical abilities.
  • 40. Topic 8: Optimal testing system structures
  • 41. 8) Single or multiple target systems • Becker and Rosen (1990) – A “single target” examination (e.g., minimum competency) is problematic • Set too high, slower kids will be discouraged and drop out • Set too low, and advanced kids will be bored and may work less – Examination systems should have multiple targets • Empirical Studies of 1970s—1980s Minimum Competency Exams (e.g., Ligon, Mangino, Babcock Johnstone, Brightman, Davis) – Performance of lowest students did improve, but that of advanced students either stayed flat, or decreased • Jonathan Jacobson (1992) – Longitudinal analysis of students from minimum competency states showed that slowest students gained and middle students lost – Probably, the test induced resource flows to the slow students and away from the middle students
  • 42. 8) Examples of multiple target systems • Hierarchical, or “tiered,” systems – British system, New York State – All students must pass exams with broad, common requirements, but at choice of levels (Advanced or Ordinary; Competency or Honors) – British just recently changed, creating a hybrid that looks more like continental exam systems • Branched or parallel track systems – Most of Continental Europe – Students choose (or the choice is made for them) where to concentrate their efforts, and they are tested mostly on that concentration – First branching (junior high level) into academic, general, vocational – Second branching (high school level) into subject area or vocational concentration
  • 43. 8) Some current research on testing system structure • John Bishop – Suspects that standardized end-of-course or end-of-year examinations may be the most optimal form of standardized testing. – Why? – perhaps because they combine the best of both worlds • standardized and external • concise, targeted, with very strong alignment between curriculum and test • Value-added systems – Concerns for volatility and fairness mandate that the testing be frequent – at least annual • Tests not only quality control measure; How to optimize whole set (Phelps, Just for the Kids, others…)
  • 44. 8) The more high-stakes decision points, the better the student performance ? Figure 1: Average TIMSS Score and Number of Quality Control Measures Used, by Country 0 10 20 30 40 50 60 70 80 0 5 10 15 20 Number of Quality Control Measures Used AveragePercentCorrect(grades7&8) Top-Performing Countries Bottom-Performing Countries SOURCE: Phelps, 2001
  • 45. 8) Quality control has proportionally greater effect in poorer countries Figure 2: Average TIMSS Score and Number of Quality Control Measures Used (each adjusted for GDP/capita), by Country Number of Quality Control Measures Used (per GDP/capita) AveragePercentCorrect(grades7&8) (perGDP/capita) SOURCE: Phelps, 2001
  • 46. Topic 9: Optimal testing industry structures
  • 47. 9) The industry structure game, in theory • Selfish consumers want a perfectly competitive industry – Lots of producers, cutthroat competition – Easy producer entry to, exit from industry – Low prices, lots of choice and information • Selfish producers want to be monopolists – Raise prices, lower quality – Block new entrants, withhold information
  • 48. 9) The industry structure game, in practice • Consumers want stable suppliers, salespeople they know, brand names they can trust – So, sure, they want competition, choice, and low prices… – But, they do not want to have to try out a new brand of detergent after every visit to the grocery store • Producers try to avoid monopoly, or else get regulated or split up – e.g., Microsoft pushes Apple and Corel to the brink of bankruptcy, then tosses each of them a lifeline to keep them in business (barely) – So, the goal is to approach having a monopoly without quite having one
  • 49. 9) Competitive strategy theory • In industries with steep economies (of scale, scope, learning, ….) there is only room for so many producers – If you do not have the relevant “economies” in your firm, you had better focus on a specialty niche that makes you unique, or else get out • (e.g.) General Electric/RCA Consumer Electronics (1987) – Crowded field: Sony, Zenith, Phillips, Toshiba, Mitsubishi, others • Sony - technological edge, reputation for quality, could charge high prices • Niche players – Mitsubishi (big screen TVs); Sharp (flat panels) • Low cost players – Koreans had entered market, Chinese were purchasing the facilities of bankrupt American firms (e.g., Admiral, Philco, Sylvania) • Japanese manufacturers were building assembly plants in US and Mexico in order to lower their shipping costs for large sets – GE was “stuck in the middle” – could not compete on cost or quality and had no unique niche – they sold out
  • 50. 9) Possible sources of competitive advantage in the testing industry • Advantages related to scale economies – Huge item banks take time to accumulate and test and they are copyrighted (‘sunk costs’ => barrier to entry) – Established client base, relationships • Advantages related to scope economies – Much psychometric expertise is equally useful across a variety of tests – Customers needs largely similar across states, countries – Good brand name provides instant cachet in new markets • Advantages related to learning economies – Experience working with, knowledge of clients – Experience gained with a new type of product will lower cost for subsequent, similar projects
  • 51. 9) Niche markets in educational testing (where “economies” may be of little help) • Custom-made performance tests, “built from scratch” • Some special education and psychological testing that requires one-on-one administration, highly-specialized protocols, or licensed test administrators • Some vocational-occupational testing that employs “hands on” demonstrations observed by specialists • Oral interviews