2. Helping teacher’s set reasonable and rigorous
goals
Presenter - John Cronin, Ph.D.
Contacting us:
Rebecca Moore: 503-548-5129
E-mail: rebecca.moore@nwea.org
3. What you’ll learn
• The purposes of teacher evaluation. There can be different purposes for
different educators.
• The value of differentiating educator evaluations and the risks associated
with not differentiating.
• The value of multiple measurements, and the importance of defining a
purpose for each measurement.
• The information needed to determine whether a goal is attainable
• The difference between “aspirational” and “evaluative” goals and the
value of each.
• Thoughts on strategies for addressing the difficult to measure.
4. What’s the purpose?
• The Connecticut system requires a collaborative process.
• For most educators the purpose of evaluation is formative.
• For a small minority of educators the purpose is to summative
and goals may involve demonstrating basic competence.
• Leaders should be transparent about the purpose of the
process for each educator.
• Perfect consistency isn’t necessarily a requirement.
5. Differences between principal and
teacher evaluation
Principals
Thus principals should bestaff.
• Inherit a pre-existing
evaluated on their ability to improve
• Have limited control over
growth or maintain high levels of
staffing conditions
growth over time rather than their
students’with this intact school
• Work growth within a
year.
group from year to year.
6. Differences between principal and
teacher evaluation
Teachers
• They have new groups of
students each year
• They generally work with
those students for one
school year only.
• New teachers generally
become more effective in
their first three years
7. The difference between formative and
summative evaluation
• Formative evaluation is intended to give educators
useful feedback to help them improve their job
performance. For most educators formative
evaluation should be the focus of the process.
• Summative evaluation is a judgment of educator
performance that informs future decisions about
employment including granting of tenure,
performance pay, protection from layoff.
8. Purposes of summative evaluation
• An accurate and defensible judgment of an
educator’s job performance.
• Ratings of performance that provide
meaningful differentiation across educators.
• Goals of evaluation
– Support professional improvement
– Retain your top educators
– Dismiss ineffective educators
9. If evaluators do not
differentiate their
ratings, then all
differentiation
comes from the test.
11. Results of Tennessee Teacher Evaluation
Pilot
60%
50%
40%
30%
Value-added result
Observation Result
20%
10%
0%
1
2
3
4
5
12. Results of Georgia Teacher Evaluation Pilot
Evaluator Rating
ineffective
Minimally Effective
Effective
Highly Effective
13. Connecticut expectations around
teacher observation
• First and Second year teachers –
– Required – 3 formal observations
– Recommended – 3 formal observations and 3 informal
observations
• Below standard and developing –
– Required – 3 formal evaluations
– Recommended – 3 formal evaluations and 5 informal
evaluations
• Proficient and exemplary –
– Recommended – 3 formal evaluations – 1 in class
– Required – 3 formal evaluations – 1 in class
14. Reliability of evaluation weights in predicted
stability of student growth gains year to year
Observation by
Reliability coefficient
(relative to state test
value-added gain)
Proportion of test
variance
explained
Model 1 – State test – 81%
Student surveys 17% Classroom
Observations – 2%
.51
26.0%
Model 2 – State test – 50%
Student Surveys – 25%
Classroom Observation – 25%
.66
43.5%
Model 3 – State test – 33% Student Surveys – 33%
Classroom Observations – 33%
.76
57.7%%
Model 4 – Classroom Observation
50%
State test – 25%
Student surveys – 25%
.75
56.2%
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective
Teaching: Culminating Findings from the MET Projects Three-Year Study
15. Reliability of a variety of teacher observation
implementations
Observation by
Reliability coefficient
(relative to state test
value-added gain)
Proportion of test
variance
explained
Principal – 1
.51
26.0%
Principal – 2
.58
33.6%
Principal and other administrator
.67
44.9%
Principal and three short
observations by peer observers
.67
44.9%
Two principal observations and
two peer observations
.66
43.6%
Two principal observations and
two different peer observers
.69
47.6%
Two principal observations one
peer observation and three short
observations by peers
.72
51.8%
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective
Teaching: Culminating Findings from the MET Projects Three-Year Study
16. Why should we care about goal
setting in education?
Because we want students
to learn more!
• Research view
–Setting goals improves performance
17. The testing to teacher evaluation process
Testing
Metric (Growth Score)
Analysis (Value-Added)
Evaluation (Rating)
19. One district’s change in 5th grade math performance
relative to Kentucky cut scores
Mathematics
Number of Students
No Change
Down
Up
Fall RIT
20. Number of students who achieved the normal
mathematics growth in that district
Mathematics
Failed growth target
Number of Students
Met growth target
Student’s score in fall
21. Issues in the use of growth measures
Measurement design of the
instrument
Many assessments are not
designed to measure growth.
Others do not measure growth
equally well for all students.
22. Tests are not equally accurate for all
students
California STAR
NWEA MAP
23. Issues with rubric based instruments
• Rubrics should be granular enough to show
growth. Four point rubrics may not be
adequate.
• High and low scores should be written in a
manner that sets a reasonable floor and
ceiling.
25. Inconsistency occurs because
• Of differences in test design.
• Differences in testing conditions.
• Differences in models being applied to
evaluate growth.
26. The reliability problem –
Inconsistency in testing conditions
Test
Test 1
Time 1
Test 2
Time 1
Test 1
Time 2
Retest
Test 2
Time 2
27. The reliability problem –
Inconsistency in testing conditions
Test 1
Time 1
Test 1
Time 1
Test 1
Time 1
Test 2
Time 1
Test 2
Time 1
Test 2
Time 1
Test 1
Time 2
Test 1
Time 2
Test 1
Time 2
Test 2
Time 2
Test 2
Time 2
Test 2
Time 2
28. The problem with spring-spring testing
Teacher 1
3/11
4/11
5/11
Summer
6/11
7/11
8/11
Teacher 2
9/11
10/11
11/11
12/11
1/12
2/12
3/12
29. The testing to teacher evaluation process
Testing
Metric (Growth Score)
Analysis (Value-Added)
Evaluation (Rating)
30. Issues in the use of growth and valueadded measures
Differences among value-added
models
Los Angeles Times Study
Los Angeles Times Study #2
31. Issues in the use of value-added measures
Control for statistical error
All models attempt to address this
issue. Nevertheless, many teachers
value-added scores will fall within
the range of statistical error.
32. Issues in the use of growth and valueadded measures
Control for statistical error
New York City
New York City #2
33. Range of teacher value-added
estimates
12.00
11.00
Mathematics Growth Index Distribution by Teacher - Validity Filtered
10.00
9.00
8.00
7.00
Each line in this display represents a single teacher. The graphic
shows the average growth index score for each teacher (green
line), plus or minus the standard error of the growth index estimate
(black line). We removed students who had tests of questionable
validity and teachers with fewer than 20 students.
Average Growth Index Score and Range
6.00
5.00
4.00
3.00
2.00
Q5
Q4
1.00
0.00
Q3
-1.00
-2.00
-3.00
-4.00
-5.00
-6.00
-7.00
-8.00
-9.00
-10.00
-11.00
-12.00
Q2
Q1
34. What’s a SMART goal?
•
•
•
•
•
Specific
Measurable
Attainable
Relevant
Time-Bound
35. Specific
• What: What do I want to accomplish?
• Why: What are the reasons or purpose for
pursuing the goal?
• Which: What are the requirements and
constraints for achieving the goal?
36. SMART goal resources
• National Staff Development Council. Provides
a nice process for developing SMART goals.
• Arlington Public Schools. Excellent and
detailed examples of SMART goals across
subject disciplines, including art and music.
• The Handbook for Smart School Teams. Anne
Conzemius and Jan O’Neill.
37. Issues with local tests and goal
setting
• Validity and reliability of assessments.
• Teachers/administrators are unlikely to set
goals that are inconsistent with their current
performance.
• It is difficult to set goals without prior
evidence or context.
40. Data should be triangulated
• Classroom assessment data to standardized
test data.
• Domain data (mathematics) to sub-domain
data (fractions and decimals) to granular data
(division with fractions).
42. Measurable
Types of Goals
• Performance – 75% of the students in my 7th grade
mathematics class will achieve the qualifying score
needed for placement in 8th grade Algebra.
• Growth – 65% of my students will show growth on the
OAKS mathematics test that is greater than the state
reported norm.
• Improvement – Last year 40% of my students showed
growth on the OAKS mathematics test that was greater
than the norm. This year 50% of my students will show
greater than normal growth.
43. Attainable
The goals set should be reasonable and
rigorous. At minimum, they should
represent a level of performance that a
competent educator could be expected
to achieve.
44. An analogy to baseball
Center Fielders - WAR (Wins Above Replacement)
6.4
6.1
Superstar – Mike Trout – Los Angeles Angels
1.7
Median Major Leaguer – Gregor Blanco, San Francisco Giants
0.0
Marginal Major Leaguer – Chris Young, Oakland As
-1.3
45. The difference between aspirational and
evaluative goals
Aspirational – I will meet my target weight by losing
50 pounds during the next year and sustain that
weight for one year.
Proficient – I intend to lose 15 pounds in the next six
months, which will move me from the “obese” to the
“overweight” category, and sustain that weight for one
year.
Marginal – I will lose weight in the next six months.
46. Ways to evaluate the attainability of a goal
• Prior performance
• Performance of peers within the system
• Performance of a norming group
47. One approach to evaluating the attainment
of goals.
Students in La Brea Elementary School show
mathematics growth equivalent to only 2/3 of the
average for students in their grade.
Level 4 – (Aspirational) – Students in La Brea Elementary School will
improve their mathematics growth equivalent to 1.5 times the average
for their grade.
Level 3 – (Proficient) Students in La Brea Elementary School will
improve their mathematics growth to be equivalent to the average for
their grade.
Level 2 – (Marginal) Students in La Brea Elementary School will
improve their mathematics growth relative to last year.
Level 1 – (Unacceptable) Students in La Brea Elementary School
do not improve their mathematics growth relative to last year.
48. Is this goal attainable?
62% of students at John Glenn Elementary met or exceeded
proficiency in Reading/Literature last year. Their goal is to improve
their rate to 82% this year. Is the goal attainable?
Oregon schools – change in
Reading/Literature proficiency 2009-10 to
2010-11 among schools that started with
60% proficiency rates
400
300
200
100
362
351
291
173
0
Growth > -20% > -10% > 0%
> -30%
73
14
3
> 10% > 20% > 30%
49. Is this goal attainable and
rigorous?
45% of the students at La Brea elementary showed average growth or
better last year. Their goal is to improve that rate to 50% this year. Is
their goal reasonable?
Students with average or better annual
growth in Repus school district
100%
80%
60%
40%
20%
0%
50. The selection of metrics matters
Students at LaBrea Elementary School
will show growth equivalent to 150% of
grade level.
Students at Etsaw Middle School will
show growth equivalent to 150% of grade
level.
52. Percent of a Year’s Growth
Percent of a year’s growth in
mathematics
Mathematics
200%
180%
160%
140%
120%
100%
80%
60%
40%
20%
0%
2
3
4
5
6
7
8
9
53. Assessing the difficult to measure
• Encourage use of performance assessment and rubrics.
• Encourage outside scoring
– Use of peers in other buildings, professionals in the field,
contest judges
• Make use of resources
– Music educator, art educator, vocational professional
associations
– Available models – AP art portfolio.
– Use your intermediate agency
– Work across buildings
• Make use of classroom observation.
54. The outcome is important, but why the
outcome occurred is equally important!
Success can’t be replicated if you don’t know why you
succeeded.
Failure can’t be reversed if you don’t know why you failed.
55. The outcome is important, but why the
outcome occurred is equally important!
• Establish checkpoints and check-in beyond
what’s required when possible.
• Collect implementation information
– Classroom observations - visits
– Teacher journal
– Student work – artifacts
These processes can be done by teacher peers!
56. Thank you for attending
Presenter - John Cronin, Ph.D.
Contacting us:
NWEA Main Number: 503-624-1951
E-mail: rebecca.moore@nwea.org