SlideShare une entreprise Scribd logo
1  sur  56
Chapter Fourteen


Reading Guide
Measurement and assessment

 Measurement: an evaluation expressed
  in quantitative (numerical) terms
 Assessment: procedures used to obtain
  information about student performance
Assessment
     Assessment includes
        many different
      possible activities,
        not just tests!




But, since tests are a large part of current assessment practices, this whole
chapter is dedicated to them.
Standardized tests

               Assessment instruments given to large
                 samples of students under uniform
                 conditions and scored according to
                 uniform procedures


For example: ACT, SAT, GRE, Praxis, and many of the achievement tests you took in
school.
Norm-referenced testing: testing in which scores are compared to
           guys named “Norm.” Just kidding. Testing in which scores are
           compared with the average performance of others.

            Norms
           An individual’s score on a standardized test is
            compared to some kind of “normal” group.
           Norming group: the representative group of
            individuals whose standardized test scores are
            compiled for the purpose of constructing
            national norms.
           National norms: scores on standardized tests
            earned by representative groups of students
            from around the nation to which an individual’s
            score is compared.

Your score on achievement tests such as SAT or ACT was
compared to the scores of the norming group.
Criterion-referenced test
interpretations
 Testing in which scores are compared to
  a set performance standard.
 The test you took to get a driver’s license
  was criterion-referenced.
 Criterion-referenced tests measure
  mastery of concepts or skills.
Descriptive statistics

 Frequency distributions
 Measures of central tendency
 Measures of variability
 Normal distribution
This
                   Frequency distribution                                       arrangement
                                                                                 doesn’t tell
                   Scores on 50 point test:                                      you much
                   48, 47, 46, 45, 45, 44, 44, 44, 44, 44, 43, 43,
      This         43, 43, 42, 42, 42, 42, 41, 41, 41, 40, 40, 39,
 arrangement       39, 38, 38, 37, 36, 35
  shows you                                                         X
  more about
     what                                                X     X    X
  happened.
  Remember                                          X    X     X    X
  the general
 shape that is                     X    X     X     X    X     X    X      X    X
   defined by
the x’s. It will   X    X    X     X    X     X     X    X     X    X      X    X    X    X
be important.
                   35   36   37    38   39    40    41   42    43   44     45   46   47   48



  A frequency distribution is a distribution of test scores that shows a
  simple count of the number of people who obtained each score.
Histogram
              5
            4.5
              4
            3.5
              3
            2.5
                                                                  SCORES
              2
            1.5
              1
            0.5
              0
                  35   37    39    41    43     45    47

The same frequency distribution can be represented by a histogram: a bargraph of
a frequency distribution.

By the way, if you were a teacher, what would you think of this group of scores?
Thinking about the scores
 Clearly no one had mastery of the material—out
  of a 50 point test, no one answered all
  questions correctly.
 So, the next question is, does this have
  something to do with the nature of the students,
  the nature of the teaching that went on, or the
  nature of the assessment? Why did so few
  students do well on this test?
 An assessment is not just information gathered
  about students; it also represents something
  about teaching and potentially something about
  the assessment procedures themselves.
Measures of Central Tendency

                Mean: average score
                Median: middle score
                Mode: most frequent score




Measures of central tendency are quantitative descriptions of how a
group performed as a whole.
More about measures of
    central tendency
Imagine a classroom that   So how come we
has one genius and 20      can’t just use the
ordinary people. The       average?
genius’s score will
artificially raise the
average of the class.
Median and mode are not
as affected by a single
“outlying” score.
And another thing….
                               X                           X
                               X                           X
                               X     X                     X
                          X    X     X               X     X    X
               X     X    X    X     X    X     X    X     X    X
               X     x    X    X     x    X     X    X     X    X     X    X
               60    61   62   63    64   65    66   67    68   69    70   71


You might have a frequency distribution that looks like this. For example, if you
measured the heights of a group of men and women, you would have a “bump” for the
average height of men and another “bump” for the height of the women. This situation
would mean that you have two modes (63 and 68 inches in this case) and your
collection of numbers would be “bimodal.” Keep in mind that “average” for the whole
group in this case would be somewhere between the men and the women. The average
wouldn’t really represent what is going on here very well.
Measures of variability

 Range: the distance between the top
  and bottom score in a distribution of
  scores.
 Standard deviation: a statistical
  measure of the spread of scores
 Variability: degree of difference from
  mean.
Standard deviation




Who is the better shot? Standard deviation could tell us because it is essentially the
average deviation from the norm. We would measure the distance between each
hold and the center of the target and then average those distances to come up with
the standard deviation. The archer with the lowest standard deviation would be the
better archer.
The Normal Distribution

Imagine the SAT scores for the millions of people who take it. Some people do
extremely well. Some do extremely poorly. Most do average. If you graphed the SAT
scores of everyone who takes it the way we graphed scores earlier in this presentation
(with the X’s), you would get a shape like this. This is called the “bell curve” or the
“normal distribution.”




Normal distribution is a distribution of scores in which the mean, median, and mode
are equal and the scores distribute themselves symmetrically in a bell-shaped curve.
Normal distribution




There is an interesting relationship between the percent of people with a score and
standard deviation (the average deviation from the average score). 68% (blue on this
graph) of the test takers fall within one standard deviation of the norm. This is true for
any measure that yields a normal distribution (height of women or height of men, IQ,
ACT, SAT, number of pushups people can do, etc.).
Normal distribution




Statisticians have taken advantage of this relationship between average and standard
deviation in the development of standardized test scoring.
Interpreting Standardized Test
Results
 Raw scores
 Percentiles
 Stanines
 Grade equivalents
 Standard scores
Raw score

 The number of items an individual answered
  correctly on a standardized test or subtest.
 This is a number that has not been interpreted.
  The statistical procedures which follow allow us
  to interpret a raw score in relation to other
  people who have taken the test.
 You cannot make any interpretation of a simple
  raw score—you need to know the information
  that follows in order to interpret the score.
Percentiles
 Percentile or percentile rank: a ranking that compares an
  individual’s score with the scores of all the others who
  have taken the test.
 Your percentile rank tells you how you did in relation to
  others. A percentile rank of 80 means you did as well or
  better than 80% of those who took the test.
 Percentile is not the same as percentage (the number of
  correct out of the total number of items). Percentile is a
  comparison of people.
 Percentile bands are ranges of percentile scores on
  standardized tests. They allow for the fact that test
  scores are an estimation rather than a perfect measure.
Grade equivalents

 A score that is determined by comparing
  an individual’s score on a standardized
  test to the scores of students in a
  particular age group; the first digit
  represents the grade and the second the
  month of that school year.
 A first grader who has the grade
  equivalent of 2.4 on a reading test has
  the same score as the AVERAGE 2nd
  grader in the fourth month of school.
Grade equivalents

 When a student scores above grade
 level on a test, that does not mean the
 student is ready for higher level work. It
 simply means that the student has
 mastered the work at his/her grade level.
Standard score
Imagine
various            A description of performance on a
scores of this      standardized test that uses standard
sort and then
go back to the      deviation as a basic unit.
chart that
shows you          Z-score: the number of standard
the                 deviation units from the mean (average).
percentage of
people in          T-score: standard score that defines 50
relation to
standard            as the mean and a standard deviation as
deviation.          10.
That will show
you the utility
of this
concept.
More about standard scores
Stanines
 A stanine (standard nine) is a description of an
  individual’s standardized test performance that
  uses a scale ranging from 1-9 points.
 These 9 points are spread equitably across the
  range of the bell curve and are standardized in
  terms of how many standard deviations they
  cover.
 While stanines are a simple way to report
  scores and they reflect the fact that students
  who score closely in percentile ranks are really
  not much different, they also may be overly
  reductive—they reduce a whole test to a single
  digit.
Stanines
Interpreting test scores

 Reliability
 Error
 Confidence interval
 Validity
 Absence of bias
Consistency of test results
            Reliability




How reliable is your bathroom scale? If it reads 100 pounds, then 75 pounds, then
105 pounds for the same object, then it’s not reliable.
Standard error of
               measurement
                True score: the hypothetical average of
                 an individual’s scores if repeated testing
                 under ideal conditions were possible.
                Standard error of measurement: the
                 range of scores within which an
                 individual’s true score is likely to fall.
                 (also called confidence interval, score
                 band, or profile band).

This is how statisticians get around the idea that no test situation is
perfect.
Validity

 An evaluation of the adequacy and
  appropriateness of the interpretations
  and uses of assessment results.
 Focuses on USE of test results, not the
  test itself.
 Three types: content, predictive, and
  construct
Content validity

 A test’s ability to representatively sample the
  content that is taught and measure the extent to
  which learners understand it.
 In other words, how close is the test to what
  was actually taught? That is its content validity.
   If it is not close to the curriculum, then its
  results do not measure how much students
  understood the curriculum and it is not
  appropriate to use it for that purpose.
Predictive validity

 An indicator of a test’s ability to gauge
  future performance.
 You find this by correlating the test score
  with grades. For example, the extent to
  which the SAT predicts college grades is
  .42 (which is not very high—a perfect
  correlation is 1.0).
Construct validity

 An indicator of the logical connection between a
  test and what it is designed to measure.
 A reading test that has students actually read a
  passage has construct validity because of the
  logical connection between the every day task
  (reading) and the process on the test (reading
  and responding to questions about a text).
 A reading test does not typically assess
  mathematical knowledge, and would not have
  construct validity for that purpose.
Bias

 Content
 Testing procedures
 Test use
Cultural minorities and high-
        stakes tests
         Tests can be harmful to people from
           cultural minorities in the US since people
           from many of these cultures tend to
           score lower than non-minority students.



Assessment bias: qualities of an assessment instrument that offend or
unfairly penalize a group of students because of the students’ gender,
economic class, race, ethnicity, etc.
Content

 Content tends to be biased toward white
  middle class students.
 For example, the 4th grade proficiency
  test writing prompt that asked students
  to write about going camping. Students
  without this kind of experience would not
  be able to write well about this topic.
Procedures

 Students’ cultures can affect their
  response to the whole procedure of
  testing.
 Time limits can be unfair to students for
  whom English is a foreign language or
  for students with learning disabilities.
Test use

 Test results can be used to discriminate
  against groups of students.
Eliminating bias

 Examine test content and analyze the
  results. For example, if most students
  get a certain answer wrong, they were
  probably not taught that concept.
 Adapt testing procedures if possible and
  teach students about the test.
 Use more than just standardized tests
  for making decisions—use alternative
  assessment data as well.
Creating bias-free tests

 Culture-fair/culture-free test: a test
  without cultural bias.
 This is very difficult to create
Purposes of standardized
testing
 Student assessment: how students in one
    classroom compare to students across the
    nation (or even around the world)
   Diagnosis: a student’s specific strengths and
    weaknesses
   Selection and placement: standardized tests
    may determine whether or not a student is
    invited to take advanced classes
   Program evaluation: how students in a
    particular school or program compare with
    students across the nation
   Accountability: how students of a particular
    teacher score on a test.
Types of Standardized Tests

 Achievement
 Diagnostic
 Intelligence
 Aptitude
Achievement

                  Determining the extent to which students
What
achievement        have mastered a content area
tests have you    Comparing the performance of students
taken?
                   with others across the country
                  Tracking student progress over time
                  Determining if students have the
                   background knowledge to begin
                   instruction in particular areas
                  Identifying learning problems

  Achievement tests are designed to measure and communicate how much students
  have learned in specified content areas.
Diagnostic tests

                Usually given individually
                Usually have more subtests and
                 measure knowledge of a particular area
                 in detail
                Provide information that teachers can
                 use in order to instruct the child or
                 address weaknesses


Diagnostic tests are designed to provide a detailed description of learners’ strengths
and weaknesses in specified skill areas.
Aptitude tests

 Standardized tests designed to predict
  the potential for future learning and
  measure general abilities developed
  over long periods of time.
 Different from intelligence tests because
  aptitude is just one aspect of
  intelligence.
 ACT and SAT are aptitude tests
  (measuring your aptitude for college-
  level work).
Intelligence tests

 A type of aptitude test
 Standardized tests designed to measure
  an individual’s ability to acquire
  knowledge, capacity to think and reason
  in the abstract, and ability to solve novel
  problems.
 Individually administered tests are
  typically more accurate than group-
  administered intelligence tests.
Teacher’s job

 Make sure the test content matches
  learning goals
 Prepare students for the test
 Administer tests according to instructions
 Communicate results to students and
  their caregivers so their results can be
  used for educational decision-making
Issues in Standardized Testing

 Accountability
 Testing teachers
Accountability
                                Accountability: the process of
Adequate yearly progress:         requiring students to demonstrate
objectives for yearly             that they have met specified
improvement for all students
and for specific groups, such     standards and holding teachers
as students from major ethnic     responsible for students’
and racial groups, students       performance.
with disabilities, students
from low-income families, and    High stakes tests: standardized
students whose English is         tests designed to measure the
limited.
                                  extent to which standards are being
                                  met. (minimum competency testing).


Standards-based education: the process of focusing curricula and instruction
on pre-determined goals.
Accountability

 When you have high stakes testing
  (where the results of the test determine
  something major about a person’s life
  such as whether or not he/she
  graduates), teachers often “teach to the
  test.” Instruction becomes boring and
  students with a low tolerance for busy
  work struggle.
 Some say these tests help us to know
  which schools are effective.
Standardized testing with
alternative formats
 These are time-consuming to score but
  they address critics’ concerns that
  multiple choice formats are a limited way
  to know what a student knows.
Implications for teachers

 You will have to deal with standardized testing
  —it’s here to stay.
 You will need to know more content knowledge
  as well as about how children learn.
 You need to know about how to interpret
  standardized tests.
 It’s possible to find a creative way to deal with
  standardized testing (see p. 544 in your book
  for an example).
New directions in assessment

 Authentic assessment: measurement of
  important abilities using procedures that
  simulate the application of these abilities
  to real-life problems.
 Constructed response formats:
  assessment procedures that require the
  student to create an answer instead of
  selecting an answer from a set of
  choices.
Praxis and new directions

         Praxis II uses constructed responses
         Praxis III is supposed to be authentic
            assessment (you actually teach a lesson
            while being observed).

These practices are not without their critics—you don’t have a lot of time to
do the constructed responses in Praxis II and the pass rate for Praxis III is
very high—perhaps the universities are doing a good job of preparing pre-
service teachers after all…
Vocabulary
               Confi-                                            Standa
 Accoun-                   Grade                  Percentile
               dence                   Median                    r-dized      Validity
  tability                equivalent                bands
              interval                                            tests

 Achiev        Con-         High
                                                  Predictive Standard
 e-ment        struct      Stakes       Mode                                 Variability
                                                   validity    score
  tests       validity       test
                                                                  Stan-
Adequate     Con-                                                 dards-
                                       National
  yearly   structed Histogram                      Range          based       Z-score
                                        norms
 progress responses                                                edu-
                                                                  cation
               Culture
                                     Normal
 Aptitude       -fair/  Intelligence
                                     distribu     Reliability    Stanine
  tests       culture-      tests
                                      -tion
              free test
                                                    Stan-
 Assess-
             Diagnostic                Norming      dard
  ment                     Mean                                 True score
                tests                   group        de-
   bias
                                                   viation
             Fre-         Measure             Standard
 Authentic quency            s of              error of
                                   Percentile                    T-score
assessment distribu        central            measure-
            -tion         tendency              ment

Contenu connexe

Tendances

Test standardization
Test standardizationTest standardization
Test standardizationKaye Batica
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2Jesullyna Manuel
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good TestDrSindhuAlmas
 
ppt norm reference and criteration test
ppt norm reference and criteration testppt norm reference and criteration test
ppt norm reference and criteration testNur Arif S
 
Criterion referenced test
Criterion referenced test Criterion referenced test
Criterion referenced test Ulfa
 
ASSESSMENT AND TYPES OF ASSESSMENT
ASSESSMENT AND TYPES OF ASSESSMENTASSESSMENT AND TYPES OF ASSESSMENT
ASSESSMENT AND TYPES OF ASSESSMENTSANA FATIMA
 
Assessment, types and purpose
Assessment, types and purposeAssessment, types and purpose
Assessment, types and purposeshresthadnes
 
Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir Dr.Shazia Zamir
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testALMA HERMOGINO
 
Chapter 1a Basic Concepts in Assessment
Chapter 1a   Basic Concepts in AssessmentChapter 1a   Basic Concepts in Assessment
Chapter 1a Basic Concepts in AssessmentAhl Terdie Mantua
 
Achievement test - Teacher Made Test and Standardized Test - Characteristics,...
Achievement test - Teacher Made Test and Standardized Test - Characteristics,...Achievement test - Teacher Made Test and Standardized Test - Characteristics,...
Achievement test - Teacher Made Test and Standardized Test - Characteristics,...Suresh Babu
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and ValidityBrian Ebie
 
Administration of the Test and Analysis of Students’ Performance
Administration of the Test and Analysis of Students’ PerformanceAdministration of the Test and Analysis of Students’ Performance
Administration of the Test and Analysis of Students’ PerformanceGautam Kumar
 

Tendances (20)

Test standardization
Test standardizationTest standardization
Test standardization
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
 
Standardized test
Standardized testStandardized test
Standardized test
 
Interpreting test results
Interpreting test resultsInterpreting test results
Interpreting test results
 
ppt norm reference and criteration test
ppt norm reference and criteration testppt norm reference and criteration test
ppt norm reference and criteration test
 
Interpretation of test Scores
Interpretation of test ScoresInterpretation of test Scores
Interpretation of test Scores
 
Criterion referenced test
Criterion referenced test Criterion referenced test
Criterion referenced test
 
Test interpretation
Test interpretationTest interpretation
Test interpretation
 
Assessment purposes and approaches
Assessment purposes and approachesAssessment purposes and approaches
Assessment purposes and approaches
 
ASSESSMENT AND TYPES OF ASSESSMENT
ASSESSMENT AND TYPES OF ASSESSMENTASSESSMENT AND TYPES OF ASSESSMENT
ASSESSMENT AND TYPES OF ASSESSMENT
 
Assessment, types and purpose
Assessment, types and purposeAssessment, types and purpose
Assessment, types and purpose
 
Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Chapter 1a Basic Concepts in Assessment
Chapter 1a   Basic Concepts in AssessmentChapter 1a   Basic Concepts in Assessment
Chapter 1a Basic Concepts in Assessment
 
Achievement test - Teacher Made Test and Standardized Test - Characteristics,...
Achievement test - Teacher Made Test and Standardized Test - Characteristics,...Achievement test - Teacher Made Test and Standardized Test - Characteristics,...
Achievement test - Teacher Made Test and Standardized Test - Characteristics,...
 
Reliability
ReliabilityReliability
Reliability
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and Validity
 
Administration of the Test and Analysis of Students’ Performance
Administration of the Test and Analysis of Students’ PerformanceAdministration of the Test and Analysis of Students’ Performance
Administration of the Test and Analysis of Students’ Performance
 
Types of test items
Types of test itemsTypes of test items
Types of test items
 

Similaire à Standardized tests

3_Statistics-Refresher.pptx Pschological Assessment
3_Statistics-Refresher.pptx Pschological Assessment3_Statistics-Refresher.pptx Pschological Assessment
3_Statistics-Refresher.pptx Pschological Assessmentsaphiremirae
 
Converting-a-Normal-Random-Variable-to-a-Standard.pptx
Converting-a-Normal-Random-Variable-to-a-Standard.pptxConverting-a-Normal-Random-Variable-to-a-Standard.pptx
Converting-a-Normal-Random-Variable-to-a-Standard.pptxborielroy279
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic StatisticsYee Bee Choo
 
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxLESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxMarjoriAnneDelosReye
 
Fulcher standardized testing
Fulcher standardized testingFulcher standardized testing
Fulcher standardized testingMelikarj
 
F ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docx
F ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docxF ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docx
F ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docxmecklenburgstrelitzh
 
Data DistributionM (1).pptx
Data DistributionM (1).pptxData DistributionM (1).pptx
Data DistributionM (1).pptxhodakamal6
 
Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)Mahsa Farahanynia
 
Analyzing quantitative data
Analyzing quantitative dataAnalyzing quantitative data
Analyzing quantitative datamostafasharafiye
 
Week 7 a statistics
Week 7 a statisticsWeek 7 a statistics
Week 7 a statisticswawaaa789
 
Research methods 2 operationalization & measurement
Research methods 2   operationalization & measurementResearch methods 2   operationalization & measurement
Research methods 2 operationalization & measurementattique1960
 
When you are working on the Inferential Statistics Paper I want yo.docx
When you are working on the Inferential Statistics Paper I want yo.docxWhen you are working on the Inferential Statistics Paper I want yo.docx
When you are working on the Inferential Statistics Paper I want yo.docxalanfhall8953
 

Similaire à Standardized tests (20)

Zscores
ZscoresZscores
Zscores
 
3_Statistics-Refresher.pptx Pschological Assessment
3_Statistics-Refresher.pptx Pschological Assessment3_Statistics-Refresher.pptx Pschological Assessment
3_Statistics-Refresher.pptx Pschological Assessment
 
Converting-a-Normal-Random-Variable-to-a-Standard.pptx
Converting-a-Normal-Random-Variable-to-a-Standard.pptxConverting-a-Normal-Random-Variable-to-a-Standard.pptx
Converting-a-Normal-Random-Variable-to-a-Standard.pptx
 
Kinds Of Variable
Kinds Of VariableKinds Of Variable
Kinds Of Variable
 
Research Procedure
Research ProcedureResearch Procedure
Research Procedure
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic Statistics
 
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxLESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
 
Data and Statistics
Data and StatisticsData and Statistics
Data and Statistics
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Module 2 lesson 3 notes
Module 2 lesson 3 notesModule 2 lesson 3 notes
Module 2 lesson 3 notes
 
Fulcher standardized testing
Fulcher standardized testingFulcher standardized testing
Fulcher standardized testing
 
Norms[1]
Norms[1]Norms[1]
Norms[1]
 
F ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docx
F ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docxF ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docx
F ProjHOSPITAL INPATIENT P & L20162017Variance Variance Per DC 20.docx
 
Data DistributionM (1).pptx
Data DistributionM (1).pptxData DistributionM (1).pptx
Data DistributionM (1).pptx
 
Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)
 
Analyzing quantitative data
Analyzing quantitative dataAnalyzing quantitative data
Analyzing quantitative data
 
Week 7 a statistics
Week 7 a statisticsWeek 7 a statistics
Week 7 a statistics
 
Research methods 2 operationalization & measurement
Research methods 2   operationalization & measurementResearch methods 2   operationalization & measurement
Research methods 2 operationalization & measurement
 
When you are working on the Inferential Statistics Paper I want yo.docx
When you are working on the Inferential Statistics Paper I want yo.docxWhen you are working on the Inferential Statistics Paper I want yo.docx
When you are working on the Inferential Statistics Paper I want yo.docx
 
Location Scores
Location  ScoresLocation  Scores
Location Scores
 

Plus de CarolynOsborne (13)

Classroom assessment
Classroom assessmentClassroom assessment
Classroom assessment
 
Types of Teacher Expertise
Types of Teacher ExpertiseTypes of Teacher Expertise
Types of Teacher Expertise
 
Research for wiki
Research for wikiResearch for wiki
Research for wiki
 
Communicationppt
CommunicationpptCommunicationppt
Communicationppt
 
Differences and similarities
Differences and similaritiesDifferences and similarities
Differences and similarities
 
Abilities, disabilities intelligence laws
Abilities, disabilities intelligence lawsAbilities, disabilities intelligence laws
Abilities, disabilities intelligence laws
 
Motivation
MotivationMotivation
Motivation
 
Cognitive (1)
Cognitive (1)Cognitive (1)
Cognitive (1)
 
Bandura
BanduraBandura
Bandura
 
Vygotsky
VygotskyVygotsky
Vygotsky
 
Erikson
EriksonErikson
Erikson
 
Piaget (1)
Piaget (1)Piaget (1)
Piaget (1)
 
Behaviorism
BehaviorismBehaviorism
Behaviorism
 

Dernier

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Dernier (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

Standardized tests

  • 2. Measurement and assessment  Measurement: an evaluation expressed in quantitative (numerical) terms  Assessment: procedures used to obtain information about student performance
  • 3. Assessment Assessment includes many different possible activities, not just tests! But, since tests are a large part of current assessment practices, this whole chapter is dedicated to them.
  • 4. Standardized tests  Assessment instruments given to large samples of students under uniform conditions and scored according to uniform procedures For example: ACT, SAT, GRE, Praxis, and many of the achievement tests you took in school.
  • 5. Norm-referenced testing: testing in which scores are compared to guys named “Norm.” Just kidding. Testing in which scores are compared with the average performance of others. Norms  An individual’s score on a standardized test is compared to some kind of “normal” group.  Norming group: the representative group of individuals whose standardized test scores are compiled for the purpose of constructing national norms.  National norms: scores on standardized tests earned by representative groups of students from around the nation to which an individual’s score is compared. Your score on achievement tests such as SAT or ACT was compared to the scores of the norming group.
  • 6. Criterion-referenced test interpretations  Testing in which scores are compared to a set performance standard.  The test you took to get a driver’s license was criterion-referenced.  Criterion-referenced tests measure mastery of concepts or skills.
  • 7. Descriptive statistics  Frequency distributions  Measures of central tendency  Measures of variability  Normal distribution
  • 8. This Frequency distribution arrangement doesn’t tell Scores on 50 point test: you much 48, 47, 46, 45, 45, 44, 44, 44, 44, 44, 43, 43, This 43, 43, 42, 42, 42, 42, 41, 41, 41, 40, 40, 39, arrangement 39, 38, 38, 37, 36, 35 shows you X more about what X X X happened. Remember X X X X the general shape that is X X X X X X X X X defined by the x’s. It will X X X X X X X X X X X X X X be important. 35 36 37 38 39 40 41 42 43 44 45 46 47 48 A frequency distribution is a distribution of test scores that shows a simple count of the number of people who obtained each score.
  • 9. Histogram 5 4.5 4 3.5 3 2.5 SCORES 2 1.5 1 0.5 0 35 37 39 41 43 45 47 The same frequency distribution can be represented by a histogram: a bargraph of a frequency distribution. By the way, if you were a teacher, what would you think of this group of scores?
  • 10. Thinking about the scores  Clearly no one had mastery of the material—out of a 50 point test, no one answered all questions correctly.  So, the next question is, does this have something to do with the nature of the students, the nature of the teaching that went on, or the nature of the assessment? Why did so few students do well on this test?  An assessment is not just information gathered about students; it also represents something about teaching and potentially something about the assessment procedures themselves.
  • 11. Measures of Central Tendency  Mean: average score  Median: middle score  Mode: most frequent score Measures of central tendency are quantitative descriptions of how a group performed as a whole.
  • 12. More about measures of central tendency Imagine a classroom that So how come we has one genius and 20 can’t just use the ordinary people. The average? genius’s score will artificially raise the average of the class. Median and mode are not as affected by a single “outlying” score.
  • 13. And another thing…. X X X X X X X X X X X X X X X X X X X X X X X X x X X x X X X X X X X 60 61 62 63 64 65 66 67 68 69 70 71 You might have a frequency distribution that looks like this. For example, if you measured the heights of a group of men and women, you would have a “bump” for the average height of men and another “bump” for the height of the women. This situation would mean that you have two modes (63 and 68 inches in this case) and your collection of numbers would be “bimodal.” Keep in mind that “average” for the whole group in this case would be somewhere between the men and the women. The average wouldn’t really represent what is going on here very well.
  • 14. Measures of variability  Range: the distance between the top and bottom score in a distribution of scores.  Standard deviation: a statistical measure of the spread of scores  Variability: degree of difference from mean.
  • 15. Standard deviation Who is the better shot? Standard deviation could tell us because it is essentially the average deviation from the norm. We would measure the distance between each hold and the center of the target and then average those distances to come up with the standard deviation. The archer with the lowest standard deviation would be the better archer.
  • 16. The Normal Distribution Imagine the SAT scores for the millions of people who take it. Some people do extremely well. Some do extremely poorly. Most do average. If you graphed the SAT scores of everyone who takes it the way we graphed scores earlier in this presentation (with the X’s), you would get a shape like this. This is called the “bell curve” or the “normal distribution.” Normal distribution is a distribution of scores in which the mean, median, and mode are equal and the scores distribute themselves symmetrically in a bell-shaped curve.
  • 17. Normal distribution There is an interesting relationship between the percent of people with a score and standard deviation (the average deviation from the average score). 68% (blue on this graph) of the test takers fall within one standard deviation of the norm. This is true for any measure that yields a normal distribution (height of women or height of men, IQ, ACT, SAT, number of pushups people can do, etc.).
  • 18. Normal distribution Statisticians have taken advantage of this relationship between average and standard deviation in the development of standardized test scoring.
  • 19. Interpreting Standardized Test Results  Raw scores  Percentiles  Stanines  Grade equivalents  Standard scores
  • 20. Raw score  The number of items an individual answered correctly on a standardized test or subtest.  This is a number that has not been interpreted. The statistical procedures which follow allow us to interpret a raw score in relation to other people who have taken the test.  You cannot make any interpretation of a simple raw score—you need to know the information that follows in order to interpret the score.
  • 21. Percentiles  Percentile or percentile rank: a ranking that compares an individual’s score with the scores of all the others who have taken the test.  Your percentile rank tells you how you did in relation to others. A percentile rank of 80 means you did as well or better than 80% of those who took the test.  Percentile is not the same as percentage (the number of correct out of the total number of items). Percentile is a comparison of people.  Percentile bands are ranges of percentile scores on standardized tests. They allow for the fact that test scores are an estimation rather than a perfect measure.
  • 22. Grade equivalents  A score that is determined by comparing an individual’s score on a standardized test to the scores of students in a particular age group; the first digit represents the grade and the second the month of that school year.  A first grader who has the grade equivalent of 2.4 on a reading test has the same score as the AVERAGE 2nd grader in the fourth month of school.
  • 23. Grade equivalents  When a student scores above grade level on a test, that does not mean the student is ready for higher level work. It simply means that the student has mastered the work at his/her grade level.
  • 24. Standard score Imagine various  A description of performance on a scores of this standardized test that uses standard sort and then go back to the deviation as a basic unit. chart that shows you  Z-score: the number of standard the deviation units from the mean (average). percentage of people in  T-score: standard score that defines 50 relation to standard as the mean and a standard deviation as deviation. 10. That will show you the utility of this concept.
  • 26. Stanines  A stanine (standard nine) is a description of an individual’s standardized test performance that uses a scale ranging from 1-9 points.  These 9 points are spread equitably across the range of the bell curve and are standardized in terms of how many standard deviations they cover.  While stanines are a simple way to report scores and they reflect the fact that students who score closely in percentile ranks are really not much different, they also may be overly reductive—they reduce a whole test to a single digit.
  • 28. Interpreting test scores  Reliability  Error  Confidence interval  Validity  Absence of bias
  • 29. Consistency of test results Reliability How reliable is your bathroom scale? If it reads 100 pounds, then 75 pounds, then 105 pounds for the same object, then it’s not reliable.
  • 30. Standard error of measurement  True score: the hypothetical average of an individual’s scores if repeated testing under ideal conditions were possible.  Standard error of measurement: the range of scores within which an individual’s true score is likely to fall. (also called confidence interval, score band, or profile band). This is how statisticians get around the idea that no test situation is perfect.
  • 31. Validity  An evaluation of the adequacy and appropriateness of the interpretations and uses of assessment results.  Focuses on USE of test results, not the test itself.  Three types: content, predictive, and construct
  • 32. Content validity  A test’s ability to representatively sample the content that is taught and measure the extent to which learners understand it.  In other words, how close is the test to what was actually taught? That is its content validity. If it is not close to the curriculum, then its results do not measure how much students understood the curriculum and it is not appropriate to use it for that purpose.
  • 33. Predictive validity  An indicator of a test’s ability to gauge future performance.  You find this by correlating the test score with grades. For example, the extent to which the SAT predicts college grades is .42 (which is not very high—a perfect correlation is 1.0).
  • 34. Construct validity  An indicator of the logical connection between a test and what it is designed to measure.  A reading test that has students actually read a passage has construct validity because of the logical connection between the every day task (reading) and the process on the test (reading and responding to questions about a text).  A reading test does not typically assess mathematical knowledge, and would not have construct validity for that purpose.
  • 35. Bias  Content  Testing procedures  Test use
  • 36. Cultural minorities and high- stakes tests  Tests can be harmful to people from cultural minorities in the US since people from many of these cultures tend to score lower than non-minority students. Assessment bias: qualities of an assessment instrument that offend or unfairly penalize a group of students because of the students’ gender, economic class, race, ethnicity, etc.
  • 37. Content  Content tends to be biased toward white middle class students.  For example, the 4th grade proficiency test writing prompt that asked students to write about going camping. Students without this kind of experience would not be able to write well about this topic.
  • 38. Procedures  Students’ cultures can affect their response to the whole procedure of testing.  Time limits can be unfair to students for whom English is a foreign language or for students with learning disabilities.
  • 39. Test use  Test results can be used to discriminate against groups of students.
  • 40. Eliminating bias  Examine test content and analyze the results. For example, if most students get a certain answer wrong, they were probably not taught that concept.  Adapt testing procedures if possible and teach students about the test.  Use more than just standardized tests for making decisions—use alternative assessment data as well.
  • 41. Creating bias-free tests  Culture-fair/culture-free test: a test without cultural bias.  This is very difficult to create
  • 42. Purposes of standardized testing  Student assessment: how students in one classroom compare to students across the nation (or even around the world)  Diagnosis: a student’s specific strengths and weaknesses  Selection and placement: standardized tests may determine whether or not a student is invited to take advanced classes  Program evaluation: how students in a particular school or program compare with students across the nation  Accountability: how students of a particular teacher score on a test.
  • 43. Types of Standardized Tests  Achievement  Diagnostic  Intelligence  Aptitude
  • 44. Achievement  Determining the extent to which students What achievement have mastered a content area tests have you  Comparing the performance of students taken? with others across the country  Tracking student progress over time  Determining if students have the background knowledge to begin instruction in particular areas  Identifying learning problems Achievement tests are designed to measure and communicate how much students have learned in specified content areas.
  • 45. Diagnostic tests  Usually given individually  Usually have more subtests and measure knowledge of a particular area in detail  Provide information that teachers can use in order to instruct the child or address weaknesses Diagnostic tests are designed to provide a detailed description of learners’ strengths and weaknesses in specified skill areas.
  • 46. Aptitude tests  Standardized tests designed to predict the potential for future learning and measure general abilities developed over long periods of time.  Different from intelligence tests because aptitude is just one aspect of intelligence.  ACT and SAT are aptitude tests (measuring your aptitude for college- level work).
  • 47. Intelligence tests  A type of aptitude test  Standardized tests designed to measure an individual’s ability to acquire knowledge, capacity to think and reason in the abstract, and ability to solve novel problems.  Individually administered tests are typically more accurate than group- administered intelligence tests.
  • 48. Teacher’s job  Make sure the test content matches learning goals  Prepare students for the test  Administer tests according to instructions  Communicate results to students and their caregivers so their results can be used for educational decision-making
  • 49. Issues in Standardized Testing  Accountability  Testing teachers
  • 50. Accountability  Accountability: the process of Adequate yearly progress: requiring students to demonstrate objectives for yearly that they have met specified improvement for all students and for specific groups, such standards and holding teachers as students from major ethnic responsible for students’ and racial groups, students performance. with disabilities, students from low-income families, and  High stakes tests: standardized students whose English is tests designed to measure the limited. extent to which standards are being met. (minimum competency testing). Standards-based education: the process of focusing curricula and instruction on pre-determined goals.
  • 51. Accountability  When you have high stakes testing (where the results of the test determine something major about a person’s life such as whether or not he/she graduates), teachers often “teach to the test.” Instruction becomes boring and students with a low tolerance for busy work struggle.  Some say these tests help us to know which schools are effective.
  • 52. Standardized testing with alternative formats  These are time-consuming to score but they address critics’ concerns that multiple choice formats are a limited way to know what a student knows.
  • 53. Implications for teachers  You will have to deal with standardized testing —it’s here to stay.  You will need to know more content knowledge as well as about how children learn.  You need to know about how to interpret standardized tests.  It’s possible to find a creative way to deal with standardized testing (see p. 544 in your book for an example).
  • 54. New directions in assessment  Authentic assessment: measurement of important abilities using procedures that simulate the application of these abilities to real-life problems.  Constructed response formats: assessment procedures that require the student to create an answer instead of selecting an answer from a set of choices.
  • 55. Praxis and new directions  Praxis II uses constructed responses  Praxis III is supposed to be authentic assessment (you actually teach a lesson while being observed). These practices are not without their critics—you don’t have a lot of time to do the constructed responses in Praxis II and the pass rate for Praxis III is very high—perhaps the universities are doing a good job of preparing pre- service teachers after all…
  • 56. Vocabulary Confi- Standa Accoun- Grade Percentile dence Median r-dized Validity tability equivalent bands interval tests Achiev Con- High Predictive Standard e-ment struct Stakes Mode Variability validity score tests validity test Stan- Adequate Con- dards- National yearly structed Histogram Range based Z-score norms progress responses edu- cation Culture Normal Aptitude -fair/ Intelligence distribu Reliability Stanine tests culture- tests -tion free test Stan- Assess- Diagnostic Norming dard ment Mean True score tests group de- bias viation Fre- Measure Standard Authentic quency s of error of Percentile T-score assessment distribu central measure- -tion tendency ment