SlideShare une entreprise Scribd logo
1  sur  22
test equating using irt
Presented by: Muhammad Munsif
Presented to: Dr. nasir mehmood
COURSE: test theories and designs
M.Phil. education (Evening 2019-2021)
munsifsail@gmail.com
TEST EQUATING
Equating is part of Item Analysis.
Used for large scale tests: Standardized tests/ Critereon
referenced test like SAT, IELTS, GAT
TEST EQUATING
Equating is a statistical process in which scores from
different test forms are adjusted so that they can be used
interchangeably.
“Equating is a statistical process to adjust scores on
different test forms to make them comparable (Kim &
Hanson, 2002)”.
TEST EQUATING
Each year a separate test is used to measure the abilities of
students. If the performance of the incoming students in a given
year is better than the performance of the students in the
preceding year, one can explain the difference in two ways.
1. Either the incoming students are more proficient or
2. The test they took was easier.
TEST EQUATING
In order to exclude the second explanation and make sure that the
observed trends are a valid reflection of changes in students’
abilities over time, the assessment mechanism needs to equate
scores from different test forms and adjust for variations in form
difficulty.
Test equating is a procedure “to provide comparable
scores on multiple forms of the same test, consequently
avoiding some of the possible inequities that could occur if
one examinee took a more difficult form of a test than that
taken by another examinee”
Conditions/Properties for Test Equating
Dorans (1990) describes four conditions necessary for
equating to be considered successful.
1. The test forms that are being equated must
measure the same construct and must be built to the
same content specifications.
2. A second property of a good equating process is that of
equity.
3. The equating transformation, or adjustment to scores,
must be symmetric.
4. The group invariance property, which states that the
transformation used to equate test scores should work
equally well in all subpopulations of interest.
TEST EQUATING USING IRT
IRT models were developed in the context of dichotomously
scored (0, 1) data e.g. MCQs or other tests, scored as
right/wrong.
Later IRT models have been expanded to accommodate
polytomously scored items, including Likert-type items,
performance ratings, and essay questions.
The mathematical function underlying IRT models
relates an examinee’s probability of a correct response
to the underlying level of the construct, commonly
termed theta (θ ) in the IRT literature.
Although theta is often referred to as “ability,” it could
represent any construct capable of being modeled in
terms of response probabilities.
IRT MODELS
IRT models are not only limited to educational applications.
Can be used with any type of latent construct.
Different IRT models are distinguished by the parameters
used to model the relation between theta and the response
probability.
IRT MODELS
The one-parameter model includes only a difficulty parameter, usually
symbolized as b.
The two-parameter model adds a discrimination parameter, a.
The three-parameter model adds a pseudo-guessing parameter, c.
This parameter adjusts the model to allow for the possibility that an
examinee might obtain the correct answer by guessing or other
construct-irrelevant behavior. Thus, even an examinee with very low
ability could have a probability greater than zero of obtaining the
correct answer.
IRT EQUATING DESIGNS
First determine that test forms are sufficiently similar in
content and difficulty.
Select an equating design.
An equating design is essentially a design for
administering the two (or more) test forms.
Single-Group Design
Administering both forms to the same sample.
Advantage
Scores on cannot be attributed to differences in the samples.
Disadvantage
Increased Testing time
Fatigue effects
practice effects
A safer way to get around this potential
problem
Splitting the group of test takers in two.
In one group, the new test form is taken first and the base
form second.
In the other group the order of the two forms is switched.
This design is known as the single- group counterbalanced
design.
Random-Groups Design
The test forms are typically spiraled by creating stacks of
tests in which the two (or more) forms are alternated (e.g., X,
Y, X, Y, X, Y).
Tests are then passed out to examinees in alternating order.
Advantage: Examinees do not have to take two test forms.
Differences in average scores on the two forms can be taken
as indications of differences in test difficulty.
This makes the equating process fairly straightforward.
The Disadvantage
It requires a large number of test takers because the
group is being split in half. According to Livingston
(2014), the random-groups design can require 5 –
15times as many examinees as the single-group
counterbalanced design to yield the same level of
equating accuracy.
Common Item Nonequivalent Groups
Design
In many cases, new forms of a test are developed and
administered some time after the base form of the test
was administered.
In such cases, it is clearly not possible to randomly split
those taking the two forms into groups.
For nonequivalent groups, this is accomplished by
including a set of common items (anchor/linking items).
These common items are included on both test forms and
are used to provide information about group differences on
the construct being measured.
Although the groups may differ in ability, they should not be
too different; otherwise the equating process will not be
able to completely adjust for differences across forms.
This is one of the drawbacks of the common-item
nonequivalent groups design.
The ability to interpret score differences in the
common-item nonequivalent groups design depends
heavily on the common-item set.
The common items should be in the same or very similar
locations on both tests because the order in which items
are presented can affect their level of difficulty.
Common items should be disbursed throughout the test,
rather than included in a separate block.
Question
What is Test Equating?
Tell names of IRT Models.
Name test equating designs.
Question
Could the following pairs of tests be equated? Why or
why not?
a. Two parallel versions of a reading comprehension test
b. Two personality tests that are based on different
theories of personality
c. Two tests of motor development, one for children up to
6 months of age and one for children aged 3 to 3 ½

Contenu connexe

Tendances

Standardized assessments
Standardized assessmentsStandardized assessments
Standardized assessments
jzurheide
 
Reliability
ReliabilityReliability
Reliability
Roi Xcel
 
Lesson 4 analysis of test results
Lesson 4 analysis of test resultsLesson 4 analysis of test results
Lesson 4 analysis of test results
Carlo Magno
 
Assessment presentation
Assessment presentationAssessment presentation
Assessment presentation
cswstyle
 
EquivalencePartition
EquivalencePartitionEquivalencePartition
EquivalencePartition
swornim nepal
 

Tendances (20)

Characteristics of a Good Test
Characteristics of a Good TestCharacteristics of a Good Test
Characteristics of a Good Test
 
Standardized assessments
Standardized assessmentsStandardized assessments
Standardized assessments
 
Reliability
ReliabilityReliability
Reliability
 
Issues in Assessment
Issues in AssessmentIssues in Assessment
Issues in Assessment
 
Reliability
ReliabilityReliability
Reliability
 
Test development
Test developmentTest development
Test development
 
Test Assembling (writing and constructing)
Test Assembling (writing and constructing)Test Assembling (writing and constructing)
Test Assembling (writing and constructing)
 
Lesson 4 analysis of test results
Lesson 4 analysis of test resultsLesson 4 analysis of test results
Lesson 4 analysis of test results
 
Anova, ancova, manova thiyagu
Anova, ancova, manova   thiyaguAnova, ancova, manova   thiyagu
Anova, ancova, manova thiyagu
 
The utilization of virtual learning environment (vle) to improve mathematics ...
The utilization of virtual learning environment (vle) to improve mathematics ...The utilization of virtual learning environment (vle) to improve mathematics ...
The utilization of virtual learning environment (vle) to improve mathematics ...
 
Professional education reviewer for let or blept examinees
Professional education reviewer for let or blept examineesProfessional education reviewer for let or blept examinees
Professional education reviewer for let or blept examinees
 
Assessment presentation
Assessment presentationAssessment presentation
Assessment presentation
 
Grading Ppt
Grading PptGrading Ppt
Grading Ppt
 
Assembling The Test
Assembling The TestAssembling The Test
Assembling The Test
 
Selecting and using standardized achievement test
Selecting and using standardized achievement testSelecting and using standardized achievement test
Selecting and using standardized achievement test
 
One way repeated measure anova
One way repeated measure anovaOne way repeated measure anova
One way repeated measure anova
 
Statistical techniques for interpreting and reporting quantitative data i
Statistical techniques for interpreting and reporting quantitative data   iStatistical techniques for interpreting and reporting quantitative data   i
Statistical techniques for interpreting and reporting quantitative data i
 
test Administration
test Administrationtest Administration
test Administration
 
Test planning
Test planningTest planning
Test planning
 
EquivalencePartition
EquivalencePartitionEquivalencePartition
EquivalencePartition
 

Similaire à Test equating using irt. final

An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response TheoryAn Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
Editor IJMTER
 
Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docx
nettletondevon
 
Adaptive Testing, Learning Progressions, and Students with Disabilities - May...
Adaptive Testing, Learning Progressions, and Students with Disabilities - May...Adaptive Testing, Learning Progressions, and Students with Disabilities - May...
Adaptive Testing, Learning Progressions, and Students with Disabilities - May...
Peter Hofman
 
Technology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docxTechnology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docx
ssuserf9c51d
 
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docxCase Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
wendolynhalbert
 
Developing instruments for research
Developing instruments for researchDeveloping instruments for research
Developing instruments for research
Carlo Magno
 
7Repeated Measures Designs for Interval DataLearnin.docx
7Repeated Measures Designs  for Interval DataLearnin.docx7Repeated Measures Designs  for Interval DataLearnin.docx
7Repeated Measures Designs for Interval DataLearnin.docx
evonnehoggarth79783
 

Similaire à Test equating using irt. final (20)

A New Concurrent Calibration Method For Nonequivalent Group Design Under Nonr...
A New Concurrent Calibration Method For Nonequivalent Group Design Under Nonr...A New Concurrent Calibration Method For Nonequivalent Group Design Under Nonr...
A New Concurrent Calibration Method For Nonequivalent Group Design Under Nonr...
 
An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response TheoryAn Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
 
Bus 308
Bus 308Bus 308
Bus 308
 
Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docx
 
Automated Question Paper Generator And Answer Checker Using Information Retri...
Automated Question Paper Generator And Answer Checker Using Information Retri...Automated Question Paper Generator And Answer Checker Using Information Retri...
Automated Question Paper Generator And Answer Checker Using Information Retri...
 
#1 Characteristics, Strengths, Weaknesses, Kinds of.pptx
#1 Characteristics, Strengths, Weaknesses, Kinds of.pptx#1 Characteristics, Strengths, Weaknesses, Kinds of.pptx
#1 Characteristics, Strengths, Weaknesses, Kinds of.pptx
 
U0 vqmtq2otq=
U0 vqmtq2otq=U0 vqmtq2otq=
U0 vqmtq2otq=
 
Adaptive Testing, Learning Progressions, and Students with Disabilities - May...
Adaptive Testing, Learning Progressions, and Students with Disabilities - May...Adaptive Testing, Learning Progressions, and Students with Disabilities - May...
Adaptive Testing, Learning Progressions, and Students with Disabilities - May...
 
Bijker, M. (2010) Making Measures And Inferences Reserve
Bijker, M. (2010)   Making Measures And Inferences ReserveBijker, M. (2010)   Making Measures And Inferences Reserve
Bijker, M. (2010) Making Measures And Inferences Reserve
 
Bijker, M. (2010) Making Measures And Inferences Reserve
Bijker, M. (2010)   Making Measures And Inferences ReserveBijker, M. (2010)   Making Measures And Inferences Reserve
Bijker, M. (2010) Making Measures And Inferences Reserve
 
EDR8205-3
EDR8205-3EDR8205-3
EDR8205-3
 
Constructs, variables, hypotheses
Constructs, variables, hypothesesConstructs, variables, hypotheses
Constructs, variables, hypotheses
 
LENGUAGE TESTING (II Bimestre Abril Agosto 2011)
LENGUAGE TESTING (II Bimestre Abril Agosto 2011)LENGUAGE TESTING (II Bimestre Abril Agosto 2011)
LENGUAGE TESTING (II Bimestre Abril Agosto 2011)
 
Technology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docxTechnology-based assessments-special educationNew technologies r.docx
Technology-based assessments-special educationNew technologies r.docx
 
Angelie
AngelieAngelie
Angelie
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
 
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docxCase Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
 
Developing instruments for research
Developing instruments for researchDeveloping instruments for research
Developing instruments for research
 
7Repeated Measures Designs for Interval DataLearnin.docx
7Repeated Measures Designs  for Interval DataLearnin.docx7Repeated Measures Designs  for Interval DataLearnin.docx
7Repeated Measures Designs for Interval DataLearnin.docx
 

Plus de munsif123

Advanced assessment and evaluation (role of assessment and measurement in tea...
Advanced assessment and evaluation (role of assessment and measurement in tea...Advanced assessment and evaluation (role of assessment and measurement in tea...
Advanced assessment and evaluation (role of assessment and measurement in tea...
munsif123
 
Alternative assessment strategies
Alternative assessment strategiesAlternative assessment strategies
Alternative assessment strategies
munsif123
 

Plus de munsif123 (20)

On Page SEO presentations
On Page SEO presentations On Page SEO presentations
On Page SEO presentations
 
Big shadow test
Big shadow testBig shadow test
Big shadow test
 
Computer based test designs (cbt)
Computer based test designs (cbt)Computer based test designs (cbt)
Computer based test designs (cbt)
 
Test construction
Test constructionTest construction
Test construction
 
Simultaneously assembly
Simultaneously assemblySimultaneously assembly
Simultaneously assembly
 
The nedelsky method
The nedelsky methodThe nedelsky method
The nedelsky method
 
Portfolios in Education
Portfolios in EducationPortfolios in Education
Portfolios in Education
 
Instrument development process
Instrument development processInstrument development process
Instrument development process
 
Advanced assessment and evaluation (role of assessment and measurement in tea...
Advanced assessment and evaluation (role of assessment and measurement in tea...Advanced assessment and evaluation (role of assessment and measurement in tea...
Advanced assessment and evaluation (role of assessment and measurement in tea...
 
Alternative assessment strategies
Alternative assessment strategiesAlternative assessment strategies
Alternative assessment strategies
 
Anecdotal record in education
Anecdotal record in education Anecdotal record in education
Anecdotal record in education
 
Role of objective Assessment and Evaluation
Role of objective Assessment and Evaluation  Role of objective Assessment and Evaluation
Role of objective Assessment and Evaluation
 
Item analysis in education
Item analysis  in educationItem analysis  in education
Item analysis in education
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessment
 
vertical Moderate Standard setting
vertical Moderate Standard setting vertical Moderate Standard setting
vertical Moderate Standard setting
 
Angoff method ppt
Angoff method pptAngoff method ppt
Angoff method ppt
 
Angoff method ppt
Angoff method pptAngoff method ppt
Angoff method ppt
 
American psychology Association
American psychology Association American psychology Association
American psychology Association
 
Student diversity
Student diversity Student diversity
Student diversity
 
Rationalism
RationalismRationalism
Rationalism
 

Dernier

Dernier (20)

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

Test equating using irt. final

  • 1. test equating using irt Presented by: Muhammad Munsif Presented to: Dr. nasir mehmood COURSE: test theories and designs M.Phil. education (Evening 2019-2021) munsifsail@gmail.com
  • 2. TEST EQUATING Equating is part of Item Analysis. Used for large scale tests: Standardized tests/ Critereon referenced test like SAT, IELTS, GAT
  • 3. TEST EQUATING Equating is a statistical process in which scores from different test forms are adjusted so that they can be used interchangeably. “Equating is a statistical process to adjust scores on different test forms to make them comparable (Kim & Hanson, 2002)”.
  • 4. TEST EQUATING Each year a separate test is used to measure the abilities of students. If the performance of the incoming students in a given year is better than the performance of the students in the preceding year, one can explain the difference in two ways. 1. Either the incoming students are more proficient or 2. The test they took was easier.
  • 5. TEST EQUATING In order to exclude the second explanation and make sure that the observed trends are a valid reflection of changes in students’ abilities over time, the assessment mechanism needs to equate scores from different test forms and adjust for variations in form difficulty.
  • 6. Test equating is a procedure “to provide comparable scores on multiple forms of the same test, consequently avoiding some of the possible inequities that could occur if one examinee took a more difficult form of a test than that taken by another examinee”
  • 7. Conditions/Properties for Test Equating Dorans (1990) describes four conditions necessary for equating to be considered successful. 1. The test forms that are being equated must measure the same construct and must be built to the same content specifications.
  • 8. 2. A second property of a good equating process is that of equity. 3. The equating transformation, or adjustment to scores, must be symmetric. 4. The group invariance property, which states that the transformation used to equate test scores should work equally well in all subpopulations of interest.
  • 9. TEST EQUATING USING IRT IRT models were developed in the context of dichotomously scored (0, 1) data e.g. MCQs or other tests, scored as right/wrong. Later IRT models have been expanded to accommodate polytomously scored items, including Likert-type items, performance ratings, and essay questions.
  • 10. The mathematical function underlying IRT models relates an examinee’s probability of a correct response to the underlying level of the construct, commonly termed theta (θ ) in the IRT literature. Although theta is often referred to as “ability,” it could represent any construct capable of being modeled in terms of response probabilities.
  • 11. IRT MODELS IRT models are not only limited to educational applications. Can be used with any type of latent construct. Different IRT models are distinguished by the parameters used to model the relation between theta and the response probability.
  • 12. IRT MODELS The one-parameter model includes only a difficulty parameter, usually symbolized as b. The two-parameter model adds a discrimination parameter, a. The three-parameter model adds a pseudo-guessing parameter, c. This parameter adjusts the model to allow for the possibility that an examinee might obtain the correct answer by guessing or other construct-irrelevant behavior. Thus, even an examinee with very low ability could have a probability greater than zero of obtaining the correct answer.
  • 13. IRT EQUATING DESIGNS First determine that test forms are sufficiently similar in content and difficulty. Select an equating design. An equating design is essentially a design for administering the two (or more) test forms.
  • 14. Single-Group Design Administering both forms to the same sample. Advantage Scores on cannot be attributed to differences in the samples. Disadvantage Increased Testing time Fatigue effects practice effects
  • 15. A safer way to get around this potential problem Splitting the group of test takers in two. In one group, the new test form is taken first and the base form second. In the other group the order of the two forms is switched. This design is known as the single- group counterbalanced design.
  • 16. Random-Groups Design The test forms are typically spiraled by creating stacks of tests in which the two (or more) forms are alternated (e.g., X, Y, X, Y, X, Y). Tests are then passed out to examinees in alternating order. Advantage: Examinees do not have to take two test forms. Differences in average scores on the two forms can be taken as indications of differences in test difficulty. This makes the equating process fairly straightforward.
  • 17. The Disadvantage It requires a large number of test takers because the group is being split in half. According to Livingston (2014), the random-groups design can require 5 – 15times as many examinees as the single-group counterbalanced design to yield the same level of equating accuracy.
  • 18. Common Item Nonequivalent Groups Design In many cases, new forms of a test are developed and administered some time after the base form of the test was administered. In such cases, it is clearly not possible to randomly split those taking the two forms into groups. For nonequivalent groups, this is accomplished by including a set of common items (anchor/linking items).
  • 19. These common items are included on both test forms and are used to provide information about group differences on the construct being measured. Although the groups may differ in ability, they should not be too different; otherwise the equating process will not be able to completely adjust for differences across forms. This is one of the drawbacks of the common-item nonequivalent groups design.
  • 20. The ability to interpret score differences in the common-item nonequivalent groups design depends heavily on the common-item set. The common items should be in the same or very similar locations on both tests because the order in which items are presented can affect their level of difficulty. Common items should be disbursed throughout the test, rather than included in a separate block.
  • 21. Question What is Test Equating? Tell names of IRT Models. Name test equating designs.
  • 22. Question Could the following pairs of tests be equated? Why or why not? a. Two parallel versions of a reading comprehension test b. Two personality tests that are based on different theories of personality c. Two tests of motor development, one for children up to 6 months of age and one for children aged 3 to 3 ½

Notes de l'éditeur

  1. Equating is a statistical process in which scores from different test forms are adjusted so that they can be used interchangeably.
  2. Equating is a statistical process in which scores from different test forms are adjusted so that they can be used interchangeably.