SlideShare une entreprise Scribd logo
1  sur  20
Item Analysis:
Classical and Beyond
SCROLLA Symposium
Measurement Theory and Item Analysis
Heriot Watt University
12th February 2003.
Why is item analysis relevant?
Item analysis provides a way of measuring
the quality of questions - seeing how
appropriate they were for the candidates
and how well they measured their ability.
It also provides a way of re-using items
over and over again in different tests with
prior knowledge of how they are going to
perform.
What kinds of item analysis
are there?
Item Analysis
Classical Latent Trait Models
RaschItem Response theory
IRT1 IRT2 IRT3 IRT4
Classical Analysis
Classical analysis is the easiest and most widely used
form of analysis. The statistics can be computed by
generic statistical packages (or at a push by hand)
and need no specialist software.
Analysis is performed on the test as a whole rather
than on the item and although item statistics can be
generated, they apply only to that group of students
on that collection of items
Classical Analysis
Assumptions
Classical test analysis assumes that any
test score is comprised of a “true” value,
plus randomised error. Crucially it
assumes that this error is normally
distributed; uncorrelated with true score
and the mean of the error is zero.
xobs = xtrue + G(0, σerr)
Classical Analysis Statistics
• Difficulty (item level statistic)
• Discrimination (item level statistic)
• Reliability (test level statistic)
Classical Analysis
Difficulty
The difficulty of a (1 mark) question in
classical analysis is simply the proportion
of people who answered the question
incorrectly. For multiple mark questions,
it is the average mark expressed as a
proportion.
Given on a scale of 0-1, the higher the
proportion the greater the difficulty
Classical Analysis
Discrimination
The discrimination of an item is the
(Pearson) correlation between the
average item mark and the average total
test mark.
Being a correlation it can vary from –1 to
+1 with higher values indicating
(desirable) high discrimination.
Classical Analysis
Reliability
Reliability is a measure of how well the test “holds
together”. For practical reasons, internal
consistency estimates are the easiest to obtain
which indicate the extent to which each item
correlates with every other item.
This is measured on a scale of 0-1. The greater
the number the higher the reliability.
Classical Analysis
vs
Latent Trait Models
• Classical analysis has the test (not the item as its
basis. Although the statistics generated are often
generalised to similar students taking a similar test;
they only really apply to those students taking that
test
• Latent trait models aim to look beyond that at the
underlying traits which are producing the test
performance. They are measured at item level and
provide sample-free measurement
Latent Trait Models
• Latent trait models have been around since the
1940s, but were not widely used until the 1960s.
Although theoretically possible, it is practically
unfeasible to use these without specialist software.
• They aim to measure the underlying ability (or trait)
which is producing the test performance rather than
measuring performance per se.
• This leads to them being sample-free. As the
statistics are not dependant on the test situation
which generated them, they can be used more
flexibly
Rasch
vs
Item Response Theory
Mathematically, Rasch is identical to the most basic
IRT model (IRT1), however there are some
important differences which makes it a more viable
proposition for practical testing
• In Rasch the model is superior. Data which does
not fit the model is discarded.
• Rasch does not permit abilities to be estimated for
extreme items and persons.
• Rasch eschews the use of bayesian priors to assist
parameter setting.
IRT - the generalised model
( ) )(
)(
1
)1( gg
gg
bDa
bDa
ggg
e
e
ccP −
−
+
−+= θ
θ
θ
Where
ag = gradient of the ICC at the point θ
(item discrimination)
bg = the ability level at which ag is maximised
(item difficulty)
cg = probability of low candidates correctly
IRT - Item Characteristic Curves
•An ICC is a plot of
the candidates ability
over the probability of
them correctly
answering the
question. The higher
the ability the higher
the chance that they
will respond correctly.
c - intercept
a - gradient
b - ability at max (a)
IRT - About the Parameters
Difficulty
• Although there is no “correct” difficulty for
any one item, it is clearly desirable that
the difficulty of the test is centred around
the average ability of the candidates.
• The higher the “b” parameter the more
difficult the question - note that this is
inversely proportionate to the probability
of the question being answered correctly.
IRT - About the Parameters
Discrimination
• In IRT (unlike Rasch) maximal
discrimination is sought. Thus the higher
the “a” parameter the more desirable the
question.
• Note however that differences in the
discrimination of questions can lead to
differences in the difficulties of questions
across the ability range.
IRT - About the Parameters
Guessing
• A high “c” parameter suggests that
candidates with very little ability may
choose the correct answer.
• This is rarely a valid parameter outwith
multiple choice testing…and the value
should not vary excessively from the
reciprocal of the number of choices.
IRT - Parameter Estimation
• Before being used (in an item bank or for
measurement) items must first be calibrated. That
is their parameters must be estimated.
• There are two main procedures - Joint Maximal
Likelihood and Marginal Maximal Likelihood. JML
is most common for IRT1 and 2, while MML is used
more frequently for IRT3.
• Bayesian estimation and estimated bounds may be
imposed on the data to avoid one parameter
degrading, or high discrimation items being over
valued.
Resources - Classical Analysis
Software
• Standard statistical packages (Excel; SPSS; SAS)
• ITEMAN (available from www.assess.com)
Reading
Matlock-Hetzel (1997) Basic Concepts in Item and Test
Analysis available at www.ericae.net/ft/tamu/Espy.htm
Resources - IRT
Software
• BILOG (available at www.assess.com)
• Xcalibre available at www.assess.com
Reading
• Lord (1980) Applications of Item Response Theory
to Practical Testing Problems
• Baker, Frank (2001). The Basics of Item Response
Theory - available at http://ericae.net/irt/baker/

Contenu connexe

Tendances

11 adaptive testing-irt
11 adaptive testing-irt11 adaptive testing-irt
11 adaptive testing-irt宥均 林
 
Item Response Theory in Constructing Measures
Item Response Theory in Constructing MeasuresItem Response Theory in Constructing Measures
Item Response Theory in Constructing MeasuresCarlo Magno
 
Introduction to Item Response Theory
Introduction to Item Response TheoryIntroduction to Item Response Theory
Introduction to Item Response TheoryNathan Thompson
 
Chapter 6: Writing Objective Test Items
Chapter 6: Writing Objective Test ItemsChapter 6: Writing Objective Test Items
Chapter 6: Writing Objective Test ItemsSHELAMIE SANTILLAN
 
Reliability
ReliabilityReliability
ReliabilityRoi Xcel
 
Comparison Between Objective Type Tests and Subjective Type tests.
Comparison Between Objective Type Tests and Subjective Type tests.Comparison Between Objective Type Tests and Subjective Type tests.
Comparison Between Objective Type Tests and Subjective Type tests.Bint-e- Hawa
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testcyrilcoscos
 
TEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.pptTEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.pptJennilynDescargar
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment. Tarek Tawfik Amin
 

Tendances (20)

11 adaptive testing-irt
11 adaptive testing-irt11 adaptive testing-irt
11 adaptive testing-irt
 
Item Response Theory in Constructing Measures
Item Response Theory in Constructing MeasuresItem Response Theory in Constructing Measures
Item Response Theory in Constructing Measures
 
Introduction to Item Response Theory
Introduction to Item Response TheoryIntroduction to Item Response Theory
Introduction to Item Response Theory
 
Chapter 6: Writing Objective Test Items
Chapter 6: Writing Objective Test ItemsChapter 6: Writing Objective Test Items
Chapter 6: Writing Objective Test Items
 
Irt assessment
Irt assessmentIrt assessment
Irt assessment
 
Reliability
ReliabilityReliability
Reliability
 
Reliability
ReliabilityReliability
Reliability
 
Item analysis with spss software
Item analysis with spss softwareItem analysis with spss software
Item analysis with spss software
 
Writing Test Items
Writing Test ItemsWriting Test Items
Writing Test Items
 
New item analysis
New item analysisNew item analysis
New item analysis
 
Comparison Between Objective Type Tests and Subjective Type tests.
Comparison Between Objective Type Tests and Subjective Type tests.Comparison Between Objective Type Tests and Subjective Type tests.
Comparison Between Objective Type Tests and Subjective Type tests.
 
Item analysis
Item analysisItem analysis
Item analysis
 
Item analysis
Item analysisItem analysis
Item analysis
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
TEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.pptTEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.ppt
 
Item analysis
Item analysisItem analysis
Item analysis
 
3.2 TEST ADMINISTRATION AND ETHICAL PRINCIPLES
3.2 TEST ADMINISTRATION AND ETHICAL PRINCIPLES3.2 TEST ADMINISTRATION AND ETHICAL PRINCIPLES
3.2 TEST ADMINISTRATION AND ETHICAL PRINCIPLES
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment.
 
Item writing
Item writingItem writing
Item writing
 
Interpreting Test Scores
Interpreting Test ScoresInterpreting Test Scores
Interpreting Test Scores
 

Similaire à Item Analysis: Classical and Beyond

The+application+of+irt+using+the+rasch+model presnetation1
The+application+of+irt+using+the+rasch+model presnetation1The+application+of+irt+using+the+rasch+model presnetation1
The+application+of+irt+using+the+rasch+model presnetation1Carlo Magno
 
Algorithm evaluation using item response theory
Algorithm evaluation using item response theoryAlgorithm evaluation using item response theory
Algorithm evaluation using item response theoryCSIRO
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
 
Factor Analysis - Statistics
Factor Analysis - StatisticsFactor Analysis - Statistics
Factor Analysis - StatisticsThiyagu K
 
Sampling of Blood
Sampling of BloodSampling of Blood
Sampling of Blooddrantopa
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
Introduction to unidimensional item response model
Introduction to unidimensional item response modelIntroduction to unidimensional item response model
Introduction to unidimensional item response modelSumit Das
 
Top 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdfTop 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdfDatacademy.ai
 
Evaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryEvaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryCSIRO
 
Sampling, measurement, and stats(2013)
Sampling, measurement, and stats(2013)Sampling, measurement, and stats(2013)
Sampling, measurement, and stats(2013)BarryCRNA
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
Meta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptxMeta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptxACSRM
 
1. F A Using S P S S1 (Saq.Sav) Q Ti A
1.  F A Using  S P S S1 (Saq.Sav)   Q Ti A1.  F A Using  S P S S1 (Saq.Sav)   Q Ti A
1. F A Using S P S S1 (Saq.Sav) Q Ti AZoha Qureshi
 
Factor analysis using SPSS
Factor analysis using SPSSFactor analysis using SPSS
Factor analysis using SPSSRemas Mohamed
 
Data What Type Of Data Do You Have V2.1
Data   What Type Of Data Do You Have V2.1Data   What Type Of Data Do You Have V2.1
Data What Type Of Data Do You Have V2.1TimKasse
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 

Similaire à Item Analysis: Classical and Beyond (20)

Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
The+application+of+irt+using+the+rasch+model presnetation1
The+application+of+irt+using+the+rasch+model presnetation1The+application+of+irt+using+the+rasch+model presnetation1
The+application+of+irt+using+the+rasch+model presnetation1
 
Algorithm evaluation using item response theory
Algorithm evaluation using item response theoryAlgorithm evaluation using item response theory
Algorithm evaluation using item response theory
 
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
 
Factor Analysis - Statistics
Factor Analysis - StatisticsFactor Analysis - Statistics
Factor Analysis - Statistics
 
Sampling of Blood
Sampling of BloodSampling of Blood
Sampling of Blood
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Introduction to unidimensional item response model
Introduction to unidimensional item response modelIntroduction to unidimensional item response model
Introduction to unidimensional item response model
 
Top 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdfTop 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdf
 
Evaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryEvaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response Theory
 
Sampling, measurement, and stats(2013)
Sampling, measurement, and stats(2013)Sampling, measurement, and stats(2013)
Sampling, measurement, and stats(2013)
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
Slide Psikologi.docx
Slide Psikologi.docxSlide Psikologi.docx
Slide Psikologi.docx
 
Meta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptxMeta-Analysis -- Introduction.pptx
Meta-Analysis -- Introduction.pptx
 
1. F A Using S P S S1 (Saq.Sav) Q Ti A
1.  F A Using  S P S S1 (Saq.Sav)   Q Ti A1.  F A Using  S P S S1 (Saq.Sav)   Q Ti A
1. F A Using S P S S1 (Saq.Sav) Q Ti A
 
Factor analysis using SPSS
Factor analysis using SPSSFactor analysis using SPSS
Factor analysis using SPSS
 
Data What Type Of Data Do You Have V2.1
Data   What Type Of Data Do You Have V2.1Data   What Type Of Data Do You Have V2.1
Data What Type Of Data Do You Have V2.1
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 

Dernier

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 

Dernier (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

Item Analysis: Classical and Beyond

  • 1. Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Heriot Watt University 12th February 2003.
  • 2. Why is item analysis relevant? Item analysis provides a way of measuring the quality of questions - seeing how appropriate they were for the candidates and how well they measured their ability. It also provides a way of re-using items over and over again in different tests with prior knowledge of how they are going to perform.
  • 3. What kinds of item analysis are there? Item Analysis Classical Latent Trait Models RaschItem Response theory IRT1 IRT2 IRT3 IRT4
  • 4. Classical Analysis Classical analysis is the easiest and most widely used form of analysis. The statistics can be computed by generic statistical packages (or at a push by hand) and need no specialist software. Analysis is performed on the test as a whole rather than on the item and although item statistics can be generated, they apply only to that group of students on that collection of items
  • 5. Classical Analysis Assumptions Classical test analysis assumes that any test score is comprised of a “true” value, plus randomised error. Crucially it assumes that this error is normally distributed; uncorrelated with true score and the mean of the error is zero. xobs = xtrue + G(0, σerr)
  • 6. Classical Analysis Statistics • Difficulty (item level statistic) • Discrimination (item level statistic) • Reliability (test level statistic)
  • 7. Classical Analysis Difficulty The difficulty of a (1 mark) question in classical analysis is simply the proportion of people who answered the question incorrectly. For multiple mark questions, it is the average mark expressed as a proportion. Given on a scale of 0-1, the higher the proportion the greater the difficulty
  • 8. Classical Analysis Discrimination The discrimination of an item is the (Pearson) correlation between the average item mark and the average total test mark. Being a correlation it can vary from –1 to +1 with higher values indicating (desirable) high discrimination.
  • 9. Classical Analysis Reliability Reliability is a measure of how well the test “holds together”. For practical reasons, internal consistency estimates are the easiest to obtain which indicate the extent to which each item correlates with every other item. This is measured on a scale of 0-1. The greater the number the higher the reliability.
  • 10. Classical Analysis vs Latent Trait Models • Classical analysis has the test (not the item as its basis. Although the statistics generated are often generalised to similar students taking a similar test; they only really apply to those students taking that test • Latent trait models aim to look beyond that at the underlying traits which are producing the test performance. They are measured at item level and provide sample-free measurement
  • 11. Latent Trait Models • Latent trait models have been around since the 1940s, but were not widely used until the 1960s. Although theoretically possible, it is practically unfeasible to use these without specialist software. • They aim to measure the underlying ability (or trait) which is producing the test performance rather than measuring performance per se. • This leads to them being sample-free. As the statistics are not dependant on the test situation which generated them, they can be used more flexibly
  • 12. Rasch vs Item Response Theory Mathematically, Rasch is identical to the most basic IRT model (IRT1), however there are some important differences which makes it a more viable proposition for practical testing • In Rasch the model is superior. Data which does not fit the model is discarded. • Rasch does not permit abilities to be estimated for extreme items and persons. • Rasch eschews the use of bayesian priors to assist parameter setting.
  • 13. IRT - the generalised model ( ) )( )( 1 )1( gg gg bDa bDa ggg e e ccP − − + −+= θ θ θ Where ag = gradient of the ICC at the point θ (item discrimination) bg = the ability level at which ag is maximised (item difficulty) cg = probability of low candidates correctly
  • 14. IRT - Item Characteristic Curves •An ICC is a plot of the candidates ability over the probability of them correctly answering the question. The higher the ability the higher the chance that they will respond correctly. c - intercept a - gradient b - ability at max (a)
  • 15. IRT - About the Parameters Difficulty • Although there is no “correct” difficulty for any one item, it is clearly desirable that the difficulty of the test is centred around the average ability of the candidates. • The higher the “b” parameter the more difficult the question - note that this is inversely proportionate to the probability of the question being answered correctly.
  • 16. IRT - About the Parameters Discrimination • In IRT (unlike Rasch) maximal discrimination is sought. Thus the higher the “a” parameter the more desirable the question. • Note however that differences in the discrimination of questions can lead to differences in the difficulties of questions across the ability range.
  • 17. IRT - About the Parameters Guessing • A high “c” parameter suggests that candidates with very little ability may choose the correct answer. • This is rarely a valid parameter outwith multiple choice testing…and the value should not vary excessively from the reciprocal of the number of choices.
  • 18. IRT - Parameter Estimation • Before being used (in an item bank or for measurement) items must first be calibrated. That is their parameters must be estimated. • There are two main procedures - Joint Maximal Likelihood and Marginal Maximal Likelihood. JML is most common for IRT1 and 2, while MML is used more frequently for IRT3. • Bayesian estimation and estimated bounds may be imposed on the data to avoid one parameter degrading, or high discrimation items being over valued.
  • 19. Resources - Classical Analysis Software • Standard statistical packages (Excel; SPSS; SAS) • ITEMAN (available from www.assess.com) Reading Matlock-Hetzel (1997) Basic Concepts in Item and Test Analysis available at www.ericae.net/ft/tamu/Espy.htm
  • 20. Resources - IRT Software • BILOG (available at www.assess.com) • Xcalibre available at www.assess.com Reading • Lord (1980) Applications of Item Response Theory to Practical Testing Problems • Baker, Frank (2001). The Basics of Item Response Theory - available at http://ericae.net/irt/baker/

Notes de l'éditeur

  1. Facilitation skills offer immediate, practical benefits to any group process. As facilitator, your role is to set the agenda, encourage participation, and guide the pace of the meeting. Use these Dale Carnegie Training® strategies to help make your meeting a success. You can provide handouts to focus the discussion and give attendees a place to record their ideas. If you print handouts with three slides per page, PowerPoint will automatically include blank lines for your meeting participants to take notes.
  2. I am going to add a note
  3. Begin with a strong opening, relevant to the topic and your audience. Appeal to your audience’s interests. Your energy and attitude set the tone and provide momentum for the meeting. Facilitation is a process that helps people communicate and work together. Establish an open environment conducive to problem solving and change.
  4. Bridge to your key topic. State the purpose and importance of your meeting. As facilitator, you are a catalyst for change. Ask questions. Coach participants. Encourage participation. Record comments. Thank participants for their individual contributions, ideas, and insights. You can use PowerPoint’s Meeting Minder to record comments during the meeting.
  5. Set clear goals for the meeting. Clarify the problem to be solved. Get agreement from the group on the agenda, the problem and the process. As facilitator, guide, inspire, and support participation from all attendees. Solicit personal experiences. Ask for evidence to back up points. Make every participant feel they are valuable with something to contribute.
  6. Once you have defined the problem, move to solution-finding. Encourage participants to be open-minded, think creatively, and suggest solutions. Respect diverse opinions. Diversity strengthens problem-solving and adds substance to discussions. Record ideas. Consider all suggestions. Keep a positive attitude throughout the process. Bring all participants into the process. List and prioritize options. Use the group process to determine the solution.
  7. Now that you have a solution, outline your action plan. Using the group process, agree upon an orderly direction to reach your goals. List the action steps. You can use the Meeting Minder to record action items for meeting participants. Identify resources needed. Assign responsibilities. Establish methods of accountability. Define a time line. Identify costs. Define current resources. Also, consider what is needed for success. Training? Specific expertise? Additional information? Special materials? Additional resources? Management buy-in? Proper planning prior to action leads to goal achievement.
  8. Close your meeting with a brief summary. Thank participants for their input. Review action plans, assignments, and accountability measures. Thank the group for following the process, working together as a team, and committing to problem solving and goal achievement. If you used Meeting Minder to keep track of action items, PowerPoint will automatically add a slide with the action items to the end of your presentation. Use this slide to review the action items and get buy-in from participants. Facilitation is leadership in action. Close your presentation with a strong comment to reinforce the success of the team. Empower the group to work toward goal achievement. Apply effective facilitation skills and you will have productive, profitable meetings.