SlideShare une entreprise Scribd logo
1  sur  23
Psychometrics for Clinical Skills
Assessment
Serkan Toy, PhD
Director of Evaluation and Program Development
Graduate Medical Education
Children’s Mercy Hospital – Kansas City
Outline
• What is psychometrics?
– Measurement
– Construct development
• Reliability
• Validity
• A few other issues to consider
– Checklists vs. Global ratings
Psychometrics
• Educational & Psychological
Measurement
– measurement of knowledge, abilities,
attitudes, and personality traits
– mainly concerned with the construction and
validation of measurement instruments (i.e.
cognitive tests, surveys/questionnaires, and
personality assessments.)
Measurement
• Assigning numerals to observations based on
some pre-defined criteria
• Or assigning a value to one object/observation
in relation to another
– Intelligence - IQ
– Personality - Big Five
– Academic or procedural performance
Measurement
Subjective Objective
Qualitative Quantitative
Measurement
Global Analytic
Construct
Well-defined vs. Ill-defined
What do we already know about the construct in
question?
• Epistemological Beliefs
• Self efficacy
• Academic Performance
• Procedural Competency
Defining the Construct
• Factor Analysis
• Cluster Analytic Approach
• Cognitive/Procedural Task Analysis
– Hierarchical Task Analysis
– Delphi Technique - expert consensus (face validity)
– Angoff Method - determining passing (cut-off) scores
Key Concepts in Assessment
• Reliability
• Validity
Reliability
“The more consistent the scores are over
different raters and occasions, the more reliable
the assessment is thought to be” (Moskal & Leydens,
2000 as cited in Jonsson & Svingby, 2007)
• Test re-rest
• Inter-rater
• Binary vs. Likert scale
• Rubrics and calibration process
Jonsson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational
consequences. Educational Research Review, 2, 130-144.
Rubrics for Assessment
• “An assessment tool that describes levels of
performance on a particular task” (Jonsson & Svingby,
2007)
• Analytic, topic-specific rubrics seem to enhance reliable
scoring of performance especially when accompanied by
examples and/or rater training
• Example:
– Objective assessment of surgical competence in
gynaecological laparoscopy: development and validation of
a procedure-specific rating scale Larsen et al. (2008)
Jonsson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational
consequences. Educational Research Review, 2, 130-144.
Larsen C.R., Soerensen J.L., Grantcharov T.P., Dalsgaard T, Schouenborg L., Ottosen C.,
Schroeder T.V., Ottesen B.S. (2008) Effect of virtual reality training on laparoscopic surgery:
randomised controlled trial. BMJ, 14, 338, b1802.
Larsen et al. 2008
Example rating scale
1 2 3 4 5
Economy of
movements
Many
unnecessary
movements
Efficient motion
but some
unnecessary
motion
Maximum
economy of
movements
Economy of
time
Too long time
used to perform
sufficiently
Intermediate time
used to perform
sufficiently
Minimal time
used to perform
sufficiently
Errors: respect
for tissue
… … …
Flow of
operation
… … …
Traditionally Validity
• Three C’s of the “Trinitarian”
– Content
– Criterion
– Construct
Conceptual Change in Validity
“Validity is not a property of the test or
assessment as such, but rather of the
meaning of the test scores. These scores
are a function not only of the items or
stimulus conditions, but also of the persons
responding as well as the context of the
assessment. In particular, what needs to be
valid is the meaning or interpretation of the
score; as well as any implications for action
that this meaning entails” (p. 741).
Messick, S. (1995) Validity of psychological assessment: validation of inferences from
persons’ responses and performance as scientific inquiry into score meaning. American
Psychologist, 50, 741–9.
Validity
• Construct Validity
– Content / Face
– Convergent
– Discriminant
– Predictive
The question is: What can we validly conclude
about a trainee who receives a score of “X”
vs. that of receiving “Y”?
Construct Validity
• In competency assessment instruments validity
is usually established by examining whether
they distinguish between groups logically
presumed to differ on competency being
measured
– Experienced practitioners vs. trainees or
– Peer nominated superior performers vs. average
performers
Scofield, M. E., & Yoxtheimer, L. Y. (1983). Psychometric issues in the
assessment of clinical competencies. Journal of Counseling Psychology.
30, 413-420.
Other Issues to Consider
• Global ratings vs. Checklist scores
• Required sample size for validity testing
• Training the trainees
• Scoring the videotaped performances
– Individual procedural vs. qualitative skills
– Team performance
• Formative and summative assessment at the
same time
– Training (intervention) or assessment
(measurement)?
Global ratings vs. Checklist scores
• High correlation between global ratings
and checklist scores
• Both seem to differentiate similarly
between more experienced trainees and
novices
Examples
• Kim J., Neilipovitz D., Cardinal P., Chiu M. (2009) A comparison of global rating scale
and checklist scores in the validation of an evaluation tool to assess performance in
the resuscitation of critically ill patients during simulated emergencies (abbreviated as
‘‘CRM simulator study IB’’). Simulation in Healthcare, 4, 6–16.
• Morgan P.J., Cleave-Hogg D., Guest C.B. (2001) A comparison of global ratings and
checklist scores from an undergraduate assessment using an anesthesia simulator.
Academic Medicine, 76, 1053–5.
A Comparison of Global Rating Scale and Checklist Scores in the
Validation of an Evaluation Tool to Assess Performance in the
Resuscitation of Critically Ill Patients During Simulated Emergencies
(Abbreviated as "CRM Simulator Study IB")
Kim, John MD, MEd, FRCPC; Neilipovitz, David MD, FRCPC; Cardinal, Pierre MD, FRCPC; Chiu, Michelle MD,
FRCPC
• 32 PGY-1 & 28 PGY-3  2 simulation scenarios on Crisis Resource
Management (CRM)
• Ottawa Global Rating Scale  7-point anchored ordinal scale – 5 CRM
categories and overall performance score
• Ottawa CRM Checklist  12 item in 5 CRM categories- max 30 points
• 3 raters  blinded to year of training rated each videotaped
performance (no order specified for use of evaluation tools)
Kim et al. 2009 - Continued
Reliability: Inter-rater reliability  Intraclass Correlation Coefficient (ICC)
• Ottawa GRS: S1=0.59 & S2=0.61
– Subcategories showed similar ICC except “resource utilization &
communication (range 0.24 to 0.38)
• Cumulative CRM checklist: S1=0.63 & S2=0.55
– Again subcategories showed similar ICC except “resource utilization &
communication (again range 0.24 to 0.38)
Validity:
• Content validation (face validity)  Delphi process
• Response process  Resident orientation & rater training
• Comparison of scores by PGY  T test (ANOVA)
– Both the checklist and GRS overall & subcategory scores showed
statistically significant differences between PGY-1 and PGY-3 (more
experienced residents receiving higher scores)
– ANOVA showed similar results by each scenario as well as per rater
A Comparison of Global Ratings and Checklist Scores from an
Undergraduate Assessment Using an Anesthesia Simulator
Morgan, Pamela J. MD; Cleave-Hogg, Doreen PhD; Guest, Cameron B. MD, MEd
• 140 final year medical students  15-minute faculty-facilitated sim
scenario
– Conducted in 2nd
week of the 2-week anesthesia rotation
– Faculty followed a script and each session was videotaped
– Each student completed 1 of 6 scenarios (each with similar learning
objectives)
• 25-point criterion checklist for each scenario (not performed=0;
performed=1)
• 5-point global rating (clear failure=1 to superior performance=5)
• 10 faculty attended a workshop on performance protocols
– Randomly assigned to a rating pair to evaluate 25 to 34 videotaped
performances
Morgan et al. 2001 - Continued
• Correlation between checklist scores and global ratings  Pearson
r = 0.74
• Global ratings correlated more highly with technical skills and
judgment than with knowledge
– Knowledge  r = 0.24
– Technical skills  r =0.51
– Judgment  r =0.53
• Single-rater reliability (consistency)
– Mean ICC for Checklist scores  0.77 (range 0.58 to 0.93)
– Mean ICC for Global ratings  0.62 (range 0.40 to 0.77)
Other Issues to Consider
• Required sample size for validity testing
• Training the trainees
• Scoring the videotaped performances
– Individual procedural vs. qualitative skills
– Team performance
• Formative and summative assessment at the same time
– Training (intervention) or assessment (measurement)?
• Assessment tools to link simulated performance to actual
patient outcomes

Contenu connexe

Tendances (20)

Nursing Process in psychiatric nursing
Nursing Process in psychiatric nursingNursing Process in psychiatric nursing
Nursing Process in psychiatric nursing
 
Dissociative disorders & conversion disorders
Dissociative disorders & conversion disordersDissociative disorders & conversion disorders
Dissociative disorders & conversion disorders
 
Affective disorders ( MANIA and BIPOLAR DISORDERS)
Affective disorders ( MANIA and BIPOLAR DISORDERS)Affective disorders ( MANIA and BIPOLAR DISORDERS)
Affective disorders ( MANIA and BIPOLAR DISORDERS)
 
Behavior therapy
Behavior therapyBehavior therapy
Behavior therapy
 
OSPE/ OSCE
OSPE/ OSCEOSPE/ OSCE
OSPE/ OSCE
 
Bipolar disorder
Bipolar disorderBipolar disorder
Bipolar disorder
 
Psychotherapy
PsychotherapyPsychotherapy
Psychotherapy
 
General physical examination in psyhiatry
General physical examination in psyhiatryGeneral physical examination in psyhiatry
General physical examination in psyhiatry
 
Milieu therapy
Milieu therapyMilieu therapy
Milieu therapy
 
Somatoform disorders (1)
Somatoform disorders (1)Somatoform disorders (1)
Somatoform disorders (1)
 
hebephrenic (disorganized) and post schizophrenic depression
hebephrenic (disorganized) and post schizophrenic depressionhebephrenic (disorganized) and post schizophrenic depression
hebephrenic (disorganized) and post schizophrenic depression
 
Bi Polar Affective Disorder
Bi Polar Affective DisorderBi Polar Affective Disorder
Bi Polar Affective Disorder
 
Mood disorder & Manic episode
Mood disorder  & Manic episodeMood disorder  & Manic episode
Mood disorder & Manic episode
 
Pain ( M.Sc Nursing)
Pain ( M.Sc Nursing)Pain ( M.Sc Nursing)
Pain ( M.Sc Nursing)
 
General anxiety disorder (GAD) presentation
General anxiety disorder (GAD) presentationGeneral anxiety disorder (GAD) presentation
General anxiety disorder (GAD) presentation
 
Group therapy
Group therapyGroup therapy
Group therapy
 
Insomnia
InsomniaInsomnia
Insomnia
 
Psychoimmunology
PsychoimmunologyPsychoimmunology
Psychoimmunology
 
Quality assurance ppt
Quality assurance pptQuality assurance ppt
Quality assurance ppt
 
Evidence Based Practice
Evidence Based PracticeEvidence Based Practice
Evidence Based Practice
 

En vedette

Creating Drupal A Module
Creating Drupal A ModuleCreating Drupal A Module
Creating Drupal A Modulearcaneadam
 
Drupal from development point of view
Drupal from development point of viewDrupal from development point of view
Drupal from development point of viewNikolay Ignatov
 
Drupal Views development
Drupal Views developmentDrupal Views development
Drupal Views developmentOSInet
 
Drupal 7 module development
Drupal 7 module developmentDrupal 7 module development
Drupal 7 module developmentAdam Kalsey
 
Creating Custom Drupal Modules
Creating Custom Drupal ModulesCreating Custom Drupal Modules
Creating Custom Drupal Modulestanoshimi
 
Introduction to Drupal (7) Theming
Introduction to Drupal (7) ThemingIntroduction to Drupal (7) Theming
Introduction to Drupal (7) ThemingRobert Carr
 

En vedette (7)

Creating Drupal A Module
Creating Drupal A ModuleCreating Drupal A Module
Creating Drupal A Module
 
Drupal from development point of view
Drupal from development point of viewDrupal from development point of view
Drupal from development point of view
 
Drupal Views development
Drupal Views developmentDrupal Views development
Drupal Views development
 
Elasticsearch PHP UG BG
Elasticsearch PHP UG BGElasticsearch PHP UG BG
Elasticsearch PHP UG BG
 
Drupal 7 module development
Drupal 7 module developmentDrupal 7 module development
Drupal 7 module development
 
Creating Custom Drupal Modules
Creating Custom Drupal ModulesCreating Custom Drupal Modules
Creating Custom Drupal Modules
 
Introduction to Drupal (7) Theming
Introduction to Drupal (7) ThemingIntroduction to Drupal (7) Theming
Introduction to Drupal (7) Theming
 

Similaire à Psychometrics for Clinical Skills Assessment

Assessing learning in Instructional Design
Assessing learning in Instructional DesignAssessing learning in Instructional Design
Assessing learning in Instructional Designleesha roberts
 
Assessment of Clinical Competence.pptx
Assessment of Clinical Competence.pptxAssessment of Clinical Competence.pptx
Assessment of Clinical Competence.pptxPhysiotherapyDepartm1
 
Maggie Cruickshank - Optional work place assessment of training
Maggie Cruickshank - Optional work place assessment of training   Maggie Cruickshank - Optional work place assessment of training
Maggie Cruickshank - Optional work place assessment of training triumphbenelux
 
A journey towards programmatic assessment
A journey towards programmatic assessmentA journey towards programmatic assessment
A journey towards programmatic assessmentMedCouncilCan
 
Assessment Methods In Medical Education
Assessment Methods In Medical EducationAssessment Methods In Medical Education
Assessment Methods In Medical EducationNathan Mathis
 
Measurement and evaluation
Measurement and evaluationMeasurement and evaluation
Measurement and evaluationVipin Chandran
 
Dubrowski assessment and evaluation
Dubrowski assessment and evaluationDubrowski assessment and evaluation
Dubrowski assessment and evaluationAdam Dubrowski
 
Know how of question bank development
Know how of question bank developmentKnow how of question bank development
Know how of question bank developmentManoj Bhatt
 
Evaluating the quality of quality improvement training in healthcare
Evaluating the quality of quality improvement training in healthcareEvaluating the quality of quality improvement training in healthcare
Evaluating the quality of quality improvement training in healthcareDaniel McLinden
 
A Novel Approach for Developing Rating Scales for a Practice Ready OSCE
A Novel Approach for Developing Rating Scales for a Practice Ready OSCE A Novel Approach for Developing Rating Scales for a Practice Ready OSCE
A Novel Approach for Developing Rating Scales for a Practice Ready OSCE Saad Chahine
 
Lecture 7. student evaluation
Lecture 7. student evaluationLecture 7. student evaluation
Lecture 7. student evaluationShafiqur Rehman
 
How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.Healthcare consultant
 
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptx
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptxLearning_activity#1_Sánchez_Jhon.NRC_18235.pptx
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptxJhon Sánchez Medina
 
Measurement & Evaluation pptx
Measurement & Evaluation pptxMeasurement & Evaluation pptx
Measurement & Evaluation pptxAliimtiaz35
 
ASSESSMENT IN MEDICAL EDUCATION07122022.pptx
ASSESSMENT IN MEDICAL EDUCATION07122022.pptxASSESSMENT IN MEDICAL EDUCATION07122022.pptx
ASSESSMENT IN MEDICAL EDUCATION07122022.pptxMuralidharanp13
 
Surveying the landscape: An overview of tools for direct observation and asse...
Surveying the landscape: An overview of tools for direct observation and asse...Surveying the landscape: An overview of tools for direct observation and asse...
Surveying the landscape: An overview of tools for direct observation and asse...MedCouncilCan
 

Similaire à Psychometrics for Clinical Skills Assessment (20)

Assessing learning in Instructional Design
Assessing learning in Instructional DesignAssessing learning in Instructional Design
Assessing learning in Instructional Design
 
Assessment of Clinical Competence.pptx
Assessment of Clinical Competence.pptxAssessment of Clinical Competence.pptx
Assessment of Clinical Competence.pptx
 
Maggie Cruickshank - Optional work place assessment of training
Maggie Cruickshank - Optional work place assessment of training   Maggie Cruickshank - Optional work place assessment of training
Maggie Cruickshank - Optional work place assessment of training
 
A journey towards programmatic assessment
A journey towards programmatic assessmentA journey towards programmatic assessment
A journey towards programmatic assessment
 
Assessment Methods In Medical Education
Assessment Methods In Medical EducationAssessment Methods In Medical Education
Assessment Methods In Medical Education
 
Measurement and evaluation
Measurement and evaluationMeasurement and evaluation
Measurement and evaluation
 
Dubrowski assessment and evaluation
Dubrowski assessment and evaluationDubrowski assessment and evaluation
Dubrowski assessment and evaluation
 
Week 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and ReliabilityWeek 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and Reliability
 
Simulated Learning Environments
Simulated Learning EnvironmentsSimulated Learning Environments
Simulated Learning Environments
 
Know how of question bank development
Know how of question bank developmentKnow how of question bank development
Know how of question bank development
 
Evaluating the quality of quality improvement training in healthcare
Evaluating the quality of quality improvement training in healthcareEvaluating the quality of quality improvement training in healthcare
Evaluating the quality of quality improvement training in healthcare
 
A Novel Approach for Developing Rating Scales for a Practice Ready OSCE
A Novel Approach for Developing Rating Scales for a Practice Ready OSCE A Novel Approach for Developing Rating Scales for a Practice Ready OSCE
A Novel Approach for Developing Rating Scales for a Practice Ready OSCE
 
Lecture 7. student evaluation
Lecture 7. student evaluationLecture 7. student evaluation
Lecture 7. student evaluation
 
How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.
 
1 rating
1 rating1 rating
1 rating
 
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptx
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptxLearning_activity#1_Sánchez_Jhon.NRC_18235.pptx
Learning_activity#1_Sánchez_Jhon.NRC_18235.pptx
 
Measurement & Evaluation pptx
Measurement & Evaluation pptxMeasurement & Evaluation pptx
Measurement & Evaluation pptx
 
ASSESSMENT IN MEDICAL EDUCATION07122022.pptx
ASSESSMENT IN MEDICAL EDUCATION07122022.pptxASSESSMENT IN MEDICAL EDUCATION07122022.pptx
ASSESSMENT IN MEDICAL EDUCATION07122022.pptx
 
Validation Studies in Simulation-based Education - Deb Rooney
Validation Studies in Simulation-based Education - Deb RooneyValidation Studies in Simulation-based Education - Deb Rooney
Validation Studies in Simulation-based Education - Deb Rooney
 
Surveying the landscape: An overview of tools for direct observation and asse...
Surveying the landscape: An overview of tools for direct observation and asse...Surveying the landscape: An overview of tools for direct observation and asse...
Surveying the landscape: An overview of tools for direct observation and asse...
 

Plus de INSPIRE_Network

Alert 2017 tejani - nbi resident mock code study-2017-updated
Alert 2017  tejani - nbi resident mock code study-2017-updatedAlert 2017  tejani - nbi resident mock code study-2017-updated
Alert 2017 tejani - nbi resident mock code study-2017-updatedINSPIRE_Network
 
Multi-site outreach simulation education to improve premature infant admissio...
Multi-site outreach simulation education to improve premature infant admissio...Multi-site outreach simulation education to improve premature infant admissio...
Multi-site outreach simulation education to improve premature infant admissio...INSPIRE_Network
 
Throughput, cost and standardization: Does a serious game in healthcare work ...
Throughput, cost and standardization: Does a serious game in healthcare work ...Throughput, cost and standardization: Does a serious game in healthcare work ...
Throughput, cost and standardization: Does a serious game in healthcare work ...INSPIRE_Network
 
Making simulation training stick with rapid cycle deliberate practice
Making simulation training stick with rapid cycle deliberate practiceMaking simulation training stick with rapid cycle deliberate practice
Making simulation training stick with rapid cycle deliberate practiceINSPIRE_Network
 
Alert 2017 deshpande secure_tube study in simulated
Alert 2017 deshpande secure_tube study in simulatedAlert 2017 deshpande secure_tube study in simulated
Alert 2017 deshpande secure_tube study in simulatedINSPIRE_Network
 
Improving knowledge and confidence of emergency residents in pediatric resusc...
Improving knowledge and confidence of emergency residents in pediatric resusc...Improving knowledge and confidence of emergency residents in pediatric resusc...
Improving knowledge and confidence of emergency residents in pediatric resusc...INSPIRE_Network
 
Interdisciplinary pediatric sedation provider course toolkit
Interdisciplinary pediatric sedation provider course toolkitInterdisciplinary pediatric sedation provider course toolkit
Interdisciplinary pediatric sedation provider course toolkitINSPIRE_Network
 
Alert 2017 duff co_nsolidating tools for outcomes
Alert 2017 duff co_nsolidating tools for outcomesAlert 2017 duff co_nsolidating tools for outcomes
Alert 2017 duff co_nsolidating tools for outcomesINSPIRE_Network
 
Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...
Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...
Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...INSPIRE_Network
 
Simulation as a strategy for learning in pediatrics graduate in high complexi...
Simulation as a strategy for learning in pediatrics graduate in high complexi...Simulation as a strategy for learning in pediatrics graduate in high complexi...
Simulation as a strategy for learning in pediatrics graduate in high complexi...INSPIRE_Network
 
Alert 2017 lopreiato - a handover osce to measure skills in transitions of ...
Alert 2017   lopreiato - a handover osce to measure skills in transitions of ...Alert 2017   lopreiato - a handover osce to measure skills in transitions of ...
Alert 2017 lopreiato - a handover osce to measure skills in transitions of ...INSPIRE_Network
 
Validating a serious games strategy to improve multi-patient care in emergenc...
Validating a serious games strategy to improve multi-patient care in emergenc...Validating a serious games strategy to improve multi-patient care in emergenc...
Validating a serious games strategy to improve multi-patient care in emergenc...INSPIRE_Network
 
Using Cognitive task analysis to understand the decision making process for p...
Using Cognitive task analysis to understand the decision making process for p...Using Cognitive task analysis to understand the decision making process for p...
Using Cognitive task analysis to understand the decision making process for p...INSPIRE_Network
 
Development of an Electronic Decision Support tool to optimize performance du...
Development of an Electronic Decision Support tool to optimize performance du...Development of an Electronic Decision Support tool to optimize performance du...
Development of an Electronic Decision Support tool to optimize performance du...INSPIRE_Network
 
Alert 2017 sherman - virtual reality simulation training - updated #2
Alert 2017   sherman - virtual reality simulation training - updated #2Alert 2017   sherman - virtual reality simulation training - updated #2
Alert 2017 sherman - virtual reality simulation training - updated #2INSPIRE_Network
 
INSPIRE Annual Report 2015 - 2016
INSPIRE Annual Report 2015 - 2016INSPIRE Annual Report 2015 - 2016
INSPIRE Annual Report 2015 - 2016INSPIRE_Network
 
IPSSW Pre-Conference Workshop 7 (2016, Glasgow)
IPSSW Pre-Conference Workshop 7 (2016, Glasgow)IPSSW Pre-Conference Workshop 7 (2016, Glasgow)
IPSSW Pre-Conference Workshop 7 (2016, Glasgow)INSPIRE_Network
 

Plus de INSPIRE_Network (20)

Alert 2017 tejani - nbi resident mock code study-2017-updated
Alert 2017  tejani - nbi resident mock code study-2017-updatedAlert 2017  tejani - nbi resident mock code study-2017-updated
Alert 2017 tejani - nbi resident mock code study-2017-updated
 
Multi-site outreach simulation education to improve premature infant admissio...
Multi-site outreach simulation education to improve premature infant admissio...Multi-site outreach simulation education to improve premature infant admissio...
Multi-site outreach simulation education to improve premature infant admissio...
 
Gross initial alert
Gross initial alertGross initial alert
Gross initial alert
 
Alert 2017 wheaton
Alert 2017  wheatonAlert 2017  wheaton
Alert 2017 wheaton
 
Throughput, cost and standardization: Does a serious game in healthcare work ...
Throughput, cost and standardization: Does a serious game in healthcare work ...Throughput, cost and standardization: Does a serious game in healthcare work ...
Throughput, cost and standardization: Does a serious game in healthcare work ...
 
Making simulation training stick with rapid cycle deliberate practice
Making simulation training stick with rapid cycle deliberate practiceMaking simulation training stick with rapid cycle deliberate practice
Making simulation training stick with rapid cycle deliberate practice
 
Alert 2017 deshpande secure_tube study in simulated
Alert 2017 deshpande secure_tube study in simulatedAlert 2017 deshpande secure_tube study in simulated
Alert 2017 deshpande secure_tube study in simulated
 
Improving knowledge and confidence of emergency residents in pediatric resusc...
Improving knowledge and confidence of emergency residents in pediatric resusc...Improving knowledge and confidence of emergency residents in pediatric resusc...
Improving knowledge and confidence of emergency residents in pediatric resusc...
 
Interdisciplinary pediatric sedation provider course toolkit
Interdisciplinary pediatric sedation provider course toolkitInterdisciplinary pediatric sedation provider course toolkit
Interdisciplinary pediatric sedation provider course toolkit
 
Alert 2017 duff co_nsolidating tools for outcomes
Alert 2017 duff co_nsolidating tools for outcomesAlert 2017 duff co_nsolidating tools for outcomes
Alert 2017 duff co_nsolidating tools for outcomes
 
Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...
Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...
Alert 2017 kUse of a defined role/responsibility chart and a color coded floo...
 
Simulation as a strategy for learning in pediatrics graduate in high complexi...
Simulation as a strategy for learning in pediatrics graduate in high complexi...Simulation as a strategy for learning in pediatrics graduate in high complexi...
Simulation as a strategy for learning in pediatrics graduate in high complexi...
 
Alert 2017 lopreiato - a handover osce to measure skills in transitions of ...
Alert 2017   lopreiato - a handover osce to measure skills in transitions of ...Alert 2017   lopreiato - a handover osce to measure skills in transitions of ...
Alert 2017 lopreiato - a handover osce to measure skills in transitions of ...
 
Validating a serious games strategy to improve multi-patient care in emergenc...
Validating a serious games strategy to improve multi-patient care in emergenc...Validating a serious games strategy to improve multi-patient care in emergenc...
Validating a serious games strategy to improve multi-patient care in emergenc...
 
Using Cognitive task analysis to understand the decision making process for p...
Using Cognitive task analysis to understand the decision making process for p...Using Cognitive task analysis to understand the decision making process for p...
Using Cognitive task analysis to understand the decision making process for p...
 
Development of an Electronic Decision Support tool to optimize performance du...
Development of an Electronic Decision Support tool to optimize performance du...Development of an Electronic Decision Support tool to optimize performance du...
Development of an Electronic Decision Support tool to optimize performance du...
 
Alert 2017 sherman - virtual reality simulation training - updated #2
Alert 2017   sherman - virtual reality simulation training - updated #2Alert 2017   sherman - virtual reality simulation training - updated #2
Alert 2017 sherman - virtual reality simulation training - updated #2
 
INSPIRE Annual Report 2015 - 2016
INSPIRE Annual Report 2015 - 2016INSPIRE Annual Report 2015 - 2016
INSPIRE Annual Report 2015 - 2016
 
IPSSW Pre-Conference Workshop 7 (2016, Glasgow)
IPSSW Pre-Conference Workshop 7 (2016, Glasgow)IPSSW Pre-Conference Workshop 7 (2016, Glasgow)
IPSSW Pre-Conference Workshop 7 (2016, Glasgow)
 
INSPIRE @ IMSH 2016
INSPIRE @ IMSH 2016INSPIRE @ IMSH 2016
INSPIRE @ IMSH 2016
 

Dernier

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 

Dernier (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

Psychometrics for Clinical Skills Assessment

  • 1. Psychometrics for Clinical Skills Assessment Serkan Toy, PhD Director of Evaluation and Program Development Graduate Medical Education Children’s Mercy Hospital – Kansas City
  • 2. Outline • What is psychometrics? – Measurement – Construct development • Reliability • Validity • A few other issues to consider – Checklists vs. Global ratings
  • 3. Psychometrics • Educational & Psychological Measurement – measurement of knowledge, abilities, attitudes, and personality traits – mainly concerned with the construction and validation of measurement instruments (i.e. cognitive tests, surveys/questionnaires, and personality assessments.)
  • 4. Measurement • Assigning numerals to observations based on some pre-defined criteria • Or assigning a value to one object/observation in relation to another – Intelligence - IQ – Personality - Big Five – Academic or procedural performance
  • 7. Construct Well-defined vs. Ill-defined What do we already know about the construct in question? • Epistemological Beliefs • Self efficacy • Academic Performance • Procedural Competency
  • 8. Defining the Construct • Factor Analysis • Cluster Analytic Approach • Cognitive/Procedural Task Analysis – Hierarchical Task Analysis – Delphi Technique - expert consensus (face validity) – Angoff Method - determining passing (cut-off) scores
  • 9. Key Concepts in Assessment • Reliability • Validity
  • 10. Reliability “The more consistent the scores are over different raters and occasions, the more reliable the assessment is thought to be” (Moskal & Leydens, 2000 as cited in Jonsson & Svingby, 2007) • Test re-rest • Inter-rater • Binary vs. Likert scale • Rubrics and calibration process Jonsson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review, 2, 130-144.
  • 11. Rubrics for Assessment • “An assessment tool that describes levels of performance on a particular task” (Jonsson & Svingby, 2007) • Analytic, topic-specific rubrics seem to enhance reliable scoring of performance especially when accompanied by examples and/or rater training • Example: – Objective assessment of surgical competence in gynaecological laparoscopy: development and validation of a procedure-specific rating scale Larsen et al. (2008) Jonsson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review, 2, 130-144. Larsen C.R., Soerensen J.L., Grantcharov T.P., Dalsgaard T, Schouenborg L., Ottosen C., Schroeder T.V., Ottesen B.S. (2008) Effect of virtual reality training on laparoscopic surgery: randomised controlled trial. BMJ, 14, 338, b1802.
  • 12. Larsen et al. 2008 Example rating scale 1 2 3 4 5 Economy of movements Many unnecessary movements Efficient motion but some unnecessary motion Maximum economy of movements Economy of time Too long time used to perform sufficiently Intermediate time used to perform sufficiently Minimal time used to perform sufficiently Errors: respect for tissue … … … Flow of operation … … …
  • 13. Traditionally Validity • Three C’s of the “Trinitarian” – Content – Criterion – Construct
  • 14. Conceptual Change in Validity “Validity is not a property of the test or assessment as such, but rather of the meaning of the test scores. These scores are a function not only of the items or stimulus conditions, but also of the persons responding as well as the context of the assessment. In particular, what needs to be valid is the meaning or interpretation of the score; as well as any implications for action that this meaning entails” (p. 741). Messick, S. (1995) Validity of psychological assessment: validation of inferences from persons’ responses and performance as scientific inquiry into score meaning. American Psychologist, 50, 741–9.
  • 15. Validity • Construct Validity – Content / Face – Convergent – Discriminant – Predictive The question is: What can we validly conclude about a trainee who receives a score of “X” vs. that of receiving “Y”?
  • 16. Construct Validity • In competency assessment instruments validity is usually established by examining whether they distinguish between groups logically presumed to differ on competency being measured – Experienced practitioners vs. trainees or – Peer nominated superior performers vs. average performers Scofield, M. E., & Yoxtheimer, L. Y. (1983). Psychometric issues in the assessment of clinical competencies. Journal of Counseling Psychology. 30, 413-420.
  • 17. Other Issues to Consider • Global ratings vs. Checklist scores • Required sample size for validity testing • Training the trainees • Scoring the videotaped performances – Individual procedural vs. qualitative skills – Team performance • Formative and summative assessment at the same time – Training (intervention) or assessment (measurement)?
  • 18. Global ratings vs. Checklist scores • High correlation between global ratings and checklist scores • Both seem to differentiate similarly between more experienced trainees and novices Examples • Kim J., Neilipovitz D., Cardinal P., Chiu M. (2009) A comparison of global rating scale and checklist scores in the validation of an evaluation tool to assess performance in the resuscitation of critically ill patients during simulated emergencies (abbreviated as ‘‘CRM simulator study IB’’). Simulation in Healthcare, 4, 6–16. • Morgan P.J., Cleave-Hogg D., Guest C.B. (2001) A comparison of global ratings and checklist scores from an undergraduate assessment using an anesthesia simulator. Academic Medicine, 76, 1053–5.
  • 19. A Comparison of Global Rating Scale and Checklist Scores in the Validation of an Evaluation Tool to Assess Performance in the Resuscitation of Critically Ill Patients During Simulated Emergencies (Abbreviated as "CRM Simulator Study IB") Kim, John MD, MEd, FRCPC; Neilipovitz, David MD, FRCPC; Cardinal, Pierre MD, FRCPC; Chiu, Michelle MD, FRCPC • 32 PGY-1 & 28 PGY-3  2 simulation scenarios on Crisis Resource Management (CRM) • Ottawa Global Rating Scale  7-point anchored ordinal scale – 5 CRM categories and overall performance score • Ottawa CRM Checklist  12 item in 5 CRM categories- max 30 points • 3 raters  blinded to year of training rated each videotaped performance (no order specified for use of evaluation tools)
  • 20. Kim et al. 2009 - Continued Reliability: Inter-rater reliability  Intraclass Correlation Coefficient (ICC) • Ottawa GRS: S1=0.59 & S2=0.61 – Subcategories showed similar ICC except “resource utilization & communication (range 0.24 to 0.38) • Cumulative CRM checklist: S1=0.63 & S2=0.55 – Again subcategories showed similar ICC except “resource utilization & communication (again range 0.24 to 0.38) Validity: • Content validation (face validity)  Delphi process • Response process  Resident orientation & rater training • Comparison of scores by PGY  T test (ANOVA) – Both the checklist and GRS overall & subcategory scores showed statistically significant differences between PGY-1 and PGY-3 (more experienced residents receiving higher scores) – ANOVA showed similar results by each scenario as well as per rater
  • 21. A Comparison of Global Ratings and Checklist Scores from an Undergraduate Assessment Using an Anesthesia Simulator Morgan, Pamela J. MD; Cleave-Hogg, Doreen PhD; Guest, Cameron B. MD, MEd • 140 final year medical students  15-minute faculty-facilitated sim scenario – Conducted in 2nd week of the 2-week anesthesia rotation – Faculty followed a script and each session was videotaped – Each student completed 1 of 6 scenarios (each with similar learning objectives) • 25-point criterion checklist for each scenario (not performed=0; performed=1) • 5-point global rating (clear failure=1 to superior performance=5) • 10 faculty attended a workshop on performance protocols – Randomly assigned to a rating pair to evaluate 25 to 34 videotaped performances
  • 22. Morgan et al. 2001 - Continued • Correlation between checklist scores and global ratings  Pearson r = 0.74 • Global ratings correlated more highly with technical skills and judgment than with knowledge – Knowledge  r = 0.24 – Technical skills  r =0.51 – Judgment  r =0.53 • Single-rater reliability (consistency) – Mean ICC for Checklist scores  0.77 (range 0.58 to 0.93) – Mean ICC for Global ratings  0.62 (range 0.40 to 0.77)
  • 23. Other Issues to Consider • Required sample size for validity testing • Training the trainees • Scoring the videotaped performances – Individual procedural vs. qualitative skills – Team performance • Formative and summative assessment at the same time – Training (intervention) or assessment (measurement)? • Assessment tools to link simulated performance to actual patient outcomes