Mmig talk jan 245 2011

The importance of construct
validity in designing serious games

Brock Dubbels

Overview of Talk
• Great ideas – we want them to get better
– Why video games?
• You mean we have to show evidence?
– The Vegas Effect
• What can be done—
– Methods, validity, and return on investment
• Great ideas pt 2 – we know how to help

“More than eight in ten (83%) young people have a
video game console at home, and 56% have two or
more."
--Gen M: Media in the Lives of 8-18 Year-olds (Executive Summary, p. 36)

The market

• Actual market in 2009
– 52 Billion
– Growth actually closer to 25%
• 2014 Projection
– 86 Billion
– Projected growth of 9.4%
• Serious games projection
– 400 to 500 Million

Price Waterhouse Cooper, 2007

Get on the bandwagon?

From a consumer perspective, serious games should be judged on
how they deliver learning outcomes, not what they generate in
sales in the marketplace.

• Games, by their very
nature, assess, measure, and evaluate.
– Can be very much like tools used in psychological
assessments and evaluations

Informative Assessment
• An informative assessment stresses
meaningful, timely, and continuous feedback
about learning during learning.
• It is intended to provide the problems in contexts
where the learner can learn through trial and
error with feedback based upon the criteria for
competence.
• Research findings from over 4,000 studies indicate
that informative assessment has the most
significant impact on achievement (Wiliam, 2007).

Branching Dialog

Blue star for Constructive Conversation—
you constructively redirected.

What is if we made a game about
going to Las Vegas

The Vegas Effect
Should everything that happens in games, stay in games?
It is not enough to invoke games and play.
Serious games should provide evidence that they delivered.
This should be quantifiable in performance metrics

Learning
• Perceptual Learning
• Conceptual Learning
• Validity

Surface (face) Validity
• Games are often built on this.
– It looks like it measures what it is supposed to
measure.
– It appears to be a good project
– Does game-like delivery cut costs?

Construct Validity
• When we claim construct validity, we are essentially claiming that our
observed pattern—how things operate in reality—corresponds with our
theoretical pattern—how we think the world works.

– There are four main ways for assessing construct validity:
• Internal validity is related to what actually happens in a study.
• Has the independent variable really had an effect on the dependent variable?
• Or was the effect on the dependent variable caused by some other confounding
variable.
– Convergent and discriminant validity
• External validity refers to whether the findings of a study really can be generalized
beyond the present study. We can break external validity down into two types.
– Can we use the same measures in the game, and use them in the work environment?
– Will the measures we use in the work environment be effectively modeled for in-game learning
that transfers?
• Population validity - which refers to the extent to which the findings can be generalized
to other populations of people.
• Ecological validity - which refers to the extent to which the findings can be generalized
beyond the present situation.

Pre-Design Methods
• Elicitation – do you really need a game?
• Cognitive Ethnography
• Identification of
– quality of life and cost variables
– theoretic processes, & documentation variables for
kiosk.
• Fidelity
• Assessment, measurement, and evaluation
methods.

Tension in workflow
• Software Design • Research Design
– Typically based upon an – Typically based upon
economic consideration. answering a testable
• How will this solve a question.
problem? • How will this solve a problem?
• What are the first steps in • How do I know this?
production?
– The focus is on method and
– The focus is on stages of hypothesis testing:
production: • Construct
• Business Partner validity, reliability, reliability, a
Relations, Function, Beha nd probability.
vior, Structure, & Non-
Function (qualities).

Nomological Networks
• This is an attempt to provide better assurance
of construct validity. To do this, the researcher
should provide a theoretical frame-work for
what is being measured, an empirical
framework for how it is to be measured, and
specification of the linkage between these two
frameworks.
– (Cronbach&Meehl, 1955)

Multi Trait Multi Method

Construct Method
ADL Kiosk
CRB Rubric
SDT Inventories

Campbell and Fiske (1959)
Great explanation here:
http://luna.cas.usf.edu/~mbrannic/files/pmet/mtmm1a.htm

Cognitive Ethnography Design/
Usability Perspectives for MTMM validity measure

The state of long term care
• 55% white, 35% Black, 10% Hispanic
• Most workers are economically disadvantaged
• Low levels of educational attainment.
• Physically and emotionally demanding work, but often among the lowest paid in
the service industry.
• Viewed as an unpleasant occupation: primarily a maid service taking care of
incontinent, cognitively unaware old people.
• 45% attrition rate in first 90 days. Some reports show 100% turn over a year.
• Great shortage—with shortages, come reductions in quality of care.
• Expected growth rate of 85% with Baby Boomers retiring.
• Regulation tends to emphasize entry training, with limited attention to continued
career growth or development.
• Supervisors with “good people skills,” promotion of worker autonomy are the most
important predictors of higher job satisfaction and lower turnover rates.
» From “Who Will Care for Us?” US Dept. Health & Human Services

The User Story
• Functionally capable, but
not skilled.
– Soft skills
– Documentation skills
• High school education +/-1
• Like popular culture.
• A bit irreverent about
job, but this is coffee break
coping mechanism
• Limited care load while
training.
• Enjoy popular culture –
soaps, drama, etc.

Hypothesis
• Will improved people skills and increased
worker autonomy reduce attrition through
improving the perceived quality of life.
– Will perceived quality of life (PQoL) improve:
– Increase well-being in residents and nursing assistants?
– Reduce pain management?
– Reduce catastrophic care?
– Confidence and accuracy in information gathering and
reporting?

Theoretical perspective
• Improve soft skills and documentation.
• Quality of Life Measures
– How these are affected through: Presence, Constructive Conversation, Active
Listening
» These are used for game mechanics and coding dialog.

• Reducing attrition, improving Perceived Quality of Life
(PQoL), and improving documentation will reduce costs and
allow for more hires, better wages
– Reduce: pain meds, attendant care, catastrophic care.
• Documentation
– Reduce elicitation from medical staff
– Improve medical staff objective knowledge on daily living skills

Analysis Tools
• In order to measure whether the game does what it was
designed to do:
– Analysis criteria must exist inside and outside of the game for
evaluation.
– Same underlying measures from game
• Tools from ADL, SDT, Complex Relationship Building
• Inside:
– Scoring system weighted dialog
– Story content equated to kiosk input
– Play aloud
• Outside:
– Observational scoring tool for preceptors
– Survey for self-report
– Resident survey
– Kiosk
– Care plans

Four questions
1. Can I take any credit for any changes that
have happened in an individuals learning?
2. Does this have a connection to my
instructional activities?
3. Does these instructional activities equate to a
return on investment?
4. How do I know this?

Theory of interaction
• The central cog in Figure
6, Psychological Needs, is
modeled from Self-Determination
Theory (Deci& Ryan, 2000).
• The base measure, or bottom
cog, came from the Activities of
Daily Living (Roper, Logan, &
Tierney, 1980; 2000) and is
hypthothesized to be influenced
through interpersonal relations.
• The interpersonal relations were
modeled from operationalization
of Complex Relationship Building
(Bulechek, Butcher, &Dochtman,
2008)

SDT
• Self-Determination Theory • Basic Psychological Needs
– SDT is a macro theory, and It Scale
is concerned with supporting – General
our natural or intrinsic • 21 items
tendencies to behave in – Baard, P. P., Deci, E.
effective and healthy ways. L., Ryan, R. M. (2004).
– The key game play element
here was as a larger category – Relationships
for scoring criteria and • 9 items
providing accessible terms for • La
training. Guardia, Ryan, Couchman, &D
eci, 2000)
– In the work
environment, theses
inventories are used to – Work
provide an opportunity to • 21 items
create – (Deci, Ryan, Gagne, Leone, U
sunov, Kornazheva, 2001)
external, environmental, and
population validity.

ADL
• Activities of Daily Living • The term “activities of daily
living” refers to a set of
– The facility had already identified 8 common, everyday
items for identification in their tasks, performance of which is
kiosk software. required for personal self-care
– The key game play element here and independent living. The most
was modeling the facility kiosks in often used measure of functional
the game and scoring the resident ability is the Katz Activities of
interaction scenarios with how the Daily Living Scale (Katz et
CNAs document their observations.
al., 1963; Katx, 1983).
– In the work environment, the • Wiener, Hanley, Clark, Van Nostrand
kiosks are already used to collect (1990, pg.1 )
data, and this provides an
opportunity to create
population validity and provide ROI
analysis for care plans.

Complex Relationship Building
• Complex Relationship Building • a nursing intervention from the
Nursing Interventions Classification
– Identified from Nursing Interventions (NIC) defined as establishing a
Classification therapeutic relationship with a
(NIC), (Bulechek, Butcher, &Dochtman, patient to promote insight and
2008) behavioral change.
– The key game play element was • NIC identifies a 1 hour intervention.
modeling interactions with the
residents and providing an optimal path • There are 31 activities identified.
for interaction. – These activities represent the focus of
– Although this care giving practice is the game design and scoring process.
supposed to take an hour, the CNA must – 3 operationalized processes were
choose how to apportion that hour. taught to mediate this:
– In the work environment, these • Constructive Conversation
activities are used to provide language • Presence
and action for continuous • Active Listening
improvement. They are part of rubrics
for observation, and a further
opportunity to create
population validity.

Design Decisions
• FPH –the first person healer
• Flash / with database
– Web-host as well as installed with data upload.
• Time serves as game element
– Functional task selection vs. Interpersonal Communication
• Dialog driven
– Dialog supported by video cut scenes with voice narration
• Mini games such as room clean up
• Reward system (blue stars)
• Preceptor / Optimal Path / Debrief
• Interface with task log, resident information, clock, pause.
• Use real people’s faces as avatars
• Increase engagement through subtle but tasteful irreverence.

Outcome Analysis
• General Linear Model
• Quality of Life Variables
– Operationalized in soft skills and PQoL construct
• Presence, Active Listening, and Constructive Conversation
• Longitudinal study
– Pre, Game Play, Post
• Compare performance in: surveys, objective observer data, game
play, non-game play controls, self-report.
• Game play – construct sub-level scoring, i.e. number of
residents, rewards, optimal path decisions.
– Institutional data pre / post
• Compare catastrophic care, pain meds, independence, attrition
– Use game play, survey and observational tools as co-variates.

Take home
• Can you pose a testable question– hypothesis?
– Tension between design process and measurement
• Needs – behavior, function, non-function, structure.
• Construct validity – are you measuring what you think you are measuring? Theoretically?
Conceptually?
• Assessments, measures, & evaluations
• Mixed Methods approaches such as cognitive ethnography can provide an
opportunity to create a nomological network.
– MTMM provides an analysis tool that can be constructed to identify
convergent and discriminant validity.
• Spend time understanding the sample population
– Beliefs, likes, skills, & abilities.
– irreverence increases engagement, but reduces happiness of business partner.
• Usability testing should align with construct
• Again, emphasis on validity
– Without it, there is no capability for ROI analysis

Mmig talk jan 245 2011

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (20)

Similaire à Mmig talk jan 245 2011

Similaire à Mmig talk jan 245 2011 (20)

Plus de Brock Dubbels

Plus de Brock Dubbels (6)

Dernier

Dernier (20)

Mmig talk jan 245 2011

Notes de l'éditeur