2. Topics Discussed in this Chapter
Data collection
Measuring instruments
Terminology
Interpreting data
Types of instruments
Technical issues
Validity
Reliability
Selection of a test
3. Data Collection
Scientific inquiry requires the collection,
analysis, and interpretation of data
Data – the pieces of information that are
collected to examine the research topic
Issues related to the collection of this
information are the focus of this chapter
4. Data Collection
Terminology related to data
Constructs – abstractions that cannot be
observed directly but are helpful when
trying to explain behavior
Intelligence
Teacher effectiveness
Self concept
Obj. 1.1 & 1.2
5. Data Collection
Data terminology (continued)
Operational definition – the ways by which
constructs are observed and measured
Weschler IQ test
Virgilio Teacher Effectiveness Inventory
Tennessee Self-Concept Scale
Variable – a construct that has been
operationalized and has two or more
values
Obj. 1.1 & 1.2
6. Data Collection
Measurement scales
Nominal – categories
Gender, ethnicity, etc.
Ordinal – ordered categories
Rank in class, order of finish, etc.
Interval – equal intervals
Test scores, attitude scores, etc.
Ratio – absolute zero
Time, height, weight, etc.
Obj. 2.1
7. Data Collection
Types of variables
Categorical or quantitative
Categorical variables reflect nominal scales
and measure the presence of different qualities
(e.g., gender, ethnicity, etc.)
Quantitative variables reflect ordinal, interval, or
ratio scales and measure different quantities of
a variable (e.g., test scores, self-esteem
scores, etc.)
Obj. 2.2
8. Data Collection
Types of variables
Independent or dependent
Independent variables are purported causes
Dependent variables are purported effects
Two instructional strategies, co-operative groups and
traditional lectures, were used during a three week social
studies unit. Students’ exam scores were analyzed for
differences between the groups.
The independent variable is the instructional approach (of
which there are two levels)
The dependent variable is the students’ achievement
Obj. 2.3
9. Measurement Instruments
Important terms
Instrument – a tool used to collect data
Test – a formal, systematic procedure for
gathering information
Assessment – the general process of
collecting, synthesizing, and interpreting
information
Measurement – the process of quantifying
or scoring a subject’s performance
Obj. 3.1 & 3.2
10. Measurement Instruments
Important terms (continued)
Cognitive tests – examining subjects’
thoughts and thought processes
Affective tests – examining subjects’
feelings, interests, attitudes, beliefs, etc.
Standardized tests – tests that are
administered, scored, and interpreted in a
consistent manner
Obj. 3.1
11. Measurement Instruments
Important terms (continued)
Selected response item format – respondents
select answers from a set of alternatives
Multiple choice
True-false
Matching
Supply response item format – respondents
construct answers
Short answer
Completion
Essay
Obj. 3.3 & 11.3
12. Measurement Instruments
Important terms (continued)
Individual tests – tests administered on an
individual basis
Group tests – tests administered to a group
of subjects at the same time
Performance assessments – assessments
that focus on processes or products that
have been created
Obj. 3.6
13. Measurement Instruments
Interpreting data
Raw scores – the actual score made on a
test
Standard scores – statistical
transformations of raw scores
Percentiles (0.00 – 99.9)
Stanines (1 – 9)
Normal Curve Equivalents (0.00 – 99.99)
Obj. 3.4
14. Measurement Instruments
Interpreting data (continued)
Norm-referenced – scores are interpreted
relative to the scores of others taking the
test
Criterion-referenced – scores are
interpreted relative to a predetermined
level of performance
Self-referenced – scores are interpreted
relative to changes over time
Obj. 3.5
15. Measurement Instruments
Types of instruments
Cognitive – measuring intellectual
processes such as thinking, memorizing,
problem solving, analyzing, or reasoning
Achievement – measuring what students
already know
Aptitude – measuring general mental
ability, usually for predicting future
performance
Obj. 4.1 & 4.2
16. Measurement Instruments
Types of instruments (continued)
Affective – assessing individuals’ feelings,
values, attitudes, beliefs, etc.
Typical affective characteristics of interest
Values – deeply held beliefs about ideas, persons, or
objects
Attitudes – dispositions that are favorable or unfavorable
toward things
Interests – inclinations to seek out or participate in
particular activities, objects, ideas, etc.
Personality – characteristics that represent a person’s
typical behaviors
Obj. 4.1 & 4.5
17. Measurement Instruments
Types of instruments (continued)
Affective (continued)
Scales used for responding to items on affective tests
Likert
Positive or negative statements to which subjects
respond on scales such as strongly disagree,
disagree, neutral, agree, or strongly agree
Semantic differential
Bipolar adjectives (i.e., two opposite adjectives) with a
scale between each adjective
Dislike: ___ ___ ___ ___ ___ :Like
Rating scales – rankings based on how a subject would
rate the trait of interest
Obj. 5.1
18. Measurement Instruments
Types of instruments (continued)
Affective (continued)
Scales used for responding to items on
affective tests (continued)
Thurstone – statements related to the trait of interest
to which subjects agree or disagree
Guttman – statements representing a uni-
dimensional trait
Obj. 5.1
19. Measurement Instruments
Issues for cognitive, aptitude, or affective
tests
Problems inherent in the use of self-report
measures
Bias – distortions of a respondent’s performance or
responses based on ethnicity, race, gender, language,
etc.
Responses to affective test items
Socially acceptable responses
Accuracy of responses
Response sets
Alternatives include the use of projective tests
Obj. 4.3, 4.4
21. Technical Issues
Validity – extent to which interpretations
made from a test score are appropriate
Characteristics
The most important technical characteristic
Situation specific
Does not refer to the instrument but to the
interpretations of scores on the instrument
Best thought of in terms of degree
Obj. 6.1 & 7.1
22. Technical Issues
Validity (continued)
Four types
Content – to what extent does the test measure
what it is supposed to measure
Item validity
Sampling validity
Determined by expert judgment
Obj. 7.1 & 7.2
23. Technical Issues
Validity (continued)
Criterion-related
Predictive – to what extent does the test predict a
future performance
Concurrent - to what extent does the test predict a
performance measured at the same time
Estimated by correlations between two tests
Construct – the extent to which a test measures
the construct it represents
Underlying difficulty defining constructs
Estimated in many ways
Obj. 7.1, 7.3, & 7.4
24. Technical Issues
Validity (continued)
Consequential – to what extent are the
consequences that occur from the test
harmful
Estimated by empirical and expert judgment
Factors affecting validity
Unclear test directions
Confusing and ambiguous test items
Vocabulary that is too difficult for test takers
Obj. 7.1, 7.5, & 7.7
25. Technical Issues
Factors affecting validity (continued)
Overly difficult and complex sentence
structure
Inconsistent and subjective scoring
Untaught items
Failure to follow standardized
administration procedures
Cheating by the participants or someone
teaching to the test items
Obj. 7.7
26. Technical Issues
Reliability – the degree to which a test
consistently measures whatever it is
measuring
Characteristics
Expressed as a coefficient ranging from 0 to 1
A necessary but not sufficient characteristic of
a test
Obj. 6.1, 8.1, & 8.7
27. Technical Issues
Reliability (continued)
Six reliability coefficients
Stability – consistency over time with the same
instrument
Test – retest
Estimated by a correlation between the two
administrations of the same test
Equivalence – consistency with two parallel
tests administered at the same time
Parallel forms
Estimated by a correlation between the parallel tests
Obj. 8.1, 8.2, 8.3, & 8.7
28. Technical Issues
Reliability (continued)
Six reliability coefficients (continued)
Equivalence and stability – consistency over
time with parallel forms of the test
Combines attributes of stability and equivalence
Estimated by a correlation between the parallel forms
Internal consistency – artificially splitting the
test into halves
Several coefficients – split halves, KR 20, KR 21, Cronbach
alpha
All coefficients provide estimates ranging from 0 to 1
Obj. 8.1, 8.4, 8.5, & 8.7
29. Technical Issues
Reliability (continued)
Six reliability coefficients
Scorer/rater – consistency of observations
between raters
Inter-judge – two observers
Intra-judge – one judge over two occasions
Estimated by percent agreement between
observations
Obj. 8.1, 8.6, & 8.7
30. Technical Issues
Reliability (continued)
Six reliability coefficients (continued)
Standard error of measurement (SEM) – an
estimate of how much difference there is
between a person’s obtained score and his or
her true score
Function of the variation of the test and the reliability
coefficient (e.g., KR 20, Cronbach alpha, etc.)
Estimated by specifying an interval rather than a
point estimate of a person’s score
Obj. 8.1, 8.7, & 9.1
31. Selection of a Test
Sources of test information
Mental Measurement Yearbooks (MMY)
The reviews in MMY are most easily accessed through
your university library and the services to which they
subscribe (e.g., EBSCO)
Provides factual information on all known tests
Provides objective test reviews
Comprehensive bibliography for specific tests
Indices: titles, acronyms, subject, publishers, developers
Buros Institute
Obj. 10.1 & 12.1
32. Selection of a Test
Sources (continued)
Tests in Print
Tests in Print is a subsidiary of the Buros Institute
The reviews in it are most easily accessed through your
university library and the services to which they
subscribe (e.g., EBSCO)
Bibliography of all known commercially produced tests
currently available
Very useful to determine availability
Tests in Print
Obj. 10.1 & 12.1
33. Selection of a Test
Sources (continued)
ETS Test Collection
Published and unpublished tests
Includes test title, author, publication date, target
population, publisher, and description of purpose
Annotated bibliographies on achievement, aptitude,
attitude and interests, personality, sensory motor, special
populations, vocational/occupational, and miscellaneous
ETS Test Collection
Obj. 10.1 &12.1
34. Selection of a Test
Sources (continued)
Professional journals
Test publishers and distributors
Issues to consider when selecting tests
Psychometric properties
Validity
Reliability
Length of test
Scoring and score interpretation
Obj. 10.1, 11.1, & 12.1
35. Selection of a Test
Issues to consider when selecting tests
Non-psychometric issues
Cost
Administrative time
Objections to content by parents or others
Duplication of testing
Obj. 11.1
36. Selection of a Test
Designing your own tests
Get help from others with experience in
developing tests
Item writing guidelines
Avoid ambiguous and confusing wording and sentence
structure
Use appropriate vocabulary
Write items that have only one correct answer
Give information about the nature of the desired answer
Do not provide clues to the correct answer
See Writing Multiple Choice Items
Obj. 11.2
37. Selection of a Test
Test administration guidelines
Plan ahead
Be certain that there is consistency across
testing sessions
Be familiar with any and all procedures
necessary to administer a test
Obj. 11.4