Defining Constituents, Data Vizzes and Telling a Data Story
Reliability and validity
1. Reliability That quality of measurement methods that suggests that the same data would have
been collected each time in repeated observations of the same phenomenon. Reliability,
however, does not ensure accuracy any more than does precision. “Did you attend religious
services last week?” would have higher reliability than the question “About how many times
have you attended religious services in your life?”
Reliability means that scores from an instrument are stable and consistent. Scores should be
nearly the same when researchers administer the instrument multiple times at different times.
Also, scores need to be consistent.
Validity refers to the extent to which an empirical measure adequately reflects the real meaning
of the concept under consideration. A measure of social class should measure social class, not
political orientations. A measure of political orientations should measure political orientations,
not sexual permissiveness. Validity means that we are actually measuring what we say we are
measuring. For example, your IQ would seem a more valid measure of your intelligence than
would the number of hours you spend in the library.
Reliability is generally easier to understand as it is a measure of consistency. If scores are not
reliable, they are not valid; scores need to be stable and consistent first before they can be
meaningful. Additionally, the more reliable the scores from an instrument, the more valid the
scores may be. The ideal situation exists when scores are both reliable and valid.
Threats to internal validity
The first category addresses threats related to participants in the study and their experiences:
History: Time passes between the beginning of the experiment and the end, and events may
occur (e.g., additional discussions about the hazards of smoking besides the treatment lecture)
between the pretest and posttest that influence the outcome.
Maturation: Individuals develop or change during the experiment (i.e., become older, wiser,
stronger, and more experienced), and these changes may affect their scores between the pretest
and posttest. A careful selection of participants who mature or develop in a similar way (e.g.,
individuals at the same grade level) for both the control and experimental groups helps guard
against this problem.
Regression: When researchers select individuals for a group based on extreme scores, they will
naturally do better (or worse) on the posttest than the pretest regardless of the treatment.
2. For example, the selection of heavy smokers for an experiment will probably contribute to lower
rates of smoking after treatment because the teens selected started with high rates at the
beginning of the experiment. The selection of individuals who do not have extreme scores on
entering characteristics (e.g., moderate smokers or average scores on pretests) may help solve
this problem.
Selection: “People factors” may introduce threats that influence the outcome, such as selecting
individuals who are brighter, more receptive to a treatment, or more familiar with a treatment
(e.g., teen smokers ready to quit) for the experimental group. Random selection may partly
address this threat.
Mortality: When individuals drop out during the experiment for any number of reasons (e.g.,
time, interest, money, friends, parents who do not want them participating in an experiment
about smoking), drawing conclusions from scores may be difficult. Researchers need to choose a
large sample and compare those who drop out with those who remain in the experiment on the
outcome measure.
The next category addresses threats related to treatments used in the study:
Diffusion of treatments: When the experimental and control groups can communicate with each
other, the control group may learn from the experimental group information about the treatment
and create a threat to internal validity. The diffusion of treatments (experimental and non
experimental) for the control and experimental groups needs to be different. As much as
possible, experimental researchers need to keep the two groups separate in an experiment.
Compensatory equalization: When only the experimental group receives a treatment, an
inequality exists that may threaten the validity of the study. The benefits (i.e., the goods or
services believed to be desirable) of the experimental treatment need to be equally distributed
among the groups in the study. To counter this problem, researchers use comparison groups (e.g.,
one group receives the health hazards lecture, whereas the other receives a handout about the
problems of teen smoking) so that all groups receive some benefits during an experiment.
Compensatory rivalry: If you publicly announce assignments to the control and experimental
groups, compensatory rivalry may develop between the groups because the control group feels
that it is the “underdog.” Researchers can try to avoid this threat by attempting to reduce the
awareness and expectations of the presumed benefits of the experimental treatment.
3. Resentful demoralization: When a control group is used, individuals in this group may become
resentful and demoralized because they perceive that they receive a less desirable treatment than
other groups. One remedy to this threat is for experimental researchers to provide a treatment to
this group after the experiment has concluded (e.g., after the experiment, all classes receive the
lecture on the health hazards of smoking).
Threats to external validity
Threats to external validity are problems that threaten our ability to draw correct inferences
from the sample data to other persons, settings, treatment variables, and measures.
Interaction of selection and treatment: This threat to external validity involves the inability to
generalize beyond the groups in the experiment, such as other racial, social, geographical, age,
gender, or personality groups.
Interaction of setting and treatment: This threat to external validity arises from the inability to
generalize from the setting where the experiment occurred to another setting. For example, you
cannot generalize treatment effect you obtain from studying entire school districts to specific
high schools. The practical solution to an interaction of setting and treatment is for the researcher
to analyze the effect of a treatment for each type of setting.
Interaction of history and treatment: This threat to external validity develops when the
researcher tries to generalize findings to past and future situations. Experiments may take place
at a special time (e.g., at the beginning of the school year) and may not produce similar results if
conducted earlier (e.g., students attending school in the summer may be different from students
attending school during the regular year) or later (e.g., during semester break). One solution is to
replicate the study at a later time rather than trying to generalize results to other times.