SlideShare a Scribd company logo
1 of 15
Download to read offline
Evaluation and Program Planning 24 (2001) 129±143
                                                                                                                    www.elsevier.com/locate/evalprogplan



    Assessing the subsequent effect of a formative evaluation on a program
                                             J. Lynne Brown a,*, Nancy Ellen Kiernan b
                         a
                          Penn State University, Department of Food Science, 203B Borland, University Park, Pennsylvania, PA, USA
b
    Penn State University, College of Agricultural Sciences, 401 Agricultural Administration Building, University Park, Pennsylvania, PA 814-863-3439, USA
                                Received 30 June 1999; received in revised form 1 September 2000; accepted 31 October 2000



Abstract
   The literature on formative evaluation focuses on its conceptual framework, methodology and use. Permeating this work is a consensus
that a program will be strengthened as a result of a formative evaluation although little empirical evidence exists in the literature to
demonstrate the subsequent effects of a formative evaluation on a program. This study begins to ®ll that gap. To do this, we outline the
initial program and formative evaluation, present key ®ndings of the formative evaluation, describe how these ®ndings in¯uenced the ®nal
program and summative evaluation, and then compare the ®ndings to those of the formative. The study demonstrates that formative
evaluation can strengthen the implementation and some impacts of a program, i.e. knowledge and some behaviors. The ®ndings also suggest
that when researchers are faced with negative feedback about program components in a formative evaluation, they need to exercise care in
interpreting and using this feedback. q 2001 Elsevier Science Ltd. All rights reserved.
Keywords: Formative evaluation; Summative evaluation; Impact; Assessing feedback




1. Introduction                                                                   Morris, 1978); debated its frequency and timing in the
                                                                                  program cycle (Markle, 1979; Thiagarajan, 1991; Russell
   Formative evaluation commands a formidable place in                            & Blake, 1988; Chambers, 1994); scrutinized its overlap
the evaluation literature. Highly regarded, the process was                       with process evaluation (Patton, 1982; Stuf¯ebeam, 1983;
used to improve educational ®lms in the 1920's (Cambre,                           Scheirer & Rezmovic, 1983; Dehar, Casswell & Duignan,
1981). Academic areas as diverse as agricultural safety                           1993; Scheirer, 1994; Chen, 1996); and expanded its epis-
(Witte, Peterson, Vallabhan, Stephenson, Plugge, Givens                           temological framework, linking it to developmental
et al., 1992/93) and cardiovascular disease (Jacobs, Luep-                        programs (Patton, 1996). As the conceptual framework
ker, Mittelmark, Folsom, Pirie, Mascioli et al., 1986) draw                       evolved, the perceived value of formative evaluation has
on the process today, using ®ndings to improve a program;                         only increased.
among educators in particular, it is `almost universally                             Second, the literature focuses on methods and design
embraced' (Weston, 1986, p. 5). Surprisingly, the sub-                            strategies to conduct formative evaluation. That focus
sequent effect of using the ®ndings of formative evaluation                       appears ®rst, in handbooks or articles describing methods
has not received systematic attention. This paper address                         and design strategies for either an entire program (Rossi &
that gap.                                                                         Freeman, 1982; Patton, 1978; Fitzs-Gibbon & Morris, 1978)
   The literature focuses attention on three aspects of forma-                    or a segment of a program such as the materials (Weston,
tive evaluation, the ®rst of which is its conceptualization.                      1986; Bertrand, 1978), instruction (Tessmer, 1993), electro-
Over time, researchers clari®ed the concept. They distin-                         nic delivery like television (Baggaley, 1986), or interactive
guished it from other forms of evaluation especially summa-                       technology (Flagg, 1990; Chen & Brown, 1994). The focus
tive, the fundamental difference being the rationale and use                      on method and strategies appears secondly, in case studies
of the data (Baker & Alkin, 1973; Markle, 1989; Patton,                           which illuminate a particular method or strategy tailored to
1994; Chambers, 1994; Weston, 1986); labeled it formative                         the exigencies of a particular situation such as a community
evaluation (Scriven, 1967) and accepted that designation                          (Jacobs et al., 1986; Johnson, Osganian, Budman, Lytle,
(Rossi & Freeman, 1982; Patton, 1982; Fitz-Gibbon &                               Barrera, Bonura et al., 1994; McGraw, Stone, Osganian,
                                                                                  Elder, Johnson, Parcel et al., 1994; McGraw, McKinley,
    * Corresponding author. Tel.: 11-814-863-3973; fax: 11-814-863-6132.          McClements, Lasater, Assaf & Carleton, 1989) or worksite
      E-mail address: f9a@psu.edu (J.L. Brown).                                   (Kishchuk, Peters, Towers, Sylvestre, Bourgault & Richard,
0149-7189/01/$ - see front matter q 2001 Elsevier Science Ltd. All rights reserved.
PII: S 0149-718 9(01)00004-0
130                            J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143

1994). Over time, the focus on methods and strategies illu-             ers take the next step and demonstrate this by comparing
minated critical decisions needed to design a valid forma-              data from the initial program with data from the ®nal
tive evaluation. The decisions include: (1) who should                  program to show whether the changes resulted in an
participateÐexperts (Geis, 1987), learners from the                     improvement in program implementation and impacts.
targeted audience (Weston, 1986; Russell & Blake, 1988),                Reviewing over 60 years of work in formative evaluation,
learners with different aptitudes (Wager, 1983), instructors            scholars (Flagg, 1990; Dick, 1980; Dick & Carey, 1985;
representative of those in the ®eld (Weston, 1987; Peterson             Weston, 1986,) found that the `evidence is supportive but
& Bickman, 1988), or drop outs from a program (Rossi &                  meager' (Geis, 1987, p. 6). Furthermore, most evidence
Freeman, 1982); (2) how many to include and in what                     (Baker & Alkin, 1973; Baghdadi, 1981; Kandaswamy,
formÐone or a group (Wager, 1983; Dick, 1980); (3)                      Stolovitch & Thiagarajan, 1976; Nathenson & Henderson,
type of data to collectÐqualitative or quantitative (Dennis,            1980; Scanlon, 1981; Wager, 1983; Montague, Ellis &
Fetterman & Sechrest, 1994; Peterson & Bickman, 1988;                   Wulfeck, 1983; Cambre, 1981) relates to only a component
Flay, 1986); (4) data collection techniques (Weston, 1986;              of a program, the educational materials, not to an entire
Tessmer, 1993) and (5) similarity of pilot sessions relative            program. Some landmark studies examine the impact of
to actual learning situations (Rossi & Freeman, 1982;                   an entire program in its formative stage such as the use of
Weston, 1986). Not surprisingly, the conviction permeating              negative income tax strategies as a substitute for welfare
the literature on methods and strategies is that formative              (Kershaw & Fair, 1976; Rossi & Lyall, 1976; Robins, Spie-
evaluation will lead to a stronger, more effective program.             gelman, Weiner & Bell, 1980; Hausman & Wise, 1985) and
   Third, attention in the literature dwells on the immediate           the Department of Labor's LIFE effort to decrease arrest
use of formative evaluation ®ndings. Academic areas such                rates among released felons with increased employment
as nutrition (Cardinal & Sachs, 1995), cancer prevention for            (Lenihan, 1976), but only a few, such as those reported by
agricultural workers (Parrott, Steiner & Goldenhar, 1996),              Fairweather and Tornatzky (1977), actually document that
and child health (Seidel, 1993) have evaluated a program in             the changes made as a result of a formative evaluation
its formative stage. In case studies such as these, researchers         resulted in a change in the impact of the ®nal program.
hail the evaluation process, describing the immediate effects           Given that researchers hail formative evaluation as impor-
of the evaluation, i.e., the problems identi®ed and/or                  tant, the lack of evidence about its subsequent effect points
changes to be made in a modi®ed version of the program                  to a surprising gap in the literature.
(Potter et al., 1990; Finnegan, Rooney, Viswanath, Elmer,                  The purpose of this paper is to examine the subsequent
Graves, Baxter et al., 1992; Kishchuk et al., 1994; Iszler,             effect of a formative evaluation to see whether the changes
Crockett, Lytle, Elmer, Finnegan, Luepker et al., 1995).                resulting from it improved the ®nal program, suf®ciently to
These researchers are not consistent when reporting the                 distinguish between the impact of two program delivery
immediate effects of a formative evaluation. Some do not                methods. To do this, we: (a) outline the initial program
include data; some do not outline the problems the process              and its formative evaluation, (b) present the key ®ndings
identi®ed; and some do not describe the changes they made.              of the formative evaluation, (c) describe how the formative
What is consistent however, is the message from these                   ®ndings in¯uenced the design of the revised program and its
researchers: formative evaluation led them to make changes              evaluation, and then (d) compare the results of the initial and
that should lead to a stronger program.                                 revised program, something rarely done in the formative
   In summary, much has been written about formative                    evaluation literature. In doing this, we provide a compre-
evaluationÐit's conceptual framework, its methods, and                  hensive look at the implementation of both a formative and
its use. Throughout this literature, there is strong consensus          summative evaluation. In conclusion, we identify issues that
on the value of formative evaluation, some calling its value            evaluators wishing to improve the design of a formative
`obvious' (Baggaley, 1986, p. 34) and `no longer ques-                  evaluation need to consider. In addition, we identify
tioned' (Chen & Brown, 1994, p. 192). Many educators                    problems we encountered in attempting to assess the effec-
contend, however, that formative evaluation is not used                 tiveness of a formative evaluation.
enough (Flagg, 1990; Kaufman, 1980; Geis, 1987; Foshee,
McLeroy, Sumner & Bibeau, 1986). Indeed, some evalua-
tions re¯ect no previous attempt at formative evaluation                2. Stage one: The initial program
(Foshee et al., 1988; Glanz, Sorensen & Farmer, 1996;
Pelletier, 1996; Schneider, Ituarte & Stokols, 1993; Wilk-              2.1. Background
inson, Schuler & Skjolaas, 1993; Hill, May, Coppolo &
Jenkins, 1993). Other formative evaluations are limited:                   Combining federal, state and local funding, land grant
using few people, non-representative samples, or selected               colleges support educational health promotion programs
materials (Tessmer, 1993).                                              for individuals and communities offered by county-based
   Part of the explanation for limited use of formative                 co-operative extension family living agents. Prior to our
evaluation may lie in the lack of empirical evidence in the             study, agents reported poor attendance at evening and
literature demonstrating its subsequent effect. Few research-           weekend meetings but rarely offered daytime programs at
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                        131

worksites. Instead, agents used correspondence lessons to              2.4. Delivery method
reach people unwilling to attend meetings. However, group
interaction is more likely to facilitate changes in behavior              We tested two bi-weekly delivery methods for the
(Glanz & Seewald-Klein, 1986), possibly through the                    lessons. Group-delivery (G), based on the discussion±deci-
support offered by sharing experiences. While it was easier            sion methods of Lewin (1943), was a 30 min motivational
for agents to mail correspondence lessons (an impersonal               session in which participants discussed adopting a behavior
delivery method), we postulated that using a group meeting             suggested in each lesson (i.e. trying recipes, walking for
to motivate the use of each lesson in a series before it was           exercise, involving children in food preparation, and eating
distributed was more likely to promote change in food/                 calcium-rich foods, not supplements). Participants could
health behaviors. To test this hypothesis, we designed a               taste a recipe using the featured calcium-rich food and
two-stage impact study to evaluate two methods of deliver-             vote by raised hands on their willingness to individually
ing lessons biweekly at worksites: distribution alone vs               adopt the suggested behavior. An agent served as facilita-
distribution in conjunction with a half-hour group meeting.            tor/motivator and distributed a lesson at the end of each
   Agents delivering the program would work with new                   session. The other method, impersonal-delivery (I),
delivery sites, new clientele, new content, and new delivery           consisted of either the agent or a company contact person
methods. Because of this unfamiliarity and because this                simply distributing the required lesson to participants
program had to ®t into worksite environments with differing            according to schedule.
work-shift patterns, lunch patterns, physical settings,
personnel departments, and required advertising, we                    2.5. Staff training
conducted a formative evaluation of the initial program                   To insure consistency, all agents received guidelines for
impact and its implementation. We included participants                recruitment of worksites and participants, a program content
and instructors in the evaluation, using a variety of qualita-         review, a printed program delivery script and instructions
tive and quantitative methods.                                         for instrument administration.

2.2. Target health problem and audience                                2.6. Recruitment

   Four print lessons in the initial program addressed                    Seven agents representing three rural and four urban/
prevention of osteoporosis, a recently proclaimed public               suburban counties interviewed personnel managers at busi-
health problem (National Institutes of Health, 1985) most              nesses within their county and recruited 48 worksites where
often affecting white, elderly women. Prevention requires              women comprised over 30% of the work force. Once work-
life long adequate calcium intake and exercise. According to           sites were randomly assigned to a delivery method (G or I),
NHANES II data, 75% of American women fail to consume                  agents systematically recruited participants within a month.
the recommended daily amount of calcium (Walden, 1989).
   We targeted working women, ages 21±45, with at least
                                                                       3. Stage one: Formative evaluation
one child at home because these women are building bone
mass which peaks at age 35±45. Mothers can also provide
                                                                         We delineate the data collection methods, the evaluation
nutrition activities (Gillespie & Achterberg, 1989) that
                                                                       design, and analyses.
teach children how to protect bone health.
                                                                       3.1. Evaluating program implementation
2.3. Lesson content and organization
                                                                          Our goals were to assess: (a) participant characteristics
   The lessons, based on the Health Belief Model (Janz &               relative to the prescribed target audience; (b) participant
Becker, 1984), encouraged participants to eat calcium-rich             attention to, and use of, the lessons; (c) participant reaction
foods and to walk for exercise by focusing on personal                 to advertising, lessons content and structure, delivery
susceptibility, disease severity, bene®ts of prevention, and           method, and time between lessons and (d) agent reaction
over coming barriers to health protecting actions. Because             to delivering the program and its content.
many in the target audience disliked drinking ¯uid milk                   To address goal (a), we included demographic questions
(based on an initial survey) or could have reactions to                in the ®rst questionnaire administered. To address (b), we
milk (Houts, 1988), each lesson introduced a different                 designed a response sheet for each lesson which asked parti-
calcium-rich food (non-fat dry milk, plain yogurt, canned              cipants how completely they had read the lesson, how easy
salmon or tofu) and menu ideas. Each also included scien-              it was to read, and how useful it was, and whether they
ti®c background on the lifestyle±osteoporosis link, a self             completed the worksheet, tried suggestions or recipes, and
assessment worksheet, a featured food fact sheet, sugges-              shared lesson materials. To address (c), we developed focus
tions for involving children in food preparation, and                  group questions for participants, and, for (d), questions for
calcium-rich recipes. Rightwriter (1990) analysis indicated            agents attending a debrie®ng.
a 12th grade reading level.                                               We conducted four focus groups among participants
132                           J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143

within a month of the intervention, two each for group-
delivery and impersonal-delivery. Each focus group derived
from a purposeful sample of thirty participants composed of
two-thirds completers and one-third non-completers. The
agent telephoned all selected and those who chose to attend
became the sample. We held the debrie®ng with all agents
within a month also. Data consisted of tape recordings and
written notes.

3.2. Evaluating program impact

   Our goal was to examine changes in knowledge, attitudes,
and behaviors (KAB) needed to prevent osteoporosis using
appropriate scales, changes in calcium intake using a food
frequency questionnaire (FFQ), and changes in exercise
pattern using speci®c questions. Our hypothesis was persons
in group±delivery would exhibit greater changes in attitude
and behavior scores, calcium intake, and exercise pattern
than those in impersonal-delivery. We anticipated similar
changes in knowledge for both delivery methods because
the same lessons were used; the meeting focused primarily
on motivation.
   To assess changes, we developed the KAB scales using
nutrition expert and target audience reviews and internal
consistency testing with 65 of our target audience prior to
                                                                             Fig. 1. Model of formative and summative evaluation design.
use in Stage One. The ®nal formative instrument contained a
20 item knowledge scale (KR-20 ˆ 0.80); a 22 item attitude
scale (a ˆ 0.78); and a 16 item behavior scale (a ˆ 0.75) all          response sheet that participants returned prior to receiving
addressing concepts in the lessons.                                    the next lesson.
   We used a modi®ed version of the Block food
frequency questionnaire (Brown & Griebler, 1993) that                  3.4. Formative evaluation data analyses
included the four foods featured in the osteoporosis
                                                                          We used x 2 analysis to compare categorical and ANOVA
lessons to assess calorie and calcium intake. To examine
                                                                       to compare continuous implementation data, between
exercise behavior directly, we asked participants if they
                                                                       lessons and between delivery methods, from response sheets
exercised regularly within the last several months each
                                                                       returned. We examined tape recording transcripts and focus
time they completed the KAB scales; after the lessons
                                                                       group and debrie®ng notes for repeated themes (Krippen-
we also asked if this exercise pattern was new, and, if
                                                                       dorff, 1980).
new, if it was due to the lessons.
                                                                          Data from those completing both KAB instruments were
3.3. Formative evaluation design                                       analyzed and scale scores determined allowing only one
                                                                       missing value. Each individual's knowledge score was the
   We employed a pre-test (T0), 8 week intervention, post-             sum of correct answers. Each attitude and behavior state-
test (T1) design to compare group-delivery (G) and imper-              ment required a response on a 5-point Likert scale. Each
sonal-delivery (I) (Fig. 1). We arranged the 48 worksites in           individual's attitude and behavior scale score was the mean
four blocks re¯ecting business types (white collar, educa-             of all their responses to those questions.
tional/municipal, health care and blue collar) and assigned               Data from those completing both FFQs were coded,
them randomly to either delivery. Although eleven work-                entered, and analyzed using FFQ software (Health Habits
sites withdrew prior to the intervention, primarily due to             and History Questionnaire, 1989). Because nutrient value
company changes, the proportion of business types in                   distributions were not normal, the data were transformed
each delivery method was unaffected.                                   using log e prior to statistical analysis (SAS Proprietary
   Participants completed pre KAB and FFQ instruments at               software, 1989).
a meeting 1 week prior to receiving lesson one; the last                  Non-directional t-tests for independent samples were
lesson included post KAB and FFQ instruments, which                    used to test signi®cance of continuous and ordinal data
participants returned at an optional post program meeting              (mean age, education, KAB scores and calcium) between
1 week later or by mail according to Dillman (1978).                   delivery methods (G vs I) at each time point (T0, T1). Cate-
Question order in each KAB scale differed at each measure-             gorical demographic and exercise data were compared using
ment to diminish recall bias. Each lesson included the                 x 2 analysis. ANOVA for repeated measures and ANCOVA
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                           133




                                          Fig. 2. Percent reading all the lesson in each evaluation.


were used to test signi®cance of mean KAB scores and                      worksheet one, 28% worksheet two, 80% worksheet three,
calcium intake of matched individuals across time. The                    and 50% worksheet four.
covariates of mean income and employment status were                         The response sheets assessed whether participants tried
used in testing changes in KAB scores. Signi®cance was                    recipes, involved children in food preparation, and shared
assumed at #0.05.                                                         lesson materials and revealed no signi®cant differences
                                                                          between delivery methods. Although 37% of method G
                                                                          tried lesson one recipes compared to 20% in method I, there-
4. Key ®ndings from the formative evaluation of the                       after, percentages were lower and similar between delivery
initial program                                                           groups. Those involving children varied from 11% for
                                                                          lesson one to 2% for lesson four and those sharing recipes
   We outline program implementation ®ndings for goals                    with friends between 16 and 22% (Fig. 3).
(a)±(d) and impact ®ndings.
                                                                          4.1.3. Goal (c): Participant reactions
4.1. Program implementation                                                  Fifty women (27 from G and 23 from I) participated in the
                                                                          focus groups. Participants from both delivery methods were
4.1.1. Goal (a): Target audience                                          more likely to remember personal contacts and paycheck
   Ultimately, 275/489 (56%) women completed post                         ¯yers than other advertisements. They recommended
questionnaires that met analysis criteria. Completers                     changes in lesson format, recipes, worksheets, and
and non-completers did not differ in demographic                          calcium-rich foods featured. Many found the lesson booklet
characteristics (data not shown). When comparing deliv-                   cumbersome, the menus unhelpful, the worksheets in two
ery methods, completers differed signi®cantly only in                     lessons long, and some featured foods dif®cult to adopt.
two factors: percent employed full time (91.6% in G                       Some participants wanted the emphasis on drinking milk.
vs 81.6% in I) and percent of families with incomes                       They suggested including menus and microwave instruc-
over $35,000 (57.7% vs 42.4%).                                            tions in the recipes. With some exceptions, women reported
                                                                          it was dif®cult to involve children in food preparation or that
4.1.2. Goal (b): Participant's attention to and use of                    their children were grown.
lessons                                                                      However, some feedback was unique to a delivery
   Response sheets returned dropped over the four lessons;                method. Group-delivery participants wanted more
Method G dropped from 81% of initial registrants for lesson               lecture, more question and answer time, and less moti-
one to 41% for lesson four and method I from 95 to 67%.                   vational discussion. They could not recall voting to try
Otherwise, the two delivery methods did not differ signi®-                a behavior (critical to the discussion±decision method)
cantly in attention to, and use of lessons.                               but liked the food tasting activity. Impersonal-delivery
   Respondents that reported reading all lesson materials fell            participants also wanted question and answer time and
from 85% for lesson one to 62±64% for lesson four (Fig. 2).               reminders to complete each lesson, but disagreed about
Regardless of delivery method, respondents rated all                      the period between lessons.
lessons, on a scale of 1±5, fairly easy to read (1.4 ^ 0.6                   Participants from both delivery methods revealed that
where 1 ˆ easy to read), and useful (hovering at 2.1 ^ 0.8                they had limited time to try recipes and had not yet put
where 1 is very useful). About 70% reported completing                    learned health-promoting actions into practice. They
134                         J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143




                                      Fig. 3. Percent reporting sharing lesson recipe with friends.


disliked the long KAB questionnaire and completing the                  advice. They felt the recipes needed improvement.
second FFQ, only 2 months after the initial one, when                   Agents echoed the lack of emphasis on drinking milk,
they had not yet initiated changes in eating habits.                    a political issue in counties with a dairy industry.

4.1.4. Goal (d): Agent reactions                                        4.2. Program impact
   All agents participated in both delivery methods. They
reported that the advertising materials did not clearly                   As hoped, changes over time for KAB were signi®-
de®ne the target audience and that in-person appeals                    cant. As expected, the hypothesis that changes in knowl-
and an enthusiastic site contact improved recruitment.                  edge would not differ by delivery method was supported.
Despite managing shifts, they preferred the interaction                 Unexpectedly, the hypothesis that those in group-deliv-
and participant interest in group-delivery and the oppor-               ery would show greater gains in attitude, behavior,
tunity for daytime programs. But agents using group-                    calcium intake, and exercise pattern than those in imper-
delivery resisted being motivators and asked to provide                 sonal-delivery was not supported. For the KAB
lectures, perceiving that participants wanted prescriptive              measures, time by delivery method interaction was not




                                            Fig. 4. Change in knowledge score over time.
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                         135




                                             Fig. 5. Change in attitude score over time.




                                            Fig. 6. Change in behavior score over time.


signi®cant (Figs. 4±6). Group delivery did not affect                 5.1. Revised program lesson content and recruitment
knowledge, attitude, or behavior scores any more than
impersonal delivery. Changes in calcium intake and                       We changed the lesson content to address the concerns
exercise pattern were not signi®cantly different between              outlined above. We asked six county agents, representing
delivery groups (data not shown).                                     three rural and three suburban counties, to recruit four work-
                                                                      sites each, a total of 24. We clari®ed the target audience in
                                                                      advertising materials and directed agents toward in-person
                                                                      recruiting. We lowered the lessons' reading level to accom-
5. Stage two: The revised program and summative                       modate participants from more blue collar worksites where
evaluation                                                            mothers, ages 21±45, were a signi®cant part of the work
                                                                      force to insure enrolling more working women with young
   The changes made in stage two in the program content,              children.
recruitment, delivery method, and evaluation design and
instruments for the summative evaluation are shown in                 5.2. Revised program delivery method
Table 1. Almost all re¯ect key ®ndings of the stage one
formative evaluation.                                                    The initial program implementation and impact data
136                                    J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143

Table 1
Major changes in educational program, evaluation design and evaluation instruments prompted by results of the formative evaluation

Type of change                               From                                         To

Program lesson content
z Layout of each lesson                      Booklet                                      Folder with pull-out fact sheets
z Calcium rich foods                         Emphasize four non-traditional foods         Emphasize ¯uid milk and four non-traditional foods
z Worksheets                                 Lesson 1: 7 day exercise diary               Lesson 1: 3 day exercise diary
                                             Lesson 4: long contract to make one          Lesson 4: short contract to make one behavior change
                                             behavior change
z Fact sheet: food activities for children   Suggestions to involve children in food      Retain and give added emphasis in lessons and group meeting
                                             activities
z Recipes                                    Six per lesson with conventional             Keep four most popular, but add microwave instructions and
                                             instructions                                 menu suggestions; emphasize testing on weekends
z Reading level                              12th grade                                   8th grade
Program recruitment
z Recruitment of worksites                   Work force must have a high percentage       Target blue collar worksites; work force must have a high
                                             of working women                             percentage of working mothers
z Advertising for target audience            Print material and in-person recruitment     Emphasis on in-person recruitment; clarify target audience in
                                                                                          all recruitment material
Program delivery
z Delivery method                            Group: motivational discussion about         Group: lecture stressing 2±3 main points of lesson followed by
                                             overcoming barriers to suggested             pep talk about suggested behavior followed by group vote on
                                             behaviors ending with group vote on          trying the behavior [try recipes, start walking program, involve
                                             trying the behavior [try recipes, start      kids in kitchen, use foods not supplements] plus food tasting
                                             walking program, involve kids in             with revised recipes
                                             kitchen, use foods not supplements]
                                             plus food tasting
                                             Impersonal: pass out lessons on schedule     Impersonal: pass out lessons on schedule
Evaluation design
z Intervention design                        Comparison of two delivery methods           Comparison of two delivery methods with a control
                                             Pre±post measures, T0 & T1 ÐKAB and          Pre±post 4 month post measures: T0, T1 & T2 ÐKAB, T0 &
                                             FFQ                                          T2 ÐFFQ
                                             Response sheet in each lesson; no            Response sheet in each lesson; provide incentive for return
                                             incentive to return
Evaluation instruments
z Impact instrument scales                   KAB questionnaire                            KAB questionnaire
                                              z 20 knowledge questions                    z 14 behavior questions
                                             KR-20 ˆ 0.80                                 KR-20 ˆ 0.725
                                              z 22 attitude questions                     z 16 attitude questions
                                             a ˆ 0.78                                     a ˆ 0.80
                                              z 16 behavior questions                     z 14 behavior questions
                                             a ˆ 0.75                                     a ˆ 0.80
z Response sheets                             z Try any suggestion for child activity:    For both questions, add the response: no, but plan to
                                             responsesÐyes, no
                                              z Try any recipe: responsesÐyes, no



indicated the group delivery method did not affect atti-                            tion rate led us to use a pre (T0), immediate post (T1), and 4
tudes and behaviors possibly because agents were                                    month post (T2) summative evaluation design (Fig. 1). We
uncomfortable and did not conduct the meeting accord-                               asked participants to complete the KAB instrument at all
ing to directions. To rectify this, using Pelz (1959), six                          time points, but the FFQ only at T0 and T2, a 6 month
agents designed four new 30 min meeting scripts that                                interval, expecting the T2 measure would detect changes
included two to three main points, retained the food                                which initial program participants claimed took time to
tasting (with new recipes), and eliminated the motiva-                              implement.
tional discussion. A suggested action was still promoted                               To improve our ability to detect changes, we compared
at the end of the meeting and a group vote taken on                                 three intervention groups (two experimentalÐgroup-deliv-
adoption. Agents were trained to use these scripts and                              ery and impersonal-deliveryÐand one control). The
distributed the lessons biweekly.                                                   controls received four correspondence lessons addressing
                                                                                    cancer prevention, identical in design to the osteoporosis
5.3. Summative evaluation design                                                    lessons the experimental groups received. The osteoporosis
                                                                                    and cancer lessons differed only in diet±disease context,
   Participants' comments and the poor formative comple-                            bene®cial nutrients and foods, and emphasis on exercise
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                      137

in the osteoporosis lessons. In sum, those in group-delivery            more accurate assessment of the responses of those complet-
received the modi®ed group meeting and osteoporosis                     ing the program.
lessons; those in impersonal-delivery only the osteoporosis                Impact analysis methods were similar to those used in the
lessons, and the controls only the cancer lessons.                      formative analyses with these modi®cations: (a) we used
   We divided the 24 worksites into ®ve blocks re¯ecting                only data of those completing all three KAB or FFQ instru-
relative pay scale and type of worker. These were assigned              ments; (b) we allowed up to two missing answers on the
purposefully to the three intervention groups such that there           knowledge scale; (c) we tested the signi®cance of continu-
was an equal representation of all ®ve blocks in the two                ous and ordinal data among the three delivery groups at
experimental groups while the controls lacked representa-               three time points (T0, T1, T2) and (d) age served as the
tion from one of two lower pay blocks. Three companies                  covariate for ANOVA and ANCOVA. We determined
withdrew prior to recruitment.                                          statistically signi®cant differences among values at time
   All participants completed pre-test instruments at a meet-           points using pair-wise tests of differences between least-
ing 1 week prior to receiving the ®rst lesson. The post-test            squares means. A Bonferoni adjustment was used to control
KAB instrument, distributed with the last lesson, was                   the overall error rate. Signi®cance was assumed at #0.05.
collected at a concluding meeting 2 weeks later. Three                     Finally, we compared categorical and continuous demo-
months later the ®nal instruments were distributed to all               graphic characteristics (mean age and education) between
participants by the agent or by mail using a modi®ed Dill-              the formative and the summative evaluation completers
man Method.                                                             using x 2 analysis and non-directional t-tests.

5.4. Evaluating revised program implementation
                                                                        6. Summative evaluation ®ndings and comparison with
   To assess demographics, we included questions in the                 the formative
pre-test instrument of all three intervention groups. To
assess attention to, and use of the lessons, we included a                First, we examine the implementation ®ndings. Then we
response sheet in each lesson for the two experimental                  examine impacts over time comparing the results to the
groups only. We added a third possible response (no, but I              control, looking at differences between the two delivery
plan to) to questions about children's activities or recipes to         methods. In each instance we compare the summative ®nd-
capture behavioral intention.                                           ings with those of the formative.

5.5. Evaluating revised program impact                                  6.1. Program implementation

   As in the formative evaluation, we hypothesized that                 6.1.1. Goal (a): Target audience
those in group-delivery would exhibit greater changes                      Completion rates were better and participant demo-
than those in impersonal-delivery in attitude and behavior              graphics were closer to those desired in the summative
scores, exercise pattern, and calcium intake. In addition, we           compared to the formative. In the summative, 70% of initial
hypothesized that: (a) both experimental groups would exhi-             registrants completed all three KAB measures. Almost 90%
bit greater changes in knowledge than controls and (b) those            completed the KAB instruments at T0 and T1, in contrast to
in impersonal-delivery would exhibit greater changes than               56% in the formative. Eighty percent completed both FFQ
controls in attitude and behavior scores, exercise pattern,             instruments in the summative compared to 51% in the
and calcium intake.                                                     formative. Table 2 lists the demographics of completers in
   Based on formative participant comments, we shortened                both evaluations, ®nding them similar in family income,
the KAB scales to reduce participant burden, hoping to                  race, marital status, awareness of relatives with osteoporo-
improve completion. Using the formative instruments                     sis, initial exercise pattern, and calcium intake per 1000
completed by the 677 women registrants for stage one                    calories. Those completing the summative evaluation were
(mean age 43.34 ^ 11.58), we used internal consistency                  signi®cantly more likely than those in the formative
testing to eliminate less discriminating items, producing               however, to be younger, have only a high school education,
the scales in Table 1. Items retained in the KAB instrument             work full time, and have at least one child at home.
(76% of the original) represented all content areas in the                 In the summative, the two experimental groups and
formative, improving as for two scales. However, no new                 controls differed signi®cantly in only age (data not
questions were added. Both the question about exercise                  shown). Those in group-delivery had a mean age of
regularity and the FFQ were not changed for the summative               39.1 ^ 9.95 compared to 40.6 ^ 10.10 in impersonal-
evaluation.                                                             delivery and 40.9 ^ 9.61 in the control.

5.6. Summative evaluation data analyses                                 6.1.2. Goal (b): Participant's attention to, and use of,
                                                                        lessons
  In contrast to the formative, implementation data of those               In the summative evaluation, 78% of group-delivery and
completing all four response sheets were used, providing a              63% of impersonal-delivery registrants returned all four
138                                  J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143

Table 2
Demographic characteristics of those completing each evaluation phase. SD ˆ standard deviation; yr. ˆ years

Variable                                              Formative                                    Summative
                   a
Mean age ^ SD                                         N ˆ 275, 43.02 ^ 11.12                       N ˆ 247, 40.15 ^ 9.88
Family income                                         N ˆ 255                                      N ˆ 232
# 10±19.9000                                          41 (16.1%)                                   45 (19.4%)
20±34.9000                                            92 (36.1%)                                   83 (35.8%)
35±50,000 1                                           122 (47.8%)                                  104 (44.8%)
Employment status a                                   N ˆ 272                                      N ˆ 245
% full time                                           235 (86.4%)                                  232 (94.7%)
Mean educational level ^ SD (yr.) b                   N ˆ 268, 13.40 ^ 1.97                        N ˆ 246, 12.70 ^ 1.54
Race                                                  N ˆ 272                                      N ˆ 246
% white                                               260 (95.6%)                                  239 (97.2%)
Marital status                                        N ˆ 272                                      N ˆ 246
Married                                               200 (73.5%)                                  175 (71.1%)
Single                                                40 (14.7%)                                   27 (11.0%)
Other                                                 32 (11.7%)                                   44 (17.9%)
Percent with at least one child at home b             N ˆ 274, 129 (47.1%)                         N ˆ 247, 163 (66.0%)
Relatives with osteoporosis                           N ˆ 246                                      N ˆ 247
Yes                                                   49 (19.9%)                                   35 (14.2%)
No                                                    167 (67.9%)                                  166 (67.2%)
Don't know                                            30 (12.2%)                                   46 (18.6%)
Exercise regularly in past 6 months?                  N ˆ 269                                      N ˆ 246
Yes                                                   94 (34.9%)                                   90 (36.6%)
No                                                    175 (65.1%)                                  156 (63.4%)
                                                      N ˆ 248                                      N ˆ 244
Calories (mean ^ SD)                                  1623.8 ^ 584.4                               1798.0 ^ 711.2
Calcium in mg                                         805.4 ^ 415.5                                895.3 ^ 549.2
Calcium in mg/1000 calories                           497.1 ^ 178.9                                490.2 ^ 206.3
 a
      p , 0.01.
 b
      p , 0.001.



response sheets, providing the sample for analysis.                            tofu respectively, signi®cantly more useful than did
However, formative and summative return rates are not                          impersonal-delivery participants, whereas in the forma-
comparable because we did not restrict that sample to                          tive, the ratings were identical.
those completing all four.                                                        Questions about materials use revealed that the summa-
   In both evaluations, we asked participants if they read all,                tive results differed from the formative, regardless of the
parts of, skimmed, or did not read each lesson. Similar to the                 delivery method, in that:
formative, those reading the whole lesson declined to 65±
70% by lesson four. Unlike the formative where there was                       ² 3±10% tried a recipe (data not shown) compared to 10±
no difference between delivery groups in percent reading the                     37% in the formative. Yet, in the summative, two thirds
lessons, in the summative, signi®cantly more in group-                           or more in both delivery groups indicated they planned to
delivery skimmed and less read lessons one and two than                          try a recipe, an option not available in the formative.
in impersonal-delivery (Fig. 2).                                               ² 11±21% involved children in food preparation compared
   Completion rates of the worksheets did not differ between                     to 2±11% in the formative. Additionally, in contrast to
evaluations with one exception. More completed worksheet                         the formative, the summative exposed differences among
two, a revised exercise diary, in the summative than in the                      delivery groups, in that the percent sharing a recipe with
formative (40 vs 28%, respectively).                                             friends was signi®cantly greater in group-delivery than
   In both evaluations, respondents were asked how easy                          impersonal-delivery for lessons two (20 and 6% respec-
to read and how useful each lesson was. Respondents in                           tively) and four (24 and 10%) while no signi®cant differ-
both evaluations provided nearly identical ratings of ease                       ences between delivery methods was evident in the
of reading regardless of delivery method, suggesting the                         formative (Fig. 3). Lessons two and four featured less
lower reading level of the summative materials made it                           familiar calcium-rich foods.
easier for the less educated participants. Respondents in                      ² Consistently more in group-delivery shared other
both evaluations provided nearly identical ratings of                            lesson materials with friends than in impersonal-delivery
perceived usefulness for lessons one and two; however,                           (16±22% vs 9±11%). This distinction was not seen in the
in the summative, group-delivery participants rated                              formative where about 15% in both delivery methods
lessons three and four, highlighting canned salmon and                           shared materials (data not shown).
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                          139

6.2. Program impact                                                    greater impact than those of the formative although the
                                                                       difference was in the right direction.
6.2.1. Change in knowledge
   As expected, our hypothesis that both group-delivery and
impersonal-delivery of osteoporosis lessons would lead to              7. Discussion
greater knowledge gain than in controls was supported
(Fig. 4). Indeed, the gain in knowledge in group-delivery                 The purpose of this study was to test whether the changes
was signi®cantly greater than that in impersonal-delivery              made in a program as a result of a formative evaluation
and both were signi®cantly greater than that of the                    strengthened the implementation and impact of the program
controls at both T1 and T2. This was not seen in the formative         in the summative. We found that implementation improved
evaluation.                                                            but only certain measures of impact improved enough to
                                                                       distinguish the effects of each delivery method. As a result
                                                                       we suggest the following for evaluators to consider in
6.2.2. Change in attitude                                              designing a formative evaluation and in attempting to assess
   Our hypothesis that those in the experimental groups                its effectiveness.
would show greater gains in attitude than controls was not
supported in the summative evaluation (Fig. 5). Our hypoth-            7.1. Improving the design of a formative evaluation
esis that gains in attitude would differ between delivery
methods was not supported either. The initial mean attitude            7.1.1. Interpreting (focus group) feedback from participants
scores in the summative were signi®cantly higher than those               As a result of participants' comments in the formative
in the formative (3.9±4.0 vs 3.0±3.1, respectively), perhaps           focus groups, we extensively altered the lessons for the
due to increased media focus on osteoporosis or to differ-             summative and this did lead to greater use of a formerly
ences in participants, and could have limited our ability to           underused worksheet, continued ease of reading with parti-
improve attitudes and thus to detect signi®cant changes. We            cipants with lower educational levels, and more involve-
did not detect the in¯uence of any previous worksite educa-            ment of children in food preparation.
tional activities, however.                                               Focus group participants wanted less motivational discus-
                                                                       sion and more lecture in the meetings. In response, we
                                                                       increased lecture time in the group-delivery method. The
6.2.3. Change in behavior
                                                                       in¯uence of the group-delivery method was underscored
   Our hypothesis that experimental groups would show
                                                                       because group-delivery participants were more likely to
greater gains in behavior than controls was partially                  skim than to read the lessons. Even with less reliance on
supported in the summative (Fig. 6). Gains in behavior for             reading, group-delivery participants found two lessons
group-delivery were signi®cantly greater than the control at           signi®cantly more useful and were more likely to share
T2. Administering the behavior scale 4 months after the                lesson materials with others than those in impersonal-deliv-
intervention (T2) identi®ed further changes in participant             ery. The agent presentation in group-delivery was clearly
behavior in group-delivery that were not seen in controls,             critical to sell the lessons. In this case, following participant
even when controlling for age, a possible explanation. Our
                                                                       recommendations appeared to improve impact on knowl-
hypothesis that those in group-delivery would show greater
                                                                       edge gained, but not in most areas of behavior change.
gains in food-related behavior than those in the impersonal-              When the focus group participants indicated they did not
delivery was not supported in the summative evaluation.                like the group-delivery motivational discussion±decision
                                                                       session, they also indicated a preferred alternative. Given
6.2.4. Changes in calcium intake and exercise pattern                  the extensive literature on assessing participant perspectives
   Our hypothesis that those in experimental groups would              (Basch,1987), we assumed we needed to alter the delivery
show greater gains in calcium intake than controls was not             method to one they liked (especially since the agents echoed
supported in the summative. Our hypothesis that those in               this preference) and ultimately emphasized lecture over
group-delivery would show greater gains than those in                  motivation in the summative. In hindsight, we should
impersonal-delivery was also not supported. The summative              have asked the focus group participants how to change the
delivery methods did not have a greater differential impact            motivational aspects of the session with its emphasis on
than those of the formative.                                           behavior modi®cation, rather than abandon this based on
   Our hypothesis that experimental groups would report                their negative feedback.
greater change in exercise patterns than controls was                     A researcher using focus groups should not assume that if
partially supported (data not shown). In the summative                 participants express dislike of a particular delivery method
evaluation, the number at T2 who reported that exercising              and suggest another, that one should drop the original
regularly was a new pattern was signi®cantly greater in                method without considering the down stream effects. Be
group-delivery than in controls. Our hypothesis that exer-             prepared to probe to learn why it was disliked and how it
cise patterns would differ between delivery methods was not            might be modi®ed, especially if participants are suggesting
supported. The summative delivery methods did not have a               a more passive path to learning. By making the assumption
140                           J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143

participants were right, we limited focus group inquiry and            to plan the summative. In hindsight, we should have taken a
the type of information collected for program planning in              proactive stance once these unanticipated barriers surfaced
the summative.                                                         in the initial focus group, and posed questions to later focus
   We revised the recipes, believing these could be used as            groups based on the comments offered in previous ones.
the context to show participants how to use calcium-rich
foods and as a device to facilitate behavior change. In the            7.1.3. Interpreting feedback from program instructors
focus groups, we listened openly to complaints about the                  As a result of the formative, we allowed instructors to
recipes and asked participants how to improve them. Some               change the presentation in the group delivery method,
participants indicated they came to the program for the                believing they would have greater ownership and thus
recipes but disliked those provided. Despite these partici-            impact, if they designed a presentation with which they
pant-guided revisions, reported use of recipes did not                 were more comfortable. The remedy the instructors devel-
improve in the summative, although a large number indi-                oped, a lecture rather than a motivational discussion, led to
cated they planned to.                                                 an emphasis on knowledge rather than on behavior change
   In the formative, we assumed that when a component of               in the presentation. This may partially explain why the two
the program i.e., the recipes, was poorly received by parti-           delivery methods did not differ signi®cantly in ®nal beha-
cipants that this merely needed improvement and thus we                vior scale scores. And it may also explain the signi®cantly
speci®cally asked participants for suggestions. In hindsight,          greater gain in knowledge in the group- than the impersonal-
the question we should have asked ®rst, given the                      delivery.
complaints, was one that tested our assumption that partici-              The formative evaluation provided feedback about the
pants valued recipes, i.e. should recipes be included in the           presentation. We assumed that when a component of the
lessons at all? And, if the answer was no, we should have              program i.e., the presentation, was unacceptable to instruc-
been ready to discuss with focus group participants, alter-            tors that it should be changed and we asked for suggestions.
native devices to motivate behavior change.                            The acceptable change did not lead to greater impact.
   Researchers designing formative evaluations need to be              Although instructor suggestions carry great face validity,
alert for such a methodological inconsistency: why did                 researchers need to be wary because instructors may provide
participants give us suggestions to improve the recipes                suggestions that shift the aims of the program. In hindsight,
when, in fact, many had not used these (10±30% in the                  what we should have done was to propose to the instructors,
formative). This might be explained by the inclination of              alternative presentation methods that retained an emphasis
people to provide answers to questions when they are asked.            on behavior change. If none were acceptable, we should
People have a tendency to tell more than they can know                 have queried our assumption that agents were the appropri-
(Nisbett & Wilson, 1977).                                              ate instructors for the group presentation.
   In summary, researchers using focus groups must be
prepared to probe assertions by participants that some
component of a program is unsatisfactory. Probing should               7.1.4. Incorporating a control group
investigate both how that component might be improved                     We included impact measures in the formative to gain a
and retained as well as what might be substituted and                  quantitative estimate of the effects of the initial program but
why. In particular, asking a question about the fundamental            we did not include a control group because we expected the
usefulness of some component of a program in a formative               contrast between group- and impersonal-delivery to be
evaluation may not be easy for researchers as they may have            robust. Due to instructor dif®culties with the stage one
considerable ownership and resources invested in that                  group-delivery, this contrast did not materialize. Although
component. However, when faced with negative feedback                  we had evidence that participants were learning from the
about a component of the initial program, researchers                  print materials in both delivery methods, we could not
should investigate both options to insure suf®cient data for           demonstrate that the change seen was better than with no
an informed decision about summative activities.                       intervention. Hence we suggest that evaluators include a
                                                                       control group when testing the impact of a new program
7.1.2. Assessing barriers to changing behaviors                        offered via different delivery methods in a formative
   Because participants came to this program on calcium-               evaluation.
rich foods, we assumed that they would be open to change
and that testing the suggested foods at family meals would             7.1.5. Watching for serendipitous effects of formative
be acceptable. Some focus group participants indicated it              evaluation
was dif®cult to introduce these foods because of family                   Including impact measures lead to unexpected participant
member aversion to change. On hearing this, we assumed                 feedback about the instruments and their mode of adminis-
that altered recipes would be suf®cient to overcome opposi-            tration. This feedback proved invaluable in improving
tion and explored this in all focus groups. By not making a            program implementation in the summative, underscoring
conscious effort to uncover social barriers to changing food           the usefulness of impact measures in formative evaluations.
choices for families, we limited the information we gained             In hindsight, we recommend including these measures in the
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                                   141

formative to assess impact as well as to obtain feedback                the subsequent effects of formative evaluation on a program.
about the instruments and their implementation.                         This study begins to ®ll that gap.
                                                                           This study demonstrates that formative evaluation can
7.2. Assessing the effectiveness of the formative evaluation            strengthen the implementation and impact of an educational
                                                                        intervention designed to compare the impact of two program
   The need exists to demonstrate the subsequent effects of
                                                                        delivery methods. The modi®cations made as a result of the
formative evaluation in order to improve formative evalua-
                                                                        formative appear to have signi®cantly improved knowledge
tion design. Our study provides some guidance.
                                                                        gained but resulted in only modest improvements in beha-
                                                                        viors in the ®nal program.
7.2.1. Altering the evaluation design
                                                                           Our retrospective analysis of our experience supports the
   As a result of the formative, we added a control group and a
                                                                        inclusion of a control group and impact measures at several
3-month post-intervention measurement in the summative.
                                                                        time points in a formative evaluation of a program imple-
Without either of these, we would not have been able to
                                                                        mented in a new environment for agency personnel. The
observe some signi®cant differences between the two delivery
                                                                        ®ndings also suggest that when researchers are faced with
methods in the summative. Three months after the interven-
                                                                        negative feedback about the components of the program in a
tion, group delivery produced signi®cantly greater changes in
                                                                        formative evaluation, they need to exercise care in interpret-
behavior scores and exercise pattern than seen in the controls
                                                                        ing feedback and in revising those components.
while the impersonal-delivery method did not. This design
                                                                           In addition, the need to gather evidence of the subsequent
change also revealed that the group method lead to signi®-
                                                                        effect of using data from formative evaluations is critical.
cantly greater knowledge gain than the impersonal method
                                                                        Otherwise we cannot begin to examine whether the methods
and both gains were greater than that of controls. Clearly our
                                                                        and processes we take for granted in the formative evalua-
target audience was not likely to adopt behavior changes based
                                                                        tion are valid and appropriate. However, evaluators must
on receiving just the printed materials.
                                                                        carefully plan the evaluations making sure that instruments
   However bene®cial these changes in evaluation design
                                                                        and evaluation design are parallel in order to carry out these
were in illuminating important ®ndings about the impact
                                                                        comparisons. We offer our ®ndings as stimulus to consider
of the ®nal program, implementing them only in stage two
                                                                        such studies.
prevented a rigorous comparison between formative and
summative evaluations which might have allowed us to
see more clearly, the effect of the formative evaluation.
Lack of the T2 measure in the formative meant we could
                                                                        References
not be sure what effect the initial program had 3 months later
and lack of a control group meant we could not fully inter-             Baggaley, J. (1986). Formative evaluation of educational television. Cana-
pret the formative impact data. In future, we recommend                    dian Journal of Educational Communication, 15 (1), 29±43.
that if researchers want to assess the effectiveness of a               Baghdadi, A. A. (1981). A comparison between two formative evaluation
formative evaluation, that the design elements of formative                methods. Dissertation Abstracts International, 41 (8), 3387A.
and summative steps be parallel. These design details may               Baker, E. L., & Alkin, M. C. (1973). Formative evaluation of instructional
                                                                           development. AV Communication Review, 21 (4), 389±418.
seem more appropriate for a summative evaluation but they               Basch, C. E. (1987). Focus group interview: An underutilized research
are necessary to see the effects of the formative.                         technique for improving theory and practice in health education. Health
                                                                           Education Quarterly, 14 (4), 411±448.
7.2.2. Altering the evaluation instruments                              Bertrand, J. (1978). Communications pretesting (Media Monograph No.
                                                                           Six). Chicago: University of Chicago, Community and Family Study
   As a result of the formative, we shortened the KAB ques-
                                                                           Center.
tionnaire somewhat to address participant complaints in the             Brown, J. L., & Griebler, R. (1993). Reliability of a short and long
summative. We believe this contributed to more participants                version of the Block food frequency from for assessing changes in
completing all the evaluation instruments in stage two. Thus               calcium intake. Journal of the American Dietetic Association, 93
implementation improved in the summative. However, this                    (7), 784±789.
                                                                        Cambre, M. (1981). Historical overview of formative evaluation of instruc-
improvement carried a price. Subsequently, we were not
                                                                           tional media products. Educational Communication & Technology
able to conduct the most rigorous comparison of the forma-                 Journal, 29 (1), 3±25.
tive KAB results to those of the summative.                             Cardinal, B. J., & Sachs, M. L. (1995). Prospective analysis of stage-of-
                                                                           exercise movement following mail-delivered, self-instructional exer-
                                                                           cise packets. American Journal of Health Promotion, 9 (6), 430±432.
8. Conclusions                                                          Chambers, F. (1994). Removing confusion about formative and summative
                                                                           evaluation: Purpose versus time. Evaluation and Program Planning, 17
  The literature on formative evaluation focuses on its                    (1), 9±12.
conceptual framework, its methodology and use. Permeat-                 Chen, H. T. (1996). A comprehensive typology for program evaluation.
                                                                           Evaluation Practice, 17 (2), 121±130.
ing this work is a consensus that a program will be strength-           Chen, C. H., & Brown, S. W. (1994). The impact of feedback during
ened as a result of a formative evaluation although little                 interactive video instruction. International Journal of Instructional
empirical evidence exists in the literature to demonstrate                 Media, 21 (3), 191±197.
142                                   J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143

Dehar, M. A., Casswell, S., & Duignan, P. (1993). Formative and process               veri®cation and revision: An experimental comparison of two methods.
    evaluation of health promotion and disease prevention programs.                   AV Communication Review, 24 (3), 316±328.
    Evaluation Review Journal, 17 (2), 204±220.                                   Kaufman, R. (1980). A formative evaluation of formative evaluation:
Dick, W. (1980). Formative evaluation in instructional development. Jour-             The state of the art concept. Journal of Instructional Development, 3
    nal of Instructional Development, 3 (3), 2±6.                                     (3), 1±2.
Dick, W., & Carey, L. (1985). The systematic design of instruction, (2nd ed)      Kershaw, D., & Fair, J. (1976). The New Jersey income maintenance
    Glenview, IL: Scott, Foresman.                                                    experiment, New York: Academic Press.
Dillman, D. A. (1978). Mail and telephone surveys: The total design               Kishchuk, N., Peters, C., Towers, A. M., Sylvestre, M., Bourgault, C., &
    method, New York: John Wiley and Sons.                                            Richard, L. (1994). Formative and effectiveness evaluation of a work-
Dennis, M. L., Fetterman, D. M., & Sechrest, L. (1994). Integrating quali-            site program promoting healthy alcohol consumption. American Jour-
    tative and quantitative evaluation methods in substance abuse research.           nal of Health Promotion, 8 (5), 353±362.
    Evaluation and Program Planning, 17 (4), 419±427.                             Krippendorff, K. (1980). Content analysis: An introduction to its methodol-
Fairweather, G., & Tornatzky, L. G. (1977). Experimental methods for                  ogy, Sage: Beverly Hills, CA.
    social policy research, New York: Pergamon.                                   Lenihan, K. (1976). Opening the second gate, Washington, DC: U.S.
Finnegan Jr, J. R., Rooney, B., Viswanath, K., Elmer, P., Graves, K.,                 Government Printing Services.
    Baxter, J., Hertog, J., Mullis, R., & Potter, J. (1992). Process evaluation   Lewin, K. (1943). Forces behind food habits and methods of change. In
    of a home-based program to reduce diet-related cancer risk. The `WIN              The Problem of Changing Food Habits. National Research Council
    at Home Series'. Health Education Quarterly, 19 (2), 233±248.                     Bulletin 108. (pp. 35±65). Washington, D.C.: National Academy of
Fitz-Gibbon, C. T., & Morris, L. L. (1978). How to design a program                   Sciences.
    evaluation, Beverly Hills, CA: Sage Publications.                             Markle, S. M. (1979). Evaluating instructional programs: How much is
Flagg, B. N. (1990). Formative evaluation for educational technologies,               enough? NSPI Journal, Feb, 22±24.
    Hillsdale, NJ: Lawrence Erlbaum Associates.                                   Markle, S. M. (1989). The ancient history of formative evaluation. Perfor-
Flay, B. R. (1986). Ef®cacy and effectiveness trials (and other phases of             mance and Instruction, Aug, 27±29.
    research) in the development of health promotion programs. Preventa-          McGraw, S. A., McKinley, S. A., McClements, L., Lasater, T. M., Assaf,
    tive Medicine, 15, 451±474.                                                       A., & Carleton, R. A. (1989). Methods in program evaluation: The
Foshee, V., McLeroy, K. R., Sumner, S. K., & Bibeau, D. L. (1986).                    process evaluation system of the Pawtucket Heart Health Program.
    Evaluation of worksite weight loss programs: A review of data and                 Evaluation Review, 13 (5), 459±483.
    issues. Journal of Nutrition Education, 18 (1), S38±S43.                      McGraw, S. A., Stone, E. J., Osganian, S. K., Elder, J. P., Johnson, C. C.,
Geis, G. L. (1987). Formative evaluation: Developmental testing and expert            Parcel, G. S., Webber, L. S., & Luepker, R. V. (1994). Design of
    review. Performance & Instruction, May/June, 1±8.                                 process evaluation within the child and adolescent trial for cardio-
Gillespie, A., & Achterberg, C. (1989). Comparison of family interaction              vascular health (CATCH). Health Education Quarterly, Supplement,
    patterns related to food and nutrition. Journal of the American Dietetic          2, S5±S26.
    Association, 89 (4), 509±512.                                                 Montague, W. E., Ellis, J. A., & Wulfeck, W. H. (1983). Instructional
Glanz, K., & Seewald-Klein, T. (1986). Nutrition at the worksite: An over-            quality inventory: A formative evaluation tool for instructional devel-
                                                                                      opment. Performance and Instruction Journal, 22 (5), 11±14.
    view. Journal of Nutrition Education, 18 (1), S1±S12.
                                                                                  Nathenson, M. B., & Henderson, E. S. (1980). Using student feedback to
Glanz, K., Sorensen, G., & Farmer, A. (1996). The health impact of work-
                                                                                      improve learning materials, London: Croom Helm.
    site nutrition and cholesterol intervention programs. American Journal
                                                                                  National Institutes of Health (1985). Surgeon general's report on nutrition
    of Health Promotion, 10 (6), 453±470.
                                                                                      and health. U.S. Department of Health and Human Services, Public
Hausman, J. A., & Wise, D. A. (1985). Social experimentation, Chicago:
                                                                                      Health Service (Chapter 7, pp. 311±343). Washington, DC: U.S.
    The University of Chicago Press.
                                                                                      Government Printing Service.
Hill, M., May, J., Coppolo, D., & Jenkins, P. (1993). Long term effective-
                                                                                  Nisbett, R. E. & Wilson, T. D. (1977). Tellimg more than we can know:
    ness of a respiratory awareness program for farmers. National Institute
                                                                                      Verbal reports on mental processes. Psychological Review, 84(3), May.
    for Farm Safety, Inc. NIFS Paper No. 93-3. Columbia, MO. NIFS
                                                                                  Parrott, R., Steiner, C., & Godenhar, L. (1996). Georgia's harvesting
    Summer Meeting, Coeur d'Alene, Idaho.
                                                                                      healthy habits: A formative evaluation. The Journal of Rural Health,
Health Habits And History Questionnaire: Diet History And Other Risk
                                                                                      12 (4), 291±300.
    Factors (1989). Personal computer system packet. Version 2.2.
                                                                                  Patton, M. Q. (1978). UtilizationÐfocused evaluation, Beverly Hills, CA:
    Washington, D.C.: National Cancer Institute, Division of Cancer
                                                                                      Sage.
    Prevention and Control, National Institutes of Health.
                                                                                  Patton, M. Q. (1982). Practical evaluation, Beverly Hills, CA: Sage.
Houts, S. A. (1988). Lactose intolerance. Food Technology, 42 (3), 110±
                                                                                  Patton, M. Q. (1994). Developmental evaluation. Evaluation Practice, 15
    113.                                                                              (3), 311±319.
Iszler, J., Crockett, S., Lytle, L., Elmer, P., Finnegan, J., Luepker, R., &      Patton, M. Q. (1996). A world larger than formative and summative.
    Laing, B. (1995). Formative evaluation for planning a nutrition inter-            Evaluation Practice, 17 (2), 131±144.
    vention: Results from focus groups. Journal of Nutrition Education, 27        Pelletier, K. R. (1996). A review and analysis of the health and cost-effec-
    (3), 127±132.                                                                     tive outcome studies of comprehensive health promotion and disease
Jacobs Jr, D. R., Luepker, R. V., Mittelmark, M. B., Folsom, A. R., Pirie, P.         prevention programs at the worksite: 1993±1995 update. American
    L., Mascioli, S. R., Hannan, P. J., Pechacek, T. F., Bracht, N. F., Carlaw,       Journal of Health Promotion, 10 (5), 380±388.
    R. W., Kline, F. G., & Blackburn, H. (1986). Community-wide                   Pelz, E. B. (1959). Some factors in group decision. In E. E. Macoby, T. M.
    prevention strategies: Evaluation design of the Minnesota heart health            Newcomb & E. L. Hartley, Readings in social psychology (3rd ed).
    program. Journal of Chronic Disease, 39 (10), 775±788.                            (pp. 212±219). New York: Holt, Rinehart and Winston, Inc.
Janz, N. K., & Becker, M. H. (1984). The health belief model: A decade            Peterson, K. A., & Bickman, L. (1988). Program personnel: The missing
    later. Health Education Quarterly, 11 (1), 1±47.                                  ingredient in describing the program environment. In J. Kendon,
Johnson, C. C., Osganian, S. K., Budman, S. B., Lytle, L. A., Barrera, E. P.,         Conrad Roberts-Gray & Cynthia Roberts-Gray, Evaluating program
    Bonura, S. R., Wu, M. C., & Nader, P. R. (1994). CATCH: Family                    environments, San Francisco, CA: Jossey-Bass, Inc.
    process evaluation in a multicenter trial. Health Education Quarterly,        Potter, J. D., Graves, K. L., Finnegan, J. R., Mullis, R. M., Baxter, J. S.,
    Supplement, 2, S91±S106.                                                          Crockett, S., Elmer, P. J., Gloeb, B. D., Hall, N. J., Hertog, J., Pirie, P.,
Kandaswamy, S., Stolovitch, H. D., & Thiagarajan, S. (1976). Learner                  Richardson, S. L., Rooney, B., Slavin, J., Snyder, M. P., Splett, P., &
J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143                                        143

    Viswanath, K. (1990). The cancer and diet intervention project: a         Seidel, R. E. (1993). Notes from the ®eld in communication for child survi-
    community-based intervention to reduce nutrition-related risk of              val, Washington, DC: USAID.
    cancer. Health Education Research, 5 (4), 489±503.                        Stuf¯ebeam, D. L. (1983). The CIPP model for program evaluation. In G.
Rightwriter (1990). Version 3.1. Sarasota, FL: RightSoft, Inc.                    Madaus, M. Scriven & D. Stuf¯ebeam, Evaluation models: Viewpoints
Robins, P. K., Spiegelman, R. G., Weiner, S., & Bell, J. G. (1980). A             on educational and human services evaluationBoston: Kluwer-Nijhoff.
    guaranteed annual income: Evidence from a social experiment, New          Tessmer, M. (1993). Planning and conducting formative evaluations,
    York: Academic Press.                                                         London: Kogan Page.
Rossi, P. H., & Lyall, K. (1976). Reforming public welfare, New York:         Thiagarajan, S. (1991). Formative evaluation in performance technology.
    Russell Sage.                                                                 Performance Improvement Quarterly, 4 (2), 22±34.
Rossi, P. H., & Freeman, H. E. (1982). Evaluation: A systematic approach      Wager, J. C. (1983). One-to-one and small group formative evaluation: An
    (p. 69). Beverly Hills, CA: Sage Publications.                                examination of two basic formative evaluation procedures. Perfor-
Russell, J. D., & Blake, B. L. (1988). Formative and summative evaluation         mance and Instruction, 22 (5), 5±7.
    of instructional products and learners. Educational Technology, 28 (9),   Walden, O. (1989). The relationship of dietary and supplemental calcium
    22±28.                                                                        intake to bone loss and osteoporosis. Journal of the American Dietetic
SAS Proprietary Software Release 6.09. (1989). Cary, N.C.: SAS Institute,         Association, 89 (3), 397±400.
    Inc.                                                                      Weston, C. B. (1986). Formative evaluation of instructional materials: An
Scanlon, E. (1981). Evaluating the effectiveness of distance learning: A          overview of approaches. Canadian Journal of Educational Communi-
    case study. In F. Percival & H. Ellington, Aspects of educational tech-       cation, 15 (1), 5±17.
    nology: Vol. XV: Distance learning and evaluation (pp. 164±171).          Weston, C. B. (1987). The importance of involving experts and learners in
    London: Kogan Page.                                                           formative evaluation. Canadian Journal of Educational Communica-
Scheirer, M. A. (1994). Designing and using process evaluation. In J. S.          tions, 16 (1), 45±58.
    Wholey, H. Hatry & K. Newcomer, Handbook of practical program             Wilkinson, T. L., Schuler, R. T., & Skjolaas, C. A. (1993). The effect of
    evaluation (pp. 40±68). San Francisco: Jossey-Bass.                           safety training and experience of youth tractor operators. National Insti-
Scheirer, M. A., & Rezmovic, E. L. (1983). Measuring the degree of                tute for Farm Safety, Inc. NIFS Paper No. 93±6. Columbia, MO. NIFS
    program implementation. Evaluation Review, 7 (5), 599±633.                    Summer Meeting, Coeur d'Alene, Idaho.
Schneider, M. L., Ituarte, P., & Stokols, D. (1993). Evaluation of a commu-   Witte, K., Peterson, T. R., Vallabhan, S., Stephenson, M. T., Plugge, C. D.,
    nity bicycle helmet promotion campaign: What works and why. Amer-             Givens, V. K., Todd, J. D., Bechtold, M. G., Hyde, M. K., & Jarrett, R.
    ican Journal of Health Promotion, 7 (4), 281±287.                             (1992/3). Preventing tractor-related injuries and deaths in rural popula-
Scriven, M. (1967). The methodology of evaluation. In R. Tyler, R. Gagne          tions: Using a persuasive health message framework in formative
    & M. Scriven, Perspectives of curriculum evaluation (pp. 39±83).              evaluation research. International Quarterly of Community Health
    Chicago: Rand McNally.                                                        Education, 13 (3), 219±251.

More Related Content

Similar to 2

This is from class AH111Quality, Access and CostIn a one to two .docx
This is from class AH111Quality, Access and CostIn a one to two .docxThis is from class AH111Quality, Access and CostIn a one to two .docx
This is from class AH111Quality, Access and CostIn a one to two .docx
christalgrieg
 
Linn 2000
Linn 2000Linn 2000
Linn 2000
clinic
 
An assessment of the measurementof performance in internat.docx
An assessment of the measurementof performance in internat.docxAn assessment of the measurementof performance in internat.docx
An assessment of the measurementof performance in internat.docx
galerussel59292
 
Research on Graphic Organizers
Research on Graphic OrganizersResearch on Graphic Organizers
Research on Graphic Organizers
jwalts
 
Research ArticleProcess Evaluation of a PositiveYouth De.docx
Research ArticleProcess Evaluation of a PositiveYouth De.docxResearch ArticleProcess Evaluation of a PositiveYouth De.docx
Research ArticleProcess Evaluation of a PositiveYouth De.docx
rgladys1
 
Research ArticleProcess Evaluation of a PositiveYouth .docx
Research ArticleProcess Evaluation of a PositiveYouth .docxResearch ArticleProcess Evaluation of a PositiveYouth .docx
Research ArticleProcess Evaluation of a PositiveYouth .docx
rgladys1
 
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdfhttps://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
tester0010
 
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdfhttps://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
tester0010
 
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdfhttps://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
tester0090
 
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdfhttps://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
tester0090
 

Similar to 2 (20)

Turner, colt cross purposes ijobe v2 n1 2014
Turner, colt cross purposes ijobe v2 n1 2014Turner, colt cross purposes ijobe v2 n1 2014
Turner, colt cross purposes ijobe v2 n1 2014
 
empirical research
empirical researchempirical research
empirical research
 
Ex 2factor spq
Ex 2factor spqEx 2factor spq
Ex 2factor spq
 
Academic Performance Rating Scale
Academic Performance Rating ScaleAcademic Performance Rating Scale
Academic Performance Rating Scale
 
This is from class AH111Quality, Access and CostIn a one to two .docx
This is from class AH111Quality, Access and CostIn a one to two .docxThis is from class AH111Quality, Access and CostIn a one to two .docx
This is from class AH111Quality, Access and CostIn a one to two .docx
 
Linn 2000
Linn 2000Linn 2000
Linn 2000
 
A Generic Framework For Criterion-Referenced Assessment Of Undergraduate Essays
A Generic Framework For Criterion-Referenced Assessment Of Undergraduate EssaysA Generic Framework For Criterion-Referenced Assessment Of Undergraduate Essays
A Generic Framework For Criterion-Referenced Assessment Of Undergraduate Essays
 
Pro questdocuments 2015-03-08 (1)
Pro questdocuments 2015-03-08 (1)Pro questdocuments 2015-03-08 (1)
Pro questdocuments 2015-03-08 (1)
 
Action Research In Second Language Teacher Education
Action Research In Second Language Teacher EducationAction Research In Second Language Teacher Education
Action Research In Second Language Teacher Education
 
An assessment of the measurementof performance in internat.docx
An assessment of the measurementof performance in internat.docxAn assessment of the measurementof performance in internat.docx
An assessment of the measurementof performance in internat.docx
 
Research on Graphic Organizers
Research on Graphic OrganizersResearch on Graphic Organizers
Research on Graphic Organizers
 
Advances In The Use Of Career Choice Process Measures
Advances In The Use Of Career Choice Process MeasuresAdvances In The Use Of Career Choice Process Measures
Advances In The Use Of Career Choice Process Measures
 
An Approach To Consider The Impact Of Co-Designed Science Case Study Of Baye...
An Approach To Consider The Impact Of Co-Designed Science  Case Study Of Baye...An Approach To Consider The Impact Of Co-Designed Science  Case Study Of Baye...
An Approach To Consider The Impact Of Co-Designed Science Case Study Of Baye...
 
Assessment Of Teachers Beliefs About Classroom Use Of Critical-Thinking Acti...
Assessment Of Teachers  Beliefs About Classroom Use Of Critical-Thinking Acti...Assessment Of Teachers  Beliefs About Classroom Use Of Critical-Thinking Acti...
Assessment Of Teachers Beliefs About Classroom Use Of Critical-Thinking Acti...
 
Research ArticleProcess Evaluation of a PositiveYouth De.docx
Research ArticleProcess Evaluation of a PositiveYouth De.docxResearch ArticleProcess Evaluation of a PositiveYouth De.docx
Research ArticleProcess Evaluation of a PositiveYouth De.docx
 
Research ArticleProcess Evaluation of a PositiveYouth .docx
Research ArticleProcess Evaluation of a PositiveYouth .docxResearch ArticleProcess Evaluation of a PositiveYouth .docx
Research ArticleProcess Evaluation of a PositiveYouth .docx
 
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdfhttps://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
 
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdfhttps://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
 
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdfhttps://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
https://syndwire-videos-new.s3.amazonaws.com/fw1mz1554712036.pdf
 
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdfhttps://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
https://syndwire-videos-new.s3.amazonaws.com/efihq1556015613.pdf
 

Recently uploaded

Recently uploaded (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 

2

  • 1. Evaluation and Program Planning 24 (2001) 129±143 www.elsevier.com/locate/evalprogplan Assessing the subsequent effect of a formative evaluation on a program J. Lynne Brown a,*, Nancy Ellen Kiernan b a Penn State University, Department of Food Science, 203B Borland, University Park, Pennsylvania, PA, USA b Penn State University, College of Agricultural Sciences, 401 Agricultural Administration Building, University Park, Pennsylvania, PA 814-863-3439, USA Received 30 June 1999; received in revised form 1 September 2000; accepted 31 October 2000 Abstract The literature on formative evaluation focuses on its conceptual framework, methodology and use. Permeating this work is a consensus that a program will be strengthened as a result of a formative evaluation although little empirical evidence exists in the literature to demonstrate the subsequent effects of a formative evaluation on a program. This study begins to ®ll that gap. To do this, we outline the initial program and formative evaluation, present key ®ndings of the formative evaluation, describe how these ®ndings in¯uenced the ®nal program and summative evaluation, and then compare the ®ndings to those of the formative. The study demonstrates that formative evaluation can strengthen the implementation and some impacts of a program, i.e. knowledge and some behaviors. The ®ndings also suggest that when researchers are faced with negative feedback about program components in a formative evaluation, they need to exercise care in interpreting and using this feedback. q 2001 Elsevier Science Ltd. All rights reserved. Keywords: Formative evaluation; Summative evaluation; Impact; Assessing feedback 1. Introduction Morris, 1978); debated its frequency and timing in the program cycle (Markle, 1979; Thiagarajan, 1991; Russell Formative evaluation commands a formidable place in & Blake, 1988; Chambers, 1994); scrutinized its overlap the evaluation literature. Highly regarded, the process was with process evaluation (Patton, 1982; Stuf¯ebeam, 1983; used to improve educational ®lms in the 1920's (Cambre, Scheirer & Rezmovic, 1983; Dehar, Casswell & Duignan, 1981). Academic areas as diverse as agricultural safety 1993; Scheirer, 1994; Chen, 1996); and expanded its epis- (Witte, Peterson, Vallabhan, Stephenson, Plugge, Givens temological framework, linking it to developmental et al., 1992/93) and cardiovascular disease (Jacobs, Luep- programs (Patton, 1996). As the conceptual framework ker, Mittelmark, Folsom, Pirie, Mascioli et al., 1986) draw evolved, the perceived value of formative evaluation has on the process today, using ®ndings to improve a program; only increased. among educators in particular, it is `almost universally Second, the literature focuses on methods and design embraced' (Weston, 1986, p. 5). Surprisingly, the sub- strategies to conduct formative evaluation. That focus sequent effect of using the ®ndings of formative evaluation appears ®rst, in handbooks or articles describing methods has not received systematic attention. This paper address and design strategies for either an entire program (Rossi & that gap. Freeman, 1982; Patton, 1978; Fitzs-Gibbon & Morris, 1978) The literature focuses attention on three aspects of forma- or a segment of a program such as the materials (Weston, tive evaluation, the ®rst of which is its conceptualization. 1986; Bertrand, 1978), instruction (Tessmer, 1993), electro- Over time, researchers clari®ed the concept. They distin- nic delivery like television (Baggaley, 1986), or interactive guished it from other forms of evaluation especially summa- technology (Flagg, 1990; Chen & Brown, 1994). The focus tive, the fundamental difference being the rationale and use on method and strategies appears secondly, in case studies of the data (Baker & Alkin, 1973; Markle, 1989; Patton, which illuminate a particular method or strategy tailored to 1994; Chambers, 1994; Weston, 1986); labeled it formative the exigencies of a particular situation such as a community evaluation (Scriven, 1967) and accepted that designation (Jacobs et al., 1986; Johnson, Osganian, Budman, Lytle, (Rossi & Freeman, 1982; Patton, 1982; Fitz-Gibbon & Barrera, Bonura et al., 1994; McGraw, Stone, Osganian, Elder, Johnson, Parcel et al., 1994; McGraw, McKinley, * Corresponding author. Tel.: 11-814-863-3973; fax: 11-814-863-6132. McClements, Lasater, Assaf & Carleton, 1989) or worksite E-mail address: f9a@psu.edu (J.L. Brown). (Kishchuk, Peters, Towers, Sylvestre, Bourgault & Richard, 0149-7189/01/$ - see front matter q 2001 Elsevier Science Ltd. All rights reserved. PII: S 0149-718 9(01)00004-0
  • 2. 130 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 1994). Over time, the focus on methods and strategies illu- ers take the next step and demonstrate this by comparing minated critical decisions needed to design a valid forma- data from the initial program with data from the ®nal tive evaluation. The decisions include: (1) who should program to show whether the changes resulted in an participateÐexperts (Geis, 1987), learners from the improvement in program implementation and impacts. targeted audience (Weston, 1986; Russell & Blake, 1988), Reviewing over 60 years of work in formative evaluation, learners with different aptitudes (Wager, 1983), instructors scholars (Flagg, 1990; Dick, 1980; Dick & Carey, 1985; representative of those in the ®eld (Weston, 1987; Peterson Weston, 1986,) found that the `evidence is supportive but & Bickman, 1988), or drop outs from a program (Rossi & meager' (Geis, 1987, p. 6). Furthermore, most evidence Freeman, 1982); (2) how many to include and in what (Baker & Alkin, 1973; Baghdadi, 1981; Kandaswamy, formÐone or a group (Wager, 1983; Dick, 1980); (3) Stolovitch & Thiagarajan, 1976; Nathenson & Henderson, type of data to collectÐqualitative or quantitative (Dennis, 1980; Scanlon, 1981; Wager, 1983; Montague, Ellis & Fetterman & Sechrest, 1994; Peterson & Bickman, 1988; Wulfeck, 1983; Cambre, 1981) relates to only a component Flay, 1986); (4) data collection techniques (Weston, 1986; of a program, the educational materials, not to an entire Tessmer, 1993) and (5) similarity of pilot sessions relative program. Some landmark studies examine the impact of to actual learning situations (Rossi & Freeman, 1982; an entire program in its formative stage such as the use of Weston, 1986). Not surprisingly, the conviction permeating negative income tax strategies as a substitute for welfare the literature on methods and strategies is that formative (Kershaw & Fair, 1976; Rossi & Lyall, 1976; Robins, Spie- evaluation will lead to a stronger, more effective program. gelman, Weiner & Bell, 1980; Hausman & Wise, 1985) and Third, attention in the literature dwells on the immediate the Department of Labor's LIFE effort to decrease arrest use of formative evaluation ®ndings. Academic areas such rates among released felons with increased employment as nutrition (Cardinal & Sachs, 1995), cancer prevention for (Lenihan, 1976), but only a few, such as those reported by agricultural workers (Parrott, Steiner & Goldenhar, 1996), Fairweather and Tornatzky (1977), actually document that and child health (Seidel, 1993) have evaluated a program in the changes made as a result of a formative evaluation its formative stage. In case studies such as these, researchers resulted in a change in the impact of the ®nal program. hail the evaluation process, describing the immediate effects Given that researchers hail formative evaluation as impor- of the evaluation, i.e., the problems identi®ed and/or tant, the lack of evidence about its subsequent effect points changes to be made in a modi®ed version of the program to a surprising gap in the literature. (Potter et al., 1990; Finnegan, Rooney, Viswanath, Elmer, The purpose of this paper is to examine the subsequent Graves, Baxter et al., 1992; Kishchuk et al., 1994; Iszler, effect of a formative evaluation to see whether the changes Crockett, Lytle, Elmer, Finnegan, Luepker et al., 1995). resulting from it improved the ®nal program, suf®ciently to These researchers are not consistent when reporting the distinguish between the impact of two program delivery immediate effects of a formative evaluation. Some do not methods. To do this, we: (a) outline the initial program include data; some do not outline the problems the process and its formative evaluation, (b) present the key ®ndings identi®ed; and some do not describe the changes they made. of the formative evaluation, (c) describe how the formative What is consistent however, is the message from these ®ndings in¯uenced the design of the revised program and its researchers: formative evaluation led them to make changes evaluation, and then (d) compare the results of the initial and that should lead to a stronger program. revised program, something rarely done in the formative In summary, much has been written about formative evaluation literature. In doing this, we provide a compre- evaluationÐit's conceptual framework, its methods, and hensive look at the implementation of both a formative and its use. Throughout this literature, there is strong consensus summative evaluation. In conclusion, we identify issues that on the value of formative evaluation, some calling its value evaluators wishing to improve the design of a formative `obvious' (Baggaley, 1986, p. 34) and `no longer ques- evaluation need to consider. In addition, we identify tioned' (Chen & Brown, 1994, p. 192). Many educators problems we encountered in attempting to assess the effec- contend, however, that formative evaluation is not used tiveness of a formative evaluation. enough (Flagg, 1990; Kaufman, 1980; Geis, 1987; Foshee, McLeroy, Sumner & Bibeau, 1986). Indeed, some evalua- tions re¯ect no previous attempt at formative evaluation 2. Stage one: The initial program (Foshee et al., 1988; Glanz, Sorensen & Farmer, 1996; Pelletier, 1996; Schneider, Ituarte & Stokols, 1993; Wilk- 2.1. Background inson, Schuler & Skjolaas, 1993; Hill, May, Coppolo & Jenkins, 1993). Other formative evaluations are limited: Combining federal, state and local funding, land grant using few people, non-representative samples, or selected colleges support educational health promotion programs materials (Tessmer, 1993). for individuals and communities offered by county-based Part of the explanation for limited use of formative co-operative extension family living agents. Prior to our evaluation may lie in the lack of empirical evidence in the study, agents reported poor attendance at evening and literature demonstrating its subsequent effect. Few research- weekend meetings but rarely offered daytime programs at
  • 3. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 131 worksites. Instead, agents used correspondence lessons to 2.4. Delivery method reach people unwilling to attend meetings. However, group interaction is more likely to facilitate changes in behavior We tested two bi-weekly delivery methods for the (Glanz & Seewald-Klein, 1986), possibly through the lessons. Group-delivery (G), based on the discussion±deci- support offered by sharing experiences. While it was easier sion methods of Lewin (1943), was a 30 min motivational for agents to mail correspondence lessons (an impersonal session in which participants discussed adopting a behavior delivery method), we postulated that using a group meeting suggested in each lesson (i.e. trying recipes, walking for to motivate the use of each lesson in a series before it was exercise, involving children in food preparation, and eating distributed was more likely to promote change in food/ calcium-rich foods, not supplements). Participants could health behaviors. To test this hypothesis, we designed a taste a recipe using the featured calcium-rich food and two-stage impact study to evaluate two methods of deliver- vote by raised hands on their willingness to individually ing lessons biweekly at worksites: distribution alone vs adopt the suggested behavior. An agent served as facilita- distribution in conjunction with a half-hour group meeting. tor/motivator and distributed a lesson at the end of each Agents delivering the program would work with new session. The other method, impersonal-delivery (I), delivery sites, new clientele, new content, and new delivery consisted of either the agent or a company contact person methods. Because of this unfamiliarity and because this simply distributing the required lesson to participants program had to ®t into worksite environments with differing according to schedule. work-shift patterns, lunch patterns, physical settings, personnel departments, and required advertising, we 2.5. Staff training conducted a formative evaluation of the initial program To insure consistency, all agents received guidelines for impact and its implementation. We included participants recruitment of worksites and participants, a program content and instructors in the evaluation, using a variety of qualita- review, a printed program delivery script and instructions tive and quantitative methods. for instrument administration. 2.2. Target health problem and audience 2.6. Recruitment Four print lessons in the initial program addressed Seven agents representing three rural and four urban/ prevention of osteoporosis, a recently proclaimed public suburban counties interviewed personnel managers at busi- health problem (National Institutes of Health, 1985) most nesses within their county and recruited 48 worksites where often affecting white, elderly women. Prevention requires women comprised over 30% of the work force. Once work- life long adequate calcium intake and exercise. According to sites were randomly assigned to a delivery method (G or I), NHANES II data, 75% of American women fail to consume agents systematically recruited participants within a month. the recommended daily amount of calcium (Walden, 1989). We targeted working women, ages 21±45, with at least 3. Stage one: Formative evaluation one child at home because these women are building bone mass which peaks at age 35±45. Mothers can also provide We delineate the data collection methods, the evaluation nutrition activities (Gillespie & Achterberg, 1989) that design, and analyses. teach children how to protect bone health. 3.1. Evaluating program implementation 2.3. Lesson content and organization Our goals were to assess: (a) participant characteristics The lessons, based on the Health Belief Model (Janz & relative to the prescribed target audience; (b) participant Becker, 1984), encouraged participants to eat calcium-rich attention to, and use of, the lessons; (c) participant reaction foods and to walk for exercise by focusing on personal to advertising, lessons content and structure, delivery susceptibility, disease severity, bene®ts of prevention, and method, and time between lessons and (d) agent reaction over coming barriers to health protecting actions. Because to delivering the program and its content. many in the target audience disliked drinking ¯uid milk To address goal (a), we included demographic questions (based on an initial survey) or could have reactions to in the ®rst questionnaire administered. To address (b), we milk (Houts, 1988), each lesson introduced a different designed a response sheet for each lesson which asked parti- calcium-rich food (non-fat dry milk, plain yogurt, canned cipants how completely they had read the lesson, how easy salmon or tofu) and menu ideas. Each also included scien- it was to read, and how useful it was, and whether they ti®c background on the lifestyle±osteoporosis link, a self completed the worksheet, tried suggestions or recipes, and assessment worksheet, a featured food fact sheet, sugges- shared lesson materials. To address (c), we developed focus tions for involving children in food preparation, and group questions for participants, and, for (d), questions for calcium-rich recipes. Rightwriter (1990) analysis indicated agents attending a debrie®ng. a 12th grade reading level. We conducted four focus groups among participants
  • 4. 132 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 within a month of the intervention, two each for group- delivery and impersonal-delivery. Each focus group derived from a purposeful sample of thirty participants composed of two-thirds completers and one-third non-completers. The agent telephoned all selected and those who chose to attend became the sample. We held the debrie®ng with all agents within a month also. Data consisted of tape recordings and written notes. 3.2. Evaluating program impact Our goal was to examine changes in knowledge, attitudes, and behaviors (KAB) needed to prevent osteoporosis using appropriate scales, changes in calcium intake using a food frequency questionnaire (FFQ), and changes in exercise pattern using speci®c questions. Our hypothesis was persons in group±delivery would exhibit greater changes in attitude and behavior scores, calcium intake, and exercise pattern than those in impersonal-delivery. We anticipated similar changes in knowledge for both delivery methods because the same lessons were used; the meeting focused primarily on motivation. To assess changes, we developed the KAB scales using nutrition expert and target audience reviews and internal consistency testing with 65 of our target audience prior to Fig. 1. Model of formative and summative evaluation design. use in Stage One. The ®nal formative instrument contained a 20 item knowledge scale (KR-20 ˆ 0.80); a 22 item attitude scale (a ˆ 0.78); and a 16 item behavior scale (a ˆ 0.75) all response sheet that participants returned prior to receiving addressing concepts in the lessons. the next lesson. We used a modi®ed version of the Block food frequency questionnaire (Brown & Griebler, 1993) that 3.4. Formative evaluation data analyses included the four foods featured in the osteoporosis We used x 2 analysis to compare categorical and ANOVA lessons to assess calorie and calcium intake. To examine to compare continuous implementation data, between exercise behavior directly, we asked participants if they lessons and between delivery methods, from response sheets exercised regularly within the last several months each returned. We examined tape recording transcripts and focus time they completed the KAB scales; after the lessons group and debrie®ng notes for repeated themes (Krippen- we also asked if this exercise pattern was new, and, if dorff, 1980). new, if it was due to the lessons. Data from those completing both KAB instruments were 3.3. Formative evaluation design analyzed and scale scores determined allowing only one missing value. Each individual's knowledge score was the We employed a pre-test (T0), 8 week intervention, post- sum of correct answers. Each attitude and behavior state- test (T1) design to compare group-delivery (G) and imper- ment required a response on a 5-point Likert scale. Each sonal-delivery (I) (Fig. 1). We arranged the 48 worksites in individual's attitude and behavior scale score was the mean four blocks re¯ecting business types (white collar, educa- of all their responses to those questions. tional/municipal, health care and blue collar) and assigned Data from those completing both FFQs were coded, them randomly to either delivery. Although eleven work- entered, and analyzed using FFQ software (Health Habits sites withdrew prior to the intervention, primarily due to and History Questionnaire, 1989). Because nutrient value company changes, the proportion of business types in distributions were not normal, the data were transformed each delivery method was unaffected. using log e prior to statistical analysis (SAS Proprietary Participants completed pre KAB and FFQ instruments at software, 1989). a meeting 1 week prior to receiving lesson one; the last Non-directional t-tests for independent samples were lesson included post KAB and FFQ instruments, which used to test signi®cance of continuous and ordinal data participants returned at an optional post program meeting (mean age, education, KAB scores and calcium) between 1 week later or by mail according to Dillman (1978). delivery methods (G vs I) at each time point (T0, T1). Cate- Question order in each KAB scale differed at each measure- gorical demographic and exercise data were compared using ment to diminish recall bias. Each lesson included the x 2 analysis. ANOVA for repeated measures and ANCOVA
  • 5. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 133 Fig. 2. Percent reading all the lesson in each evaluation. were used to test signi®cance of mean KAB scores and worksheet one, 28% worksheet two, 80% worksheet three, calcium intake of matched individuals across time. The and 50% worksheet four. covariates of mean income and employment status were The response sheets assessed whether participants tried used in testing changes in KAB scores. Signi®cance was recipes, involved children in food preparation, and shared assumed at #0.05. lesson materials and revealed no signi®cant differences between delivery methods. Although 37% of method G tried lesson one recipes compared to 20% in method I, there- 4. Key ®ndings from the formative evaluation of the after, percentages were lower and similar between delivery initial program groups. Those involving children varied from 11% for lesson one to 2% for lesson four and those sharing recipes We outline program implementation ®ndings for goals with friends between 16 and 22% (Fig. 3). (a)±(d) and impact ®ndings. 4.1.3. Goal (c): Participant reactions 4.1. Program implementation Fifty women (27 from G and 23 from I) participated in the focus groups. Participants from both delivery methods were 4.1.1. Goal (a): Target audience more likely to remember personal contacts and paycheck Ultimately, 275/489 (56%) women completed post ¯yers than other advertisements. They recommended questionnaires that met analysis criteria. Completers changes in lesson format, recipes, worksheets, and and non-completers did not differ in demographic calcium-rich foods featured. Many found the lesson booklet characteristics (data not shown). When comparing deliv- cumbersome, the menus unhelpful, the worksheets in two ery methods, completers differed signi®cantly only in lessons long, and some featured foods dif®cult to adopt. two factors: percent employed full time (91.6% in G Some participants wanted the emphasis on drinking milk. vs 81.6% in I) and percent of families with incomes They suggested including menus and microwave instruc- over $35,000 (57.7% vs 42.4%). tions in the recipes. With some exceptions, women reported it was dif®cult to involve children in food preparation or that 4.1.2. Goal (b): Participant's attention to and use of their children were grown. lessons However, some feedback was unique to a delivery Response sheets returned dropped over the four lessons; method. Group-delivery participants wanted more Method G dropped from 81% of initial registrants for lesson lecture, more question and answer time, and less moti- one to 41% for lesson four and method I from 95 to 67%. vational discussion. They could not recall voting to try Otherwise, the two delivery methods did not differ signi®- a behavior (critical to the discussion±decision method) cantly in attention to, and use of lessons. but liked the food tasting activity. Impersonal-delivery Respondents that reported reading all lesson materials fell participants also wanted question and answer time and from 85% for lesson one to 62±64% for lesson four (Fig. 2). reminders to complete each lesson, but disagreed about Regardless of delivery method, respondents rated all the period between lessons. lessons, on a scale of 1±5, fairly easy to read (1.4 ^ 0.6 Participants from both delivery methods revealed that where 1 ˆ easy to read), and useful (hovering at 2.1 ^ 0.8 they had limited time to try recipes and had not yet put where 1 is very useful). About 70% reported completing learned health-promoting actions into practice. They
  • 6. 134 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 Fig. 3. Percent reporting sharing lesson recipe with friends. disliked the long KAB questionnaire and completing the advice. They felt the recipes needed improvement. second FFQ, only 2 months after the initial one, when Agents echoed the lack of emphasis on drinking milk, they had not yet initiated changes in eating habits. a political issue in counties with a dairy industry. 4.1.4. Goal (d): Agent reactions 4.2. Program impact All agents participated in both delivery methods. They reported that the advertising materials did not clearly As hoped, changes over time for KAB were signi®- de®ne the target audience and that in-person appeals cant. As expected, the hypothesis that changes in knowl- and an enthusiastic site contact improved recruitment. edge would not differ by delivery method was supported. Despite managing shifts, they preferred the interaction Unexpectedly, the hypothesis that those in group-deliv- and participant interest in group-delivery and the oppor- ery would show greater gains in attitude, behavior, tunity for daytime programs. But agents using group- calcium intake, and exercise pattern than those in imper- delivery resisted being motivators and asked to provide sonal-delivery was not supported. For the KAB lectures, perceiving that participants wanted prescriptive measures, time by delivery method interaction was not Fig. 4. Change in knowledge score over time.
  • 7. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 135 Fig. 5. Change in attitude score over time. Fig. 6. Change in behavior score over time. signi®cant (Figs. 4±6). Group delivery did not affect 5.1. Revised program lesson content and recruitment knowledge, attitude, or behavior scores any more than impersonal delivery. Changes in calcium intake and We changed the lesson content to address the concerns exercise pattern were not signi®cantly different between outlined above. We asked six county agents, representing delivery groups (data not shown). three rural and three suburban counties, to recruit four work- sites each, a total of 24. We clari®ed the target audience in advertising materials and directed agents toward in-person recruiting. We lowered the lessons' reading level to accom- 5. Stage two: The revised program and summative modate participants from more blue collar worksites where evaluation mothers, ages 21±45, were a signi®cant part of the work force to insure enrolling more working women with young The changes made in stage two in the program content, children. recruitment, delivery method, and evaluation design and instruments for the summative evaluation are shown in 5.2. Revised program delivery method Table 1. Almost all re¯ect key ®ndings of the stage one formative evaluation. The initial program implementation and impact data
  • 8. 136 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 Table 1 Major changes in educational program, evaluation design and evaluation instruments prompted by results of the formative evaluation Type of change From To Program lesson content z Layout of each lesson Booklet Folder with pull-out fact sheets z Calcium rich foods Emphasize four non-traditional foods Emphasize ¯uid milk and four non-traditional foods z Worksheets Lesson 1: 7 day exercise diary Lesson 1: 3 day exercise diary Lesson 4: long contract to make one Lesson 4: short contract to make one behavior change behavior change z Fact sheet: food activities for children Suggestions to involve children in food Retain and give added emphasis in lessons and group meeting activities z Recipes Six per lesson with conventional Keep four most popular, but add microwave instructions and instructions menu suggestions; emphasize testing on weekends z Reading level 12th grade 8th grade Program recruitment z Recruitment of worksites Work force must have a high percentage Target blue collar worksites; work force must have a high of working women percentage of working mothers z Advertising for target audience Print material and in-person recruitment Emphasis on in-person recruitment; clarify target audience in all recruitment material Program delivery z Delivery method Group: motivational discussion about Group: lecture stressing 2±3 main points of lesson followed by overcoming barriers to suggested pep talk about suggested behavior followed by group vote on behaviors ending with group vote on trying the behavior [try recipes, start walking program, involve trying the behavior [try recipes, start kids in kitchen, use foods not supplements] plus food tasting walking program, involve kids in with revised recipes kitchen, use foods not supplements] plus food tasting Impersonal: pass out lessons on schedule Impersonal: pass out lessons on schedule Evaluation design z Intervention design Comparison of two delivery methods Comparison of two delivery methods with a control Pre±post measures, T0 & T1 ÐKAB and Pre±post 4 month post measures: T0, T1 & T2 ÐKAB, T0 & FFQ T2 ÐFFQ Response sheet in each lesson; no Response sheet in each lesson; provide incentive for return incentive to return Evaluation instruments z Impact instrument scales KAB questionnaire KAB questionnaire z 20 knowledge questions z 14 behavior questions KR-20 ˆ 0.80 KR-20 ˆ 0.725 z 22 attitude questions z 16 attitude questions a ˆ 0.78 a ˆ 0.80 z 16 behavior questions z 14 behavior questions a ˆ 0.75 a ˆ 0.80 z Response sheets z Try any suggestion for child activity: For both questions, add the response: no, but plan to responsesÐyes, no z Try any recipe: responsesÐyes, no indicated the group delivery method did not affect atti- tion rate led us to use a pre (T0), immediate post (T1), and 4 tudes and behaviors possibly because agents were month post (T2) summative evaluation design (Fig. 1). We uncomfortable and did not conduct the meeting accord- asked participants to complete the KAB instrument at all ing to directions. To rectify this, using Pelz (1959), six time points, but the FFQ only at T0 and T2, a 6 month agents designed four new 30 min meeting scripts that interval, expecting the T2 measure would detect changes included two to three main points, retained the food which initial program participants claimed took time to tasting (with new recipes), and eliminated the motiva- implement. tional discussion. A suggested action was still promoted To improve our ability to detect changes, we compared at the end of the meeting and a group vote taken on three intervention groups (two experimentalÐgroup-deliv- adoption. Agents were trained to use these scripts and ery and impersonal-deliveryÐand one control). The distributed the lessons biweekly. controls received four correspondence lessons addressing cancer prevention, identical in design to the osteoporosis 5.3. Summative evaluation design lessons the experimental groups received. The osteoporosis and cancer lessons differed only in diet±disease context, Participants' comments and the poor formative comple- bene®cial nutrients and foods, and emphasis on exercise
  • 9. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 137 in the osteoporosis lessons. In sum, those in group-delivery more accurate assessment of the responses of those complet- received the modi®ed group meeting and osteoporosis ing the program. lessons; those in impersonal-delivery only the osteoporosis Impact analysis methods were similar to those used in the lessons, and the controls only the cancer lessons. formative analyses with these modi®cations: (a) we used We divided the 24 worksites into ®ve blocks re¯ecting only data of those completing all three KAB or FFQ instru- relative pay scale and type of worker. These were assigned ments; (b) we allowed up to two missing answers on the purposefully to the three intervention groups such that there knowledge scale; (c) we tested the signi®cance of continu- was an equal representation of all ®ve blocks in the two ous and ordinal data among the three delivery groups at experimental groups while the controls lacked representa- three time points (T0, T1, T2) and (d) age served as the tion from one of two lower pay blocks. Three companies covariate for ANOVA and ANCOVA. We determined withdrew prior to recruitment. statistically signi®cant differences among values at time All participants completed pre-test instruments at a meet- points using pair-wise tests of differences between least- ing 1 week prior to receiving the ®rst lesson. The post-test squares means. A Bonferoni adjustment was used to control KAB instrument, distributed with the last lesson, was the overall error rate. Signi®cance was assumed at #0.05. collected at a concluding meeting 2 weeks later. Three Finally, we compared categorical and continuous demo- months later the ®nal instruments were distributed to all graphic characteristics (mean age and education) between participants by the agent or by mail using a modi®ed Dill- the formative and the summative evaluation completers man Method. using x 2 analysis and non-directional t-tests. 5.4. Evaluating revised program implementation 6. Summative evaluation ®ndings and comparison with To assess demographics, we included questions in the the formative pre-test instrument of all three intervention groups. To assess attention to, and use of the lessons, we included a First, we examine the implementation ®ndings. Then we response sheet in each lesson for the two experimental examine impacts over time comparing the results to the groups only. We added a third possible response (no, but I control, looking at differences between the two delivery plan to) to questions about children's activities or recipes to methods. In each instance we compare the summative ®nd- capture behavioral intention. ings with those of the formative. 5.5. Evaluating revised program impact 6.1. Program implementation As in the formative evaluation, we hypothesized that 6.1.1. Goal (a): Target audience those in group-delivery would exhibit greater changes Completion rates were better and participant demo- than those in impersonal-delivery in attitude and behavior graphics were closer to those desired in the summative scores, exercise pattern, and calcium intake. In addition, we compared to the formative. In the summative, 70% of initial hypothesized that: (a) both experimental groups would exhi- registrants completed all three KAB measures. Almost 90% bit greater changes in knowledge than controls and (b) those completed the KAB instruments at T0 and T1, in contrast to in impersonal-delivery would exhibit greater changes than 56% in the formative. Eighty percent completed both FFQ controls in attitude and behavior scores, exercise pattern, instruments in the summative compared to 51% in the and calcium intake. formative. Table 2 lists the demographics of completers in Based on formative participant comments, we shortened both evaluations, ®nding them similar in family income, the KAB scales to reduce participant burden, hoping to race, marital status, awareness of relatives with osteoporo- improve completion. Using the formative instruments sis, initial exercise pattern, and calcium intake per 1000 completed by the 677 women registrants for stage one calories. Those completing the summative evaluation were (mean age 43.34 ^ 11.58), we used internal consistency signi®cantly more likely than those in the formative testing to eliminate less discriminating items, producing however, to be younger, have only a high school education, the scales in Table 1. Items retained in the KAB instrument work full time, and have at least one child at home. (76% of the original) represented all content areas in the In the summative, the two experimental groups and formative, improving as for two scales. However, no new controls differed signi®cantly in only age (data not questions were added. Both the question about exercise shown). Those in group-delivery had a mean age of regularity and the FFQ were not changed for the summative 39.1 ^ 9.95 compared to 40.6 ^ 10.10 in impersonal- evaluation. delivery and 40.9 ^ 9.61 in the control. 5.6. Summative evaluation data analyses 6.1.2. Goal (b): Participant's attention to, and use of, lessons In contrast to the formative, implementation data of those In the summative evaluation, 78% of group-delivery and completing all four response sheets were used, providing a 63% of impersonal-delivery registrants returned all four
  • 10. 138 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 Table 2 Demographic characteristics of those completing each evaluation phase. SD ˆ standard deviation; yr. ˆ years Variable Formative Summative a Mean age ^ SD N ˆ 275, 43.02 ^ 11.12 N ˆ 247, 40.15 ^ 9.88 Family income N ˆ 255 N ˆ 232 # 10±19.9000 41 (16.1%) 45 (19.4%) 20±34.9000 92 (36.1%) 83 (35.8%) 35±50,000 1 122 (47.8%) 104 (44.8%) Employment status a N ˆ 272 N ˆ 245 % full time 235 (86.4%) 232 (94.7%) Mean educational level ^ SD (yr.) b N ˆ 268, 13.40 ^ 1.97 N ˆ 246, 12.70 ^ 1.54 Race N ˆ 272 N ˆ 246 % white 260 (95.6%) 239 (97.2%) Marital status N ˆ 272 N ˆ 246 Married 200 (73.5%) 175 (71.1%) Single 40 (14.7%) 27 (11.0%) Other 32 (11.7%) 44 (17.9%) Percent with at least one child at home b N ˆ 274, 129 (47.1%) N ˆ 247, 163 (66.0%) Relatives with osteoporosis N ˆ 246 N ˆ 247 Yes 49 (19.9%) 35 (14.2%) No 167 (67.9%) 166 (67.2%) Don't know 30 (12.2%) 46 (18.6%) Exercise regularly in past 6 months? N ˆ 269 N ˆ 246 Yes 94 (34.9%) 90 (36.6%) No 175 (65.1%) 156 (63.4%) N ˆ 248 N ˆ 244 Calories (mean ^ SD) 1623.8 ^ 584.4 1798.0 ^ 711.2 Calcium in mg 805.4 ^ 415.5 895.3 ^ 549.2 Calcium in mg/1000 calories 497.1 ^ 178.9 490.2 ^ 206.3 a p , 0.01. b p , 0.001. response sheets, providing the sample for analysis. tofu respectively, signi®cantly more useful than did However, formative and summative return rates are not impersonal-delivery participants, whereas in the forma- comparable because we did not restrict that sample to tive, the ratings were identical. those completing all four. Questions about materials use revealed that the summa- In both evaluations, we asked participants if they read all, tive results differed from the formative, regardless of the parts of, skimmed, or did not read each lesson. Similar to the delivery method, in that: formative, those reading the whole lesson declined to 65± 70% by lesson four. Unlike the formative where there was ² 3±10% tried a recipe (data not shown) compared to 10± no difference between delivery groups in percent reading the 37% in the formative. Yet, in the summative, two thirds lessons, in the summative, signi®cantly more in group- or more in both delivery groups indicated they planned to delivery skimmed and less read lessons one and two than try a recipe, an option not available in the formative. in impersonal-delivery (Fig. 2). ² 11±21% involved children in food preparation compared Completion rates of the worksheets did not differ between to 2±11% in the formative. Additionally, in contrast to evaluations with one exception. More completed worksheet the formative, the summative exposed differences among two, a revised exercise diary, in the summative than in the delivery groups, in that the percent sharing a recipe with formative (40 vs 28%, respectively). friends was signi®cantly greater in group-delivery than In both evaluations, respondents were asked how easy impersonal-delivery for lessons two (20 and 6% respec- to read and how useful each lesson was. Respondents in tively) and four (24 and 10%) while no signi®cant differ- both evaluations provided nearly identical ratings of ease ences between delivery methods was evident in the of reading regardless of delivery method, suggesting the formative (Fig. 3). Lessons two and four featured less lower reading level of the summative materials made it familiar calcium-rich foods. easier for the less educated participants. Respondents in ² Consistently more in group-delivery shared other both evaluations provided nearly identical ratings of lesson materials with friends than in impersonal-delivery perceived usefulness for lessons one and two; however, (16±22% vs 9±11%). This distinction was not seen in the in the summative, group-delivery participants rated formative where about 15% in both delivery methods lessons three and four, highlighting canned salmon and shared materials (data not shown).
  • 11. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 139 6.2. Program impact greater impact than those of the formative although the difference was in the right direction. 6.2.1. Change in knowledge As expected, our hypothesis that both group-delivery and impersonal-delivery of osteoporosis lessons would lead to 7. Discussion greater knowledge gain than in controls was supported (Fig. 4). Indeed, the gain in knowledge in group-delivery The purpose of this study was to test whether the changes was signi®cantly greater than that in impersonal-delivery made in a program as a result of a formative evaluation and both were signi®cantly greater than that of the strengthened the implementation and impact of the program controls at both T1 and T2. This was not seen in the formative in the summative. We found that implementation improved evaluation. but only certain measures of impact improved enough to distinguish the effects of each delivery method. As a result we suggest the following for evaluators to consider in 6.2.2. Change in attitude designing a formative evaluation and in attempting to assess Our hypothesis that those in the experimental groups its effectiveness. would show greater gains in attitude than controls was not supported in the summative evaluation (Fig. 5). Our hypoth- 7.1. Improving the design of a formative evaluation esis that gains in attitude would differ between delivery methods was not supported either. The initial mean attitude 7.1.1. Interpreting (focus group) feedback from participants scores in the summative were signi®cantly higher than those As a result of participants' comments in the formative in the formative (3.9±4.0 vs 3.0±3.1, respectively), perhaps focus groups, we extensively altered the lessons for the due to increased media focus on osteoporosis or to differ- summative and this did lead to greater use of a formerly ences in participants, and could have limited our ability to underused worksheet, continued ease of reading with parti- improve attitudes and thus to detect signi®cant changes. We cipants with lower educational levels, and more involve- did not detect the in¯uence of any previous worksite educa- ment of children in food preparation. tional activities, however. Focus group participants wanted less motivational discus- sion and more lecture in the meetings. In response, we increased lecture time in the group-delivery method. The 6.2.3. Change in behavior in¯uence of the group-delivery method was underscored Our hypothesis that experimental groups would show because group-delivery participants were more likely to greater gains in behavior than controls was partially skim than to read the lessons. Even with less reliance on supported in the summative (Fig. 6). Gains in behavior for reading, group-delivery participants found two lessons group-delivery were signi®cantly greater than the control at signi®cantly more useful and were more likely to share T2. Administering the behavior scale 4 months after the lesson materials with others than those in impersonal-deliv- intervention (T2) identi®ed further changes in participant ery. The agent presentation in group-delivery was clearly behavior in group-delivery that were not seen in controls, critical to sell the lessons. In this case, following participant even when controlling for age, a possible explanation. Our recommendations appeared to improve impact on knowl- hypothesis that those in group-delivery would show greater edge gained, but not in most areas of behavior change. gains in food-related behavior than those in the impersonal- When the focus group participants indicated they did not delivery was not supported in the summative evaluation. like the group-delivery motivational discussion±decision session, they also indicated a preferred alternative. Given 6.2.4. Changes in calcium intake and exercise pattern the extensive literature on assessing participant perspectives Our hypothesis that those in experimental groups would (Basch,1987), we assumed we needed to alter the delivery show greater gains in calcium intake than controls was not method to one they liked (especially since the agents echoed supported in the summative. Our hypothesis that those in this preference) and ultimately emphasized lecture over group-delivery would show greater gains than those in motivation in the summative. In hindsight, we should impersonal-delivery was also not supported. The summative have asked the focus group participants how to change the delivery methods did not have a greater differential impact motivational aspects of the session with its emphasis on than those of the formative. behavior modi®cation, rather than abandon this based on Our hypothesis that experimental groups would report their negative feedback. greater change in exercise patterns than controls was A researcher using focus groups should not assume that if partially supported (data not shown). In the summative participants express dislike of a particular delivery method evaluation, the number at T2 who reported that exercising and suggest another, that one should drop the original regularly was a new pattern was signi®cantly greater in method without considering the down stream effects. Be group-delivery than in controls. Our hypothesis that exer- prepared to probe to learn why it was disliked and how it cise patterns would differ between delivery methods was not might be modi®ed, especially if participants are suggesting supported. The summative delivery methods did not have a a more passive path to learning. By making the assumption
  • 12. 140 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 participants were right, we limited focus group inquiry and to plan the summative. In hindsight, we should have taken a the type of information collected for program planning in proactive stance once these unanticipated barriers surfaced the summative. in the initial focus group, and posed questions to later focus We revised the recipes, believing these could be used as groups based on the comments offered in previous ones. the context to show participants how to use calcium-rich foods and as a device to facilitate behavior change. In the 7.1.3. Interpreting feedback from program instructors focus groups, we listened openly to complaints about the As a result of the formative, we allowed instructors to recipes and asked participants how to improve them. Some change the presentation in the group delivery method, participants indicated they came to the program for the believing they would have greater ownership and thus recipes but disliked those provided. Despite these partici- impact, if they designed a presentation with which they pant-guided revisions, reported use of recipes did not were more comfortable. The remedy the instructors devel- improve in the summative, although a large number indi- oped, a lecture rather than a motivational discussion, led to cated they planned to. an emphasis on knowledge rather than on behavior change In the formative, we assumed that when a component of in the presentation. This may partially explain why the two the program i.e., the recipes, was poorly received by parti- delivery methods did not differ signi®cantly in ®nal beha- cipants that this merely needed improvement and thus we vior scale scores. And it may also explain the signi®cantly speci®cally asked participants for suggestions. In hindsight, greater gain in knowledge in the group- than the impersonal- the question we should have asked ®rst, given the delivery. complaints, was one that tested our assumption that partici- The formative evaluation provided feedback about the pants valued recipes, i.e. should recipes be included in the presentation. We assumed that when a component of the lessons at all? And, if the answer was no, we should have program i.e., the presentation, was unacceptable to instruc- been ready to discuss with focus group participants, alter- tors that it should be changed and we asked for suggestions. native devices to motivate behavior change. The acceptable change did not lead to greater impact. Researchers designing formative evaluations need to be Although instructor suggestions carry great face validity, alert for such a methodological inconsistency: why did researchers need to be wary because instructors may provide participants give us suggestions to improve the recipes suggestions that shift the aims of the program. In hindsight, when, in fact, many had not used these (10±30% in the what we should have done was to propose to the instructors, formative). This might be explained by the inclination of alternative presentation methods that retained an emphasis people to provide answers to questions when they are asked. on behavior change. If none were acceptable, we should People have a tendency to tell more than they can know have queried our assumption that agents were the appropri- (Nisbett & Wilson, 1977). ate instructors for the group presentation. In summary, researchers using focus groups must be prepared to probe assertions by participants that some component of a program is unsatisfactory. Probing should 7.1.4. Incorporating a control group investigate both how that component might be improved We included impact measures in the formative to gain a and retained as well as what might be substituted and quantitative estimate of the effects of the initial program but why. In particular, asking a question about the fundamental we did not include a control group because we expected the usefulness of some component of a program in a formative contrast between group- and impersonal-delivery to be evaluation may not be easy for researchers as they may have robust. Due to instructor dif®culties with the stage one considerable ownership and resources invested in that group-delivery, this contrast did not materialize. Although component. However, when faced with negative feedback we had evidence that participants were learning from the about a component of the initial program, researchers print materials in both delivery methods, we could not should investigate both options to insure suf®cient data for demonstrate that the change seen was better than with no an informed decision about summative activities. intervention. Hence we suggest that evaluators include a control group when testing the impact of a new program 7.1.2. Assessing barriers to changing behaviors offered via different delivery methods in a formative Because participants came to this program on calcium- evaluation. rich foods, we assumed that they would be open to change and that testing the suggested foods at family meals would 7.1.5. Watching for serendipitous effects of formative be acceptable. Some focus group participants indicated it evaluation was dif®cult to introduce these foods because of family Including impact measures lead to unexpected participant member aversion to change. On hearing this, we assumed feedback about the instruments and their mode of adminis- that altered recipes would be suf®cient to overcome opposi- tration. This feedback proved invaluable in improving tion and explored this in all focus groups. By not making a program implementation in the summative, underscoring conscious effort to uncover social barriers to changing food the usefulness of impact measures in formative evaluations. choices for families, we limited the information we gained In hindsight, we recommend including these measures in the
  • 13. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 141 formative to assess impact as well as to obtain feedback the subsequent effects of formative evaluation on a program. about the instruments and their implementation. This study begins to ®ll that gap. This study demonstrates that formative evaluation can 7.2. Assessing the effectiveness of the formative evaluation strengthen the implementation and impact of an educational intervention designed to compare the impact of two program The need exists to demonstrate the subsequent effects of delivery methods. The modi®cations made as a result of the formative evaluation in order to improve formative evalua- formative appear to have signi®cantly improved knowledge tion design. Our study provides some guidance. gained but resulted in only modest improvements in beha- viors in the ®nal program. 7.2.1. Altering the evaluation design Our retrospective analysis of our experience supports the As a result of the formative, we added a control group and a inclusion of a control group and impact measures at several 3-month post-intervention measurement in the summative. time points in a formative evaluation of a program imple- Without either of these, we would not have been able to mented in a new environment for agency personnel. The observe some signi®cant differences between the two delivery ®ndings also suggest that when researchers are faced with methods in the summative. Three months after the interven- negative feedback about the components of the program in a tion, group delivery produced signi®cantly greater changes in formative evaluation, they need to exercise care in interpret- behavior scores and exercise pattern than seen in the controls ing feedback and in revising those components. while the impersonal-delivery method did not. This design In addition, the need to gather evidence of the subsequent change also revealed that the group method lead to signi®- effect of using data from formative evaluations is critical. cantly greater knowledge gain than the impersonal method Otherwise we cannot begin to examine whether the methods and both gains were greater than that of controls. Clearly our and processes we take for granted in the formative evalua- target audience was not likely to adopt behavior changes based tion are valid and appropriate. However, evaluators must on receiving just the printed materials. carefully plan the evaluations making sure that instruments However bene®cial these changes in evaluation design and evaluation design are parallel in order to carry out these were in illuminating important ®ndings about the impact comparisons. We offer our ®ndings as stimulus to consider of the ®nal program, implementing them only in stage two such studies. prevented a rigorous comparison between formative and summative evaluations which might have allowed us to see more clearly, the effect of the formative evaluation. Lack of the T2 measure in the formative meant we could References not be sure what effect the initial program had 3 months later and lack of a control group meant we could not fully inter- Baggaley, J. (1986). Formative evaluation of educational television. Cana- pret the formative impact data. In future, we recommend dian Journal of Educational Communication, 15 (1), 29±43. that if researchers want to assess the effectiveness of a Baghdadi, A. A. (1981). A comparison between two formative evaluation formative evaluation, that the design elements of formative methods. Dissertation Abstracts International, 41 (8), 3387A. and summative steps be parallel. These design details may Baker, E. L., & Alkin, M. C. (1973). Formative evaluation of instructional development. AV Communication Review, 21 (4), 389±418. seem more appropriate for a summative evaluation but they Basch, C. E. (1987). Focus group interview: An underutilized research are necessary to see the effects of the formative. technique for improving theory and practice in health education. Health Education Quarterly, 14 (4), 411±448. 7.2.2. Altering the evaluation instruments Bertrand, J. (1978). Communications pretesting (Media Monograph No. Six). Chicago: University of Chicago, Community and Family Study As a result of the formative, we shortened the KAB ques- Center. tionnaire somewhat to address participant complaints in the Brown, J. L., & Griebler, R. (1993). Reliability of a short and long summative. We believe this contributed to more participants version of the Block food frequency from for assessing changes in completing all the evaluation instruments in stage two. Thus calcium intake. Journal of the American Dietetic Association, 93 implementation improved in the summative. However, this (7), 784±789. Cambre, M. (1981). Historical overview of formative evaluation of instruc- improvement carried a price. Subsequently, we were not tional media products. Educational Communication & Technology able to conduct the most rigorous comparison of the forma- Journal, 29 (1), 3±25. tive KAB results to those of the summative. Cardinal, B. J., & Sachs, M. L. (1995). Prospective analysis of stage-of- exercise movement following mail-delivered, self-instructional exer- cise packets. American Journal of Health Promotion, 9 (6), 430±432. 8. Conclusions Chambers, F. (1994). Removing confusion about formative and summative evaluation: Purpose versus time. Evaluation and Program Planning, 17 The literature on formative evaluation focuses on its (1), 9±12. conceptual framework, its methodology and use. Permeat- Chen, H. T. (1996). A comprehensive typology for program evaluation. Evaluation Practice, 17 (2), 121±130. ing this work is a consensus that a program will be strength- Chen, C. H., & Brown, S. W. (1994). The impact of feedback during ened as a result of a formative evaluation although little interactive video instruction. International Journal of Instructional empirical evidence exists in the literature to demonstrate Media, 21 (3), 191±197.
  • 14. 142 J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 Dehar, M. A., Casswell, S., & Duignan, P. (1993). Formative and process veri®cation and revision: An experimental comparison of two methods. evaluation of health promotion and disease prevention programs. AV Communication Review, 24 (3), 316±328. Evaluation Review Journal, 17 (2), 204±220. Kaufman, R. (1980). A formative evaluation of formative evaluation: Dick, W. (1980). Formative evaluation in instructional development. Jour- The state of the art concept. Journal of Instructional Development, 3 nal of Instructional Development, 3 (3), 2±6. (3), 1±2. Dick, W., & Carey, L. (1985). The systematic design of instruction, (2nd ed) Kershaw, D., & Fair, J. (1976). The New Jersey income maintenance Glenview, IL: Scott, Foresman. experiment, New York: Academic Press. Dillman, D. A. (1978). Mail and telephone surveys: The total design Kishchuk, N., Peters, C., Towers, A. M., Sylvestre, M., Bourgault, C., & method, New York: John Wiley and Sons. Richard, L. (1994). Formative and effectiveness evaluation of a work- Dennis, M. L., Fetterman, D. M., & Sechrest, L. (1994). Integrating quali- site program promoting healthy alcohol consumption. American Jour- tative and quantitative evaluation methods in substance abuse research. nal of Health Promotion, 8 (5), 353±362. Evaluation and Program Planning, 17 (4), 419±427. Krippendorff, K. (1980). Content analysis: An introduction to its methodol- Fairweather, G., & Tornatzky, L. G. (1977). Experimental methods for ogy, Sage: Beverly Hills, CA. social policy research, New York: Pergamon. Lenihan, K. (1976). Opening the second gate, Washington, DC: U.S. Finnegan Jr, J. R., Rooney, B., Viswanath, K., Elmer, P., Graves, K., Government Printing Services. Baxter, J., Hertog, J., Mullis, R., & Potter, J. (1992). Process evaluation Lewin, K. (1943). Forces behind food habits and methods of change. In of a home-based program to reduce diet-related cancer risk. The `WIN The Problem of Changing Food Habits. National Research Council at Home Series'. Health Education Quarterly, 19 (2), 233±248. Bulletin 108. (pp. 35±65). Washington, D.C.: National Academy of Fitz-Gibbon, C. T., & Morris, L. L. (1978). How to design a program Sciences. evaluation, Beverly Hills, CA: Sage Publications. Markle, S. M. (1979). Evaluating instructional programs: How much is Flagg, B. N. (1990). Formative evaluation for educational technologies, enough? NSPI Journal, Feb, 22±24. Hillsdale, NJ: Lawrence Erlbaum Associates. Markle, S. M. (1989). The ancient history of formative evaluation. Perfor- Flay, B. R. (1986). Ef®cacy and effectiveness trials (and other phases of mance and Instruction, Aug, 27±29. research) in the development of health promotion programs. Preventa- McGraw, S. A., McKinley, S. A., McClements, L., Lasater, T. M., Assaf, tive Medicine, 15, 451±474. A., & Carleton, R. A. (1989). Methods in program evaluation: The Foshee, V., McLeroy, K. R., Sumner, S. K., & Bibeau, D. L. (1986). process evaluation system of the Pawtucket Heart Health Program. Evaluation of worksite weight loss programs: A review of data and Evaluation Review, 13 (5), 459±483. issues. Journal of Nutrition Education, 18 (1), S38±S43. McGraw, S. A., Stone, E. J., Osganian, S. K., Elder, J. P., Johnson, C. C., Geis, G. L. (1987). Formative evaluation: Developmental testing and expert Parcel, G. S., Webber, L. S., & Luepker, R. V. (1994). Design of review. Performance & Instruction, May/June, 1±8. process evaluation within the child and adolescent trial for cardio- Gillespie, A., & Achterberg, C. (1989). Comparison of family interaction vascular health (CATCH). Health Education Quarterly, Supplement, patterns related to food and nutrition. Journal of the American Dietetic 2, S5±S26. Association, 89 (4), 509±512. Montague, W. E., Ellis, J. A., & Wulfeck, W. H. (1983). Instructional Glanz, K., & Seewald-Klein, T. (1986). Nutrition at the worksite: An over- quality inventory: A formative evaluation tool for instructional devel- opment. Performance and Instruction Journal, 22 (5), 11±14. view. Journal of Nutrition Education, 18 (1), S1±S12. Nathenson, M. B., & Henderson, E. S. (1980). Using student feedback to Glanz, K., Sorensen, G., & Farmer, A. (1996). The health impact of work- improve learning materials, London: Croom Helm. site nutrition and cholesterol intervention programs. American Journal National Institutes of Health (1985). Surgeon general's report on nutrition of Health Promotion, 10 (6), 453±470. and health. U.S. Department of Health and Human Services, Public Hausman, J. A., & Wise, D. A. (1985). Social experimentation, Chicago: Health Service (Chapter 7, pp. 311±343). Washington, DC: U.S. The University of Chicago Press. Government Printing Service. Hill, M., May, J., Coppolo, D., & Jenkins, P. (1993). Long term effective- Nisbett, R. E. & Wilson, T. D. (1977). Tellimg more than we can know: ness of a respiratory awareness program for farmers. National Institute Verbal reports on mental processes. Psychological Review, 84(3), May. for Farm Safety, Inc. NIFS Paper No. 93-3. Columbia, MO. NIFS Parrott, R., Steiner, C., & Godenhar, L. (1996). Georgia's harvesting Summer Meeting, Coeur d'Alene, Idaho. healthy habits: A formative evaluation. The Journal of Rural Health, Health Habits And History Questionnaire: Diet History And Other Risk 12 (4), 291±300. Factors (1989). Personal computer system packet. Version 2.2. Patton, M. Q. (1978). UtilizationÐfocused evaluation, Beverly Hills, CA: Washington, D.C.: National Cancer Institute, Division of Cancer Sage. Prevention and Control, National Institutes of Health. Patton, M. Q. (1982). Practical evaluation, Beverly Hills, CA: Sage. Houts, S. A. (1988). Lactose intolerance. Food Technology, 42 (3), 110± Patton, M. Q. (1994). Developmental evaluation. Evaluation Practice, 15 113. (3), 311±319. Iszler, J., Crockett, S., Lytle, L., Elmer, P., Finnegan, J., Luepker, R., & Patton, M. Q. (1996). A world larger than formative and summative. Laing, B. (1995). Formative evaluation for planning a nutrition inter- Evaluation Practice, 17 (2), 131±144. vention: Results from focus groups. Journal of Nutrition Education, 27 Pelletier, K. R. (1996). A review and analysis of the health and cost-effec- (3), 127±132. tive outcome studies of comprehensive health promotion and disease Jacobs Jr, D. R., Luepker, R. V., Mittelmark, M. B., Folsom, A. R., Pirie, P. prevention programs at the worksite: 1993±1995 update. American L., Mascioli, S. R., Hannan, P. J., Pechacek, T. F., Bracht, N. F., Carlaw, Journal of Health Promotion, 10 (5), 380±388. R. W., Kline, F. G., & Blackburn, H. (1986). Community-wide Pelz, E. B. (1959). Some factors in group decision. In E. E. Macoby, T. M. prevention strategies: Evaluation design of the Minnesota heart health Newcomb & E. L. Hartley, Readings in social psychology (3rd ed). program. Journal of Chronic Disease, 39 (10), 775±788. (pp. 212±219). New York: Holt, Rinehart and Winston, Inc. Janz, N. K., & Becker, M. H. (1984). The health belief model: A decade Peterson, K. A., & Bickman, L. (1988). Program personnel: The missing later. Health Education Quarterly, 11 (1), 1±47. ingredient in describing the program environment. In J. Kendon, Johnson, C. C., Osganian, S. K., Budman, S. B., Lytle, L. A., Barrera, E. P., Conrad Roberts-Gray & Cynthia Roberts-Gray, Evaluating program Bonura, S. R., Wu, M. C., & Nader, P. R. (1994). CATCH: Family environments, San Francisco, CA: Jossey-Bass, Inc. process evaluation in a multicenter trial. Health Education Quarterly, Potter, J. D., Graves, K. L., Finnegan, J. R., Mullis, R. M., Baxter, J. S., Supplement, 2, S91±S106. Crockett, S., Elmer, P. J., Gloeb, B. D., Hall, N. J., Hertog, J., Pirie, P., Kandaswamy, S., Stolovitch, H. D., & Thiagarajan, S. (1976). Learner Richardson, S. L., Rooney, B., Slavin, J., Snyder, M. P., Splett, P., &
  • 15. J.L. Brown, N.E. Kiernan / Evaluation and Program Planning 24 (2001) 129±143 143 Viswanath, K. (1990). The cancer and diet intervention project: a Seidel, R. E. (1993). Notes from the ®eld in communication for child survi- community-based intervention to reduce nutrition-related risk of val, Washington, DC: USAID. cancer. Health Education Research, 5 (4), 489±503. Stuf¯ebeam, D. L. (1983). The CIPP model for program evaluation. In G. Rightwriter (1990). Version 3.1. Sarasota, FL: RightSoft, Inc. Madaus, M. Scriven & D. Stuf¯ebeam, Evaluation models: Viewpoints Robins, P. K., Spiegelman, R. G., Weiner, S., & Bell, J. G. (1980). A on educational and human services evaluationBoston: Kluwer-Nijhoff. guaranteed annual income: Evidence from a social experiment, New Tessmer, M. (1993). Planning and conducting formative evaluations, York: Academic Press. London: Kogan Page. Rossi, P. H., & Lyall, K. (1976). Reforming public welfare, New York: Thiagarajan, S. (1991). Formative evaluation in performance technology. Russell Sage. Performance Improvement Quarterly, 4 (2), 22±34. Rossi, P. H., & Freeman, H. E. (1982). Evaluation: A systematic approach Wager, J. C. (1983). One-to-one and small group formative evaluation: An (p. 69). Beverly Hills, CA: Sage Publications. examination of two basic formative evaluation procedures. Perfor- Russell, J. D., & Blake, B. L. (1988). Formative and summative evaluation mance and Instruction, 22 (5), 5±7. of instructional products and learners. Educational Technology, 28 (9), Walden, O. (1989). The relationship of dietary and supplemental calcium 22±28. intake to bone loss and osteoporosis. Journal of the American Dietetic SAS Proprietary Software Release 6.09. (1989). Cary, N.C.: SAS Institute, Association, 89 (3), 397±400. Inc. Weston, C. B. (1986). Formative evaluation of instructional materials: An Scanlon, E. (1981). Evaluating the effectiveness of distance learning: A overview of approaches. Canadian Journal of Educational Communi- case study. In F. Percival & H. Ellington, Aspects of educational tech- cation, 15 (1), 5±17. nology: Vol. XV: Distance learning and evaluation (pp. 164±171). Weston, C. B. (1987). The importance of involving experts and learners in London: Kogan Page. formative evaluation. Canadian Journal of Educational Communica- Scheirer, M. A. (1994). Designing and using process evaluation. In J. S. tions, 16 (1), 45±58. Wholey, H. Hatry & K. Newcomer, Handbook of practical program Wilkinson, T. L., Schuler, R. T., & Skjolaas, C. A. (1993). The effect of evaluation (pp. 40±68). San Francisco: Jossey-Bass. safety training and experience of youth tractor operators. National Insti- Scheirer, M. A., & Rezmovic, E. L. (1983). Measuring the degree of tute for Farm Safety, Inc. NIFS Paper No. 93±6. Columbia, MO. NIFS program implementation. Evaluation Review, 7 (5), 599±633. Summer Meeting, Coeur d'Alene, Idaho. Schneider, M. L., Ituarte, P., & Stokols, D. (1993). Evaluation of a commu- Witte, K., Peterson, T. R., Vallabhan, S., Stephenson, M. T., Plugge, C. D., nity bicycle helmet promotion campaign: What works and why. Amer- Givens, V. K., Todd, J. D., Bechtold, M. G., Hyde, M. K., & Jarrett, R. ican Journal of Health Promotion, 7 (4), 281±287. (1992/3). Preventing tractor-related injuries and deaths in rural popula- Scriven, M. (1967). The methodology of evaluation. In R. Tyler, R. Gagne tions: Using a persuasive health message framework in formative & M. Scriven, Perspectives of curriculum evaluation (pp. 39±83). evaluation research. International Quarterly of Community Health Chicago: Rand McNally. Education, 13 (3), 219±251.