The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...
Randomized Control Trials:rationale and requirements
1. International Initiative for Impact Evaluation
Randomized Control Trials:
rationale and requirements
Howard White
IFAD
Rome, September 2012
Howard White www.3ieimpact.org
2. Impact
evaluation:
an example
Why did the
Bangladesh The case of the
Integrated Bangladesh
Nutrition Integrated
Program Nutrition
(BINP) fail? Project (BINP)
Howard White www.3ieimpact.org
4. The theory of change
Target group Exposure to Behaviour change
for nutritional sufficient to change
nutritional counselling child nutrition
Target group counselling is results in
participate in the relevant knowledge Improved
program one acquisition and nutritional
(mothers of behaviour outcomes
young change
children)
Children are Food is Supplementary
correctly delivered to feeding is
identified to those enrolled supplemental, i.e.
be enrolled in no leakage or
the program substitution
Food is of sufficient
quantity and quality
Howard White www.3ieimpact.org
5. The theory of change
Target group Exposure to Behaviour change
for
nutritional
Right target
nutritional
counselling
sufficient to change
child nutrition
Target group
participate in
program
counselling is
the relevant
one
group for
results in
knowledge
acquisition and
Improved
nutritional
(mothers of
young nutritional
behaviour
change
outcomes
children)
PARTICIPATION
Children are
correctly
RATES WERE UP
counselling
Food is
delivered to
Supplementary
feeding is
identified 30% LOWER
TO to those enrolled supplemental, i.e.
be enrolled in no leakage or
FOR WOMEN
the program substitution
LIVING WITH THEIR
MOTHER-IN-LAW Food is of sufficient
quantity and quality
Howard White www.3ieimpact.org
6. The theory of change
Target group
for
Exposure to
nutritional
Knowledge
Behaviour change
sufficient to change
Target group
nutritional
counselling is
counselling
results in acquired and
child nutrition
participate in the relevant knowledge Improved
program
(mothers of
one acquisition and
behaviour
used nutritional
outcomes
young change
children)
Children are Food is Supplementary
correctly delivered to feeding is
identified to those enrolled supplemental, i.e.
be enrolled in no leakage or
the program substitution
Food is of sufficient
quantity and quality
Howard White www.3ieimpact.org
7. The theory of change
Target group Exposure to Behaviour change
for nutritional sufficient to change
Target group
nutritional
counselling is
The right
counselling
results in
child nutrition
participate in the relevant knowledge Improved
program
(mothers of
one children are
acquisition and
behaviour
nutritional
outcomes
young
children) enrolled in the
change
Children are Food is Supplementary
correctly
identified to
programme
delivered to
those enrolled
feeding is
supplemental, i.e.
be enrolled in no leakage or
the program substitution
Food is of sufficient
quantity and quality
Howard White www.3ieimpact.org
8. The theory of change
Target group
for
Exposure to
nutritional
Supplementary
Behaviour change
sufficient to change
Target group
nutritional
counselling is
counselling
results in feeding is
child nutrition
participate in the relevant knowledge Improved
program
(mothers of
one acquisition and
behaviour
supplementary nutritional
outcomes
young change
children)
Children are Food is Supplementary
correctly delivered to feeding is
identified to those enrolled supplemental, i.e.
be enrolled in no leakage or
the program substitution
Food is of sufficient
quantity and quality
Howard White www.3ieimpact.org
9. Lessons from BINP
• Apparent successes can turn out to be
failures
• Outcome monitoring does not tell us
impact and can be misleading
• A theory based impact evaluation shows if
something is working and why
• Quality of match for rigorous study
• Independent study got different findings
from project commissioned study
Howard White www.3ieimpact.org
10. So, what is impact evaluation?
Howard White www.3ieimpact.org
11. What is impact evaluation?
Impact evaluations answer the question as
to what extent the intervention being
evaluated altered the state of the world
= the (outcome) indicator with the
We can see this
intervention compared to what it would have
been in the absence of the we can’t see this
But intervention
So we use a
= Yt(1) – Yt(0) comparison
group
Howard White www.3ieimpact.org
12. The core of large n designs
Before After
Project
Comparison
Howard White www.3ieimpact.org
13. What do we need to measure impact?
Irrigation in Andhra
Pradesh
(paddy
yields, tons/ha)
Before After
Project (treatment) 4.4
comparison
The majority of evaluations have just this information …
which means we can say absolutely nothing about impact
Howard White www.3ieimpact.org
14. Before versus after single difference comparison
Before versus after = 4.4 – 3.2 = 1.2
Before After
Project (treatment) 3.2 4.4
comparison
This ‘before versus after’ approach is outcome
monitoring, which has become popular recently.
Outcome monitoring has its place, but it is not
impact evaluation
Howard White www.3ieimpact.org
15. Post-treatment comparison
comparison
Single difference = 4.4 – 2.9 = 1.5
Before After
Project (treatment) 4.4
comparison 2.9
But we don’t know if they were
similar before… though there are
ways of doing this (statistical
matching = quasi-experimental
approaches)
Howard White www.3ieimpact.org
16. Double difference =
(4.4-3.2)-(2.9-3.1) = 1.2 – (-0.2) = 1.4
Before After
Project (treatment) 3.2 4.4
comparison 3.1 2.9
Conclusion: Longitudinal (panel) data, with a
comparison group, allow for the strongest
impact evaluation design (though still need
matching). SO WE NEED BASELINE DATA
FROM PROJECT AND COMPARISON AREAS
Howard White www.3ieimpact.org
17. So what?
• Comparison groups are nothing new
• What is new is attention to threats to
validity of comparison group from
– Selection bias
– Contamination
– Spill over effects (e.g. from FFS)
Howard White www.3ieimpact.org
18. The problem of selection bias
• Program participants are not chosen at random,
but selected through
– Program placement
– Self selection
• This is a problem if the correlates of selection
are also correlated with the outcomes of interest,
since those participating would do better (or
worse) than others regardless of the intervention
Howard White www.3ieimpact.org
19. Selection bias from program placement
• A productivity enhancement programme is targeted
at poor and marginal farmers
• These farmers have less land and other assets like
capital, literacy, access to labour and so on… so
their outcomes (productivity) will be lower than that
of non-participants, maybe even with the project
• Hence productivity for project farmers will be lower
than the average for other farmers
• The comparison group has to be drawn from a
group of similarly deprived farmers
Howard White www.3ieimpact.org
20. Selection bias from self-selection
• A farmer field school programme recruits farmers from a
community on a voluntary basis
• But those farmers who join are likely to be ‘more
progressive, i.e. more interested in changing practices
• So those farmers who join the programme are more
likely to adopt new practices and have better outcomes
than those who don’t join… even in the absence of the
programme
And it may be that those
communities in the programme may
be better performing than non-
programme communities as a result
of either self-selection or progamme
placement
Howard White www.3ieimpact.org
21. Examples of selection bias
• Hospital delivery in Bangladesh (0.115 vs 0.067)
• Secondary education and teenage pregnancy in
Zambia
• Male circumcision and HIV/AIDS in Africa
Howard White www.3ieimpact.org
22. HIV/AIDs and
circumcision:
geographical
overlay
Howard White www.3ieimpact.org
23. Main point
There is ‘selection’ in who benefits from
nearly all interventions. So need to get
a comparison group which has the
same characteristics as those selected
for the intervention.
Howard White www.3ieimpact.org
24. Randomization (RCTs)
• Randomization addresses
the problem of selection
bias by the random
allocation of the treatment
• Unit of assignment may
not be the same level as
the unit of analysis, e.g.
– Randomize across villages
but measure individual
learning outcomes
– Randomize across sub-
districts but measure village-
level outcomes
Howard White www.3ieimpact.org
25. The importance of power
• The fewer units over which you randomize the higher your
standard errors
• But you need to randomize across a ‘reasonable number’ of
units
At least 30 for simple
randomized design (though
possible imbalance
considered a problem for n <
200)
Can possibly be as few as 10
for matched pair
randomization, though
literature is not clear on this
Howard White www.3ieimpact.org
26. The larger the sample the more likely it is
that treatment and control are comparable
Table 1 Average characteristics by different sample sizes (n)
Rural (%) Years of education Number of
household
members
Treatment Control Treatment Control Treatment Control
n=2 100 0 12.0 9.0 9.0 5.0
n=20 70 80 6.4 5.8 6.4 6.7
n=50 72 60 5.8 5.3 6.4 6.5
n=200 65 61 6.0 5.0 6.7 6.5
n=2,000 66 64 5.2 5.4 6.5 6.5
Howard White www.3ieimpact.org
27. Clustering
Need to
randomize
over enough
units
Treatment
Control
Design: 20 treatment
villages, 20 control villages.
Different districts (chosen at
random)
Howard White www.3ieimpact.org
28. Conducting an RCT
• Has to be an ex-ante design
• Has to be politically feasible, and confidence that
program managers will maintain integrity of the
design
• Perform power calculation to determine sample
size (and therefore cost)
• Collect baseline data to:
– Test quality of the match
– Conduct difference in difference analysis
Howard White www.3ieimpact.org
29. Overcoming resistance to randomization
• There is probably an untreated population
anyway
• Need not randomly allocate whole programme
just a bit
• Exploit different designs which make less
difference to the programme
• Don’t need ‘no treatment’ control
• RCTs are not unethical, spending money on
programmes that don’t work is unethical
Howard White www.3ieimpact.org
30. Some different ways to randomize
Pipeline Raised threshold
By analogy, could expand
eligible area and randomize
within that
Howard White www.3ieimpact.org
31. More ways to randomize
Don’t need randomize
across whole eligible Encouragement design
population
No universal scheme is
universally adopted.
There are three groups: (a)
always adopt, (b) never
adopt, and (c) may adopt
with encouragement
An encouragement design
provides an incentive to
group (c) to adopt in
Just use these guys
for the RCT treatment versus no
incentive in control
Howard White www.3ieimpact.org
32. Different types of design
Don’t need a ‘no
treatment’ control Factorial Design
In medicine the control gets the
standard practice of care ie the
existing treatment. This comparison
is often the one of most interest to
policy makers
So everyone can get basic
package, with some addition in the
control to ‘make it work better’
Howard White www.3ieimpact.org
33. Thinking about RCT designs
• What are my
- Unit of analysis (what outcomes are you measuring?
- Unit of assignment?
• Do I have sufficient units of assignment (i.e. power
calculation)
• How many ‘treatment arms’ will I have?
• What do the comparison group get?
• What sort of spillovers might there be?
• How likely is contamination of treatment or control?
• How much of the programme am I going to randomize
and how (e.g. pipeline)?
• Who needs to agree to a RCT? Have they? Cultural
factors?
Howard White www.3ieimpact.org
34. Issues in CCT RCTs
• The need for a theory-based approach
• So need to assess implementation (ie
process evaluation elements)
• Measuring farm level activities and
outcomes
• Measuring overall CC outcomes, e.g. CO2
offset (imputation?)
• Which researchers do you use?
Howard White www.3ieimpact.org
35. Issues in CCT RCTs
• The need for a theory-based approach
• So need to assess implementation (ie
process evaluation elements)
• Measuring farm level activities and
outcomes
• Measuring overall CC outcomes, e.g. CO2
offset (imputation?)
• Which researchers do you use?
Howard White www.3ieimpact.org
36. Thank you
Visit www.3ieimpact.org
Howard White www.3ieimpact.org