Call Girls Sangamwadi Call Me 7737669865 Budget Friendly No Advance Booking
1. About J-Pal and What is Evaluation
1. INTRODUCTION TO IMPACT EVALUATION AND
RANDOMIZED CONTROL TRIALS
PRESENTATION 1: J-PAL OVERVIEW
PARTICIPATION AND REGULATORY COMPLIANCE
PROJECT LAUNCH WORKSHOP
8-9 JULY 2014. HANOI
Héctor Salazar Salame, Executive Director J-PAL SEA
3. About J-PAL
• J-PAL is a network of 92 academics at universities around
the world who use randomized evaluations to answer
critical policy questions in the fight against poverty.
• Mission: To reduce poverty by ensuring that policy is
based on scientific evidence, and research is translated into
action
– Research: Conducting rigorous impact evaluations
– Policy Outreach: Translating findings into action
– Training: Building the capacity of partners and practitioners to
conduct rigorous evaluations
4. Where Does J-PAL Work?
4
502 evaluations in 57 countries worldwide
5. J-PAL Program Areas
1. Agriculture
2. Education
3. Environment & Energy
4. Finance & Microfinance
5. Health
6. Labor Markets
7. Political Economy & Governance
7. J-PAL Southeast Asia
7
• Launched in 2013
• H.E. President Susilo Bambang
Yudhoyono delivered keynote at launch
event
• Seed funding from the Australian
Department of Foreign Affairs & Trade
• Based at UI’s Institute for Economic
and Social Research (LPEM)
• Led by professors at MIT & Harvard
8. Presentations Overview
1. What is evaluation? Why Evaluate?
2. Why randomize?
3. How to randomize
4. Evaluation from Start to Finish
9. Presentations Overview
1. What is evaluation? Why Evaluate?
2. Why randomize?
3. How to randomize
4. Evaluation from Start to Finish
14. How can impact evaluation help us
improve programs and policies?
• Often little hard evidence on what works
• Can do more with given budget with better evidence
• If people knew money was going to programs that
worked, could help increase pot for anti-poverty
programs
• Instead of asking “do antipoverty programs work?”
should be asking:
– Which programs work best, why and when?
– Which concepts work, why and when?
– How can we scale up what works?
• Add to our body of evidence
– part of a well-thought out evaluation (research) strategy
15. Components of Program Evaluation
• Needs Assessment
• Program Theory Assessment
• Process Evaluation
• Impact Evaluation
• Cost Effectiveness
• What is the problem?
• How, in theory, does the
program fix the problem?
• Does the program work as
planned?
• Were its goals achieved?
The magnitude?
• Given magnitude and cost, how
does it compare to alternatives?
18. The Need
• Worldwide
– Nearly 2 million children die each year from
diarrhea
– 20% all child deaths (under 5 years old) are from
diarrhea
• Indonesia
– 36 percent of children are stunted
• Diarrhea, lack of nutrition are seen as major causes
19. The Problem
• 13% of world population lacks access to
“improved water sources”
• Lack of access of water purification solutions
• People’s reported value for clean water translates to
willingness to pay nearly $1 per averted diarrhea episode,
$24 per DALY (Kremer et al 2009)
23. Will this alone solve the Problem?
• Water quality helps little without hygiene (Esrey, 1996)
– 42% live without a toilet at home
• People are more willing to pay for convenient water than
clean water
• Less than 10% of households purchase treatment
– In Zambia, $0.18 per month for a family of six
– In Kenya, $0.30 per month
• 25% of households reported boiling their drinking water the
prior day
27. Program Theory Assessment
How will the program address the needs put forth in
your needs assessment?
– What are the prerequisites to meet the needs?
– How and why are those requirements currently lacking or
failing?
– How does the program intend to target or circumvent
shortcomings?
– What services will be offered?
Tools
– Theory of Change
– Logical Framework (LogFrame)
28. Theory of Change
Sample assumption: containers used
to collect/store water are clean , and
are kept clean within the household
afterwards
Sample assumption: HH members are
consuming water from the clean
source and not others potentially
contaminated
Sample assumption: source is properly
maintained, and there is consistent
water supply
Sample assumption:
changed behavior
stays consistent over
time
29. Log Frame
Objectives
Hierarchy
Indicators Sources of
Verification
Assumptions /
Threats
Impact
(Goal/
Overall
objective)
Lower rates
of diarrhea
Rates of
diarrhea
Household
survey
Waterborne
disease is
primary cause of
diarrhea
Outcome
(Project
Objective)
Households
drink cleaner
water
(Δ in)
drinking
water source;
E. coli
CFU/100ml
Household
survey, water
quality test at
home storage
Shift away from
dirty sources.
No
recontamination
Outputs Source water
is cleaner;
Families
collect
cleaner water
E. coli
CFU/100ml;
Water quality
test at source
continued
maintenance,
knowledge of
maintenance
practices
Inputs
(Activities)
Water source
protection is
built
Protection is
present,
functional
Source visits/
surveys
Sufficient
materials,
funding,
manpower
Source: Roduner, Schlappi (2008) Logical Framework Approach and Outcome Mapping, A constructive Attempt of Synthesis,
Needs
assessment
Process
evaluation
Impact
evaluation
34. Process Evaluation: Demand-side
• Do households collect water from improved
source?
– For multi-source households, increase in use of
improved source by 21 percentage points
• Does storage become re-contaminated?
• Do people drink from “clean” water?
– No significant changes in transport, storage or treatment
behavior
37. Did we achieve our goals?
• Primary outcome (impact): did spring
protection reduce diarrhea?
• Also distributional questions: what was the
impact for households with good v. bad
sanitation practices?
38. How to measure impact?
• What would have happened in the absence
of the program?
• Take the difference between
what happened (with the program) …and
- what would have happened (without the program)
= IMPACT of the program
39. Constructing the Counterfactual
• Counterfactual is often constructed by selecting
a group not affected by the program
• Randomized:
– Use random assignment of the program to create a
control group which mimics the counterfactual.
• Non-randomized:
– Argue that a certain excluded group mimics the
counterfactual.
40. How impact differs from process?
• When we answer a process question, we need
to describe what happened.
• When we answer an impact question, we
need to compare what happened to what
would have happened without the program
44. Spring Cleaning Sample
Target
Populatio
n
(200)
Target
Populatio
n
(200)
Not in
evaluation
(0)
Not in
evaluation
(0)
Evaluation
Sample
(200)
Evaluation
Sample
(200)
Total
Population
(562 springs)
Total
Population
(562 springs)
Random
Assignment
Random
Assignment
Year 2
(50)
Year 2
(50)
Years 3,4
(100)
Years 3,4
(100)
Year 1
(50)
Year 1
(50)
45. Impact
• 66% reduction in source water E coli
concentration
• 24% reduction in household E coli
concentration
• 25% reduction in incidence of diarrhea
47. Choosing among alternatives
Intervention Impact on Diarrhea
Spring protection (Kenya) 25% reduction in diarrhea incidence for
ages 0-3
Source chlorine dispensers (Kenya) 20-40% reduction in diarrhea
Home chlorine distribution (Kenya) 20-40% reduction in diarrhea
Hand-washing (Pakistan) 53% drop in diarrhea incidence for
children under 15 years old
Piped water in (Urban Morocco) 0.27 fewer days of diarrhea per child per
week
51. When to do a randomized evaluation?
• When there is an important question you
want/need to know the answer to
• Timing--not too early and not too late
• Program is representative not gold plated
– Or tests an basic concept you need tested
• Time, expertise, and money to do it right
• Develop an evaluation strategy to prioritize
52. When NOT to do an RE
• When the program is premature and still requires
considerable “tinkering” to work well
• When the project is on too small a scale to randomize
into two “representative groups”
• If a positive impact has been proven using rigorous
methodology and resources are sufficient to cover
everyone
• After the program has already begun and you are not
expanding elsewhere
53. How an impact evaluation can fail
• No one is asking the questions that the study
sets out to answer
• Measures the wrong outcomes
• Leaves too many important questions
unanswered
• Produces biased results
54. Programs and their Evaluations:
where do we start?
Intervention
• Start with a problem
• Verify that the problem
actually exists
• Generate a theory of why the
problem exists
• Design the program
• Think about whether the
solution is cost effective
Program Evaluation
• Start with a question
• Verify the question hasn’t
been answered
• State a hypothesis
• Design the evaluation
• Determine whether the value
of the answer is worth the
cost of the evaluation
• If you ask the right question, you’re more likely to care
55. Components of Program Evaluation
• Needs Assessment
• Program Theory Assessment
• Process Evaluation
• Impact Evaluation
• Cost Effectiveness
• What is the problem?
• How, in theory, does the
program fix the problem?
• Does the program work as
planned?
• Were its goals achieved?
The magnitude?
• Given magnitude and cost, how
does it compare to alternatives?
J-PAL’s network has conducted 347 evaluations in 51 countries
This graph is about a month old (August 2012)
We have 5 regional offices in Cambridge, Santiago, Paris, Chennai, Cape Town. And now, of course, Jakarta!
First let’s narrow down our definition of Evaluation
Evaluation is a very big term and could mean many things…
In general, we’ll be talking about program evaluation
So that means, not the type of evaluation that’s more administrative in nature…
Performance evaluations, audits, etc…
Unless those are part of a new policy or program that we wish to evaluate…
Programs are still a general term
Could include Policies, or more generally, “interventions”
What distinguishes impact evaluation?
What makes “randomized evaluation” distinct?
Where does monitoring fit in?
The quiz you took at the beginning… that’s part of an evaluation. You’ll take one at the end as well. And you’ll also give us some course feedback
something we’ll look at after this whole course is done and use it to make design and implementation changes to the course
But it’s not something we consider part of the course design itself. It’s not really meant to serve as a pedagogical device.
The clickers. That’s more part of monitoring. It’s specifically designed as part of the pedagogy.
It gives us instant feedback based on which we make mid-course adjustments, corrections, etc.
Part of the pedagogy is that we have a specific decision tree that the implementers (in this case, lecturers) use based on the results of the survey.
In general, we’ll be talking about program evaluation
So that means, not the type of evaluation that’s more administrative in nature…
Performance evaluations, audits, etc…
Unless those are part of a new policy or program that we wish to evaluate…
Programs are still a general term
Could include Policies, or more generally, “interventions”
What distinguishes impact evaluation?
What makes “randomized evaluation” distinct?
Where does monitoring fit in?
The quiz you took at the beginning… that’s part of an evaluation. You’ll take one at the end as well. And you’ll also give us some course feedback
something we’ll look at after this whole course is done and use it to make design and implementation changes to the course
But it’s not something we consider part of the course design itself. It’s not really meant to serve as a pedagogical device.
The clickers. That’s more part of monitoring. It’s specifically designed as part of the pedagogy.
It gives us instant feedback based on which we make mid-course adjustments, corrections, etc.
Part of the pedagogy is that we have a specific decision tree that the implementers (in this case, lecturers) use based on the results of the survey.
Ok, so we will not focus on monitoring,
This question is larger than a question of aid
Aid accounts for less than 10% of development spending.
Governments have their own budgets, their own programs.
Now I had singled out impact evaluation. Because you’re all here to learn about randomized evaluations, which is a method of impact evaluation.
But an impact evaluation is not particularly useful if it doesn’t come as part of a larger package.
We will revisit this again….
Indonesia
7,500 deaths from diarrhea each year http://www.asianscientist.com/health-medicine/simple-hygeine-skills-indonesia-countdown-2015/
5% of deaths in Indonesia
In Indonesia: that’s something like 20% and 30% in rural areas
** This picture shows a young boy collecting water at a naturally occurring spring.
-- As you can see, some wood has been placed around the eye of this spring, but the water pools at the collection point where it can easily be contaminated with surface water run-off. In an agricultural area with incomplete sanitation coverage, this makes it easy for fecal matter (from either humans or livestock) to contaminate the collected water.
-- You can also imagine in this picture how contamination in transport and storage might occur. Children sometimes collect water and can easily touch it in open containers. If this kid here has fecal matter on his hands and makes contact with the spring water (which is likely), he could easily contaminate it. Similar things can happen within the home. When water is scooped out of the top of storage containers with a dipper, it is hard to avoid touching the water.
Quantity & Hygiene: washing hands, bodies, dishes, clothes: transmission of disease
Convenience:
Kenya: Door-to-door delivery of chlorine increased take-up
Kenya: 50% discounts for chlorine did not
Study in Morocco (tangiers): people willing to pay for pipes even though they have access to clean water
Demand seems low despite low price.
Latrines
Information campaigns
Piped water
I will cover this more tomorrow in the Measurement lecture
Behavior change: you want to be careful about attributing behavior change
When we answer a process question, we need to describe what happened.
This can be done from reading documents, interviewing people, admin records, etc.
We typically don’t require a comparison group to do this
When we answer an impact question, we need to compare what happened to what would have happened without the program
There are various ways to get at this, but all of them have in common that they need to re-create what did not happen.
For this, evaluators typically create a comparison group
So this is what you’ll cover in the WHY Randomize lecture tomorrow afternoon.
First, what is the difference between Random Sampling and Random Assignment
First, there were 562 springs:
not seasonably dry, no upstream contaminants
Randomly selected 200 to participate
200 split into 4 groups: 50 in each group
This is called a phase in, you’ll cover this on Thursday morning’s How-to Randomize lecture
Of the 200 springs, 16 were later deemed “ineligible”
And another 10 did not comply with their allocation assignment
8 refused
2 went ahead and protected
You’ll cover this in threats on Thursday afternoon with Shawn Cole
There were on average 30 households considered “spring-users” (based on proximity) per chosen spring
Of this, 7-9 were randomly selected to be surveyed
How did they decide on the 200 springs? On the 7-9 HH/spring?
Friday morning, with Ben Olken, you’ll cover sample size
So what intervention should we invest in?
Three big issues here:
Inconsistent outcome measures
Different contexts
Cost!
Now we’ve gone through the lifecycle of a program evaluation (or several)
When there is an important question you want/need to know the answer to
Common program with not much evidence
Uncertainty about which alternative strategy to use
Key question that underlies a lot of different programs
About to roll out a big new program, important design questions
Timing--not too early and not too late:
Test once basic kinks have been taken out
you should try to be reasonably sure this is the state of the program that would be scaled up.
No point in using rigorous evaluation to find there are still problems in management and logistics
a simple process evaluation could uncover the exact same facts.
Before rolled out on a major scale
Then it may be too late to have a control group
If found ineffective, the money will have already been wasted
Program is representative not gold plated
Unless we are testing a proof of concept
Time, expertise, and money to do it right
Develop an evaluation plan to prioritize
The basic process wrinkles are being ironed out: a process evaluation shows many logistical issues need to be sorted out.
This is a sample size issue: to be discussed on day 3
This is an ethical issue. We need a control group. We don’t want to deny anyone services that are scientifically-proven to be beneficial, simply for the purpose of experimentation (we will discuss this tomorrow: day 2)
If selection has already occurred, it cannot be randomized
Before thinking about evaluations, we should think about what it is that we’re evaluating… Here I’ll generically call it, an “intervention”
You’d be surprised how many policies are implemented that address non-existent problems. One of our evaluations in India is of a policy called continuous and comprehensive evaluation…
Now let’s ask a very specific question…