delivered at World Bank, part of Development Data Group Learning Series
Washington DC, 2016-03-07
Response rates do not always provide an accurate depiction of data quality. Research based on a large multi-country survey indicate that when interviewers play a substantial role in sample selection, interviewer manipulation may artificially generate high response rates. For example, when using the random walk selection technique, interviewers should select every kth household, but they have substantial leeway in deciding which household is the kth one, and may preferentially select those where someone is home. Or, when rostering a household to select a random respondent, interviewers may leave off household members who are seldom at home. If many interviewers engage is such behaviors, a high response rate may in fact be the result of biased sample selection and therefore indicate low data quality.
There are two lessons from these findings. First, response rates should not be used as the sole or primary proxy for data quality. Second, whenever possible, interviewers’ role in sample selection should be minimized. The talk concludes with a review of alternative sampling methods that take advantage of geospatial data such as satellite photos, drone imagery and handheld GPS devices. The ideal sampling techniques are ones that minimize interviewer discretion and allow for verification of interviewer performance.
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Response Rates Impact Data Quality, But not How you Might Think
1. www.rti.orgRTI International is a registered trademark and a trade name of Research Triangle Institute.
Response Rates Impact Data Quality,
but Not How you Might Think
Based on 2 papers:
Eckman, S and Koch, A. “The Relationship between Response Rates,
Sampling Method and Data Quality: Evidence from the European Social
Survey” Under Review
Eckman, S, Himelein, K and Dever, J. “Innovative Sample Designs Using
GIS Technology" forthcoming in Advances in Comparative Survey
Methods: Multicultural, Multinational and Multiregional Context.
Stephanie Eckman, RTI Fellow
2. Motivation
Relationship between RR & Data Quality
High response rates signal data are good quality
Response rates uncorrelated with data quality
– High RR survey no more accurate than low (Keeter et al, 2000)
– Merkle & Edelman (2002)
– Groves & Peytcheva (2008)
2
4. RRs do not Correlate with Nonresponse Bias
4 Groves & Peytcheva 2008
5. Motivation
Relationship between RR & Data Quality
High response rates signal data are good quality
Response rates uncorrelated with data quality
– High RR survey not more accurate than low (Keeter et al, 2000)
– Merkle & Edelman (2002)
– Groves & Peytcheva (2008)
But maybe high response rates are a sign that data are crap?
5
6. Data Quality
Total Survey Error Framework
– Undercoverage
– Nonresponse
– Measurement error
– Editing error
– Processing error
– etc.
Misrepresentation error
– Undercoverage + Nonresponse
Tradeoff between undercoverage & NR
– Eckman & Kreuter 2017
6
Image: http://makeagif.com/dkjuuc
7. European Social Survey
7 waves
30+ countries
Central Committee sets standards
– Core questionnaire
– Minimum effective sample size
– Paradata collection
– Documentation
– Face to face attempts
– RR standard 70%
Our data: 136 country-rounds in first 6 waves
7
8. Sampling Methods in Analysis
8
Sampling
Method Includes
Field Staff Involvement in
Selecting
nHousehold Person
Individual
Register
None None
70
Household
Register
Household Register
Address Register
None
Interviewer
None
Interviewer
41
Household
Walk
Listing
Random Walk
Lister
Lister
Interviewer
Interviewer
25
10. 2 Measures of Data Quality
External measure:
– How different is ESS from Labor Force Survey?
– On 6 categorical variables: age, gender, HH size, marital status, etc.
– Index of dissimilarity measures how different 2 surveys are
– Average over 6 variables
– Assumes LFS is higher quality
Internal measure:
– 50% of all respondents from gender heterogeneous couples should be
women
– ܫ, > 1.96 indicates significant deviation from 50%
10
ܫ, =
% female, − 50
50 ∗ 50/݊
ܦ,, = 0.5 ∗ |ܻ,,
ாௌௌ
− ܻ,,
ிௌ
|
11. 2 DVs, 2 IVs
Dependent variables: misrepresentation error
– External measure
– Internal measure
Independent variables
– RR
– Sampling method
Joint effect of RR and sampling method on data quality
11
14. Implications
High RRs might signal that you have problems with your data
– When interviewers select samples
– Interviewers seem to manipulate selection process to keep RRs high
Note that ESS does better random walk than other surveys
– Listing should be done by someone other than interviewer
Other problems with random walk
– Walker effects
– No probabilities of selection
14
15. Possible Solutions
What are some alternatives to random walk?
– Satellite Photos
– Reverse Geocoding
– Qibla Method
– Geosampling
– Listing with Drones
15
16. GIS Resources
Turn by turn directions on phone
Satellite images
– Daytime images
– Small-sat revolution
– Nighttime lights
Other remote sensing data
How can we exploit these resources for sampling?
– And avoid random walks problems
16
22. Qibla Method
Qibla is Arabic for “in the direction of Mecca”
Given random starting coordinate
– Interviewer walks in the direction of Mecca
– Selects first HH encountered
22
25. Geosampling
Select first stage units
– Administrative units
– Or 1km squares
Select second stage units
– Smaller squares
Visit and interview all households in smaller unit
25
27. Eliminates separate listing
step
Still vulnerable to interviewer
manipulation
Possible QC by interviewer
GPS tracks? (Himelein et al,
2014)
Geosampling: Second & Third Stage
27
28. Use of UAVs for Listing
RTI has tested listing from drone images
– Galapagos & Guatemala
28
Amer et al 2016
32. Conclusions
Ideal method:
– Removes influence of interviewer
– Results in equal probability sample of HUs
– With known probabilities
No alternative is perfect
– High involvement of interviewers
– High data requirements
Drones may prove useful
32
Going to explore connection between RR and data quality (UC + NR)
Data collection independent in each country
5 sample types used in ESS
Ordered here by interviewer involvement in selection (low to high)
R selection method via roster + kish table or via birthday method
Recoded into 3
Very few surveys reach 70%
19% in 1
12% in 2
51% in 3
Higher means worse quality
No easy way to put a std err on this
Purposefully using strong language (cause, effect)
Gonna do some prelim analyses and then get into models
Naïve linear regression lines
1 -- nearly all of the country-rounds using the individual register sampling method have low external measures: these samples have relatively low misrepresentation error.
2 -- country-rounds using the household register sampling methods have slightly higher measures on average (meaning the samples are less representative)
3 -- values are also high.
4 -- most fall inside the [0; 1:96] region, meaning that the observed deviations from 50% female may be explained entirely by sampling error.
5 -- 46% of the country-rounds show gender ratios that are signicantly different from 50%.
6 -- 76% of the country-rounds show signicant deviations from 50% female.
Other thing we’re doing – testing slopes of reg lines
-- sig and + in 3,6
-- others not sig
Random effects by country
Also tried:
RR in tertiles, quintiles, deciles – results unchanged
binary indicator of significant internal bias – results unchanged
Fixed effects models
External
-- strong effect of sample type (but no diff 2 vs 3)
-- no effect of RR
Internal
-- strong effect of sample type (but no diff 2 vs 3)
-- small + effect of RR in model 7: high RR -> high misrep
Sample type matters for data quality
RR does not
When there is no register
Assuming clusters already selected
Many of these solutions make use to GIS data
Planet has 149 satellites, images entire earth everyday
Other data: LIDAR, PhoDAR
To go back to the photo we looked at earlier…
Could # structures and select from image
Give Is image showing selected units
Software figures out what closest structure is
Software or interviewer figures out what closest structure is
Probability of selection??
Any direction would work
Similar to reverse geocoding
Gallup interested in piloting this
Selection region for structure 1 shown in blue
Good in theory, but blue area depends on position of all other buildings – how do we know this?
This is similar to reverse geocoding
Many points lead to selections outside the area – what to do?
New problem we didn’t have in reverse geocoding
Challenges in Implementation
Satellite images incomplete, outdated, or unavailable
Satellite image resolution low and captures only rooftops
Difficult to determine if structure is a business, group quarters, vacant, controlled access, etc.
Environmental changes (landslides, etc.) and new buildings not captured
GPS accuracy varies across countries
Detailed rural road network not available in majority of cases with accessibility issues due to elevation and natural blocks (e.g. ravines)
Improve analysis within Galapagos
Use local staff to extract information from drone imagery
Compare consistency between drones and Geo-listing
Estimate percent error across methodologies
Combine methodologies to improve/update imagery
Extend to Guatemala to explore some urban and rural settings
Assess use in conflict affected and fragile locations
Recommendations and guide to use for
Census updates
Sampling
Field work support
Not yet at the point where drones can replace humans in data collection!
How building looks on street view
How building looks from drone
Improve analysis within Galapagos
Use local staff to extract information from drone imagery
Compare consistency between drones and Geo-listing
Estimate percent error across methodologies
Combine methodologies to improve/update imagery
Assess use in conflict affected and fragile locations
Recommendations and guide to use for
Census updates
Sampling
Field work support
Not yet at the point where drones can replace humans in data collection!