SlideShare une entreprise Scribd logo
1  sur  95
Télécharger pour lire hors ligne
www.srcentre.com.au
Improving inferences from poor
quality samples
Social Research Centre / Centre for Social Research and Methods Workshop
16 August, 2017
www.srcentre.com.au
Presenters
Social Research Centre Pty Ltd
Darren W. Pennay
Dina Neiger, PhD
Paul J. Lavrakas, PhD
Darren Pennay is an Adjunct Senior Research Fellow with the ANU Centre for Social Research and Methods and
an Adjunct Professor with the Institute for Social Science Research at the University of Queensland
Dina Neiger is a Centre Visitor with the ANU Centre for Social Research and Methods
Paul Lavrakas is a Senior Fellow, NORC at the University of Chicago; Senior Research Fellow, Office of Survey
Research, Michigan State University; and Senior Methodological Advisor for the ANU’s Social Research Centre
2
www.srcentre.com.au
The Social Research Centre
➢ Based in Melbourne, Australia
o owned by the Australian National University in Canberra
o operates as a for-profit research services company.
o profits returned to ANU
o Co-founders of the ANU Centre for Social Research and Methods
➢ Conduct social and public policy research for government, not-for-profit and academic
organisations
➢ Primary data collection involving survey research, qualitative research, statistical
methods, data processing and analytics, survey methodology
➢ 65 staff, plus 120-seat CATI call centre
3
www.srcentre.com.au
What is Total Survey Quality?
➢ Many national statistical offices, including Australia, Canada, New Zealand and USA,
operate within a Total Survey Quality (TSQ) framework to determine whether or not a
survey is “fit-for-purpose”.
➢ Accuracy is not enough.
4
Dimensions of TSQ Framework
Dimension Description
Accuracy Total survey error is minimised
Credibility Data are considered trustworthy by the survey user communities
Comparability
Consistent with past studies in terms of demographic, spatial, and
temporal comparisons
Usability / Interpretability Documentation is clear and metadata are well organised
Relevance Data satisfy user needs
Accessibility Access to the data is user friendly
Timeliness / Punctuality Data deliverables adhere to schedules
Completeness
Data are rich enough to satisfy the analysis objectives without
undue burden on respondents
Coherence Estimates from different sources can be reliably combined
Source: Biemer, P. (2010) Public Opinion Quarterly 74 (5): p.819.
www.srcentre.com.au
REPRESENTATION MEASUREMENT
Survey Cycle from a Design Perspective
5
Final Results &
Conclusions
Target Population
Sampling Frame
Designated Sample
Final sample Final Dataset
Response
Specification
Measurement
www.srcentre.com.au
Total Survey Error Framework
6
Final Results &
Conclusions
Sampling Error
Nonresponse Error
Adjustment Error
ERRORS OF REPRESENTATION ERRORS OF MEASUREMENT
Coverage Error
Processing Error
Specification Error
Measurement Error
Inferential Error
Target Population
Sampling Frame
Designated Sample
Final sample Final Dataset
Response
Specification
Measurement
www.srcentre.com.au
What is a probability sample?
Textbook Definition: A probability sample is one in which every unit in the
population of interest has a known, non-zero probability of being selected for
the sample.
Key elements which will differentiate the process from a non-probability
sampling process:
o Selection into the sample is via a random process.
o Every unit in the population of interest has a chance of being sampled for the research.
o The probability of selection is known.
These design features enable us to:
o Calculate standard errors
o Calculate confidence intervals
o Generalise to the target population of interest
7
www.srcentre.com.au
Trends in response rates
www.srcentre.com.au
Non-response and non-response bias
9
The general decline in response rates is evident
across nearly all types of surveys, in the United
States and abroad. At the same time, greater effort
and expense are required to achieve even the
diminished response rates of today.
These challenges have led many to question
whether surveys are still providing accurate and
unbiased information [and whether or not
probabilistic surveys are better than the
alternatives?]
Pew Research Center, 2012
10
www.srcentre.com.au
Trends in response rates (U.S. data)
There is no sign of an increase in
nonresponse bias since 2012.
On 13 demographic, lifestyle, and health
questions that were compared with
benchmarks from high response rate federal
surveys, estimates from phone polls are just
as accurate, on average, in 2016 as they
were in 2012. The average (absolute)
difference between the Center telephone
estimates and the benchmark survey
estimates was 2.7 percentage points in
2016, compared with 2.8 points in 2012.
11
www.srcentre.com.au
Trends in response rates (U.S. data)
12
www.srcentre.com.au
Impact of declining
response rates on
telephone survey
estimates in the U.S.
13
www.srcentre.com.au
Impact of low response rates on survey estimates
Pew conclusions 2012
➢ Despite the growing difficulties in obtaining a high level of participation in most surveys,
well-designed telephone polls that include landlines and cell phones reach a cross-
section of adults that mirrors the American public, both demographically and in many
social behaviours.
Pew conclusions 2016
➢ Telephone polls still provide accurate data on a wide range of social, demographic and
political variables, but some persistent weaknesses persist.
14
www.srcentre.com.au
What is the situation in Australia?
A literature review and industry consultation commissioned by RICA and undertaken by
Bednall et al. (2013) concluded the following:
➢ Telephone response rates:
o As far as the telephone is concerned response rates have been in a gradual decline the last
decade.
o Among cold-calling general community surveys, telephone response rates are typically below
10%.
o Co-operation rates, (the ratio of obtained interviews to refusals) are typically below 0.2 (that is
below one interview to five refusals).
o Telephone interviews from client lists have a higher response rate – typically above 20% with
co-operation rates above 1.0.
o It would appear that some topics, such as financial services, may induce a lower level of co-
operation.
o Government sponsored surveys have higher response rates, at times over 50%, but even here
a sharp decline in response rates over time for one long running monitor was observed. Co-
operation rates were also higher in government sponsored surveys.
15
www.srcentre.com.au
What is the situation in Australia – SRC Landline surveys
17
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
2011 2012 2013 2014 2015
Non-contactsas%usablesample
Survey year
Non-contacts as % usable sample - various projects
Landline sample 1- National Landline sample 2 - National Landline sample 3 - Victoria Landline sample 4 - Victoria Landline sample 5 - National
www.srcentre.com.au
What is the situation in Australia – SRC Landline surveys
18
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
2011 2012 2013 2014 2015
Refusalsasa%ofcontacts
Survey year
Refusals as % contacts - various projects
Landline sample 1- National Landline sample 2 - National Landline sample 3 - Victoria Landline sample 4 - Victoria Landline sample 5 - National
www.srcentre.com.au
What is the situation in Australia – SRC Landline surveys
19
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
2011 2012 2013 2014 2015
Interviewsasa%ofinterviews+refusals
Survey year
Interviews as % interviews and refusals - various projects
Landline sample 1- National Landline sample 2 - National Landline sample 3 - Victoria Landline sample 4 - Victoria Landline sample 5 - National
www.srcentre.com.au
What is the situation in Australia (cont.)?
! Not saying that low response rates are inconsequential!
! Poorly designed and poorly executed surveys, especially those with very
limited call routines (e.g. polls/trackers enumerated over one or two
evenings only), those with non-coverage errors (e.g. landline telephone
surveys), and those that use non-probability sampling methods (e.g.
online panels) are much more likely to produce biased results.
! HOWEVER, well designed and well executed probability-based surveys
still produce estimates that can be relied on in many situations (and with
known standard errors and confidence intervals!)
22
www.srcentre.com.au
This appears to be the main situation regarding non-response
23
www.srcentre.com.au
The sky is not falling in
www.srcentre.com.au
A Word of Warning on Probability Sampling
Rivers again ….
[Probability surveys work] if the nonresponse rate is small. Cochran
(1977, p. 363), in his classic text, concluded that the upper limit is
approximately 10 percent nonresponse, which is difficult to achieve today in
even the best funded surveys.
Lohr (2010, p. 355) warns that “many examples exist of surveys with a 70%
response rate whose results are flawed.”
(Doug Rivers, Comments on the “2013 AAPOR Taskforce Report on Non-
probability Sampling” (Douglas Rivers; Comment. J Surv Stat
Methodol 2013; 1 (2): 111-117. doi: 10.1093/jssam/smt009)
25
www.srcentre.com.au
The rise of non-probability online
panels
www.srcentre.com.au
The take-up of non-probability online panels
The situation in Australia
➢ In 2014-15, 86% of Australian households had access to the internet at home (ABS).
➢ Online research is continuing to grow in popularity domestically and internationally.
28
24
39
34 34
40
26
28
25
31
12
0
10
20
30
40
50
2009 2010 2012 2013 2015
Online
CATI
Source: ESOMAR / Research Industry Council of Australia
Percent of research industry turnover
Online research
globally – 28%
in 2014
➢ 50+ commercial ‘online
research panels’.
➢ First of these established in
the late 1990’s.
➢ All ‘until now’ use non-
probability sampling
methods.
www.srcentre.com.au
The rise of non-probability panels - globally
$0
$1,000
$2,000
$3,000
$4,000
$5,000
$6,000
$7,000
$8,000
'99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09 '10 '11 '12 '13 '14 '15E '16E
Revenuein$USM
Year
Online Research Revenues 1999-2016E
US Europe Rest of World
29
www.srcentre.com.au
Cons
➢ Non-coverage
➢ Self-selection
➢ Reliance on the computer-literate
respondents
➢ Non-probability sampling
o Calculating standard errors
o Calculating confidence intervals
o Generalising to the target population of
interest
Pros
➢ Reduced cost
➢ Improved timeliness
➢ Respondent convenience
➢ Reduced social desirability bias
➢ Can target ‘hard to reach’
populations
➢ Multimedia functionality
➢ Computerised questionnaire
scripts
30
Pros and cons of non-probability panels
www.srcentre.com.au
How do they compare – the SRC Online Panels Benchmarking
Survey
➢ Three surveys based on probability samples of the Australian population aged 18 years
and over and five surveys of persons aged 18 years and over administered to members
of non-probability online panels.
➢ Survey questionnaire included a range of demographic questions and questions about
health, wellbeing and use of technology for which high-quality population benchmarks
were available.
➢ Same questions were used across the eight surveys.
➢ 9 minutes average interview length for online and telephone.
o 12 page booklet for the hard copy version.
➢ Fieldwork Oct – Dec 2015.
➢ Data and documentation available from the Australian Data Archive.
https://www.ada.edu.au/ada/01329
31
www.srcentre.com.au
Results – substantive health characteristics
32
Substantive variables
Benchmark
value
Distance from benchmarks
Probability Non-probability
ABS
ANU
Poll
RDD P1 P2 P3 P4 P5
Life satisfaction (8 out of 10)
Percentage point error 32.6 -2.0 -2.0 1.9 -11.9 -11.6 -4.5 -9.2 -7.9
Psychological distress - Kessler 6 (Low)
Percentage point error 82.2 -10.6 -11.6 -8.1 -25.9 -23.5 -22.2 -25.0 -23.2
General Health Status (SF1) (Very good)
Percentage point error 36.2 0.4 -2.0 -2.6 -4.1 -5.8 -5.3 -5.0 1.5
Private Health Insurance
Percentage point error 57.1 3.4 1.9 3.3 -8.9 -12.5 -3.7 -0.6 -2.6
Daily smoker
Percentage point error 13.5 -4.1 3.5 1.6 9.8 6.7 3.9 2.7 4.3
Consumed alcohol in the last 12 months
Percentage point error 81.9 3.6 -2.8 -4.0 2.4 5.3 3.9 4.2 1.5
www.srcentre.com.au
Results – substantive health characteristics
Average error across six substantive measures
33
Probability Surveys Non-probability Panels
ABS
ANU
Poll
RDD P1 P2 P3 P4 P5
Avge. error 4.02 3.98 3.58 10.5 10.9 7.24 7.78 6.83
Largest absolute error 10.59 11.57 8.08 25.86 23.52 22.20 24.96 23.20
No. of significant
differences from
benchmarks (out of 6)
2 3 2 4 6 3 3 3
www.srcentre.com.au
Two questions …
1. How do we reduce non-response bias in low response rate
probability surveys
2. How do we reduce bias in survey estimates based on non-
probability online panels
34
www.srcentre.com.au
Instructors
Dina Neiger, PhD
➢ Dina is a professional statistician with over 20 years experience and a track record of
achievement in leadership and technical roles.
➢ Social Research Centre, Monash University, Australian Bureau of Statistics, Peter
MacCallum Cancer Centre.
➢ 1st Class Honours degree in Statistics and PhD in Business Systems from Monash
University with an emphasis in applied Operations Research and Process Engineering.
➢ Accredited Statistician (AStat) and member of the Statistical Society of Australia (SSA).
➢ Full member of the Australian Society for Operations Research (ASOR).
➢ Calibration and blending methods to improve accuracy of the non-probability samples,
establishment and maintenance of the first Australian Online Probability panel and
complex business survey design and weighting.
35
www.srcentre.com.au
Instructors
Paul J. Lavrakas, PhD
➢ Research psychologist, research methodologist, and prolific author.
➢ Since 2007, Independent Consultant with clients in the USA, Australia, Belgium, Japan
and Canada; and Senior Fellow at NORC (U-Chicago) and Office of Survey Research
(Michigan State U.) .
➢ From 2000-2007, Chief Research Methodologist for Nielson Media Research.
➢ 1978-2000 - Professor at Northwestern U. and Ohio State U. and founding faculty
director of a survey research center at each university.
➢ Australian roles - Senior Methodological Adviser, the Social Research Centre. Member
of the Scientific Advisory Board for the Centre for Social Research and Methods, ANU.
➢ President of the American Association for Public Opinion Research (2011-2014) and
continues to serve on a volunteer basis on many AAPOR task forces and committees.
36
www.srcentre.com.au
Investigating Nonresponse Bias
by Level of Effort
Paul J. Lavrakas
www.srcentre.com.au
Investigating Nonresponse Bias
➢ Methods have evolved in the past decade regarding how NR Bias can be investigated
(Groves and Brick, 2007; Olson and Montaquilla, 2012)
➢ One approach is to conduct analyses comparing the difference between early
responders and middle responders vs. late responders on key measures in a
survey with the hypothesis/premise being that late responders are more like
nonrespondents than are early and middle responders
➢ However, if no differences are found between late responders vs. early and middle
responders, that is NOT taken as evidence that there is no differences between
respondents and nonrespondents.
38
www.srcentre.com.au
Case Study 1: University Faculty Health Benefits Survey
Study Background
➢ The purpose of this survey was to provide reliable and valid information about the
opinions of faculty towards the prospect of the university creating a new health care
service for faculty working on the main campus.
➢ Approximately 800 of 5,000 members of the faculty were randomly sampled, and a
telephone survey was conducted by the university’s survey research center, which
yielded 47% completion rate.
➢ Nonresponse bias was investigated using multiple methods, with one being a Level of
Effort analysis.
39
www.srcentre.com.au
Level of Effort Analyses
➢ Level of Effort was operationalized as the number of calls made to a given sampled
respondent who completed the interview
➢ Analyses were conducted to investigate whether the effort it took to complete an
interview was related to any of the substantive questions that were asked in the
questionnaire.
➢ From these analyses it was learned than none of the key questionnaire items had a
statistically significant (p < .05) association with this effort measure
➢ For one of the questionnaire items – asking whether the respondent had heard about
the possible new health service prior to being interviewed – there was a marginally
significant correlation (p < .07) with effort, but the size of this correlation was extremely
small (r = 0.09).
➢ Thus, it was judged that there as no reliable evidence that any nonignorable
nonresponse bias was present in the findings reported in the main body of the report.
40
www.srcentre.com.au
Case Study 2: Voter Identification Survey
Study Background
➢ In 2012, there were several states in the USA whose state governments were controlled
by conservative party members who passed legislation changing the requirements for
the identification that a voter must show to prove that he or she were eligible to vote on
the day of the Presidential Election.
➢ The purpose of the survey, which was funded by large labor unions that were known to
support liberal candidates/policies, was to identify how many and which types of
residents in the state were most likely to be disenfranchised by the new Voter ID
legislation.
41
www.srcentre.com.au
Level of Effort Analysis 1
➢ One approach was to compare key data provided by those who initially refused, but
later agreed to complete the questionnaire after being recontacted with a “refusal
conversion” protocol, with data provided by the cohort that never refused.
➢ In this analysis it was found that for neither of the two of the key statistics measured by
the survey – awareness of the new Voter ID legislation; whether someone had a valid
photo ID to vote at their local polling place on November 6, 2012 – were there
statistically significant differences (p < .05) between those who never refused and those
who initially refused but later were converted.
➢ Thus, there was no evidence found to suggest that refusing nonrespondents who were
not converted would have given materially different answers to these key questions than
did the respondents.
42
www.srcentre.com.au
Level of Effort Analysis 2
➢ A second approach was to take into account the number of call attempts it took to
complete an interview, and investigate if answers to the same two key variables
correlated with the level of effort expended.
➢ In this analysis it was found that as more effort was expended to reach a respondent,
the likelihood the respondent had a valid photo ID for voting purposes increased at a
statistically significant level (p < .007).
o Thus, the findings suggested that the unweighted survey data were somewhat biased in the
direction of underestimating the proportion of people with a valid photo ID
43
www.srcentre.com.au
Level of Effort Analysis 2 (cont.)
➢ However, since the level of effort to achieve a completion correlated with various
demographic characteristics of respondents (e.g., it take less effort to reach elderly
respondents and more effort to reach young adults) and given that key demographic
characteristics were taken into account in the weighting process, it was judged to be
unlikely that the findings reported about the proportion of registered voters without a
valid photo ID were subject to nonignorable nonresponse bias.
➢ Concerning awareness of the legislation, there was no statistically significant difference
associated with the level of effort expended to gain a completion.
o Thus, there was no evidence found to suggest that uncontacted nonrespondents would
have given materially different answers to this key questions than did the respondents.
44
www.srcentre.com.au
Non-Response Follow-up Studies
Paul J. Lavrakas
www.srcentre.com.au
Case Study: IMLS Nonresponse Follow-Up (NRFU) Study
➢ Another preferred method to investigate Nonresponse Bias is to conduct a follow-up
survey of a sample of the original survey’s nonresponders.
➢ Little has been reported regarding how to use the data from such follow-up studies
➢ Please note, that my slides today are extracted from a paper presentation that I made in
2015 at the 70th annual conference of the American Association for Public Opinion
Research (AAPOR), “Studying Nonresponse Bias with a Follow-up Survey of Initial
Nonresponders in a Dual Frame RDD Survey”
46
www.srcentre.com.au
2013 Public Needs for Library and Museum Services (PNLMS)
Survey
➢ Sponsored by the Institute of Museum and Library Service (IMLS), a federal agency in
the USA
➢ Conducted by M. Davis and Company, Inc., Philadelphia PA
➢ National dual-frame RDD survey of the general population of the USA
o Data gathered September-November 2013
o 3,537 interviews completed; 2,506 from LL frame; 1,031 from CP frame
• Landline sample AAPOR RR3 was 25%
• Cell Phone sample AAPOR RR3 was 10%
47
www.srcentre.com.au
2014 IMLS Nonresponse Follow-up (NRFU) Survey
➢ Created shortened questionnaire with key variables
➢ Used a noncontingent and a contingent incentive protocol
➢ Used only best calibre interviewers
➢ 201 interviews were completed in January-February 2014 with a random sample of
nonresponders to the main survey; 100 from LL frame, 101 from CP frame
o Landline follow-up sample AAPOR RR3 was 32%
o Cell Phone follow-up sample AAPOR RR3 was 16%
➢ The original sample and the follow-up samples had remarkably similar demographic
characteristics with the exception that the follow-up sample had proportionally more
young adults aged 18-24 years and fewer older adults aged 65+ year (p < .03) than the
original survey.
48
www.srcentre.com.au
Analytic Approach: Phase 1
➢ Combining the two surveys via Weighting, in two Phases
o Phase 1: Begin by assigning weights, separately, to each of four groups
• Landline RDD sample (original study)
• Landline NR follow-up sample
• Cell phone RDD sample (original study)
• Cell phone NR follow-up sample
49
www.srcentre.com.au
Analytic Approach: Phase 1
➢ For each of the four groups, Phase 1 including the following seven steps conducted for
each US Census region within each sample type:
1. Probability of Selection and Design Weight
2. Nonresponse Follow-up Adjustment
3. Unknown Eligibility Adjustment
4. Removal of Known Ineligible
5. Unit Nonresponse Adjustment
6. Adult Subsampling Adjustment
7. Multiple Phone Line Adjustment
➢ Steps 1 and 2 created the weights to start the weighting process
➢ Steps 3-7 were sequential adjustments to the starting weights
50
www.srcentre.com.au
1. Probability of Selection and Design Weight
51
For each of the four samples:
➢ Using information from the sampling frame, the probability of selection was calculated
as the number of released telephone numbers in each Census region divided by
the total number of telephone numbers in the census region
➢ The design weight was the inverse of the probability of selection for the released
telephone numbers and zero for the non-released telephone numbers
www.srcentre.com.au
2. Nonresponse Follow-up Adjustment
52
For each of the two NRFU samples:
➢ For the telephone numbers that were eligible for the nonresponse follow-up study, a
further adjustment to the design weight was required to account for the subsampling of
telephone numbers that were called from among all the eligible nonresponse follow-up
numbers
➢ This adjustment to the design weight was calculated as the number of follow-up
eligible telephone numbers divided by the number of follow-up eligible called
telephone numbers
www.srcentre.com.au
3. Unknown Eligibility Adjustment
As in any DFRDD survey, a considerable portion of the sampled numbers
ended the original field period with their eligibility unresolved
For each of the four samples:
➢ Using standard AAPOR final disposition codes, two groups of telephone numbers were
created in each Census region: (1) unknown eligibility telephone numbers and (2)
known eligibility telephone numbers
➢ Unknown eligibility adjustment factor was the sum of the follow-up adjusted design
weights for all telephone numbers divided by the sum of the follow-up adjusted
design weights for the known eligibility telephone numbers
➢ The unknown eligibility adjusted weight is the product of the unknown eligibility
adjustment factor and the follow-up adjusted design weight for the known eligibility
telephone numbers and zero (0.0) for the unknown eligibility telephone numbers
53
www.srcentre.com.au
4. Removal of Known Ineligible
As with all telephone surveys of the general public, many of the telephone
numbers that were released were found to be ineligible for various reasons,
including various non-working numbers (e.g., disconnected or temporarily
out of service, technical problems, etc.), and numbers that were not part of
the target population, e.g., fax, business, or other nonresidential numbers.
For each of the four samples:
➢ Using AAPOR final disposition codes, these known ineligible telephone numbers
were identified and removed from the weighting process
54
www.srcentre.com.au
5. Unit Nonresponse Adjustment
55
As with all telephone surveys of the general public, for many of the
telephone numbers for which contact was made, it was determined that
there was at least one eligible adult but no data were collected
For each of the four samples:
➢ Using the final disposition codes for the eligible telephone numbers produced two groups
of eligible telephone numbers in each Census region, (1) eligible responding
telephone numbers and (2) eligible non-responding telephone numbers
➢ The unit nonresponse adjustment shifts the weights from the eligible nonrespondents to
the eligible respondents
➢ The nonresponse adjustment factor was the sum of the unknown eligibility adjusted
weights for all eligible telephone numbers divided by the sum of the unknown
eligibility adjusted weights for the responding telephone numbers
➢ The nonresponse adjusted weight was the product of the nonresponse adjustment factor
and the unknown eligibility adjusted weight for respondents, and zero (0.0) for eligible
nonrespondents
www.srcentre.com.au
6. Landline Adult Subsampling Adjustment
56
Within eligible landline households there may have been more than one
eligible adult in residence; this was determined in the questionnaire
For each of the four samples:
➢ When there were two or more eligible adults associated with an landline telephone
number, the adult subsampling adjustment factor was capped at two (2.0)
➢ The adult subsampling adjusted weight was the product of the adult subsampling
adjustment factor and the nonresponse adjusted weight.
www.srcentre.com.au
7. Multiple Phone Line Adjustment
Most adults can be sampled on more than one phone line.
For each of the four samples:
➢ A multiplicity adjustment was implemented when there were two or more telephone
numbers, either landline or cell phone, at which the responding adult could have been
contacted; this multiplicity adjustment factor was capped at two (2.0), otherwise,
the multiplicity adjustment factor was one (1.0)
This completed Phase 1 of the separate weighting of each of the four
groups
57
www.srcentre.com.au
Phase 2 Combining Samples: Composite Weights
58
For each of the four samples:
➢ The composite weighting adjusted the multiplicity adjusted weight from the landline
respondents and cell phone respondents to account for overlap in the two samples
➢ If the person could have been reached by both frames the composite weight was 0.5;
if the person could only be reached by one frame it was 1.0
www.srcentre.com.au
Phase 2 Combining Samples: Calibration
For each of the four samples:
➢ The final adjustment forced the weight totals from the survey data using the composite
weight to match external population control totals; the external control totals were
based on the following characteristics:
o An extrapolation of Blumberg and Luke’s 2013 NHIS findings for telephone service
ownership so that 50% were both landline and cell phone, 41% cell phone only, and 9%
landline only
o Socio-demographic characteristics from the most recent American Community Survey
• Gender (Male, Female)
• Age group (18-44, 45-64, 64+)
• Marital status (Never been Married, Married, Separated/Divorced/Widow)
• Hispanicity/Race (Hispanic, Non-Hispanic African American, Non-Hispanic White, Non-Hispanic
Other)
• Education (Less High School/High School, Some College, Associate or Bachelor Degree, Advanced
Degree)
• Presence or absence of children (No, Yes)
59
www.srcentre.com.au
Phase 2 Combining Samples: Calibration (cont.)
➢ The calibration methodology that was used was Iterative Proportional Fitting, i.e.,
raking or sample balancing;
➢ This method forces all of the different characteristics to simultaneously match the
control totals
o Each of the socio-demographic characteristics was used as a separate dimension in the
raking process
60
www.srcentre.com.au
Using the Final Combined Data from the Four Samples
➢ After all this was done we had a weighted dataset containing all the interviews from
the original study and from the follow-up survey of nonresponders (n= 3,738)
➢ This allowed us to generate two types of population estimates for key behavioral
variables in the study
o Estimates based on the original survey only
o Estimates based on the combined surveys
➢ We then compared the differences between the two estimates to determine whether or
not a statistic was materially different when the data from the nonresponse follow-up
survey was taken into account
61
www.srcentre.com.au
An Example of a Material Difference in a Key Statistic
➢ Percent of parents/guardians reporting having a child who visited a Zoo/Aquarium in the
past 30 days
62
Original Survey
Yes, Did Visit 19.2%
Est. # in USA 14.45M
www.srcentre.com.au
An Example of a Material Difference in a Key Statistic
➢ Percent of parents/guardians reporting having a child who visited a Zoo/Aquarium in the
past 30 days
➢ 8.1 percentage point difference
➢ Estimated 6.1M difference
63
Original Survey Combined Surveys
Yes, Did Visit 19.2% 27.3%
Est. # in USA 14.45M 20.55M
www.srcentre.com.au
The Extent of Differences Identified by the NR Follow-up Study
➢ For 10 of the 28 behavioral measures, the percentage found in the original survey
differed by more than two percentage points (2 pp) from the percentage found in the
combined survey dataset
➢ Of these, six of the behavioral measures differed by more than five percentage points (5
pp)
➢ All of these measures were for estimates associated with the reported behavior of
children (where the adult respondent served as a proxy for her/his child) and none of
them were for estimates of the reported behavior of the adult respondent
Nonresponse Bias exists at the level of the individual measure, and here we
found evidence of several measures that likely were highly biased
64
www.srcentre.com.au
Response Propensity Modelling –
Non-response Adjustments
Dina Neiger
www.srcentre.com.au
Acknowledgement
Analysis by Andrew Ward, Principal Statistician, Social Research Centre
Thanks to Dr Siu-Ming Tam and Mr Paul Schubert, ABS, for their
collaboration on this work and their kind permission to use the survey data
for this presentation.
Slides based on presentation by Andrew Ward at the Australian Statistical
Conference in December 2016.
66
www.srcentre.com.au
Community Trust in Statistics Survey
Determine public
awareness and trust of
official statistics
Dual frame phone survey
Also available for respondents
and non-respondents: part-of-
state (based on the landline
prefix) or mobile
Collected info from ~727
refusals – age, sex,
awareness, trust
67
www.srcentre.com.au
Challenge
➢ Incorporation of refusal information in non-response adjustment
Probability of response
derived from propensity
model
Base weight =
Design weight / Probability of response
Limited auxiliary information available for respondents and non-
respondents
68
www.srcentre.com.au
Non-response adjustment
Awareness / Trust
Non-respondents
(%)
Respondents
(%)
Have heard of and trust a great deal 12.4 20.5
Have heard of and tend to trust 28.5 54.6
Have heard of and tend to distrust 6.2 10.5
Have heard of and distrust a great deal 4.8 2.2
Have heard of but DK / Refused trust 26.3 2.9
Total awareness 78.2 90.7
Have not heard of 17.1 9.0
Don’t know / Refused 4.8 0.4
69
www.srcentre.com.au
Gives boost to respondents most like non-respondents
Location, Age, Sex, Awareness of official statistics, Trust in official
statistics
Logistic regression model predicts response probability from information
available for both respondents and non-respondents
Non-response adjustment (cont’d)
70
www.srcentre.com.au
Non-response adjustment (cont’d)
Awareness / Trust
Unweighted
(%)
Without
propensity
weight (%)
With
propensity
weight (%)
With % -
Without %
Have heard of and trust a great
deal
20.5 15.3 14.3 -1
Have heard of and tend to trust 54.6 52.6 47.5 -5.1
Have heard of and tend to distrust 10.5 11.1 9.9 -1.2
Have heard of and distrust a great
deal
2.2 2.4 2.9 0.5
Have heard of but DK / Refused
trust
2.9 3.4 7.5 4.1
Total awareness 90.7 84.8 82.1 -2.7
Have not heard of 9.0 14.5 17.1 2.6
Don’t know / Refused 0.4 0.7 0.9 0.2
71
www.srcentre.com.au
Non-response adjustment (cont’d)
Assumptions
Refusals who provided
basic information are
representative of all
non-contacts
Refusals and
respondents answer non-
response items in the
same way
72
www.srcentre.com.au
Non-response adjustment (cont’d)
Has face-validity, though…
For example, persons
with university education
are typically over-
represented in RDD
surveys
Therefore awareness
estimates may have been
inflated
73
www.srcentre.com.au
Techniques to reduce bias in non-
probability samples
Dina Neiger
www.srcentre.com.au
Data
The Social Research Centre
Cancer Council Victoria
Slides and Ideas
Darren Pennay
Andrew C Ward
Paul J Lavrakas
Tina Petroulias
Sebastian Misson
Charles DiSogra
Inferences from Non Probability
Samples Workshop participants,
Paris 2017
75
Acknowledgments
www.srcentre.com.au
Key background references
➢ Yeager, D. S., Krosnick, J. A., Chang, L., Javitz, H. S., Levendusky, M. S., Simpser, A.,
(2011)
o Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with
Probability and Non-Probability Samples. Public Opinion Quarterly
➢ DiSogra, C., Cobb C., Chan E., Dennis J. M. (2011)
o Calibrating Non-Probability Internet Samples with Probability Samples Using Early Adopter
Characteristics. Section on Survey Research Methods – JSM 2011
➢ Valliant, R., Dever, J. A. (2011)
o Estimating propensity adjustments for volunteer web surveys, Sociological Methods &
Research, 40(1), pp. 105-137)
➢ Terhnian, G., Bremer, J., Haney C., (2014)
o A model based approach for achieving a representative sample
➢ Fahimi, M., F. M. Barlas, R. K. Thomas and N. Buttermore (2015)
o Scientific Surveys Based on Incomplete Sampling Frames and High Rates of Nonresponse.
Survey Practice. 8 (5)
76
www.srcentre.com.au
Usual starting point
➢ Design weight – inverse of probability of selection
o Probability sample: number of people in the household, number of landlines, etc.
o Non-probability sample: unknown, usually given design weight=1
➢ Calibration (post-stratification, raking, RIM)
o Uses external measures as “benchmarks” to adjust/weight (calibrate) the data to
improve accuracy and force estimates to be consistent with other data sources
o Applied to both probability and non-probability samples for key population
demographics e.g. gender, age, education to reflect population distribution
o If non-probability sample uses quotas for the same demographics then post-
stratification will have minimal impact
77
www.srcentre.com.au
Selecting benchmarks
Should be a known source of non-response bias and likely to have an effect on survey estimates.
Must have information for each sample member individually and the population as a whole
➢ Identical or as close to as possible (e.g. the same survey question and the census)
Stratified sample:
➢ Unequal probability of selection within population subgroups (markets / strata)
➢ Population totals within each subgroup/stratum must be included as part of the weights
Common variables in general population surveys:
➢ Telephone status (for dual frame surveys)
➢ Gender
➢ Age by education
➢ Country of birth (Australia V Other English Speaking Country V Non-English Speaking Countries)
➢ Geography (State, Metro V regional)
78
www.srcentre.com.au
Is your adjustment working?
Key considerations:
➢ Variance (probability samples) and bias
Variance:
➢ One sample of many possible samples, how different would your result be if you happened to select a different random sample?
➢ If use too many benchmarks (or data is severely biased), weights will become extreme increasing variance of your estimate and
decreasing your confidence in the accuracy of your result
Bias:
➢ Difference from your survey result to the true value
➢ If ignore a known source of non-response bias then survey results will be further away from the truth
➢ Need to know the truth in order to measure
➢ Bias measurement dilemma - if already know the truth why bother with the survey?
What about weighting efficiency?
➢ A measure of how much work a weight is doing
➢ What is an acceptable minimum? No standard
➢ Can be misleading especially in non-probability or highly biased samples (e.g. weight=1)
79
www.srcentre.com.au
Case study 1: The Online Panels Benchmarking Study
➢ Three surveys based on probability samples of the Australian population aged 18 years
and over
➢ Five surveys of persons aged 18 years and over administered to members of non-
probability online panels
➢ Survey questionnaire included a range of demographic questions and questions about
health, wellbeing and use of technology
➢ Same questions were used across the eight surveys (Unified approach to questionnaire
design to try and minimise mode effects)
➢ 9 minutes average interview length for online and telephone (12 page booklet)
➢ Fieldwork: Oct – Dec 2015
➢ Data and documentation available from the Australian Data Archive
https://www.ada.edu.au/ada/01329
80
www.srcentre.com.au
Results – substantive health characteristics (modal response)
81
Substantive variables
Benchmark
value
Distance from benchmarks
Probability Non-probability
ABS
ANU
Poll
DF
RDD
P1 P2 P3 P4 P5
Life satisfaction (8 out of 10)
Percentage point error 32.6 -2.0 -2.0 1.9 -11.9 -11.6 -4.5 -9.2 -7.9
Psychological distress - Kessler 6 (Low)
Percentage point error 82.2 -10.6 -11.6 -8.1 -25.9 -23.5 -22.2 -25.0 -23.2
General Health Status (SF1) (Very good)
Percentage point error 36.2 0.4 -2.0 -2.6 -4.1 -5.8 -5.3 -5.0 1.5
Private Health Insurance
Percentage point error 57.1 3.4 1.9 3.3 -8.9 -12.5 -3.7 -0.6 -2.6
Daily smoker
Percentage point error 13.5 -4.1 3.5 1.6 9.8 6.7 3.9 2.7 4.3
Consumed alcohol in the last 12 months
Percentage point error 81.9 3.6 -2.8 -4.0 2.4 5.3 3.9 4.2 1.5
www.srcentre.com.au
Impact on weighting
Impact of weighting on average absolute error: The average difference
(percentage points) across all benchmarks between the official statistics and
the survey estimates.
82
Probability Surveys Non-probability Panels
ABS
ANU
Poll
RDD P1 P2 P3 P4 P5
Unweighted (avge. error) 4.28 3.68 4.63 9.34 10.35 7.28 7.20 6.41
Weighted (avge. error) 4.02 3.98 3.58 10.5 10.9 7.24 7.78 6.83
Impact ✓  ✓   -  
www.srcentre.com.au
Model-based design weights
www.srcentre.com.au
Design weight: Non-probability sample
➢ Current approaches:
o Best case scenario
• Model based selection to address known biases (e.g. quotas)
o Worst case scenario:
• Participation is a combination of idiosyncratic contacts, low response rates, and non-coverage.
• This is not the kind of design one would choose if there were an affordable alternative.
• (Doug Rivers, Comments on the 2013 AAPOR Taskforce Report on Non-probability Sampling”
(Douglas Rivers; Comment. J Surv Stat Methodol 2013; 1 (2): 111-117. doi: 10.1093/jssam/smt009)
➢ Possible alternative:
o Adapt propensity response model to model likelihood of being part of the non-probability
sample
o Need probability reference sample
84
www.srcentre.com.au
Probability reference sample
➢ Gold standard that we’d like non-probability sample to resemble as closely as possible
o Known to produce accurate estimates for key outcomes
o Ideally includes data that can be compared to independent benchmarks
➢ Needs
o Common data items with the non-probability estimate
• As a minimum: demographic and attitudinal and outcomes of interest
o Comparable data
• Mode (depending on the questionnaire)
• Question wording
• Reference timeframe
85
www.srcentre.com.au
Case study 2
LinA as a reference sample for OPBS
➢ Life in Australia
o http://www.srcentre.com.au/our-research#life-in-aus
o Australia’s first and only probability-based online panel
o Launched in December 2016
o 3,300 adults from across Australia randomly recruited via a dedicated dual-frame telephone
survey, Aged 18 years and over, Online and offline population
o Replicated online panel benchmarking study in February 2016
➢ OPBS non-probability panels reweighted
o Design weight=1
o Post-stratification weight matched to LinA weighting
o Add enrolment to vote to the substantive characteristics
86
www.srcentre.com.au
LinA as a reference sample
87
Substantive variables
Benchmark
value (%)
Distance from benchmarks (percentage point
difference from benchmark)
LinA
Non-probability Panels
1 2 3 4 5
Life satisfaction
(8 out of 10)
32.6 -1.4 -11.1 -13.6 -5.9 -10.3 -8.0
Psychological distress -
Kessler 6 (Low)
82.2 -21.3 -26.5 -25.5 -22.7 -25.1 -23.4
General Health Status -
SF1 (Very good)
36.2 -3.8 -4.2 -4.2 -4.0 -5.3 0.1
Private Health Insurance 57.1 2.6 -8.0 -9.8 -3.9 0.1 -2.0
Daily smoker 13.5 -1.0 9.0 6.0 3.5 2.2 3.5
Consumed alcohol in the
last 12 months
81.9 2.7 -1.1 -6.4 -3.6 -4.9 -1.7
Enrolled to vote 78.5 9.0 8.4 7.6 10.7 8.8 13.1
Average absolute difference 5.23 8.54 9.14 6.79 7.09 6.48
www.srcentre.com.au
Back to model-based design weights
➢ Use reference sample to calculate
o Propensity scores – conditional probability that a respondent in the non-probability sample
rather than in a probability sample given observed characteristics of the respondent
• Logistic model (R PracTools package, Valliant et al. 2015)
• Non-probability cases=1, reference cases=0
• Design weight for non-probability sample - inverse of estimated probability of inclusion in non-
probability sample given weighted reference sample
• Beware: extreme weights, size and quality of the reference sample
➢ Result:
o Probability-based design weights for non-probability sample
o Now ready to look at calibration/weighting adjustment methods
88
www.srcentre.com.au
Calibration
www.srcentre.com.au
No agreed approach – work in progress
➢ Starting point – known biases in the non-probability sample
➢ For example, for online panels
o Heavier internet users
o Heavier media users
o More interested in technology
o Early adopters skew
o More health care card holders
➢ Common theme
o Standard demographics are not enough
o High quality “official” benchmarks not available
➢ Probability reference sample to the rescue!
o As long as items that are known/suspected biases in the non-probability sample are
included
90
www.srcentre.com.au
Cast study 2
➢ Significant differentiators
o Early adopter variables
o Internet usage variables
o Income
o Employment
o Remoteness
o Home ownership
o Media consumption variables were not included in the survey
➢ Incorporate one additional variable at a time compare with benchmarks to evaluate
impact on bias
91
www.srcentre.com.au
Options for EA inclusion in calibration
➢ Each EA variable is included as a 0/1 variables where
o 1 means agreed or strongly agreed with that statement and
o 0 all else (disagreed or strongly disagreed, did not respond)
➢ Scale derived using Rasch model
o Bond, T. G. and C. M. Fox (2007). Applying the Rasch model: Fundamental measurement in
the human sciences. (2nd ed.) Mahwah, N.J.: Erlbaum.
➢ Categorical scale
o None – did not agree or strongly agree to any of 5 statements
o Some – agreed or strongly agreed with at least one and no more than 2 out of 5 statements
o High – agreed or strongly agreed with 3 or more out of 5 statements
➢ Agreement total calculated as the number of statements (min 0, max 5) that the
respondent agreed or strongly agreed with
92
www.srcentre.com.au
Options for EA inclusion in calibration
Composite is better than individual variables
93
6.2
2.5
-0.4
-1.6
2
-1.5
-4.4
-5.5
7.8
-0.3
-1.5
-2.4
-8
-6
-4
-2
0
2
4
6
8
10
EA1-EA4 EA1-EA5 Rasch Scale EA1-EA5 Categorical Scale EA1-EA5 Agreement total
%change
Impact of EA variables on bias (% change)
Ave abs error compared to unweighted Ave abs error compared to std adjustment
Ave RMSE compared to std adjustment
www.srcentre.com.au
Internet usage measures
Not useful when calibrating online to online
94
2.6
5
4.5
5.5
-1.5
0.9
0.3
1.3
0.1
2
6.4
1.9
-2
-1
0
1
2
3
4
5
6
7
Look for information Access at home, Look for
information
Look for information, Post to
blogs etc, Financial
transactions, Social Media
Access at home,
Frequency of use
%change
Impact of Internet Usage measures on bias (% change)
Ave abs error compared to unweighted Ave abs error compared to std adjustment
Ave RMSE compared to std adjustment
www.srcentre.com.au
Socio-economic variables in calibration
Income – single most influential variable to reduce bias and RMSE
Benefit from including both: income and employment
95
-7.3
-10.3
-7.3
-5.9
-11.0
-13.9
-10.9
-9.6
-8.5
-10.6
-5.8
-4.6
-16.0
-14.0
-12.0
-10.0
-8.0
-6.0
-4.0
-2.0
0.0
Income Income, Employment Income, Employment, Home
Ownership
Income, Employment, Home
Ownership, Remoteness
%change
Impact of Income, Employment and Home Ownership variables on bias (% change)
Ave abs error compared to unweighted Ave abs error compared to std adjustment Ave RMSE compared to std adjustment
www.srcentre.com.au
Problem solved
www.srcentre.com.au
Impact of calibration
(LinA: Income, Employment, EA total agreement score)
97
Benefit of adjustment is sample dependent!
Standard adjustments do not help with bias reductions!!
Average Absolute Error % Change in bias
Unweighted
Std
adjustments
Design weights
and key
differentiators
From
unweighted
From std
adjustment
Panel 1 9.2 9.8 7.8 -15.1 -19.8
Panel 2 10.0 10.4 9.3 -7.2 -10.9
Panel 3 7.7 7.7 5.3 -31.0 -31.5
Panel 4 7.4 8.1 7.1 -3.3 -11.9
Panel 5 7.4 7.4 7.8 5.4 5.7
All Panels 8.3 8.7 7.5 -10.4 -13.9
www.srcentre.com.au
Comparison with independent benchmarks vs reference
sample benchmarks
98
Mostly change in bias is consistent but there can be large differences for some samples
-15.1
-7.2
-31.0
-3.3
5.4
-26.9
5.7
-31.7
28.7
1.0
-40.0
-30.0
-20.0
-10.0
0.0
10.0
20.0
30.0
40.0
Panel 1 Panel 2 Panel 3 Panel 4 Panel 5
% Change in bias from unweighted
Independent Benchmarks Reference Sample
-19.8
-10.9
-31.5
-11.9
5.7
-30.6
0.6
-31.5
-12.4
-8.3
-50.0
-40.0
-30.0
-20.0
-10.0
0.0
10.0
20.0
Panel 1 Panel 2 Panel 3 Panel 4 Panel 5
% Change in bias from std adjustment
Independent Benchmarks Reference Sample
www.srcentre.com.au
What about blending?
➢ Adding probability sample to non-probability sample will further
decrease the bias
➢ Confirmed that
o Combining probability and non-probability data greatly reduces variability across panels and
has much larger impact than calibration alone
o Use the best probability sample to combine with non probability regardless of mode and
response rate
➢ But…
➢ Costs and feasibility of running parallel probability sample
o Push to web – costs (incentives, follow up), timeliness
o Listed vs RDD mobile – option for blending
99

Contenu connexe

Tendances

Contect_General Update v3
Contect_General Update v3Contect_General Update v3
Contect_General Update v3
Jason Rebello
 
OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis
OER Hub
 

Tendances (20)

An Exploration of the Effectiveness of the use of Communication Apps through ...
An Exploration of the Effectiveness of the use of Communication Apps through ...An Exploration of the Effectiveness of the use of Communication Apps through ...
An Exploration of the Effectiveness of the use of Communication Apps through ...
 
Predicting Response Mode Preferences of Survey Respondents
Predicting Response Mode Preferences of Survey RespondentsPredicting Response Mode Preferences of Survey Respondents
Predicting Response Mode Preferences of Survey Respondents
 
Contect_General Update v3
Contect_General Update v3Contect_General Update v3
Contect_General Update v3
 
In metrics we trust?
In metrics we trust?In metrics we trust?
In metrics we trust?
 
Spreading the Word! Librarians and OER (OER14, April 2014)
Spreading the Word! Librarians and OER (OER14, April 2014)  Spreading the Word! Librarians and OER (OER14, April 2014)
Spreading the Word! Librarians and OER (OER14, April 2014)
 
Self-service design - Eye-tracking Findings That Will Help You Design Forms T...
Self-service design - Eye-tracking Findings That Will Help You Design Forms T...Self-service design - Eye-tracking Findings That Will Help You Design Forms T...
Self-service design - Eye-tracking Findings That Will Help You Design Forms T...
 
Frontiers' Collaborative Review
Frontiers' Collaborative ReviewFrontiers' Collaborative Review
Frontiers' Collaborative Review
 
MIS5101 Week 13 Security Privacy Data Mining
MIS5101 Week 13 Security Privacy Data MiningMIS5101 Week 13 Security Privacy Data Mining
MIS5101 Week 13 Security Privacy Data Mining
 
Frontiers: Five Year Plan
Frontiers: Five Year PlanFrontiers: Five Year Plan
Frontiers: Five Year Plan
 
Michael Habib – Lightning talk at NISO Altmetrics Initiative
Michael Habib – Lightning talk at NISO Altmetrics InitiativeMichael Habib – Lightning talk at NISO Altmetrics Initiative
Michael Habib – Lightning talk at NISO Altmetrics Initiative
 
Forecasting elections from voters' perceptions of candidates' ability to hand...
Forecasting elections from voters' perceptions of candidates' ability to hand...Forecasting elections from voters' perceptions of candidates' ability to hand...
Forecasting elections from voters' perceptions of candidates' ability to hand...
 
The PollyVote Combining forecasts for U.S. Presidential Elections
The PollyVote  Combining forecasts for U.S. Presidential ElectionsThe PollyVote  Combining forecasts for U.S. Presidential Elections
The PollyVote Combining forecasts for U.S. Presidential Elections
 
OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis
 
OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis
 
OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis OER Impact: Collaboration, Evidence, Synthesis
OER Impact: Collaboration, Evidence, Synthesis
 
Interleaving - SIGIR 2016 presentation
Interleaving - SIGIR 2016 presentationInterleaving - SIGIR 2016 presentation
Interleaving - SIGIR 2016 presentation
 
Improving measurement through Operations Research
Improving measurement through Operations ResearchImproving measurement through Operations Research
Improving measurement through Operations Research
 
Carpenter and Agate, "Impact on the Scholar & Researcher: Interview with Nick...
Carpenter and Agate, "Impact on the Scholar & Researcher: Interview with Nick...Carpenter and Agate, "Impact on the Scholar & Researcher: Interview with Nick...
Carpenter and Agate, "Impact on the Scholar & Researcher: Interview with Nick...
 
Habib, Researcher Awareness + Perception: A Year in Review
Habib, Researcher Awareness + Perception: A Year in ReviewHabib, Researcher Awareness + Perception: A Year in Review
Habib, Researcher Awareness + Perception: A Year in Review
 
Women who choose Computer Science - what really matters
Women who choose Computer Science - what really mattersWomen who choose Computer Science - what really matters
Women who choose Computer Science - what really matters
 

Similaire à Improving inferences from poor quality samples

How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling Data
DataCards
 
Running head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docx
Running head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docxRunning head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docx
Running head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docx
toltonkendal
 
College student smartphone usage aapor may 16 2014 new
College student smartphone usage   aapor may 16 2014 newCollege student smartphone usage   aapor may 16 2014 new
College student smartphone usage aapor may 16 2014 new
Sharp Mind
 
FIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docx
FIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docxFIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docx
FIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docx
clydes2
 

Similaire à Improving inferences from poor quality samples (20)

How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling Data
 
April Heyward Research Methods Class Session - 7-29-2021
April Heyward Research Methods Class Session - 7-29-2021April Heyward Research Methods Class Session - 7-29-2021
April Heyward Research Methods Class Session - 7-29-2021
 
Workshop session 1 - How did we get here?
Workshop session 1 -  How did we get here?Workshop session 1 -  How did we get here?
Workshop session 1 - How did we get here?
 
Workshop session 4 - Optimal sample designs for general community telephone s...
Workshop session 4 - Optimal sample designs for general community telephone s...Workshop session 4 - Optimal sample designs for general community telephone s...
Workshop session 4 - Optimal sample designs for general community telephone s...
 
Online survey
Online surveyOnline survey
Online survey
 
Question 1
Question 1Question 1
Question 1
 
Workshop session 10 - Alternatives to CATI (3) address-based sampling and pus...
Workshop session 10 - Alternatives to CATI (3) address-based sampling and pus...Workshop session 10 - Alternatives to CATI (3) address-based sampling and pus...
Workshop session 10 - Alternatives to CATI (3) address-based sampling and pus...
 
Online surveys
Online surveysOnline surveys
Online surveys
 
Influence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase IntentionInfluence of electronic word of mouth on Consumers Purchase Intention
Influence of electronic word of mouth on Consumers Purchase Intention
 
Running head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docx
Running head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docxRunning head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docx
Running head THE ADVANTAGES AND DISADVANTAGES OF SURVEYS.docx
 
Methods for the design
Methods for the designMethods for the design
Methods for the design
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 
CKX: Social Impact Measurement Around the Globe
CKX: Social Impact Measurement Around the Globe CKX: Social Impact Measurement Around the Globe
CKX: Social Impact Measurement Around the Globe
 
College student smartphone usage aapor may 16 2014 new
College student smartphone usage   aapor may 16 2014 newCollege student smartphone usage   aapor may 16 2014 new
College student smartphone usage aapor may 16 2014 new
 
Media Insights & Engagement 2016
Media Insights & Engagement 2016Media Insights & Engagement 2016
Media Insights & Engagement 2016
 
Internet Pilot Paper2
Internet Pilot Paper2Internet Pilot Paper2
Internet Pilot Paper2
 
Workshop session 9 - Alternatives to CATI (2) probability online panels
Workshop session 9 - Alternatives to CATI (2) probability online panelsWorkshop session 9 - Alternatives to CATI (2) probability online panels
Workshop session 9 - Alternatives to CATI (2) probability online panels
 
Falls Prevention & Management In Residential Care Setting  
Falls Prevention & Management In Residential Care Setting  Falls Prevention & Management In Residential Care Setting  
Falls Prevention & Management In Residential Care Setting  
 
Survey Resaerch
Survey ResaerchSurvey Resaerch
Survey Resaerch
 
FIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docx
FIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docxFIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docx
FIRST CLASSMATE’S REPLY By Erika Little Discussion Board Modu.docx
 

Plus de The Social Research Centre

Plus de The Social Research Centre (7)

Workshop session 12 - Closing remarks
Workshop session 12 - Closing remarksWorkshop session 12 - Closing remarks
Workshop session 12 - Closing remarks
 
Workshop session 11 - The future of CATI - including panel discussion
Workshop session 11 - The future of CATI - including panel discussionWorkshop session 11 - The future of CATI - including panel discussion
Workshop session 11 - The future of CATI - including panel discussion
 
Workshop session 8 - Alternatives to CATI (1) non-probability online panels
Workshop session 8 - Alternatives to CATI (1) non-probability online panelsWorkshop session 8 - Alternatives to CATI (1) non-probability online panels
Workshop session 8 - Alternatives to CATI (1) non-probability online panels
 
Workshop session 5 - the effectiveness of standard methods to improve general...
Workshop session 5 - the effectiveness of standard methods to improve general...Workshop session 5 - the effectiveness of standard methods to improve general...
Workshop session 5 - the effectiveness of standard methods to improve general...
 
Workshop session 3 - Sampling frames for telephone surveys
Workshop session 3 - Sampling frames for telephone surveysWorkshop session 3 - Sampling frames for telephone surveys
Workshop session 3 - Sampling frames for telephone surveys
 
Workshop session 2 - The telephone status of the population
Workshop session 2 - The telephone status of the populationWorkshop session 2 - The telephone status of the population
Workshop session 2 - The telephone status of the population
 
Increasing cooperation in telephone surveys with the progressive engagement t...
Increasing cooperation in telephone surveys with the progressive engagement t...Increasing cooperation in telephone surveys with the progressive engagement t...
Increasing cooperation in telephone surveys with the progressive engagement t...
 

Dernier

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 

Dernier (20)

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Improving inferences from poor quality samples

  • 1. www.srcentre.com.au Improving inferences from poor quality samples Social Research Centre / Centre for Social Research and Methods Workshop 16 August, 2017
  • 2. www.srcentre.com.au Presenters Social Research Centre Pty Ltd Darren W. Pennay Dina Neiger, PhD Paul J. Lavrakas, PhD Darren Pennay is an Adjunct Senior Research Fellow with the ANU Centre for Social Research and Methods and an Adjunct Professor with the Institute for Social Science Research at the University of Queensland Dina Neiger is a Centre Visitor with the ANU Centre for Social Research and Methods Paul Lavrakas is a Senior Fellow, NORC at the University of Chicago; Senior Research Fellow, Office of Survey Research, Michigan State University; and Senior Methodological Advisor for the ANU’s Social Research Centre 2
  • 3. www.srcentre.com.au The Social Research Centre ➢ Based in Melbourne, Australia o owned by the Australian National University in Canberra o operates as a for-profit research services company. o profits returned to ANU o Co-founders of the ANU Centre for Social Research and Methods ➢ Conduct social and public policy research for government, not-for-profit and academic organisations ➢ Primary data collection involving survey research, qualitative research, statistical methods, data processing and analytics, survey methodology ➢ 65 staff, plus 120-seat CATI call centre 3
  • 4. www.srcentre.com.au What is Total Survey Quality? ➢ Many national statistical offices, including Australia, Canada, New Zealand and USA, operate within a Total Survey Quality (TSQ) framework to determine whether or not a survey is “fit-for-purpose”. ➢ Accuracy is not enough. 4 Dimensions of TSQ Framework Dimension Description Accuracy Total survey error is minimised Credibility Data are considered trustworthy by the survey user communities Comparability Consistent with past studies in terms of demographic, spatial, and temporal comparisons Usability / Interpretability Documentation is clear and metadata are well organised Relevance Data satisfy user needs Accessibility Access to the data is user friendly Timeliness / Punctuality Data deliverables adhere to schedules Completeness Data are rich enough to satisfy the analysis objectives without undue burden on respondents Coherence Estimates from different sources can be reliably combined Source: Biemer, P. (2010) Public Opinion Quarterly 74 (5): p.819.
  • 5. www.srcentre.com.au REPRESENTATION MEASUREMENT Survey Cycle from a Design Perspective 5 Final Results & Conclusions Target Population Sampling Frame Designated Sample Final sample Final Dataset Response Specification Measurement
  • 6. www.srcentre.com.au Total Survey Error Framework 6 Final Results & Conclusions Sampling Error Nonresponse Error Adjustment Error ERRORS OF REPRESENTATION ERRORS OF MEASUREMENT Coverage Error Processing Error Specification Error Measurement Error Inferential Error Target Population Sampling Frame Designated Sample Final sample Final Dataset Response Specification Measurement
  • 7. www.srcentre.com.au What is a probability sample? Textbook Definition: A probability sample is one in which every unit in the population of interest has a known, non-zero probability of being selected for the sample. Key elements which will differentiate the process from a non-probability sampling process: o Selection into the sample is via a random process. o Every unit in the population of interest has a chance of being sampled for the research. o The probability of selection is known. These design features enable us to: o Calculate standard errors o Calculate confidence intervals o Generalise to the target population of interest 7
  • 10. The general decline in response rates is evident across nearly all types of surveys, in the United States and abroad. At the same time, greater effort and expense are required to achieve even the diminished response rates of today. These challenges have led many to question whether surveys are still providing accurate and unbiased information [and whether or not probabilistic surveys are better than the alternatives?] Pew Research Center, 2012 10
  • 11. www.srcentre.com.au Trends in response rates (U.S. data) There is no sign of an increase in nonresponse bias since 2012. On 13 demographic, lifestyle, and health questions that were compared with benchmarks from high response rate federal surveys, estimates from phone polls are just as accurate, on average, in 2016 as they were in 2012. The average (absolute) difference between the Center telephone estimates and the benchmark survey estimates was 2.7 percentage points in 2016, compared with 2.8 points in 2012. 11
  • 13. www.srcentre.com.au Impact of declining response rates on telephone survey estimates in the U.S. 13
  • 14. www.srcentre.com.au Impact of low response rates on survey estimates Pew conclusions 2012 ➢ Despite the growing difficulties in obtaining a high level of participation in most surveys, well-designed telephone polls that include landlines and cell phones reach a cross- section of adults that mirrors the American public, both demographically and in many social behaviours. Pew conclusions 2016 ➢ Telephone polls still provide accurate data on a wide range of social, demographic and political variables, but some persistent weaknesses persist. 14
  • 15. www.srcentre.com.au What is the situation in Australia? A literature review and industry consultation commissioned by RICA and undertaken by Bednall et al. (2013) concluded the following: ➢ Telephone response rates: o As far as the telephone is concerned response rates have been in a gradual decline the last decade. o Among cold-calling general community surveys, telephone response rates are typically below 10%. o Co-operation rates, (the ratio of obtained interviews to refusals) are typically below 0.2 (that is below one interview to five refusals). o Telephone interviews from client lists have a higher response rate – typically above 20% with co-operation rates above 1.0. o It would appear that some topics, such as financial services, may induce a lower level of co- operation. o Government sponsored surveys have higher response rates, at times over 50%, but even here a sharp decline in response rates over time for one long running monitor was observed. Co- operation rates were also higher in government sponsored surveys. 15
  • 16. www.srcentre.com.au What is the situation in Australia – SRC Landline surveys 17 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 2011 2012 2013 2014 2015 Non-contactsas%usablesample Survey year Non-contacts as % usable sample - various projects Landline sample 1- National Landline sample 2 - National Landline sample 3 - Victoria Landline sample 4 - Victoria Landline sample 5 - National
  • 17. www.srcentre.com.au What is the situation in Australia – SRC Landline surveys 18 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 2011 2012 2013 2014 2015 Refusalsasa%ofcontacts Survey year Refusals as % contacts - various projects Landline sample 1- National Landline sample 2 - National Landline sample 3 - Victoria Landline sample 4 - Victoria Landline sample 5 - National
  • 18. www.srcentre.com.au What is the situation in Australia – SRC Landline surveys 19 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 2011 2012 2013 2014 2015 Interviewsasa%ofinterviews+refusals Survey year Interviews as % interviews and refusals - various projects Landline sample 1- National Landline sample 2 - National Landline sample 3 - Victoria Landline sample 4 - Victoria Landline sample 5 - National
  • 19. www.srcentre.com.au What is the situation in Australia (cont.)? ! Not saying that low response rates are inconsequential! ! Poorly designed and poorly executed surveys, especially those with very limited call routines (e.g. polls/trackers enumerated over one or two evenings only), those with non-coverage errors (e.g. landline telephone surveys), and those that use non-probability sampling methods (e.g. online panels) are much more likely to produce biased results. ! HOWEVER, well designed and well executed probability-based surveys still produce estimates that can be relied on in many situations (and with known standard errors and confidence intervals!) 22
  • 20. www.srcentre.com.au This appears to be the main situation regarding non-response 23
  • 22. www.srcentre.com.au A Word of Warning on Probability Sampling Rivers again …. [Probability surveys work] if the nonresponse rate is small. Cochran (1977, p. 363), in his classic text, concluded that the upper limit is approximately 10 percent nonresponse, which is difficult to achieve today in even the best funded surveys. Lohr (2010, p. 355) warns that “many examples exist of surveys with a 70% response rate whose results are flawed.” (Doug Rivers, Comments on the “2013 AAPOR Taskforce Report on Non- probability Sampling” (Douglas Rivers; Comment. J Surv Stat Methodol 2013; 1 (2): 111-117. doi: 10.1093/jssam/smt009) 25
  • 23. www.srcentre.com.au The rise of non-probability online panels
  • 24. www.srcentre.com.au The take-up of non-probability online panels The situation in Australia ➢ In 2014-15, 86% of Australian households had access to the internet at home (ABS). ➢ Online research is continuing to grow in popularity domestically and internationally. 28 24 39 34 34 40 26 28 25 31 12 0 10 20 30 40 50 2009 2010 2012 2013 2015 Online CATI Source: ESOMAR / Research Industry Council of Australia Percent of research industry turnover Online research globally – 28% in 2014 ➢ 50+ commercial ‘online research panels’. ➢ First of these established in the late 1990’s. ➢ All ‘until now’ use non- probability sampling methods.
  • 25. www.srcentre.com.au The rise of non-probability panels - globally $0 $1,000 $2,000 $3,000 $4,000 $5,000 $6,000 $7,000 $8,000 '99 '00 '01 '02 '03 '04 '05 '06 '07 '08 '09 '10 '11 '12 '13 '14 '15E '16E Revenuein$USM Year Online Research Revenues 1999-2016E US Europe Rest of World 29
  • 26. www.srcentre.com.au Cons ➢ Non-coverage ➢ Self-selection ➢ Reliance on the computer-literate respondents ➢ Non-probability sampling o Calculating standard errors o Calculating confidence intervals o Generalising to the target population of interest Pros ➢ Reduced cost ➢ Improved timeliness ➢ Respondent convenience ➢ Reduced social desirability bias ➢ Can target ‘hard to reach’ populations ➢ Multimedia functionality ➢ Computerised questionnaire scripts 30 Pros and cons of non-probability panels
  • 27. www.srcentre.com.au How do they compare – the SRC Online Panels Benchmarking Survey ➢ Three surveys based on probability samples of the Australian population aged 18 years and over and five surveys of persons aged 18 years and over administered to members of non-probability online panels. ➢ Survey questionnaire included a range of demographic questions and questions about health, wellbeing and use of technology for which high-quality population benchmarks were available. ➢ Same questions were used across the eight surveys. ➢ 9 minutes average interview length for online and telephone. o 12 page booklet for the hard copy version. ➢ Fieldwork Oct – Dec 2015. ➢ Data and documentation available from the Australian Data Archive. https://www.ada.edu.au/ada/01329 31
  • 28. www.srcentre.com.au Results – substantive health characteristics 32 Substantive variables Benchmark value Distance from benchmarks Probability Non-probability ABS ANU Poll RDD P1 P2 P3 P4 P5 Life satisfaction (8 out of 10) Percentage point error 32.6 -2.0 -2.0 1.9 -11.9 -11.6 -4.5 -9.2 -7.9 Psychological distress - Kessler 6 (Low) Percentage point error 82.2 -10.6 -11.6 -8.1 -25.9 -23.5 -22.2 -25.0 -23.2 General Health Status (SF1) (Very good) Percentage point error 36.2 0.4 -2.0 -2.6 -4.1 -5.8 -5.3 -5.0 1.5 Private Health Insurance Percentage point error 57.1 3.4 1.9 3.3 -8.9 -12.5 -3.7 -0.6 -2.6 Daily smoker Percentage point error 13.5 -4.1 3.5 1.6 9.8 6.7 3.9 2.7 4.3 Consumed alcohol in the last 12 months Percentage point error 81.9 3.6 -2.8 -4.0 2.4 5.3 3.9 4.2 1.5
  • 29. www.srcentre.com.au Results – substantive health characteristics Average error across six substantive measures 33 Probability Surveys Non-probability Panels ABS ANU Poll RDD P1 P2 P3 P4 P5 Avge. error 4.02 3.98 3.58 10.5 10.9 7.24 7.78 6.83 Largest absolute error 10.59 11.57 8.08 25.86 23.52 22.20 24.96 23.20 No. of significant differences from benchmarks (out of 6) 2 3 2 4 6 3 3 3
  • 30. www.srcentre.com.au Two questions … 1. How do we reduce non-response bias in low response rate probability surveys 2. How do we reduce bias in survey estimates based on non- probability online panels 34
  • 31. www.srcentre.com.au Instructors Dina Neiger, PhD ➢ Dina is a professional statistician with over 20 years experience and a track record of achievement in leadership and technical roles. ➢ Social Research Centre, Monash University, Australian Bureau of Statistics, Peter MacCallum Cancer Centre. ➢ 1st Class Honours degree in Statistics and PhD in Business Systems from Monash University with an emphasis in applied Operations Research and Process Engineering. ➢ Accredited Statistician (AStat) and member of the Statistical Society of Australia (SSA). ➢ Full member of the Australian Society for Operations Research (ASOR). ➢ Calibration and blending methods to improve accuracy of the non-probability samples, establishment and maintenance of the first Australian Online Probability panel and complex business survey design and weighting. 35
  • 32. www.srcentre.com.au Instructors Paul J. Lavrakas, PhD ➢ Research psychologist, research methodologist, and prolific author. ➢ Since 2007, Independent Consultant with clients in the USA, Australia, Belgium, Japan and Canada; and Senior Fellow at NORC (U-Chicago) and Office of Survey Research (Michigan State U.) . ➢ From 2000-2007, Chief Research Methodologist for Nielson Media Research. ➢ 1978-2000 - Professor at Northwestern U. and Ohio State U. and founding faculty director of a survey research center at each university. ➢ Australian roles - Senior Methodological Adviser, the Social Research Centre. Member of the Scientific Advisory Board for the Centre for Social Research and Methods, ANU. ➢ President of the American Association for Public Opinion Research (2011-2014) and continues to serve on a volunteer basis on many AAPOR task forces and committees. 36
  • 33. www.srcentre.com.au Investigating Nonresponse Bias by Level of Effort Paul J. Lavrakas
  • 34. www.srcentre.com.au Investigating Nonresponse Bias ➢ Methods have evolved in the past decade regarding how NR Bias can be investigated (Groves and Brick, 2007; Olson and Montaquilla, 2012) ➢ One approach is to conduct analyses comparing the difference between early responders and middle responders vs. late responders on key measures in a survey with the hypothesis/premise being that late responders are more like nonrespondents than are early and middle responders ➢ However, if no differences are found between late responders vs. early and middle responders, that is NOT taken as evidence that there is no differences between respondents and nonrespondents. 38
  • 35. www.srcentre.com.au Case Study 1: University Faculty Health Benefits Survey Study Background ➢ The purpose of this survey was to provide reliable and valid information about the opinions of faculty towards the prospect of the university creating a new health care service for faculty working on the main campus. ➢ Approximately 800 of 5,000 members of the faculty were randomly sampled, and a telephone survey was conducted by the university’s survey research center, which yielded 47% completion rate. ➢ Nonresponse bias was investigated using multiple methods, with one being a Level of Effort analysis. 39
  • 36. www.srcentre.com.au Level of Effort Analyses ➢ Level of Effort was operationalized as the number of calls made to a given sampled respondent who completed the interview ➢ Analyses were conducted to investigate whether the effort it took to complete an interview was related to any of the substantive questions that were asked in the questionnaire. ➢ From these analyses it was learned than none of the key questionnaire items had a statistically significant (p < .05) association with this effort measure ➢ For one of the questionnaire items – asking whether the respondent had heard about the possible new health service prior to being interviewed – there was a marginally significant correlation (p < .07) with effort, but the size of this correlation was extremely small (r = 0.09). ➢ Thus, it was judged that there as no reliable evidence that any nonignorable nonresponse bias was present in the findings reported in the main body of the report. 40
  • 37. www.srcentre.com.au Case Study 2: Voter Identification Survey Study Background ➢ In 2012, there were several states in the USA whose state governments were controlled by conservative party members who passed legislation changing the requirements for the identification that a voter must show to prove that he or she were eligible to vote on the day of the Presidential Election. ➢ The purpose of the survey, which was funded by large labor unions that were known to support liberal candidates/policies, was to identify how many and which types of residents in the state were most likely to be disenfranchised by the new Voter ID legislation. 41
  • 38. www.srcentre.com.au Level of Effort Analysis 1 ➢ One approach was to compare key data provided by those who initially refused, but later agreed to complete the questionnaire after being recontacted with a “refusal conversion” protocol, with data provided by the cohort that never refused. ➢ In this analysis it was found that for neither of the two of the key statistics measured by the survey – awareness of the new Voter ID legislation; whether someone had a valid photo ID to vote at their local polling place on November 6, 2012 – were there statistically significant differences (p < .05) between those who never refused and those who initially refused but later were converted. ➢ Thus, there was no evidence found to suggest that refusing nonrespondents who were not converted would have given materially different answers to these key questions than did the respondents. 42
  • 39. www.srcentre.com.au Level of Effort Analysis 2 ➢ A second approach was to take into account the number of call attempts it took to complete an interview, and investigate if answers to the same two key variables correlated with the level of effort expended. ➢ In this analysis it was found that as more effort was expended to reach a respondent, the likelihood the respondent had a valid photo ID for voting purposes increased at a statistically significant level (p < .007). o Thus, the findings suggested that the unweighted survey data were somewhat biased in the direction of underestimating the proportion of people with a valid photo ID 43
  • 40. www.srcentre.com.au Level of Effort Analysis 2 (cont.) ➢ However, since the level of effort to achieve a completion correlated with various demographic characteristics of respondents (e.g., it take less effort to reach elderly respondents and more effort to reach young adults) and given that key demographic characteristics were taken into account in the weighting process, it was judged to be unlikely that the findings reported about the proportion of registered voters without a valid photo ID were subject to nonignorable nonresponse bias. ➢ Concerning awareness of the legislation, there was no statistically significant difference associated with the level of effort expended to gain a completion. o Thus, there was no evidence found to suggest that uncontacted nonrespondents would have given materially different answers to this key questions than did the respondents. 44
  • 42. www.srcentre.com.au Case Study: IMLS Nonresponse Follow-Up (NRFU) Study ➢ Another preferred method to investigate Nonresponse Bias is to conduct a follow-up survey of a sample of the original survey’s nonresponders. ➢ Little has been reported regarding how to use the data from such follow-up studies ➢ Please note, that my slides today are extracted from a paper presentation that I made in 2015 at the 70th annual conference of the American Association for Public Opinion Research (AAPOR), “Studying Nonresponse Bias with a Follow-up Survey of Initial Nonresponders in a Dual Frame RDD Survey” 46
  • 43. www.srcentre.com.au 2013 Public Needs for Library and Museum Services (PNLMS) Survey ➢ Sponsored by the Institute of Museum and Library Service (IMLS), a federal agency in the USA ➢ Conducted by M. Davis and Company, Inc., Philadelphia PA ➢ National dual-frame RDD survey of the general population of the USA o Data gathered September-November 2013 o 3,537 interviews completed; 2,506 from LL frame; 1,031 from CP frame • Landline sample AAPOR RR3 was 25% • Cell Phone sample AAPOR RR3 was 10% 47
  • 44. www.srcentre.com.au 2014 IMLS Nonresponse Follow-up (NRFU) Survey ➢ Created shortened questionnaire with key variables ➢ Used a noncontingent and a contingent incentive protocol ➢ Used only best calibre interviewers ➢ 201 interviews were completed in January-February 2014 with a random sample of nonresponders to the main survey; 100 from LL frame, 101 from CP frame o Landline follow-up sample AAPOR RR3 was 32% o Cell Phone follow-up sample AAPOR RR3 was 16% ➢ The original sample and the follow-up samples had remarkably similar demographic characteristics with the exception that the follow-up sample had proportionally more young adults aged 18-24 years and fewer older adults aged 65+ year (p < .03) than the original survey. 48
  • 45. www.srcentre.com.au Analytic Approach: Phase 1 ➢ Combining the two surveys via Weighting, in two Phases o Phase 1: Begin by assigning weights, separately, to each of four groups • Landline RDD sample (original study) • Landline NR follow-up sample • Cell phone RDD sample (original study) • Cell phone NR follow-up sample 49
  • 46. www.srcentre.com.au Analytic Approach: Phase 1 ➢ For each of the four groups, Phase 1 including the following seven steps conducted for each US Census region within each sample type: 1. Probability of Selection and Design Weight 2. Nonresponse Follow-up Adjustment 3. Unknown Eligibility Adjustment 4. Removal of Known Ineligible 5. Unit Nonresponse Adjustment 6. Adult Subsampling Adjustment 7. Multiple Phone Line Adjustment ➢ Steps 1 and 2 created the weights to start the weighting process ➢ Steps 3-7 were sequential adjustments to the starting weights 50
  • 47. www.srcentre.com.au 1. Probability of Selection and Design Weight 51 For each of the four samples: ➢ Using information from the sampling frame, the probability of selection was calculated as the number of released telephone numbers in each Census region divided by the total number of telephone numbers in the census region ➢ The design weight was the inverse of the probability of selection for the released telephone numbers and zero for the non-released telephone numbers
  • 48. www.srcentre.com.au 2. Nonresponse Follow-up Adjustment 52 For each of the two NRFU samples: ➢ For the telephone numbers that were eligible for the nonresponse follow-up study, a further adjustment to the design weight was required to account for the subsampling of telephone numbers that were called from among all the eligible nonresponse follow-up numbers ➢ This adjustment to the design weight was calculated as the number of follow-up eligible telephone numbers divided by the number of follow-up eligible called telephone numbers
  • 49. www.srcentre.com.au 3. Unknown Eligibility Adjustment As in any DFRDD survey, a considerable portion of the sampled numbers ended the original field period with their eligibility unresolved For each of the four samples: ➢ Using standard AAPOR final disposition codes, two groups of telephone numbers were created in each Census region: (1) unknown eligibility telephone numbers and (2) known eligibility telephone numbers ➢ Unknown eligibility adjustment factor was the sum of the follow-up adjusted design weights for all telephone numbers divided by the sum of the follow-up adjusted design weights for the known eligibility telephone numbers ➢ The unknown eligibility adjusted weight is the product of the unknown eligibility adjustment factor and the follow-up adjusted design weight for the known eligibility telephone numbers and zero (0.0) for the unknown eligibility telephone numbers 53
  • 50. www.srcentre.com.au 4. Removal of Known Ineligible As with all telephone surveys of the general public, many of the telephone numbers that were released were found to be ineligible for various reasons, including various non-working numbers (e.g., disconnected or temporarily out of service, technical problems, etc.), and numbers that were not part of the target population, e.g., fax, business, or other nonresidential numbers. For each of the four samples: ➢ Using AAPOR final disposition codes, these known ineligible telephone numbers were identified and removed from the weighting process 54
  • 51. www.srcentre.com.au 5. Unit Nonresponse Adjustment 55 As with all telephone surveys of the general public, for many of the telephone numbers for which contact was made, it was determined that there was at least one eligible adult but no data were collected For each of the four samples: ➢ Using the final disposition codes for the eligible telephone numbers produced two groups of eligible telephone numbers in each Census region, (1) eligible responding telephone numbers and (2) eligible non-responding telephone numbers ➢ The unit nonresponse adjustment shifts the weights from the eligible nonrespondents to the eligible respondents ➢ The nonresponse adjustment factor was the sum of the unknown eligibility adjusted weights for all eligible telephone numbers divided by the sum of the unknown eligibility adjusted weights for the responding telephone numbers ➢ The nonresponse adjusted weight was the product of the nonresponse adjustment factor and the unknown eligibility adjusted weight for respondents, and zero (0.0) for eligible nonrespondents
  • 52. www.srcentre.com.au 6. Landline Adult Subsampling Adjustment 56 Within eligible landline households there may have been more than one eligible adult in residence; this was determined in the questionnaire For each of the four samples: ➢ When there were two or more eligible adults associated with an landline telephone number, the adult subsampling adjustment factor was capped at two (2.0) ➢ The adult subsampling adjusted weight was the product of the adult subsampling adjustment factor and the nonresponse adjusted weight.
  • 53. www.srcentre.com.au 7. Multiple Phone Line Adjustment Most adults can be sampled on more than one phone line. For each of the four samples: ➢ A multiplicity adjustment was implemented when there were two or more telephone numbers, either landline or cell phone, at which the responding adult could have been contacted; this multiplicity adjustment factor was capped at two (2.0), otherwise, the multiplicity adjustment factor was one (1.0) This completed Phase 1 of the separate weighting of each of the four groups 57
  • 54. www.srcentre.com.au Phase 2 Combining Samples: Composite Weights 58 For each of the four samples: ➢ The composite weighting adjusted the multiplicity adjusted weight from the landline respondents and cell phone respondents to account for overlap in the two samples ➢ If the person could have been reached by both frames the composite weight was 0.5; if the person could only be reached by one frame it was 1.0
  • 55. www.srcentre.com.au Phase 2 Combining Samples: Calibration For each of the four samples: ➢ The final adjustment forced the weight totals from the survey data using the composite weight to match external population control totals; the external control totals were based on the following characteristics: o An extrapolation of Blumberg and Luke’s 2013 NHIS findings for telephone service ownership so that 50% were both landline and cell phone, 41% cell phone only, and 9% landline only o Socio-demographic characteristics from the most recent American Community Survey • Gender (Male, Female) • Age group (18-44, 45-64, 64+) • Marital status (Never been Married, Married, Separated/Divorced/Widow) • Hispanicity/Race (Hispanic, Non-Hispanic African American, Non-Hispanic White, Non-Hispanic Other) • Education (Less High School/High School, Some College, Associate or Bachelor Degree, Advanced Degree) • Presence or absence of children (No, Yes) 59
  • 56. www.srcentre.com.au Phase 2 Combining Samples: Calibration (cont.) ➢ The calibration methodology that was used was Iterative Proportional Fitting, i.e., raking or sample balancing; ➢ This method forces all of the different characteristics to simultaneously match the control totals o Each of the socio-demographic characteristics was used as a separate dimension in the raking process 60
  • 57. www.srcentre.com.au Using the Final Combined Data from the Four Samples ➢ After all this was done we had a weighted dataset containing all the interviews from the original study and from the follow-up survey of nonresponders (n= 3,738) ➢ This allowed us to generate two types of population estimates for key behavioral variables in the study o Estimates based on the original survey only o Estimates based on the combined surveys ➢ We then compared the differences between the two estimates to determine whether or not a statistic was materially different when the data from the nonresponse follow-up survey was taken into account 61
  • 58. www.srcentre.com.au An Example of a Material Difference in a Key Statistic ➢ Percent of parents/guardians reporting having a child who visited a Zoo/Aquarium in the past 30 days 62 Original Survey Yes, Did Visit 19.2% Est. # in USA 14.45M
  • 59. www.srcentre.com.au An Example of a Material Difference in a Key Statistic ➢ Percent of parents/guardians reporting having a child who visited a Zoo/Aquarium in the past 30 days ➢ 8.1 percentage point difference ➢ Estimated 6.1M difference 63 Original Survey Combined Surveys Yes, Did Visit 19.2% 27.3% Est. # in USA 14.45M 20.55M
  • 60. www.srcentre.com.au The Extent of Differences Identified by the NR Follow-up Study ➢ For 10 of the 28 behavioral measures, the percentage found in the original survey differed by more than two percentage points (2 pp) from the percentage found in the combined survey dataset ➢ Of these, six of the behavioral measures differed by more than five percentage points (5 pp) ➢ All of these measures were for estimates associated with the reported behavior of children (where the adult respondent served as a proxy for her/his child) and none of them were for estimates of the reported behavior of the adult respondent Nonresponse Bias exists at the level of the individual measure, and here we found evidence of several measures that likely were highly biased 64
  • 61. www.srcentre.com.au Response Propensity Modelling – Non-response Adjustments Dina Neiger
  • 62. www.srcentre.com.au Acknowledgement Analysis by Andrew Ward, Principal Statistician, Social Research Centre Thanks to Dr Siu-Ming Tam and Mr Paul Schubert, ABS, for their collaboration on this work and their kind permission to use the survey data for this presentation. Slides based on presentation by Andrew Ward at the Australian Statistical Conference in December 2016. 66
  • 63. www.srcentre.com.au Community Trust in Statistics Survey Determine public awareness and trust of official statistics Dual frame phone survey Also available for respondents and non-respondents: part-of- state (based on the landline prefix) or mobile Collected info from ~727 refusals – age, sex, awareness, trust 67
  • 64. www.srcentre.com.au Challenge ➢ Incorporation of refusal information in non-response adjustment Probability of response derived from propensity model Base weight = Design weight / Probability of response Limited auxiliary information available for respondents and non- respondents 68
  • 65. www.srcentre.com.au Non-response adjustment Awareness / Trust Non-respondents (%) Respondents (%) Have heard of and trust a great deal 12.4 20.5 Have heard of and tend to trust 28.5 54.6 Have heard of and tend to distrust 6.2 10.5 Have heard of and distrust a great deal 4.8 2.2 Have heard of but DK / Refused trust 26.3 2.9 Total awareness 78.2 90.7 Have not heard of 17.1 9.0 Don’t know / Refused 4.8 0.4 69
  • 66. www.srcentre.com.au Gives boost to respondents most like non-respondents Location, Age, Sex, Awareness of official statistics, Trust in official statistics Logistic regression model predicts response probability from information available for both respondents and non-respondents Non-response adjustment (cont’d) 70
  • 67. www.srcentre.com.au Non-response adjustment (cont’d) Awareness / Trust Unweighted (%) Without propensity weight (%) With propensity weight (%) With % - Without % Have heard of and trust a great deal 20.5 15.3 14.3 -1 Have heard of and tend to trust 54.6 52.6 47.5 -5.1 Have heard of and tend to distrust 10.5 11.1 9.9 -1.2 Have heard of and distrust a great deal 2.2 2.4 2.9 0.5 Have heard of but DK / Refused trust 2.9 3.4 7.5 4.1 Total awareness 90.7 84.8 82.1 -2.7 Have not heard of 9.0 14.5 17.1 2.6 Don’t know / Refused 0.4 0.7 0.9 0.2 71
  • 68. www.srcentre.com.au Non-response adjustment (cont’d) Assumptions Refusals who provided basic information are representative of all non-contacts Refusals and respondents answer non- response items in the same way 72
  • 69. www.srcentre.com.au Non-response adjustment (cont’d) Has face-validity, though… For example, persons with university education are typically over- represented in RDD surveys Therefore awareness estimates may have been inflated 73
  • 70. www.srcentre.com.au Techniques to reduce bias in non- probability samples Dina Neiger
  • 71. www.srcentre.com.au Data The Social Research Centre Cancer Council Victoria Slides and Ideas Darren Pennay Andrew C Ward Paul J Lavrakas Tina Petroulias Sebastian Misson Charles DiSogra Inferences from Non Probability Samples Workshop participants, Paris 2017 75 Acknowledgments
  • 72. www.srcentre.com.au Key background references ➢ Yeager, D. S., Krosnick, J. A., Chang, L., Javitz, H. S., Levendusky, M. S., Simpser, A., (2011) o Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-Probability Samples. Public Opinion Quarterly ➢ DiSogra, C., Cobb C., Chan E., Dennis J. M. (2011) o Calibrating Non-Probability Internet Samples with Probability Samples Using Early Adopter Characteristics. Section on Survey Research Methods – JSM 2011 ➢ Valliant, R., Dever, J. A. (2011) o Estimating propensity adjustments for volunteer web surveys, Sociological Methods & Research, 40(1), pp. 105-137) ➢ Terhnian, G., Bremer, J., Haney C., (2014) o A model based approach for achieving a representative sample ➢ Fahimi, M., F. M. Barlas, R. K. Thomas and N. Buttermore (2015) o Scientific Surveys Based on Incomplete Sampling Frames and High Rates of Nonresponse. Survey Practice. 8 (5) 76
  • 73. www.srcentre.com.au Usual starting point ➢ Design weight – inverse of probability of selection o Probability sample: number of people in the household, number of landlines, etc. o Non-probability sample: unknown, usually given design weight=1 ➢ Calibration (post-stratification, raking, RIM) o Uses external measures as “benchmarks” to adjust/weight (calibrate) the data to improve accuracy and force estimates to be consistent with other data sources o Applied to both probability and non-probability samples for key population demographics e.g. gender, age, education to reflect population distribution o If non-probability sample uses quotas for the same demographics then post- stratification will have minimal impact 77
  • 74. www.srcentre.com.au Selecting benchmarks Should be a known source of non-response bias and likely to have an effect on survey estimates. Must have information for each sample member individually and the population as a whole ➢ Identical or as close to as possible (e.g. the same survey question and the census) Stratified sample: ➢ Unequal probability of selection within population subgroups (markets / strata) ➢ Population totals within each subgroup/stratum must be included as part of the weights Common variables in general population surveys: ➢ Telephone status (for dual frame surveys) ➢ Gender ➢ Age by education ➢ Country of birth (Australia V Other English Speaking Country V Non-English Speaking Countries) ➢ Geography (State, Metro V regional) 78
  • 75. www.srcentre.com.au Is your adjustment working? Key considerations: ➢ Variance (probability samples) and bias Variance: ➢ One sample of many possible samples, how different would your result be if you happened to select a different random sample? ➢ If use too many benchmarks (or data is severely biased), weights will become extreme increasing variance of your estimate and decreasing your confidence in the accuracy of your result Bias: ➢ Difference from your survey result to the true value ➢ If ignore a known source of non-response bias then survey results will be further away from the truth ➢ Need to know the truth in order to measure ➢ Bias measurement dilemma - if already know the truth why bother with the survey? What about weighting efficiency? ➢ A measure of how much work a weight is doing ➢ What is an acceptable minimum? No standard ➢ Can be misleading especially in non-probability or highly biased samples (e.g. weight=1) 79
  • 76. www.srcentre.com.au Case study 1: The Online Panels Benchmarking Study ➢ Three surveys based on probability samples of the Australian population aged 18 years and over ➢ Five surveys of persons aged 18 years and over administered to members of non- probability online panels ➢ Survey questionnaire included a range of demographic questions and questions about health, wellbeing and use of technology ➢ Same questions were used across the eight surveys (Unified approach to questionnaire design to try and minimise mode effects) ➢ 9 minutes average interview length for online and telephone (12 page booklet) ➢ Fieldwork: Oct – Dec 2015 ➢ Data and documentation available from the Australian Data Archive https://www.ada.edu.au/ada/01329 80
  • 77. www.srcentre.com.au Results – substantive health characteristics (modal response) 81 Substantive variables Benchmark value Distance from benchmarks Probability Non-probability ABS ANU Poll DF RDD P1 P2 P3 P4 P5 Life satisfaction (8 out of 10) Percentage point error 32.6 -2.0 -2.0 1.9 -11.9 -11.6 -4.5 -9.2 -7.9 Psychological distress - Kessler 6 (Low) Percentage point error 82.2 -10.6 -11.6 -8.1 -25.9 -23.5 -22.2 -25.0 -23.2 General Health Status (SF1) (Very good) Percentage point error 36.2 0.4 -2.0 -2.6 -4.1 -5.8 -5.3 -5.0 1.5 Private Health Insurance Percentage point error 57.1 3.4 1.9 3.3 -8.9 -12.5 -3.7 -0.6 -2.6 Daily smoker Percentage point error 13.5 -4.1 3.5 1.6 9.8 6.7 3.9 2.7 4.3 Consumed alcohol in the last 12 months Percentage point error 81.9 3.6 -2.8 -4.0 2.4 5.3 3.9 4.2 1.5
  • 78. www.srcentre.com.au Impact on weighting Impact of weighting on average absolute error: The average difference (percentage points) across all benchmarks between the official statistics and the survey estimates. 82 Probability Surveys Non-probability Panels ABS ANU Poll RDD P1 P2 P3 P4 P5 Unweighted (avge. error) 4.28 3.68 4.63 9.34 10.35 7.28 7.20 6.41 Weighted (avge. error) 4.02 3.98 3.58 10.5 10.9 7.24 7.78 6.83 Impact ✓  ✓   -  
  • 80. www.srcentre.com.au Design weight: Non-probability sample ➢ Current approaches: o Best case scenario • Model based selection to address known biases (e.g. quotas) o Worst case scenario: • Participation is a combination of idiosyncratic contacts, low response rates, and non-coverage. • This is not the kind of design one would choose if there were an affordable alternative. • (Doug Rivers, Comments on the 2013 AAPOR Taskforce Report on Non-probability Sampling” (Douglas Rivers; Comment. J Surv Stat Methodol 2013; 1 (2): 111-117. doi: 10.1093/jssam/smt009) ➢ Possible alternative: o Adapt propensity response model to model likelihood of being part of the non-probability sample o Need probability reference sample 84
  • 81. www.srcentre.com.au Probability reference sample ➢ Gold standard that we’d like non-probability sample to resemble as closely as possible o Known to produce accurate estimates for key outcomes o Ideally includes data that can be compared to independent benchmarks ➢ Needs o Common data items with the non-probability estimate • As a minimum: demographic and attitudinal and outcomes of interest o Comparable data • Mode (depending on the questionnaire) • Question wording • Reference timeframe 85
  • 82. www.srcentre.com.au Case study 2 LinA as a reference sample for OPBS ➢ Life in Australia o http://www.srcentre.com.au/our-research#life-in-aus o Australia’s first and only probability-based online panel o Launched in December 2016 o 3,300 adults from across Australia randomly recruited via a dedicated dual-frame telephone survey, Aged 18 years and over, Online and offline population o Replicated online panel benchmarking study in February 2016 ➢ OPBS non-probability panels reweighted o Design weight=1 o Post-stratification weight matched to LinA weighting o Add enrolment to vote to the substantive characteristics 86
  • 83. www.srcentre.com.au LinA as a reference sample 87 Substantive variables Benchmark value (%) Distance from benchmarks (percentage point difference from benchmark) LinA Non-probability Panels 1 2 3 4 5 Life satisfaction (8 out of 10) 32.6 -1.4 -11.1 -13.6 -5.9 -10.3 -8.0 Psychological distress - Kessler 6 (Low) 82.2 -21.3 -26.5 -25.5 -22.7 -25.1 -23.4 General Health Status - SF1 (Very good) 36.2 -3.8 -4.2 -4.2 -4.0 -5.3 0.1 Private Health Insurance 57.1 2.6 -8.0 -9.8 -3.9 0.1 -2.0 Daily smoker 13.5 -1.0 9.0 6.0 3.5 2.2 3.5 Consumed alcohol in the last 12 months 81.9 2.7 -1.1 -6.4 -3.6 -4.9 -1.7 Enrolled to vote 78.5 9.0 8.4 7.6 10.7 8.8 13.1 Average absolute difference 5.23 8.54 9.14 6.79 7.09 6.48
  • 84. www.srcentre.com.au Back to model-based design weights ➢ Use reference sample to calculate o Propensity scores – conditional probability that a respondent in the non-probability sample rather than in a probability sample given observed characteristics of the respondent • Logistic model (R PracTools package, Valliant et al. 2015) • Non-probability cases=1, reference cases=0 • Design weight for non-probability sample - inverse of estimated probability of inclusion in non- probability sample given weighted reference sample • Beware: extreme weights, size and quality of the reference sample ➢ Result: o Probability-based design weights for non-probability sample o Now ready to look at calibration/weighting adjustment methods 88
  • 86. www.srcentre.com.au No agreed approach – work in progress ➢ Starting point – known biases in the non-probability sample ➢ For example, for online panels o Heavier internet users o Heavier media users o More interested in technology o Early adopters skew o More health care card holders ➢ Common theme o Standard demographics are not enough o High quality “official” benchmarks not available ➢ Probability reference sample to the rescue! o As long as items that are known/suspected biases in the non-probability sample are included 90
  • 87. www.srcentre.com.au Cast study 2 ➢ Significant differentiators o Early adopter variables o Internet usage variables o Income o Employment o Remoteness o Home ownership o Media consumption variables were not included in the survey ➢ Incorporate one additional variable at a time compare with benchmarks to evaluate impact on bias 91
  • 88. www.srcentre.com.au Options for EA inclusion in calibration ➢ Each EA variable is included as a 0/1 variables where o 1 means agreed or strongly agreed with that statement and o 0 all else (disagreed or strongly disagreed, did not respond) ➢ Scale derived using Rasch model o Bond, T. G. and C. M. Fox (2007). Applying the Rasch model: Fundamental measurement in the human sciences. (2nd ed.) Mahwah, N.J.: Erlbaum. ➢ Categorical scale o None – did not agree or strongly agree to any of 5 statements o Some – agreed or strongly agreed with at least one and no more than 2 out of 5 statements o High – agreed or strongly agreed with 3 or more out of 5 statements ➢ Agreement total calculated as the number of statements (min 0, max 5) that the respondent agreed or strongly agreed with 92
  • 89. www.srcentre.com.au Options for EA inclusion in calibration Composite is better than individual variables 93 6.2 2.5 -0.4 -1.6 2 -1.5 -4.4 -5.5 7.8 -0.3 -1.5 -2.4 -8 -6 -4 -2 0 2 4 6 8 10 EA1-EA4 EA1-EA5 Rasch Scale EA1-EA5 Categorical Scale EA1-EA5 Agreement total %change Impact of EA variables on bias (% change) Ave abs error compared to unweighted Ave abs error compared to std adjustment Ave RMSE compared to std adjustment
  • 90. www.srcentre.com.au Internet usage measures Not useful when calibrating online to online 94 2.6 5 4.5 5.5 -1.5 0.9 0.3 1.3 0.1 2 6.4 1.9 -2 -1 0 1 2 3 4 5 6 7 Look for information Access at home, Look for information Look for information, Post to blogs etc, Financial transactions, Social Media Access at home, Frequency of use %change Impact of Internet Usage measures on bias (% change) Ave abs error compared to unweighted Ave abs error compared to std adjustment Ave RMSE compared to std adjustment
  • 91. www.srcentre.com.au Socio-economic variables in calibration Income – single most influential variable to reduce bias and RMSE Benefit from including both: income and employment 95 -7.3 -10.3 -7.3 -5.9 -11.0 -13.9 -10.9 -9.6 -8.5 -10.6 -5.8 -4.6 -16.0 -14.0 -12.0 -10.0 -8.0 -6.0 -4.0 -2.0 0.0 Income Income, Employment Income, Employment, Home Ownership Income, Employment, Home Ownership, Remoteness %change Impact of Income, Employment and Home Ownership variables on bias (% change) Ave abs error compared to unweighted Ave abs error compared to std adjustment Ave RMSE compared to std adjustment
  • 93. www.srcentre.com.au Impact of calibration (LinA: Income, Employment, EA total agreement score) 97 Benefit of adjustment is sample dependent! Standard adjustments do not help with bias reductions!! Average Absolute Error % Change in bias Unweighted Std adjustments Design weights and key differentiators From unweighted From std adjustment Panel 1 9.2 9.8 7.8 -15.1 -19.8 Panel 2 10.0 10.4 9.3 -7.2 -10.9 Panel 3 7.7 7.7 5.3 -31.0 -31.5 Panel 4 7.4 8.1 7.1 -3.3 -11.9 Panel 5 7.4 7.4 7.8 5.4 5.7 All Panels 8.3 8.7 7.5 -10.4 -13.9
  • 94. www.srcentre.com.au Comparison with independent benchmarks vs reference sample benchmarks 98 Mostly change in bias is consistent but there can be large differences for some samples -15.1 -7.2 -31.0 -3.3 5.4 -26.9 5.7 -31.7 28.7 1.0 -40.0 -30.0 -20.0 -10.0 0.0 10.0 20.0 30.0 40.0 Panel 1 Panel 2 Panel 3 Panel 4 Panel 5 % Change in bias from unweighted Independent Benchmarks Reference Sample -19.8 -10.9 -31.5 -11.9 5.7 -30.6 0.6 -31.5 -12.4 -8.3 -50.0 -40.0 -30.0 -20.0 -10.0 0.0 10.0 20.0 Panel 1 Panel 2 Panel 3 Panel 4 Panel 5 % Change in bias from std adjustment Independent Benchmarks Reference Sample
  • 95. www.srcentre.com.au What about blending? ➢ Adding probability sample to non-probability sample will further decrease the bias ➢ Confirmed that o Combining probability and non-probability data greatly reduces variability across panels and has much larger impact than calibration alone o Use the best probability sample to combine with non probability regardless of mode and response rate ➢ But… ➢ Costs and feasibility of running parallel probability sample o Push to web – costs (incentives, follow up), timeliness o Listed vs RDD mobile – option for blending 99