At the Advertising Research Foundation’s (ARF) 2011 annual re:think convention, a key issues forum presentation was held entitled Research Quality. The topic of Research Quality forum was Should We Dismantle the Factory?An Approach to Evaluate Data Collected from Multiple Sample Sources and Generated through Different Approaches. Measuring online research was discussed. George Terhanian, Ph.D. presented and Gian Fulgoni moderated.
15. TOPIC
Should We Dismantle the Factory?
An Approach to Evaluate Data Collected from Multiple Sample Sources and
Generated through Different Approaches
DATE
21 March 2011
George Terhanian, Ph.D. 15
21. Advocates of online research sang its praises in the
1990s
“Online research is an unstoppable train. And it is
accelerating. Those who don’t get on board run the risk
of being left far behind.”
Humphrey Taylor & George Terhanian (1999). “Heady Days Are Here
Again. Online Polling is Rapidly Coming of Age.” Public Perspective
22. Critics compared those advocates to the sirens who
summoned Ulysses’ sailors to their doom
“A growing number of survey researchers are
unfortunately being led to the rocks like Ulysses’
sailors following the Siren call of cheap, but worthless,
data.”
Warren Mitofksy (1999). “Pollsters.com.” Public
Perspective
23. The remarkable growth of online research has
silenced most critics
US and Europe Spending ($Millions)
24. The continuing growth of online research is creating
supply challenges and stimulating innovation
Rivers
+ +
Access Panels
Communities
25. It’s also heightening concerns about data quality and
representativeness—many buyers have many
questions about the new sample sources
“What are
the biases?”
“How do I put the
pieces together?”
26. Some argue that consistency is king but the
argument overlooks the importance of accuracy
New
Hampshire
31. How agencies regard potential survey respondents
has not changed much through the years either
32. Some agencies are beginning to collect many
different types of information, all of which can be
linked to specific individuals, and re-used
Online behavioral
Video Blogging measurement (e.g., sites visited,
advertisements seen)
Moderated
Discussions
Off-line behavioral
measurement (e.g.,
footprint via GPS within
phone)
Surveys
Information
from CRM
systems
33. Some experts feel that even more change is on its
way: the DIY market has already surpassed $500M
35. What about the problem of self-selection?
Garbage in, garbage out? Not necessarily…
36. Experimenters have been dealing with the
problem of self-selection for decades to make fair
comparisons and to estimate causal effects
• Smokers vs. Non-Smokers?
– Does smoking cause cancer? How do we know?
• Mastectomy vs. Conservation?
– Which one is more effective? How do we know?
• And so on? Possibly…
– Phone vs. F2F vs. Panel vs. River vs. Social
Community….
37. Figuring out how to replicate the randomization of
the coin flip is crucial
38. Ideas on how to address the survey self-selection
problem were introduced about 60 years ago
“Since it would not have been feasible for Kinsey, Pomeroy,
and Martin to take a large sample on a probability basis, a
reasonable probability sample would be, and would have
been, a small one and its purpose would be:
– to act as a check on the large sample, and
– possibly, to serve as a basis for adjusting the results of
the large sample.”
Cochrane, Mosteller & Tukey, 1954, p. 2
38
39. Change a few words around, and you’ve nearly
solved the problem
Since it would not have been feasible...to take an ONLINE sample on a
probability basis, a reasonable probability sample would be, and would
have been, a TELEPHONE one and its purpose would be:
– to act as a check on the ONLINE sample, and
– possibly, to serve as a basis for adjusting the results of the
ONLINE sample
Census
Non-Online Online Online
Users + Users Users
Logistic model predicts
probability, or propensity
score, of participating
in the RDD survey 39
40. What does might this mean for the DIY market?
The same information but faster and cheaper?
$800 per question, n =1010, 5 days
$400 per question, n=1100, 2 days
$200 per question, n=1100, < 1 day
42. Data Collection Details (US Adults, 18+)
1. Telephone: 1,019 completed interviews among respondents residing
in private households in the continental US. They were contacted
through random-digit-dialing. $800 per question.
2. Online Panel: 1,100 completed interviews among members of
Toluna’s online panel who were invited by email. $400 per question.
3. Online River: 1,100 completed interviews among non-members of
Toluna’s online panel who were directed to the survey after clicking
through an advertisement or link on the Internet. $400 per question.
4. Social among 1,100 completed interviews among members of
Toluna’s social voting community who were invited through Toluna’s
DIY QuickSurvey (TQS) service. $200 per question.
Data Collection Dates:
Phone: February 23-27, 2011
Date Collection, Online: February 28-March 3, 2011
43. Topics Covered (the “dependent variables”)
1. General: Quality of health, Approval for Obama, Degree of
religiousness, Own passport, Possess driver’s license, Smoke
cigarettes, TV viewing per week
2. Attitudes Towards Privacy: AIDS screening at work,
unsolicited calls for selling purposes, cookies on computers to
track, airport searches based on visual profiles
3. Technology Ownership: Smartphone, Digital Camera, Tablet
computer, Game console, Satellite radio, eBook reader
4. Online Behaviors (since 1/1/11): Made purchase, Banked, Used
social network/media application, Uploaded picture, Watched
Video, Participated in Auction
44. Sample Selection and Weighting Adjustments
1. Telephone
– Sample Selection: 50/50 Male/Female ratio within major regions
– Weighting Factors: region, gender*age, race, ethnicity, education, income,
number of adults in household, lived in home without landline in past two
years. Denominated as “demographic weighting”.
2. Online Panel, River, Social
– Sample Selection: Minimum quotas for region, gender*age, race, ethnicity,
education
– Weighting Factors by Target Population:
1. General Population: Same as telephone--”demographic”
2. Online (only) Population: Same factors as telephone, with percentages
determined through telephone survey rather than Census--
”demographic”
3. Propensity Score: Several attitudes and opinions reflecting views
towards environment, new things, politics, personalization
45. Benchmark Choices and Assessment Approach
1. Benchmarks
– Official government data: Percent of adults with a driver’s
license
– Telephone responses
2. Point of Comparison
– The difference between each question’s modal response (i.e.,
the proportion of respondents who select it) and the
benchmark’s.
– The survey/source with the lowest (average) score relative to
the benchmark will be considered the most accurate.
46. General Questions, Dem & Propensity Score
Modal Response Benchmark Panel River Social
Health "Good" 32.3 31.6 33.7 31.5
Obama, "Approve" 46.6 51.8 46.6 50.7
"Moderately" Religious 40.2 37.5 41.4 36.8
Passport, "Do not Own" 56.7 54.6 59 54
Driver's License, "Have" 85.5 85.9 85 84.8
Smoke Cigarettes, "Not at all" 79.4 83 76.5 77.3
TV Watching, "2 hrs per Week" 25.5 19 20.3 19.9
Mean Deviation -- 3.0 1.9 2.8
47. Attitudes Towards Privacy, Dem & Propensity Score
Modal Response Benchmark Panel River Social
AIDS Screening, "Violation" 53.5 52.1 46 50.1
Unsolicited Calls, "Violation" 67.3 75.4 73.9 74.1
Cookies, "Violation" 65.8 64.9 68.6 64.1
Airport Searches, "No Violation" 60.4 60.1 61 62.8
Mean Deviation -- 2.7 4.4 3.6
48. Technology Ownership, Dem & Propensity Score
Modal Response Benchmark Panel River Social
Own Smartphone, "No" 58.8 65 63.7 62.7
Own Digital Camera, "Yes" 73.4 79.5 77.5 79.1
Own Tablet, "No" 91.4 92.6 91.7 92.3
Own Game Console, "No" 53 53 51.8 50.9
Own Satellite Radio, "No" 78.5 79.2 83.1 76.3
Own eBook Reader, "No" 90 89.5 89 88.9
Mean Deviation -- 2.5 2.7 2.7
49. Online Behaviors, Dem &Propensity Score
Modal Response Benchmark Panel River Social
Online purchase since Jan 1, "No" 56.1 56.9 59.7 56.2
Banked online since Jan 1, "Yes" 58.1 68.6 62.5 73
Used social media since Jan 1, "Yes" 64.1 70.5 71.7 67.3
Uploaded picture since Jan 1, "Yes" 61.8 55.2 56.9 57.3
Watched video since Jan 1, "Yes" 70.9 68.8 72.2 73.9
Online auction since Jan 1, "No" 86.4 82.8 82.3 82.6
Mean Deviation -- 5.0 4.3 4.9
51. Summary by Category, Source, & Weighting (1)
General Panel River Social Total
No Weighting 4.3 5.1 4.4 4.6
Demographic Weighting 3.6 3.9 4.2 3.9
Dem and Propensity Score 3.0 1.9 2.8 2.6
Attitudes Towards Privacy Panel River Social Total
No Weighting 5.3 6.9 6.2 6.1
Demographic Weighting 5.4 7.7 6.8 6.6
Dem and Propensity Scoring 2.7 4.4 3.6 3.6
52. Summary by Category, Source, & Weighting (2)
Technology Ownership Panel River Social Total
No Weighting 3.2 2.8 2.5 2.8
Demographic Weighting 2.6 2.5 2.6 2.6
Dem and Propensity Score 2.5 2.7 2.7 2.6
Online Behaviors Panel River Social Total
No Weighting 7.8 5.6 6.9 6.8
Demographic Weighting 5.9 5.3 6.5 5.9
Dem and Propensity Scoring 5.0 4.3 4.9 4.7
53. Overall Comparison to Benchmarks
All Questions Panel River Social Total
No Weighting 5.1 4.9 4.8 4.9
Demographic Weighting 4.3 4.6 4.8 4.6
Dem and Propensity Score 3.3 3.2 3.4 3.3
Cost per question vs. phone -50% -50% -75%
Time required vs. phone -60% -60% -80%
Time Required:
Estimates above for panel, river and social are standard; for this
study, we stretched data collection over five days, as with telephone