2. Presentation
Maja Fromseier Petersen, chief consultant and team leader for the survey
consultants
Master from Copenhagen University in Cultural Geography
2003-2007 Statistic Denmark, statistics on entrepreneurs and globalization
2007-2013 Deputy manager at SFI Survey (the data collection department)
on the National Institute of Social research
2014- Merged with Statistic Denmark into ”Statistic Denmark Survey
department”
Contact info: mfm@dst.dk
2
3. The Danish Statistical system
Person
id:
Person-No
Business
register
id:
CVR-No
Building and
Housing
Register (BBR)
id:
Adress
Health
Tax
Employ-
ment
Education
Social
stat
And
more…
CPR
BBR CVR
Web
forms
Inter-
view
Cadastral
4. SD Survey’s business model
SD Survey is a external selling offices in Statistics
Denmark
90% of the turnover on 5 mill. € is from external
customers
Positive result 2015 0.6 mill. €. Overhead 2 mill. €
Contacts annually 300.000 people in 80 surveys
29 employees, 25-50 central interviewers on cati central
and 200 nationwide interviewers
Not products and private marketing
The entire process from sample, questionnaire, weighting
and report
Three principles:
1. Non loss
2. no (much) profit
3. take care of the reputation of Statistic Denmark
4
5. Mission and products
Mission: Make SD methods and register valuable
for others
Products:
Optimized sampling and data collection with:
telephone interviews,
personal interviews,
web forms
paper interviews
experiments and tests on the PC
5
6. A new and well-documented web panel
Representative surveys
Recruiting to the web panel
Selection from the web panel
Non-response in the web panels
Calibration in the web panels
E boks – “panel”
What is E boks and what is the E boks
population?
The E-boks test Who’s answering?
Overview of the presentation
6
7. A new and good web panel
Researchers at universities, government and
research institutions use also cheap web panels
Idea:
Using Statistics Denmark's register of the entire
population, it must be possible to establish a web
panel that comes as close as possible to the
classic representative surveys - by controlling the
bias
7
8. A new and good web panel
Accept bias due to demands about access to the
Internet.
Demand to new web panel:
1.Recruited from representative surveys (simple
random sampling of the total population)
2.Be within the 95% confidence interval for all other
variables not used for selection
8
9. Representative surveys (1)
A ‘sample’ can be selected in many ways,
- for example all who are here today
Only a sample without bias can be really be generalized
Three requirements to probability sampling:
1. All in the target population can be elected
2. Selection with known probability
3. Weighing with the inverse selection probability
9
10. Representative surveys (2)
Probability sampling:
The right answer may as well be higher as lower as
what we can measure
The sample has a statistical uncertainty
The larger sample the smaller the confidence interval
But for all background/ register variables: 19 out of
20 samples are within the classic 95% confidence
interval
The most importen: If a sample is not representative:
A larger sample may just make the estimate more
wrong
10
11. Representative surveys (3)
One of many monthly omnibus
The sample is from the central personal register, the size is
around1,600, and has a random error up to 2.5%
Population Sample
Elementary school 34,9 34,0
High school and vocational edu. 39,6 39,3
Short education 4,2 4,0
Medium education 12,4 13,2
Long education 8,9 9,5
11
12. Representative surveys (4)
For any variables: 19 out of 20 samples are within
the classic 95% confidence interval
Present
Population Sample
Ethnicity
Danish origin 88,4 89,2
Immigrant 10,2 9,2
Descendants 1,5 1,6
12
13. Recruiting to Web panel (1)
For more than three years, DST in representative
samples, asked for e-mail addresses. 1,500 answer
every month and about 600 say yes.
There are currently about 30.000 in the database.
There are three sources of bias in this web panel.
1.The non-response of the initial representative sample
2.Those who do not have Internet access and do not
have an e-mail
3.Those who will not give their e-mail
13
14. Recruiting to Web panel (2)
Examples of bias:
Population Web-panel
18-34 year 29,0% 23,1%
50-64 year 26,8% 30,6%
Elementary school 32,3% 22,2%
Long education 7,5% 10,4%
<100.000 DDK 18,1% 12,3%
>600.000 DDK 15,8% 20,9%
Random error with 30,000 observations around 0,5 percent
14
15. Recruiting to Web panel (3)
More examples of bias:
Population Web-panel
Men 50,1% 49,0%
South Denmark 21,1% 21,6%
Danes 87,8% 94,3%
Western countries 4,7% 2,5%
Non western 7,5% 3,2%
One plan housing 67,9% 74,3%
15
16. Recruiting to Web panel (4)
More examples of bias:
Population Web-panel
Singles without children 28,1% 20,6%
Single parents 28,1% 20,6%
Unmarried 37,7% 30,8%
Married 48,3% 56,5%
Unemployed 20,2% 14,9%
Employed 61,4% 68,3%
16
17. Recruiting to Web panel (5)
Conclusion for recruitment
The three sources of bias worsens picture of the
normal first step non-response on a number of
variables
Gender and geography, however less
Web panel shows a too socially positive picture of
population
17
18. Selection from the web panel (1)
Is it possible to correct the bias by selecting proportionally to registry
variables?
Population GAR EFGAR
G: Men 50,1% 50,1% 50,1%
A: 18-34 year 29,0% 29,1% 29,1%
R: South DK 21,1% 21,1% 21,0%
E: Elementary edu. 32,3% 21,6% 31,6%
F: Single parents 28,1% 20,8% 28,1%
GAR (Gender, Age and Region) can not correct for Education and
Family
18
19. Selection from the web panel (2)
Examples of other factors that are not selected proportionally
EFGAR (Gender, Age, Region, Education and Family)
Population GAR EFGAR
Danes 87,8% 95,3% 93,6%
< 100.000 DKK 18,1% 13,3% 16,9%
Employed 61,4% 68,6% 67,0%
Income, ethnic background and employment bias with EFGAR
Income something better with EFGAR, but still bias.
19
20. Non-respons in web panels
3000 selected proportionally to EFGAR (Education, Family,
Gender, Age and Region)
Selected Response Population
18-34 year 28,8% 22,7% 29,0%
Elementary 31,6% 23,1% 32,3%
Men 50,1% 53,7% 50,1%
Single parents 27,9% 27,5% 28,1%
South DK 20,9% 18,9% 21,1%
Men answering more in web panel. Unlike other collection
methods
Education and age wrong again, even if it been corrected in the
selection
20
21. Calibration in web panels
GREG calibration after gender, age and region, as well as education
and family type
Calibration Population
18-34 year 29,0% 29,0%
Single parents 28,1% 28,1%
Elementary 32,3% 32,3%
<150.000 DKK 23,9% 26,0%
Very bias non-response - large variation in weights.
What calculated up it fits of course. But still not income.
21
22. To sum up on Statistic Denmark’s
Web panel
Even when recruited from representative surveys, there is great bias
Proportional selection with gender, age and region solves nothing
Proportional selection with more than gender, age and region is
necessary and helps
Income and education, plus age, gender and family is better to choose
proportional and weight after
The non-response in the web collection gives new bias
Weighting restores the population, but there is big variation in weights
One could call this web panel for the quasi representative
But is it all ready out dated? And will it be replaced with our E-boks
panel
22
23. An alternative where it is possible to get rid of
”the first bias”
The base is (allmost) the total population
An alternative Web Panel - E-boks
23
24. 2001 E-Boks opens – goal 350.000 users.
2005 All State payslips in DK are now sent out via E-
Boks.
2006 number of users reaches one million.
2011 E-Boks achieves 3 million users in Denmark and
is established in the Norwegian market. The E-Boks
app is also introduced.
2015 E-Boks launches E-boks.se. Meanwhile E-Boks
reaches 11 million users in the Nordics.
E-boks
24
25. “All” Danish citizens of 15+ years
Contact information is maintained by the
respondents
Link to Cpr-numbers and all register information
Limitations to public costumers
Panel of 4 mio users
25
26. Non registered:
The elderly
Heavy registered:
Unemployed on
unemployment
benefits
Families with
children
Users of E-boks
26
Age
Registered
for E-boks
95+ 14,1%
85 til 94 25,8%
75 til 84 53,5%
65 til 74 81,6%
55 til 64 92,1%
45 til 54 95,3%
35 til 44 97,1%
25 til 34 97,6%
15 til 24 98,3%
Total 89,4%
27. Test of using e-boks
27
March 2016
Normal Omnibus – response rate 61,4 pct.
Invitation letter – normal letter
Reminder normal letter
Cati interviewing and extra written reminder for those
without phone number
E boks test - response rate 36 pct.
B boks invitation
E boks reminder
Second e boks reminder
33. E-boks test: Test of using e-boks -
labor market
33
56%
28%
56%
32%
61%
40%
70%
42%
68%
33%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Ordinary E boks
5. Outside labourmarket
1. Students
2. Employees basic levle
3. Employees medium level and higher
4. Self-employed
34. E-boks test: Test of using e-boks –
Family income
34
46%
27%
48%
29%
60%
28%
64%
44%
70%
46%
73%
39%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Ordinary E boks
No income / unknown income 0 -200.000 200-350.000 350-500.000 500-750.000 750+
35. E-boks test: conclusions on who’s
answering
35
The ones answering on E-boks test:
Equal numbers of men and women
The 50+ years (until the get exempt), but the youngest
answers more often than in the normal set up.
The highly educated (shorter higher educated answers
more often than in ordinary set up)
The employed (employed on basic level answers more
often than in ordinary set up)
The ones not living in the big cities
The ones earning more than 350.000 d.kr. a year (around
47.000 euros)
36. Difference in reply pattern
36
Consumer price index - non weighted
How do you think the prices are today, compared with
a year ago?
E-boks Normal
Higher 61 50
Same 34 44
uændret 5 6
Question from National police - non weighted
To what extent are you concerned about crime and
violence in society?
E-boks Normal
Very concerned 18 24
Somewhat concerned 34 32
A bit concerned 37 36
Not at all 11 7
37. Use of e-boks in Omnibus
37
The test has led to a change in the contact strategy for
the normal omnibus
E boks mail
Normal letter after 2 days
Cati interviewing and Second reminder letter
E boks mail 2-3 days before closing
Use of E-boks mail seems to have raised the
completion rate with about 2-3 pct. and lowered the
expenses for postage.
38. We need to be able to led the respondents answer
on phones / tablets
E boks app -> more tablet / phone
Implication for survey software
38
1.K 2015 1. K 2016
Desktop 73% 68%
Tablet 20% 23%
Mobile 7% 9%
39. E-boks omnibus with and with out smileys
39
Extra Omnibus Extra Omnibus
+ Smileys
Daycares can demand parents to
take out the children for 3 weeks
summer holiday
Agree:
28 pct
Agree:
36 pct
0-class obligatory or
not
Agree:
67 pct
Agree:
79 pct
• Fully agree
• Agree
• Both agree and
disagree
• Disagree
• Fully disagree
40. Even when web panel is recruited from representative surveys, there
is great bias
The non-response in the web collection gives new bias
Weighting restores the population, but there is big variation in
weights
One could call this web panel for the quasi representative
E-boks solves the problem about the first sources of bias, but it is still
a problem that some respondents are hard to get to answer.
We still have to weight data to restores the population.
What do you think are the future for these ways of collecting data?
Where should we focus and what should we be aware of?
Web panel and E-boks panel
Conclusions and your questions
40