Presented at CHI 2013
When you share content in an online social network, who is listening? Users have scarce information about who actually sees their content, making their audience seem invisible and difficult to estimate. However, understanding this invisible audience can impact both science and design, since perceived audiences influence content production and self-presentation online. In this paper, we combine survey and large-scale log data to examine how well users’ perceptions of their audience match their actual audience on Facebook. We find that social media users consistently underestimate their audience size for their posts, guessing that their audience is just 27% of its true size. Qualitative coding of survey responses reveals folk theories that attempt to reverse-engineer audience size using feedback and friend count, though none of these approaches are particularly accurate. We analyze audience
logs for 222,000 Facebook users’ posts over the course of one month and find that publicly visible signals — friend count, likes, and comments — vary widely and do not strongly indicate the audience of a single post. Despite the variation, users typically reach 61% of their friends each month. Together, our results begin to reveal the invisible undercurrents of audience attention and behavior in online social networks.
Quantifying the Invisible Audience in Social Networks
1. stanford hci group
Quantifying the
Invisible Audience
in Social Networks
Eytan Bakshy, Moira Burke, Brian Karrer
Facebook Data Science Team
Michael Bernstein
Stanford Computer Science Department
2. Sharing on a social
network is like
giving a talk from
behind a curtain.
3. Sharing on a social
network is like
giving a talk from
behind a curtain.
2
6. Quantify the difference between
users’ estimated and actual audience
Measure audience size uncertainty
for 220,000 Facebook users
7. Our perception of audience size
affects our behavior
We guide our audience’s impression of us
[Goffman 1959]
We manage the boundaries of when to engage
[Altman 1975]
On social media, we speak to the audience that we
expect is listening
[Marwick and boyd 2011, Viégas 1999]
8. Our perception of audience size
affects our behavior
We guide our audience’s impression of us
[Goffman 1959]
We manage the boundaries of when to engage
[Altman 1975]
On social media, we speak to the audience that we
expect is listening
[Marwick and boyd 2011, Viégas 1999]
What if our audience size estimates are
inaccurate?
9. Perceived audience
vs. reality
- survey
- folk theories of audience
- desired audience size
Predictability of
audience size
- using friend count
- using feedback
10. Perceived audience
vs. reality
- survey
- folk theories of audience
- desired audience size
Predictability of
audience size
- using friend count
- using feedback
11. Method
Data
220,000 U.S. Facebook users who share with
friends-only privacy
Collected audience information for their status
updates and link shares over 30 days
150,000,000 viewer-story pairs
16. Method
Audience size measurement
Javascript tracking whether a
story remained in the browser
viewport for at least 900ms
Not a direct measure of
attention: users remember
~70% of posts they see
[Counts and Fisher 2011]
17. Method
Survey
Recruited users with recent content (2-90 days ago)
via a request at the top of news feed
N=589; 61% female; mean age 33
Audience size survey
Show participants their most recent story
“How many people do you think saw it?”
“Describe how you came up with that number.”
“How many people do you wish saw this content?”
19. Method
Analysis
Compare participants’ actual audience size to
their estimated audience size
Consider your own most recent status update:
What percentage of your social network do you
think saw it?
21. Users underestimate by 4x
Results
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Actual audience (% of friends)
22. 0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Actual audience (% of friends)
Perceivedaudience(%offriends)
Users underestimate by 4x
Results
23. Users underestimate by 4x
Results
Accurate
estimations along
the diagonal
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Actual audience (% of friends)
Perceivedaudience(%offriends)
24. Users underestimate by 4x
Results
Accurate
estimations along
the diagonal
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Actual audience (% of friends)
Perceivedaudience(%offriends)
overestimates
underestimates
25. 0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Actual audience (% of friends)
Perceivedaudience(%offriends)
Users underestimate by 4x
Results
Estimated
20 friends = 6% of
network
Actual
78 friends = 24%
of network
R2 = 0.04
overestimates
underestimates
26. Folk theories of audience
Results
Inductive coding on participants’ reasons for how
estimating their audience (Fleiss’s Kappa = 0.72)
27. Folk theories of audience
Results
Inductive coding on participants’ reasons for how
estimating their audience (Fleiss’s Kappa = 0.72)
Random guess 23%
28. Folk theories of audience
Results
Inductive coding on participants’ reasons for how
estimating their audience (Fleiss’s Kappa = 0.72)
Random guess 23%
Feedback — likes and comments 21%
“I figured about half of the people who see it
will ‘like’ it, or comment on it”
29. Folk theories of audience
Results
Inductive coding on participants’ reasons for how
estimating their audience (Fleiss’s Kappa = 0.72)
Random guess 23%
Feedback — likes and comments 21%
Fraction of friend count 15%
“Maybe a third of my friends saw it.”
30. Folk theories of audience
Results
Inductive coding on participants’ reasons for how
estimating their audience (Fleiss’s Kappa = 0.72)
Random guess 23%
Feedback — likes and comments 21%
Fraction of friend count 15%
Login timing 9%
“Not a lot of people stay up late at night”
31. Folk theories of audience
Results
Inductive coding on participants’ reasons for how
estimating their audience (Fleiss’s Kappa = 0.72)
Random guess 23%
Feedback — likes and comments 21%
Fraction of friend count 15%
Login timing 9%
Friends seen active on the site 5%
Number of close friends and family 3%
Who might be interested in the topic 2%
Other 10%
32. Folk theories of audience
Results
No folk theory was more accurate than a
random guess
Random guess 23%
Feedback — likes and comments 21%
Fraction of friend count 15%
Login timing 9%
Friends seen active on the site 5%
Number of close friends and family 3%
Who might be interested in the topic 2%
Other 10%
33. Users want larger audiences
Results
same more far morefewer
far fewer
“How many people do you wish saw this content?”
50% 25% 22%
34. Users want larger audiences
Results
Roughly half want a larger audience...
but they already have it.
same more far morefewer
far fewer
“How many people do you wish saw this content?”
50% 25% 22%
35. Users underestimate their
audience by 4x
Common folk theories use
feedback and friend count
Users want larger audiences,
but already have them
36. Perceived audience
vs. reality
- survey
- folk theories of audience
- desired audience size
Predictability of
audience size
- using friend count
- using feedback
37. Perceived audience
vs. reality
- survey
- folk theories of audience
- desired audience size
Predictability of
audience size
- using friend count
- using feedback
38. Can we predict a post’s
audience using public
signals?
using the full 220,000 user and 150,000,000 view dataset
39. 35% of friends see median post
Results
More friends means higher variability in audience
0.00
0.01
0.02
0.03
0 50 100 150 200
Number of friends who saw post
Density
50th percentile by friend count (266)
40. 0.00
0.01
0.02
0.03
0 50 100 150 200
Number of friends who saw post
Density
35% of friends see median post
Results
More friends means higher variability in audience
25th percentile by friend count (138)
41. 35% of friends see median post
Results
More friends means higher variability in audience
0.00
0.01
0.02
0.03
0 50 100 150 200
Number of friends who saw post
Density
75th percentile by friend count (484)
42. Audience size is highly variable
Results
Highly variable: 50th percentile range is
20% of friends
0%
20%
40%
60%
80%
100%
lllllllllllllllllllllllllllllllll
0 200 400 600 800
Number of friends
Percentoffriendswhosawpost
43. Audience size is highly variable
Results
Highly variable: 50th percentile range is
20% of friends
50th percentile range
90th percentile range0%
20%
40%
60%
80%
100%
lllllllllllllllllllllllllllllllll
0 200 400 600 800
Number of friends
Percentoffriendswhosawpost
44. Audience size prediction
OLS regression
Model predictors R2 Mean absolute
error
Friend count 0.12 8% of friend
count
Feedback 0.13 8%
Friend count and
feedback
0.27 7%
45. 0%
20%
40%
60%
80%
100%
l
l
l l l l l l l l l l l l l l l l l l
0 5 10 15
Unique friends liking the post
Percentoffriendswhosawpost
Feedback is not predictive
Results
Rapid audience growth until the post receives
feedback from five unique friends
Posts with no likes or comments have especially
large variance: 90th percentile is 2%–55%
46. Model predictors R2 Mean absolute
error
Friend count 0.12 8% of friend
count
Feedback 0.13 8% of friend
count
Friend count and
feedback
0.27 7% of friend
count
Audience size prediction
OLS regression
47. Model predictors R2 Mean absolute
error
Friend count 0.12 8% of friend
count
Feedback 0.13 8% of friend
count
Friend count and
feedback
0.27 7% of friend
count
Audience size prediction
OLS regression
48. Model predictors R2 Mean absolute
error
Friend count 0.12 8% of friend
count
Feedback 0.13 8% of friend
count
Friend count and
feedback
0.27 7% of friend
count
Even with access to all user-visible signals,
audience size is still unpredictable.
Audience size prediction
OLS regression
49. How predictable is a user’s
cumulative audience?
Consider the audience for all of a user’s posts over
30 days instead of a single post
50% of the users in our sample produced five or
more pieces of content during the month
52. Fundamental mismatch between
perceived and actual audience
How might a 4x underestimate be impacting user
behavior?
Type of content shared, sharing volume, motivation
Ambiguous whether a more socially transparent
design would be desirable
53. Fundamental mismatch between
perceived and actual audience
How might a 4x underestimate be impacting user
behavior?
Type of content shared, sharing volume, motivation
Ambiguous whether a more socially transparent
design would be desirable
54. Why underestimate audience size?
The wishful thinking hypothesis: more comfortable
to blame a noisy distribution channel than to blame
yourself for writing bad content
55. Why underestimate audience size?
The wishful thinking hypothesis: more comfortable
to blame a noisy distribution channel than to blame
yourself for writing bad content
What role might be played by...
The availability heuristic?
Algorithmic feed filtering?
57. Your invisible audience is
larger than you probably think.
Users underestimate audience size by 4x
Median reach is 35% per post and 61% per month
Many want larger audiences but already have them
58. stanford hci group
Quantifying the
Invisible Audience
in Social Networks
Eytan Bakshy, Moira Burke, Brian Karrer
Facebook Data Science Team
Michael Bernstein
Stanford Computer Science Department