Aggregators rely on votes, and links to select and present subsets of the large quantity of news and opinion items generated each day. Opinion and topic diversity in the output sets can provide individual and societal benefits, but simply selecting the most popular items may not yield as much diversity as is present in the overall pool of votes and links.
In this paper, we define three diversity metrics that address different dimensions of diversity: inclusion, alienation, and proportional representation. We then present the Sidelines algorithm – which temporarily suppresses a voter’s preferences after a preferred item has been selected – as one approach to increase the diversity of result sets. In comparison to collections of the most popular items, from user votes on Digg.com and links from a panel of political blogs, the Sidelines algorithm increased inclusion while decreasing alienation. For the blog links, a set with known political preferences, we also found that Sidelines improved proportional representation. In an online experiment using blog link data as votes, readers were more likely to find something challenging to their views in the Sidelines result sets. These findings can help build news and opinion aggregators that present users with a broader range of topics and opinions.
Paper at http://www.smunson.com/portfolio/projects/aggdiversity/Sidelines-ICWSM.pdf
TeamStation AI System Report LATAM IT Salaries 2024
Sidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators
1. Sidelines:
An Algorithm for Increasing Diversity in News
and Opinion Aggregators
Sean Munson, Daniel Zhou, Paul Resnick
School of Information, University of Michigan
2.
3. “front page stories from the last seven days shows that liberal
sites… have had multiple articles a day on the front page while
weeks will go by without a single major conservative blog
achieving popular status.”
– Simon Owens, Mediashift Blog
September 2008
4.
5. today
• Diversity goals
• Sidelines algorithm, based on votes and voters
• Diversity measures, based on votes, voters, and
affiliations
• Pilot test
– metrics
– user response
• Future work
6. diversity goals
• Make people feel represented
• Proportional representation of viewpoints
• Expose everyone to challenging viewpoints
7. approval voting
• Each voter can vote for
an unlimited number of
items, up to once each
• Select the kitems with
the most votes
For news aggregator, votes weighted according to age
8. approval voting
• Each voter can vote for
an unlimited number of
items, up to once each
• Select the kitems with
the most votes
Risk of tipping?
With approval voting, a small
majority may be able to claim
all the top kspots.
For news aggregator, votes weighted according to age
9. approval voting sidelines
• Each voter can vote for • Each voter can vote for an
an unlimited number of unlimited number of
items, up to once each items, up to once each
• Select the kitems with • Selection: repeat k times
the most votes 1) Select item with the most
votes
2) Voters for that item
Risk of tipping? sidelined for next t turns
With approval voting, a small
majority may be able to claim
all the top kspots.
For news aggregator, votes weighted according to age
10. documents
A B C D E F Approval
Sidelines
voting
1 ✔ ✔ ✔ ✔
2 ✔ ✔ ✔
3 ✔ ✔ ✔
4 ✔ ✔ ✔
5 ✔ ✔
6 ✔ ✔
total 3 4 2 3 2 3
11. documents
A B C D E F Approval
Sidelines
voting
1 ✔ ✔ ✔ ✔
B
2 ✔ ✔ ✔
A
3 ✔ ✔ ✔
D
4 ✔ ✔ ✔
F
5 ✔ ✔
6 ✔ ✔
total 3 4 2 3 2 3
12. documents
A B C D E F Approval
Sidelines
voting
1 ✔ ✔ ✔ ✔
B B
2 ✔ ✔ ✔
A
3 ✔ ✔ ✔
D
4 ✔ ✔ ✔
F
5 ✔ ✔
Wait of just 1 turn
6 ✔ ✔
total 3 4 2 3 2 3
13. documents
A B C D E F Approval
Sidelines
voting
1 ✔ ✔ ✔ ✔
B B
2 ✔ ✔ ✔
A C
3 ✔ ✔ ✔
D
4 ✔ ✔ ✔
F
5 ✔ ✔
Wait of just 1 turn
6 ✔ ✔
total 0 0 2 0 2 0
14. documents
A B C D E F Approval
Sidelines
voting
1 ✔ ✔ ✔ ✔
B B
2 ✔ ✔ ✔
A C
3 ✔ ✔ ✔
D A
4 ✔ ✔ ✔
F
5 ✔ ✔
Wait of just 1 turn
6 ✔ ✔
total 3 4 0 3 0 3
15. documents
A B C D E F Approval
Sidelines
voting
1 ✔ ✔ ✔ ✔
B B
2 ✔ ✔ ✔
A C
3 ✔ ✔ ✔
D A
4 ✔ ✔ ✔
F E
5 ✔ ✔
Wait of just 1 turn
6 ✔ ✔
total 0 1 2 1 2 1
24. Digg World and Business Category
Data from 11 October 2008 to 30 November 2008.
Daily average:
New stories 4600
Diggs (votes) 85000
Voters 24000
25. Digg World and Business Category
Pure Popularity Sidelines p
Inclusion 0.651 0.668 <0.001
Alienation 0.476 0.463 <0.001
No user groups, so we couldn’t calculate
Proportional Representation score.
26. Data source: Links from 500 Political Blogs
• Links treated as votes, blogs as voters
• 24 Oct – 25 Nov
• Blogs coded as liberal (52%), conservative (35%), or
independent (13%)
27. Edges indicate Jaccard similarity above average.
Multidimensional scaling layout according to Jaccard similarity.
28. proportional representation
Pure popularity
showed some
evidence of tipping.
0.07 Pure Popularity Some tipping in
0.06
Sidelines Sidelines as well, but
0.05 significantly less
(paired t-test, p<
divKL
0.04
0.03 0.001)
0.02
0.01
0
25-Oct 30-Oct 4-Nov 9-Nov 14-Nov 19-Nov 24-Nov
29. inclusion, alienation
High inclusion score for sidelines (0.445) than pure
popularity (0.419) (paired t-test, p<0.001).
Pure Popularity
Significantly reduced
Sidelines Salienation for sidelines
0.85
(paired t-test, p<0.001)
Salienation
0.8
0.75
0.7
25-Oct 29-Oct 2-Nov 6-Nov 10-Nov 14-Nov 18-Nov 22-Nov
30. noticeable differences?
Asked 40 subjects to view
12-item result sets for
sidelines or pure popularity.
(Not told there were two
possibilities)
31. noticeable differences
Somewhat liberally-biased set of
readers had an 89% chance of
finding something challenging in
the sidelines result set (compared
with 50% for pure popularity).
32. mixed preferences for diversity
“I make a point of visiting websites with viewpoints
different than my own, so I would have been happy
with this.” (Sidelines)
“it’s good to know diverse opinions, but, on the other
hand, I can’t take too much of the opinions that
disagree with mine.” (Pure Popularity)
“I wouldn't use a news aggregator, but because it's
liberally biased [in agreement with subject’s views], I'm ok
with it.” (Pure Popularity)
33. applications
• News aggregators based on user votes.
• Other voting systems where diversity matters
(e.g. Google Moderator)
• Don’t need to know anything about content, user
groups, or long-term voting behavior
34.
35. applications
• News aggregators based on user votes.
• Other voting systems where diversity matters
(e.g. Google Moderator)
• Don’t need to know anything about content, user
groups, or long-term voting behavior
36. future work
• Enhancements to sidelines algorithm
• Alternative algorithms
• Actual preferences & behavior for challenging
vs. affirming content
• Presentation to make people feel represented
(while still viewing on challenging items!)
If people do not feel represented – they feel they have not been heard, and they don’t see content that supports them -- they may exit to places where they do. This can create balkanization and polarization. Sunstein and others have warned about the problems this may cause for democracy, society.
Make people feel represented can encourage people to speak up (who would have otherwise remained silent to promote social harmony). People may also be more open to hearing other views after they feel they have been heard.Proportional representation of viewpoints. As Duncan mentioned yesterday, people are not very good at knowing when others agree or disagree with them. They tend to think that support for their point of view is broader than it is. Those in the minority may think they are in the majority, and when their candidate does not win or their idea is not selected, they may feel disenfranchised or concoct conspiracy theories about how the election was “rigged” or “stolen.” Proportionally representing ideas can help people realize when they are in the minority. This can increase legitimacy of public decisions. It also may encourage the majority to stop and listen to dissenting views.Finally, exposing everyone to challenging viewpoints can lead to better problem solving as more ideas and viewpoints are included in the conversation. It can also help reduce polarization.
Not saying that approval voting is exactly what anyone uses!
Exclusion is just 1-inclusion.
S_alienation normalized by the maximum alienation so it always falls on the range [1/(|K|+1), 1]
Actually do need to know something about user groups and affiliations for this metric.
Blogs of one bias more likely to link to items linked by blogs with the same bias (Jaccard similarity).
Remember that the alienation score as how far down the list people had to go on average. So, a little over half the time, they didn't get an item at all. When they did which counted as an alienation of 1. The other half the time, on average, they had to go about 30-40% of the way down the list.
Subjects were recruited primarily from the university of Michigan and were somewhat liberally biased.
Need to do more work in this area of actual user preferences and behavior with respect to diversity.
The Obama administration used Google Moderator on Change.gov during the transition to collect questions. In one category, most of the top questions were about the legalization of marijuana. Sidelines may have let the stop question still be on this topic while letting other questions also make it to the first few questions.
Complement to content analysis approaches.
So we have an algorithm for increasing diversity without knowledge about content or voters’ political affiliations, and some potential applications for this algorithm. We also have some metrics for measuring diversity in result sets where people have voted on the candidate items. What’s next?Enhancement:suppress votes based on users’ voting history, optimize parametersOther algorithms:clustering based on votes