How will the role of data science in democracy be transformed as software expands the public’s ability to conduct our own experiments at scale? In the 1940s-70s, debates over authoritarian uses of statistics led to new paradigms in social psychology, management theory, and policy evaluation. Today, large-scale social experiments and predictive modeling are reviving these debates. Technology platforms now conduct hundreds of undisclosed experiments per day on pricing and advertising, and the algorithms that shape our social lives remain opaque to to the public. Democratic methods for data science may offer an alternative to this corporate libertarian paternalism.
In this talk, hear about the history and future of democratic social experimentation, from Kurt Lewin and Karl Popper to Donald Campbell. You’ll also hear about CivilServant, software that supports communities to conduct their own experiments on algorithms and social behavior online.
http://cmsw.mit.edu/event/nathan-matias-authoritarian-democratic-data-science-experimenting-society/
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Authoritarian and Democratic Data Science in an Experimenting Society
1. Authoritarian & Democratic
Data Science in an
Experimenting Society
MIT CMS/W, Feb 16, 2017
@natematias
natematias.com
civic.mit.edu/users/natematias
J. Nathan Matias
2.
3.
4. McMillen, Andrew. Wikipedia Is Not
Therapy: How the online encyclopedia
manages mental illness and suicide
threats in its volunteer community.
Backchannel. Illustration by Laurent Hrybyk
5. Goldman, Adam. (2016). The Comet Ping Pong Gunman Answers Our Reporter’s Questions. New
York Times
6. Report to Law Enforcement
Report to reddit Platform
Report to Community Moderators
Up-Vote or Down-Vote
7. negative feedback leads to significant
behavioral changes that are detrimental to
the community.
Not only do authors of negatively-evaluated
content contribute more, but also their future
posts are of lower quality, and are
perceived by the community as such.
Cheng, J., Danescu-Niculescu-Mizil, C. & Leskovec, J. (2014). How Community Feedback Shapes User
Behavior. ICWSM 2014.
9. Experiments Per Day on bing.com
Kohavi, R., Deng, A., Frasca, B., Walker, T., Xu, Y., & Pohlmann, N. (2013, August). Online controlled
experiments at large scale. In Proceedings of the 19th ACM SIGKDD international conference on
Knowledge discovery and data mining (pp. 1168-1176). ACM.
10. Geiger, S. (2015). Does facebook have civil servants? On governmentality and computational social
science. In Workshop on Ethics for Studying Sociotechnical Systems in a Big Data World. Vancouver,
British Columbia, Canada.
academic and industry researchers who
work for institutions that build and operate our
digitally mediated public spaces are either
directly doing governance work themselves
or building systems that have been delegated
governance work.
In this sense, researchers can be said to
form a core part of the elite civil service
and bureaucratic corps of our era
11.
12. MacKinnon, R. (2012). Consent of the networked: The worldwide struggle for Internet freedom.
Basic Books
Companies act as the new sovereigns of
cyberspace… most companies’ failure to take
responsibility for their power over citizens’
political lives, and their lack of
accountability in the exercise of that
power, corrodes the Internet’s democratic
potential
16. Tiziana Terranova (2000) Free Labor: Producing Culture for the Digital Economy. Social Text
the Internet is about the extraction of value
out of continuous, updateable work
[consumption & production of culture]
[….]
Such means of production need to be
cultivated by encouraging the worker to
participate in a culture of exchange, whose
flows are mainly kept within the company
17. Prahalad, C. K., and Venkat Ramaswamy. 2004. Co-Creation Experiences: The next Practice in Value
Creation. Journal of Interactive Marketing 18 (3): 5–14.
the market is becoming a forum for
conversations
managers need to invest in building new
infrastructure capabilities, as well as new
functional and governance capabilities
18. Gillespie, T. (2010). The politics of “platforms.” New Media & Society, 12(3), 347–364.
[platform] choices about what can appear,
how it is organized, how it is monetized, what
can be removed and why, and what the
technical architecture allows and prohibits, are
all real and substantive interventions into
the contours of public discourse.
19.
20. JoAnne Yates (1989) Control Through Communication: The Rise of System in American Management.
Johns Hopkins University Press
Systematic management attempted to
improve control over–and thus the efficiency
of–managers, workers, materials, and
production processes
21. JoAnne Yates (1989) Control Through Communication: The Rise of System in American Management.
Johns Hopkins University Press
Management
Theories for
Scaled
Operations
Growth in
Scale &
Complexity of
Industry
Comm & Info
Technology
22. JoAnne Yates (1989) Control Through Communication: The Rise of System in American Management.
Johns Hopkins University Press
Systematicall
y-Defined
Roles
Stopwatch
23. Frank Bunker Gilbreth and Lillian M. Gilbreth (1910-1924) Original films of Frank & Lillian Gilbreth.
Source: Prelinger Archives, via Wikimedia Commons
24. JoAnne Yates (1989) Control Through Communication: The Rise of System in American Management.
Johns Hopkins University Press
Performance
Monitoring
Statistics
Systematicall
y-Defined
Roles
Stopwatch
25. Chandler Jr, A. D. (1977). The Visible Hand: The Managerial Revolution in American Business.
Harvard University Press.
For the middle and top managers, control
through statistics quickly became both a
science and an art. This need for accurate
information led to the devising of improved
methods for collecting, collating, and
analyzing a wide variety of data generated
by the day-to-day operations of the enterprise.
27. Valentine, R. (1916). The progressive relation between efficiency and consent. Bulletin of Taylor
Society, 2(1)
28. Valentine, R. (1916). The progressive relation between efficiency and consent. Bulletin of Taylor
Society, 2(1)
A free man—a consenting man— is the more
desirable worker…
organized consent as well as individual
consent is the basis of a more efficient
group.
…build up a finer texture of democracy
through self-training groups, constantly growing
in strength through the consideration of
scientifically-accurate data.
29. Marshall, Edward. (1913) Industrial Psychologist’ to Prevent Labor Troubles. The New York Times,
April 27, 1913: Magazine Section Part Five, 11
30. Adolf Hitler delivers a speech at the Kroll Opera House,
Dec 11, 1941. Image source: Wikimedia Commons
31. Lewin, K. (1944). The dynamics of group action. Educational Leadership, 1(4), 195–200.
Efficient democracy means
organization, but it means
organization and leadership on
different principles than
autocracy.
It is essential that a
democratic commonwealth
and its educational system
apply the rational procedures
of scientific investigation to
its own processes of group
living.
Kurt Lewin. Image source: Wikipedia
32. Burnes, B. (2007). Kurt Lewin and the Harwood Studies The Foundations of OD. The Journal of Applied
Behavioral Science, 43(2), 213-231.
continue
autocratic
management
Harwood Pajama Factory Experiments
• Increasing Productivity
• Reducing Employee Turnover
workers
discuss &
vote
on
management
changes
33. Wikimedia Commons
Coch, L., & French, J. (1948). Overcoming resistance to change. Human Relations, 1, 512–532.
35. Adelman, C. (1993). Kurt Lewin and the origins of action research. Educational action research, 1(1),
7-24.
the residents of the affected community
must be involved in the research process
from the beginning
36. we will build the Great Society. It is a
Society where no child will go unfed,
and no youngster will go unschooled
Johnson, Lyndon. 323 -
Remarks in Athens at Ohio
University. May 7, 1964
Image source: Wikipedia: First Lady
Lady Bird Johnson visits a Head
Start class in 1966
37. US National Security Agency System/360 85 Console in 1971. Image source: NSA via Wikimedia Commons
38. Campbell, D. T. (1998). The experimenting society. In The experimenting society: Essays in honor of
Donald T. Campbell (p. 35). New Brunswick: Transaction Publishers.
Can the open society be an
experimenting society?
39. Popper, K. (1947). The open society and its enemies. Routledge.
Closed Societies
“the learned should rule”
Open Societies
the public evaluates &
criticizes government
“so that bad or
incompetent rulers can
be prevented from doing
too much damage”
40. Popper, K. (1947). The open society and its enemies. Routledge.
the social engineer conceives as the
scientific basis of politics something like a
social technology
the Utopian engineer will have to be deaf to
many complaints ; in fact, it will be part of his
business to suppress unreasonable
objections. But with it, he must invariably
suppress reasonable criticism also
41. Popper, K. (1947). The open society and its enemies. Routledge.
The piecemeal engineer will, accordingly,
adopt the method of searching for, and
fighting against, the greatest and most
urgent evils of society…
There will be a possibility of reaching a
reasonable compromise and therefore of
achieving the improvement by democratic
methods.
43. Williams, W., & Evans, J. W. (1969). The Politics of Evaluation: The Case of Head Start. The ANNALS
of the American Academy of Political and Social Science, 385(1), 118–132
the absolute power of analysis was
oversold
the conflicts in the system between the
analytical staff and the operators of the
programs was underestimated.
44. Williams, W. (1971). Social Policy Research and Analysis: The Experience in the Federal Social
Agencies. American Elsevier Publishing Company.
Discard
Neutrality
Government Social
Scientists Should
Propose
Policy
Manage
Policies
Advocate
for Policy
46. Campbell, D. T. (1998). The experimenting society. In The experimenting society: Essays in honor of
Donald T. Campbell (p. 35). New Brunswick: Transaction Publishers.
Participation in policy experiments is more
akin to participating in democratic political
decision making than to participating in the
psychology laboratory. These restrictions all
have costs in the validity of experimental
inference.
the task of first priority for the methodologists
of the experimenting society is to design
experimental arrangements that obviate
these difficulties
47. Campbell, D. T. (1998). The experimenting society. In The experimenting society: Essays in honor of
Donald T. Campbell (p. 35). New Brunswick: Transaction Publishers.
The Contagious Cross-Validation Model for
Local Programs…
national funding would support adoptions that
included locally designed cross-validating
evaluations…
48. Campbell, D. T. (1998). The experimenting society. In The experimenting society: Essays in honor of
Donald T. Campbell (p. 35). New Brunswick: Transaction Publishers.
it is those who have situation-specific
information who make the best critics, and
the best judges, of the plausibility of most of
the rival hypotheses…
we must provide these nonprofessional
observers with the self-confidence and
opportunity to publicly disagree with the
conclusions of the professional applied social
scientists.
49.
50. JoAnne Yates (1989) Control Through Communication: The Rise of System in American Management.
Johns Hopkins University Press
Management
Theories for
Scaled
Operations
Growth in
Scale &
Complexity
Comm & Info
Technology
51. Geiger, S. (2015). Does facebook have civil servants? On governmentality and computational social
science. In Workshop on Ethics for Studying Sociotechnical Systems in a Big Data World. Vancouver,
British Columbia, Canada.
In this sense, researchers can be said to
form a core part of the elite civil service
and bureaucratic corps of our era
53. MacKinnon, R. (2012). Consent of the networked: The worldwide struggle for Internet freedom.
Basic Books
Companies act as the new sovereigns of
cyberspace… most companies’ failure to take
responsibility for their power over citizens’
political lives, and their lack of
accountability in the exercise of that
power, corrodes the Internet’s democratic
potential
61. 0 1250 2500 3750 5000
Remove Post
Approve Post
Remove Comment
Approve Comment
Ban User
Unban User
Revise Wiki
Recategorize Post
Automated Systems Humans
8,298 Moderation Actions, May 23 - 29, 2016
62. Does making participants aware of rules
by posting them increase norm-
compliance of first-time commenters?
63. • Design Experiments
• Coordinate Policy
Interventions
• Monitor Outcomes
• Estimate Experimental Results
Civil Servant
Community-Led Field Experiments in
Community Governance Online
65. x Only routine interventions
x No high risk communities
(markets, mental health)
x No groups that organize
to harm others
Civil Servant
Community-Led Field Experiments in
Community Governance Online
66. Open Archive of
Moderation Studies
Community
Experiments
Civil Servant
Community-Led Field Experiments in
Community Governance Online
69. Matias, J. N. (2016) Posting Rules in Online Discussions Prevents Problems & Increases
Participation. CivilServant
“Sticking” a Rule Comment to Threads
Increased a Newcomer’s Probability of
Posting a First Comment Within the Rules
70. Posting the rules increases the incidence
rate of newcomer comments by 38.1% on
average.
If the community adopts sticky comments,
they could prevent 1,838 people a month
from engaging in unacceptable behavior.
They would also gain 9,631 new
commenters per month, on average.
Matias, J. N. (2016) Posting Rules in Online Discussions Prevents Problems & Increases
Participation. CivilServant
72. [Misleading Title] Bavaria passes new law to
make migrants respect ‘dominant’ local
culture
[Misleading Title | Not Appropriate Subreddit]
Spanish Terror Attack: Gunman enters
supermarket, shouts Allahu Akbar
[Editorialized Title] A last kiss for mama:
Jihadi parents bid young daughters
goodbye… before one walks into a
Damascus police station and is blown up
by remote detonator
Matias, J. N. (2016) Posting Rules in Online Discussions Prevents Problems & Increases
Participation. CivilServant
74. Can we increase the rate that commenters
question unreliable news without
making unreliable news trend on social
media algorithms?
75.
76. PredictedIncidenceofCommentsWithLinksEncouraging Fact-Checking Causes Unreliable News
To Receive 2x More Comments with Links on Average
Tabloid links in r/worldnews receive a 2.01 to 2.03x increase in the number of comments including
links to further evidence when moderators use sticky comments to encourage fact-checking.
Source: J. Nathan Matias, MIT Media Lab. Experiment by r/worldnews, 11/27/2016 – 1/20/2017
n = 840 posts from sites that moderators consider tabloids, 2.4% of submissions on average.
This negative binomial model predicts incidence rates; the effect is larger for more popular posts.
Fact-checking: p = 0.0083. Fact-checking + Voting: p = 0.0073 *** p<0.001, ** p<0.01, * p<0.05
For full details on the findings, which were not yet peer reviewed by Jan 2017, see civilservant.io
1.44**
0.71
1.46**
No Action
Taken
Suggest Fact-
Checking
Suggest Fact-
Checking & Voting
77. Encouraging Fact-Checking Causes Unreliable News
To Be Promoted Less by reddit’s Algorithms on Average
Tabloid links in r/worldnews receive a 2.04x reduction in the scores that shape reddit’s rankings
when moderators encouraged fact-checking, but not when they also suggested voting
Source: J. Nathan Matias, MIT Media Lab. Experiment by r/worldnews, 12/07/2016 – 1/20/2017
n = 696 posts from sites that moderators consider tabloids, 2.4% of submissions on average.
The reddit algorithms use the “score” to determine the ranking of a link. On average, between
links of similar age, the submission with a higher score will be ranked more highly.
This negative binomial model predicts incidence rates; the effect is larger for more popular posts.
Fact-checking intervention p = 0.000562. Voting p = 0.198 *** p<0.001, ** p<0.01, * p<0.05
For full details on the findings, which were not yet peer reviewed by Jan 2017, see civilservant.io
50.56***
103.07
134.37
No Action
Taken
Suggest Fact-
Checking
Suggest Fact-
Checking & Voting
PredictedScoreIncidenceRateAfter24hrs
79. Community Discussion
Policy Discussion:
What if lack of conflict & increased participation is bad?
Can this cause censorship if taken to an extreme?
How generalizable is this to other subs?
Intervention Design:
I imagine the wording is extremely important.
80. Community Discussion
Personal Stories of Outliers:
I don't think I've ever read any subreddit's rules ever.
Experiment Design & Implications:
I bet that the rules comment increases participation
because it makes it say “(1 comment)” on the forum
index so people click the link to read the comment
81. Community Discussion
Research Ethics:
Did you get the informed consent?
[IRBs] have no authority, legal or ethical, to make
decisions about consent.
you're objecting to this study as an excuse to critique
the moderators
82. Open Archive of
Moderation Studies
Community
Experiments
CivilServant
Community-Led Field Experiments in
Community Governance Online
83.
84.
85. Data Sampled July 2015
15,300
1,795
Eligible Communities
3,000 + comments/month
Moderator Roles
How Far Might Community Experiments
Scale on the reddit Platform?
89. Ethan Zuckerman
Associate Professor of the Practice
Massachusetts Institute of Technology
Elizabeth Levy Paluck
Associate Professor, Department of Psychology
Woodrow Wilson School, Princeton University
Tarleton Gillespie
Principal Researcher
Microsoft Research
Merry Mou
M.Eng Student
Massachusetts Institute of Technology
91. Authoritarian & Democratic
Data Science in an
Experimenting Society
MIT CMS/W, Feb 16, 2017
@natematias
natematias.com
civic.mit.edu/users/natematias
J. Nathan Matias
Notes de l'éditeur
In the summer of 2015, professor Stephen Hawking held a question-answer session on reddit, a popular social news platform. The conversation reached tens of millions of people, and over 12,000 people submitted questions on topics ranging from cosmology to AI.
Hawking also received other comments, questions that mocked his illness and his personal life. These comments, while distasteful, illustrate some of the less risky kinds of online experiences.
Sometimes these experiences interact with mental health risks, as Wikipedians have discovered. Working together, Wikipedians have developed community support structures for contributors with mental illness and those at risk of suicide.
compared to the threats of violence that gaming commentator Anita Sarkeesian routinely received in early 2015, as people promised to find her address and cause her physical harm. And threats can sometimes become reality.
Digital risks are also related to physical harm. It could be this man, who responded to online conspiracy theories by bringing and firing an assault weapon into a DC pizzeria, or it could be the large number domestic abuse cases that involve some kind of online harassment.
40-47% of internet users report experiencing some kind of online harassment, with 7% of Americans, 22.9 million people, experiencing the more severe forms, including physical threats, stalking, sexual assault, and sustained harassment which can draw out over years. In roughly half of cases, people know their harasser.
40% of US adults report experiencing some kind of online harassment, with 7% of Americans, 22.9 million people, experiencing the more severe forms, including physical threats, stalking, sexual assault, and sustained harassment which can draw out over years.
What might we do about the comments received by Stephen Hawking? The comment isn’t illegal, and it doesn’t violate reddit’s policies, so the site’s professional staff are unlikely to do anything. The joke does violate the policies of the community where it was posted, and it was removed by a volunteer moderator. Yet even before a moderator removed it, other readers used their power to vote on the comment, making it less prominent.
Voting systems like this have existed since the late 90s, but it was only in 2014 that researchers did the first causal research on their effects. Their quasi-experiment found that across four political news sites, down-voting someone leads to worse behavior that also drags down communities. It took 17 years to discover that down-voting can worsen rather than improve conversations online.
Nor is that an isolated case. Researchers recently looked back at work by Tumblr and Instagram to reduce self-harm and pro-anorexia communities and only learned four years later that their efforts increased rather than decreased encouragements towards self harm.
Like any management practice or policy intervention, they can also fail.
Online platforms could have tested the effect of downvotes with a randomized trial, an A/B test that compared the potential outcomes between conversations with downvote buttons and conversations without them.
This lack of causal knowledge on pressing public interest issues online is puzzling, especially since online platforms now conduct up to hundreds of randomized trials per day on questions related to sales and advertising.
Social experiments and other kinds of data science are now extremely common in our online environments, and the work researchers do is so important to our public life, that Stuart Geiger has argued that we “form a core part of the elite civil service and bureaucratic corps of our era.”
At the same time, social platforms face tremendous public distrust over experiments focused on public well-being, especially when that research is seen as something secret or deceptive. In 2014, when Facebook tested the effect of newsfeed adjustments on the sentiment of people’s future posts, they faced substantial public criticism.
I believe concerns about the ethics of online experiments go deeper than ethics, to public concerns about the emerging kinds of power that platforms currently exercise to observe, influence, and govern the everyday social lives of billions of people. Platform power may bring great benefits and mitigate serious social problems— but we need ways to evaluate the uses of that power for their contribution to the common good.
Today, I’m going to talk about the history and politics of the design of social experiments. I’m going to argue that if we care about reconciling values of democracy with the tremendous power of online platforms, we need a way to reconcile the way we do experimental research with those democratic values.
While the scale and details of platforms are recent, we have much to learn from other moments in US history where new technologies of data processing and the challenges of rapidly-scaling an area of human endeavor prompted debates about democracy and experimentation. (1) he development of systematic management during the industrial revolution and (2) the introduction of computerized, evidence-based policymaking in the 1960s.
I’ll end the talk by summarizing new work I’m doing to prototype large-scale community-led experiments online.
I choose management and policy histories because online social experiments merge questions of social policy with questions of business management. When reddit delegates moderation work to volunteers or to down-voting readers, they are making a business decision about the labor that maintains their system; they are also making a policy decision about how to govern.
Tiziana Terranova has called online platforms Social Factories; since our cultural and social relations generate value, platforms need ways to cultivate profitable versions of that cultural activity — in other words, they need to research management techniques that go beyond managing employees to managing users.
That management is also governance. In the business literature Prahalad wrote about it this way: since the market is becoming a forum for conversations, if firms wanted to generate value through what would soon be known as platforms, they would need to develop infrastructure *and* governance.
Indeed, by adopting the language of “platforms,” companies sit in the overlapping space between management and policymaking. As Gillespie pointed out, companies like YouTube make management decisions about attention, and those decisions influence public goods like public discourse that we typically consider to be a policy matter.
Because management and policy are deeply entwined online, the histories of management research and policy evaluation offer rich sources for imagining democratic futures of data science.
In 1939, three years after Charlie Chaplin’s film Modern Times, the experimental psychologist Kurt Lewin was approached by the Harwood clothing factory, which was struggling to understand the problems with their efforts to increase productivity through scientific management. They carefully defined- quantified worker tasks, tracked employee performance, gave them good benefits, and even funded neighborhood institutions. But people kept performing poorly and leaving the company.
Over the next ten years, Lewin and his students would conduct a series of experiments at Harwood that would create new paradigms in management theory, launch social psychology as a major field, and transform the politics of social experimentation.
Harwood in the 1930s was a classic example of what Joanne Yates calls the corporate welfare style of systematic management. Starting in the mid 1800s with railroads, US firms had increased the scale of production through management techniques that tried to improve control over business operations.
Here’s how these factors co-evolved, according to Yates. As companies grew from a few dozen people to hundreds or thousands of workers at multiple locations, firms developed theories of management for controlling labor at that scale, theories that came into being alongside related information technologies.
Information technologies like the stopwatch, pointed downward, and were used by managers to systematically define worker roles. These efficiency measures reduced the agency of individual workers, and helped set benchmarks for worker performance and pay.
Scientific management consultants like Frank and Lillian Gilbreth would use clocks to measure the units per hour of certain parts of a work process. They would then conduct experiments to test the most efficient, and sometimes most safe ways to do a task— in this case increasing the rate of clerical work by 61%.
As firms A/B tested work processes in a rudimentary way, they also developed information technologies for processing data on work performance. Evolving management theories depended on detailed records of each major task in the production process, leading to innovations in forms, card systems, file cabinets, systems for copying data, organization charts, memos, and methods for visualizing data on complex operations.
As Alfred Chandler argues in The Visible Hand, “control through statistics quickly became both a science and an art,” and the researchers who did this analytical work were some of the earliest management consultants.
Yet many workers resisted the move toward systematic management, especially machinists unions. Here at the Watertown Arsenal, machinists went on strike in 1911 after Taylorist management consultants tried to introduce stopwatch-timed work processes and the individual, piece-work pay system that typically accompanied the Taylorist style of scientific management.
Systematic managers considered democracy a risk and resisted collective bargaining, preferring to think of each person as an individual unit to be independently optimized. That idea was challenged by Robert Valentine, former head of the MIT writing program, who participated in the US Commission on Industrial Relations a year after the Watertown strike. Writing in 1916 in the Bulletin of the Taylor Society, Valentine mapped out the social structures that shaped factory work and argued that this focus on measuring and managing individuals failed to match the group structures of actual manufacturing experience.
In the article, Valentine argued that “organized consent as well as individual consent is the basis of a more efficient group,” that management experts should look to the new field of social psychology to develop management theories for organized consent that would allow democratic groups of workers to use the tools of scientific management.
Although Valentine’s ideas were celebrated in the New York Times magazine, management experts hated them. The Taylor society published his article, and also published 10 strong rebuttals from leading thinkers of scientific management.
Over the next two decades, a few groups tried small projects in organized consent, including bureaus of standards, worker shop conferences, union research units, and a brief effort at collectively-negotiated union-factory research that was cut short by the depression. But overall, managers dismissed the idea that democracy and efficiency were compatible, and they did so on what they thought was a scientific basis.
So when the pajama-makers at Harwood Manufacturing asked the psychologist Kurt Lewin, a Jewish-German refugee from Nazi Germany to help them solve their productivity and turnover problems, most management thinkers would have dismissed democratic approaches.
But Lewin was worried that American factories and schools looked too much like the autocratic institutions that he had left behind.
As an experimental psychologist, Lewin was aware of the close relations between the fields of eugenics, statistics, and psychology. Writing from MIT in 1944, he looked back on the first half of his students’ work at Harwood and drew a line between autocratic and democratic organization, and the research that upheld those two approaches.
Lewin believed that social forces, not just individual capacities, influence human behavior. He also believed that by studying those social forces through experiments, out in the world, a “democratic commonwealth could apply scientific investigation to its own processes of group living.”
In the set of experiments at Harwood, Lewin’s student Alex Bavelas analyzed employee productivity and turnover in groups. He set aside a “treatment” group that discussed and voted on management changes, and compared its performance to teams that remained the same. We wouldn’t recognize the Hardwood study as a valid experiment today, but it was a powerful early effort to bring experimental methods together with democratic governance.
Over the next ten years, Lewin’s students would conduct many such experiments, producing detailed charts like this one, that compared a the performance of a “control group” to experiment groups that were supported to organize democratically in different ways.
Lewin’s students tried many combinations of worker discussions and votes on goal setting, intervention design, and even the analysis of experiment results.
Over time, Lewin argued for more than just democratic decision making —he came to argue that democratic processes of research itself, led by study participants, could make powerful contributions to fundamental theories of human behavior.
Although Lewin passed away in 1947, his theories of human behavior helped launch the field of social psychology, and his approach to research inspired a tradition of “action research” in the social sciences.
Another debate over democracy and social experiments occurred in the 1960s and 70s, as the US government began to invest in computer systems that could help evaluate national-scale social policies like Head Start, which were part of president Johnson’s Great Society initiative.
US investment in computers spread from the military to the rest of government in the 1960s, as Kennedy’s Secretary of Defense Robert McNamara, a former Ford CEO, argued that the military should be managed through data, just like factories. When Johnson declared a War on Poverty in 1964, the subtext was that social programs could adopt data-driven management from the military. That same year, IBM released the System/360, the first widely-available general purpose mainframe system, equally capable of handling data on military operations and social programs like Social Security.
Right in the middle of this transition, in 1971, we get a provocative question from Donald Campbell, an experimental methodologist who wrote the book on randomized trials that the US government was starting to use to evaluate social policy: “Can the open society be an experimenting society?”
Before I share Campbell’s answer, let’s try to understand what he means by this question. What is an “open society”? The answer takes us back to the nazis.
Writing as an Austrian exile in New Zealand during the second world war, the philosopher Karl Popper describes two kinds of governance: open and closed societies. In closed societies, authoritarians govern and manipulate the public towards utopian goals on the paternalistic principle that ``the learned should rule’.’ In open societies, the public is encouraged to evaluate and criticize government decisions "so that bad or incompetent rulers can be prevented from doing too much damage"
In The Open Society and its Enemies, Popper yearns for a way to reconcile the social sciences with democracy without resorting to eugenics. So he talks about two kinds of social engineers and social technology:
The Utopian engineer ignores complaints and suppresses criticism toward utopian goals
while the piecemeal engineer tries to alleviate social ills through social research that informs democratic processes of compromise. When Campbell asks his question, he’s asking if this second kind of social engineering is actually possible.
To understand why Campbell could question the compatibility of experimentation with an open society, consider the case of Head Start. Initially imagined as a series of pilot projects, Head Start became widely popular and quickly grew into $100 million dollar national program serving half a million children.
Three years into Head Start, The US Office of Economic Opportunity was asked to conduct a short-notice retro-spective evaluation. To make things worse, President Nixon mentioned incomplete, preliminary results in a public speech, saying that “the long term effect of Head Start appears to be extremely weak.” The final, thrown-together evaluation had many weakness.
Poor Walter Williams, the Chief of the Research & Plans Division evaluating Head Start, came away from this experience demoralized.
Williams argued that if governments wanted statistically valid results in in the future, government social scientists should.. (slides)
Whether it’s Popper’s authoritarian, closed society or Lewin’s autocracy, how ever you want to describe it, Williams is arguing for the “governance of the learned”This is the moment when Campbell asks his question in 1971: can the open society be an experimenting society. We can summarize his answer this way:
Fortunately, Research is design, and we can redesign our methods to follow our values.
For example, contemporaries feared that the validity of experiments would be ruined if participants knew about the study, the famous “Hawthorne effect,” but Campbell saw experiment participation as democratic participation.
Reconciling those values was design challenge that might be addressed by group consent.
Campbell’s contemporaries looked for greater political power so they could improve the internal validity of extremely large, nation-wide studies. Campbell pointed out that they could gain benefits of external validity by supporting hundreds of locally-designed evaluations and replications. He imagined disputatious networks of local policy knowledge-makers who share and replicate each others’ findings.
Campbell argued that active guidance and oversight from citizens could make the open society an experimenting society, because… (read)
Unfortunately, most of Campbell’s contemporaries viewed this idea as expensive and impractical, and there have only been a handful of community-led social experiments in the last 45 years.
To sum up this historical picture, we’ve now looked at two moments in the history of management and policy where information technologies and experimental methods co-evolved to meet dramatic increases in the scale and ambition of efforts to manage and govern human behavior.
We’ve also looked at key debates over whether those powers and the experimental research behind them would reflect democratic or authoritarian values. What can we learn from those debates?
In the industrial revolution, firms went from a dozen people to hundreds or thousands. At the birth of policy evaluation, governments gained the capacity to monitor and intervene in the lives of millions of people. And today, online platforms observe and intervene in the social lives of over a billion people multiple times a day.
As data scientists accept the reality that we *are* the systematic managers and policy evaluators of our era, we urgently need to reconcile that power with democratic values.
One way of making sense of the power that citizens have in most experiments today is to borrow Arnestein’s ladder of citizen participation from urban planning. Online experiments typically place citizens in the category of non-participation, and experimenters who do care about public opinion are often concerned with deflecting criticism through tokenism rather than actual citizen power.
So when people like Rebecca MacKinnon ask how to apply the consent of the networked to platform power, we have the opportunity to ask the same of our experimental research.
Over the last year, I have been asking those questions through CivilServant, a project that supports community-led social experiments on online moderation.
Remember the cruel jokes about Stephen Hawking? While laws and platform policies govern those actions in theory, the comments were practically governed by the policies of the community where they were posted, one of perhaps millions of such communities across the social web.
In between platforms and communities have been moderators, the people who fill in the cracks of social interactions online, allowing platforms to disclaim responsibility for what happens on their systems and deflect complaints while also supporting communities to develop their own norms and practices. They have also been key organizers in social movements against platforms.
Moderators, who can range from two people to a thousand. founders, organizers, promoters, architects, maintainers, legislators, and enforcers of their communities. And they work together across communities.
On the social news site reddit, where millions of people post content, comment on it, and vote on each others’ contributions across a wide range of communities
Unlike most users online, who have very little say in the policies that govern our digital lives, online communities do exercise substantial power; that’s why moderators of the science community were able to remove the insulting comment sent to Stephen Hawking. And on reddit, moderators have privileged access to data, creating software that automates and coordinates their work.
That data access makes it possible for communities to test the effect of their policies with experiments.
This spring, we were approached by the New Reddit Journal of Science, a large community that hosts live Q&As with researchers like Stephen Hawking, alongside large discussions about peer-reviewed research.
Like most communities on reddit, this community has rules of participation
And they enforce those policies, conducting thousands of moderation actions per week.
To answer that question, moderators used novel software I have designed called CivilServant, which sets out to support communities to
design experiments operate completely independently of online platforms
coordinate policy interventions
monitor outcomes
estimate experimental results
Unlike most A/B tests on the web, CivilServant makes itself visible as a software agent, or social bot, that visibly participates in the research context, showing up in the list of moderators for a given subreddit. And if the community wants to end the experiment, they can kick the bot out of their community.
CivilServant opens up new ground in research ethics. Working with MIT’s committee on the use of humans as experimental subjects, we developed a “Meta-IRB” that specified the characteristics of experiments that I would be able to pursue, without requiring committee approval per experiment.
In parallel with Campbell’s idea of Disputatious, contagious community cross-validation, as we support communities to conduct their own studies, each experiment adds to public knowledge and may come to support community replications.
After much discussion, moderators decided to test the effect of posting these messages to the top of discussion threads. Even though they already posted the rules to large question-answer threads, they didn’t expect that these comments would have any effect.
Examples of some of the articles that were removed by moderators.
Over time, we hope to make CivilServant a self-serve project for communities. Along the way, we’ll continue to test ways to make the work of experiments community-led.
Because large-scale management and governance of human behavior through experiments is an idea with a long history, we have an opportunity to learn from past struggles on factory management and government policy evaluation.
If we care about reconciling values of democracy and using tremendous power of online platforms for beneficial ends, we need a way to reconcile the way we do experimental research with those democratic values. On one hand, we don’t want to go 17 years before learning that our efforts at improving society have backfired. On the other, we want make those efforts in a way that maintains an open society.
Fortunately, Research is design, and we can redesign our methods to follow our values.
CivilServant is my attempt at community-led experiments, and in the coming months, I hope to have early results on the ways that communities and participants make sense of experiment participation as democratic participation in community policy decisions.