Talk presented at the ID360 Conference (http://identity.utexas.edu/id360), May 1, 2013. Paper: http://ssrn.com/abstract=2228728. Joint work with Jessica Hullman, Jeffrey P. Bigham, Michael S. Bernstein, Juho Kim, Walter S. Lasecki, Saeideh Bakhshi, Tanushree Mitra, and Robert C. Miller.
1. Amazon's Mechanical Turk is Not Anonymous
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@ischool.utexas.edussrn.com/abstract=2190946
2. Roadmap
• What is Mechanical Turk?
• Mechanical Turk & Anonymity
• The Vulnerability
• Potential Risks
• Closing Thoughts
2
4. • Online marketplace for paid crowd work
• On-demand, scalable, 24/7 global workforce
• Can perform all interactions via programmer’s API
• Requestors & Workers are seemingly anonymous…
Amazon Mechanical Turk (MTurk)
4
5. Use Case 1: Data Processing
5
J. Pontin. Artificial Intelligence, With Help From
the Humans. New York Times (March 25, 2007)
6. Use Case 2: Data Collection
(e.g., surveys, demographics, …)
Amazon's
Mechanical Turk:
A New Source of
Inexpensive, Yet
High-Quality,
Data?
M. Buhrmester
et al. (2011)
6
10. Brief Digression: Identity Fraud
• Compromised & exploited worker accounts
• Sybil attacks: use of multiple worker identities
• Script bots masquerading as human workers
10
Robert Sim, MSR Faculty Summit’12
11. Safeguarding Personal Data
•
“What are the characteristics of MTurk workers?... the MTurk
system is set up to strictly protect workers’ anonymity….”
11
16. Fraudulent Abuse of Workers
“Do not do any HITs that involve: filling in
CAPTCHAs; secret shopping; test our web page;
test zip code; free trial; click my link; surveys or
quizzes (unless the requester is listed with a
smiley in the Hall of Fame/Shame); anything
that involves sending a text message; or
basically anything that asks for any personal
information at all—even your zip code. If you
feel in your gut it’s not on the level, IT’S NOT.
Why? Because they are scams...”
16
17. Workers’ Views: Survey & Forums
• “... my reviewer profile is linked to my Mturk number! I had
no idea...”
• “...Amazon needs to separate the Mturk numbers from
seller numbers to protect our privacy…”
• “I think this is outrageous though. Makes me concerned
about trusting privacy agreements.”
• “Mine pulled up my Amazon wish list which revealed my
identity. It seems to me that so called ”anonymous” tasks
on mTurk (like surveys) are not anonymous after all.”
17
19. Risks to
Workers
• Inadvertent disclosure of PII or private data
• Loss of blind hiring practices online
• Greater risk of exploitation, reputation damage,
loss of income, or even physical harm…
19
20. Risks to Researchers
• Exposing participants to undocumented risks
• Having disclosed WorkerIDs (e.g., online)
• Having not restricted access to the internally
– Potential harm to participants
– Lack of compliance with Federal/IRB governance
of human subjects research
– Being required to discard collected data
– Delays or inability to conduct future MTurk studies
20
21. Risks to Amazon
• Workers/Requesters abandoning MTurk
• The Federal Trade Commission (FTC) has recently
begun to aggressively protect consumers from data
breaches by commercial entities, including the
release of supposedly “anonymous” data
– Inadequate protection of customer records: BJWC
– De-anonymized customer records: AOL, Netflix
– Did workers have a reasonable expectation of privacy
in their use of MTurk which has been violated? 21
23. Human-centered Privacy Protection
• Vulnerabilities are not purely technological
• Focusing on software is not enough: human
factors play a significant role in security of today’s
socio-technical, online systems
– Insufficient attention to human factors design
can compromise information security, despite having
the best algorithmic security protocols
• Privacy protection should be explicitly-valued in
relation to other competing goals & stakeholder
interests to prevent being ignored or sacrificed
23
24. Brief Digression: Information Schools
• At 30 universities in N. America, Europe, Asia
• Study human-centered aspects of information
technologies: design, implementation, policy, …
24
www.ischools.org
Wobbrock et
al., 2009
25. The Future of Crowd Work
@ ACM CSCW 2013
Kittur, Nickerson, Bernstein, Gerber,
Shaw, Zimmerman, Lease, and Horton
25
26. Matt Lease - ml@ischool.utexas.edu - @mattlease
Thank You!
Mechanical Turk is Not
Anonymous
Matthew Lease, Jessica Hullman,
Jeffrey P. Bigham, Michael S. Bernstein,
Juho Kim, Walter S. Lasecki, Saeideh
Bakhshi, Tanushree Mitra, and
Robert C. Miller
Social Science Research Network
ssrn.com/abstract=2190946
ir.ischool.utexas.edu/crowd
26