3. Mechanical Turk Marketplace
400,000+ Workers
In 100+ Countries
Available 24/7
Programmatically
Accessible
http://www.flickr.com/photos/diamond_rain/2543837414/
3
4. So there are basically
Workers Requesters
http://www.flickr.com/photos/saad/1968774 http://www.flickr.com/photos/chicagobart/4181533461
4
5. Mechanical Turk as a Worker
Workers
Make money by working on
Human Intelligence Tasks
Workers can work from home
and choose their own work
hours
http://www.flickr.com/photos/saad/1968774
5
10. How do I get the money?
Amazon Bank
U.S. Bank
Gift Checks in
account
Certificate Rupees
10
11. Mechanical Turk as a Requester
Requesters
Have access to a global,
on-demand, 24 x 7 workforce
Can get thousands of HITs
completed in minutes
Pay only when they are
satisfied with the results
http://www.flickr.com/photos/chicagobart/4181533461
11
12. Requesting HITs
Requesters Workers Requesters
• define and create • work on your • approve and pay
your HITs HITs for completed
• load HITs to • submit results HITs
Mechanical Turk • use the results
12
13. Design HITs
Enter Properties
Design Layout
13
14. Design HITs - faster
Take developer and use
CSV files
SOAP / REST or
Amazon Mechanical Turk
developer tools
14
15. What would it look like
http://mechanicalturk.amazonaws.com/
?Service=AWSMechanicalTurkRequester
&AWSAccessKeyId=[the Requester's Access Key ID]
&Version=2008-08-02
&Operation=CreateHIT
&Signature=[signature for this request]
&Timestamp=[your system's local time]
&Title=Location%20and%20Photograph%20Identification
&Description=Select%20the%20image%20that%20best%20represents
&Reward.1.Amount=5 &Reward.1.CurrencyCode=USD
&Question=[URL-encoded question data]
&AssignmentDurationInSeconds=30
&LifetimeInSeconds=604800
&Keywords=location,%20photograph,%20image,%20identification,%20opinion
15
16. Publish HITs
credit card debit card
HITs have to be paid in
advance
Amazon Amazon takes 10% on top
U.S. bank
Payments
account
account
16
17. Use Mechanical Turk for
Work that requires Human
Judgment
Work that algorithms
cannot completely solve
Work that has
unpredictable or spiky
volume
17
18. Improving Data Quality
Background
Are these two
Data is the company’s business
businesses the same? Accuracy and breadth are key to
differentiation
Process
Peritor GmbH Peritor Consulting 1 MM data points to ingest each day
Blücherstraße 22 Blücherstraße 22 200 data sources
10961 Berlin Hof III Aufgang 6
http://peritor.com 10961 Berlin Problem
Data needs to be normalized,
enhanced and de-dupped
Algorithms could get data about 70%
YES NO clean
18
19. Moderating User
Generated Content
Is this image explicit?
Background
User generated content is a key part
of a web 2.0 experience
Process
Millions of photos uploaded every
day
Problem
Need to ensure user generated
http://www.flickr.com/photos/cmak/1521356521/
content meets site guidelines
YES NO
19
20. Categorization
Background
What kind of dress is Consumers need to be able to
this? quickly find a product when shopping
online
The Business Process
Millions of new products are
introduced everyday
Products are sourced from hundreds
of merchants and manufacturers,
http://www.flickr.com/photos/34801476@N00/296743627/ each with their own taxonomy
Cocktail Problem
Need to properly categorize new
Bridal dress
products quickly in order to monetize
20