Best Practices for Mechanical Turk

•Télécharger en tant que PPTX, PDF•

0 j'aime•476 vues

AWS World Wide Public Sector Mechanical Turk workshop session 3, best practices for converting workflows to Mechanical Turk

Business

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Transformational Impact
of Cloud Labor
John Hoskins & Daniel Gray
jhoskins@amazon.com
djgray@amazon.com

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Crowdsourcing Best Practices
amazon
web services

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Crowdsourcing myths

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• It’s cheaper
– It’s actually more efficient
• It’s faster
– It’s actually more scalable
• It’s not accurate
– It’s actually more accurate
The Myths[ ]

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
5
Crowdsourcing Best Practices

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• Consider the question carefully
– Workers answer what you ask
• Select your workers
– Perspective and skills vary
• Iterate and Optimize
– Adjust for optimal results

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
The question[ ]
7
You will get an answer to the question that
you ask. Focus on asking the right
question

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Choosing your workers[ ]
8
Workers are different– from language and
cultural differences – to varying skills. Test
and monitor.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Monitor and Improve[ ]
9
Monitor key metrics, adjust and measure
key attributes impact on those metrics.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
• Accuracy
– Know your current
• Throughput
– Understand both turnaround and scale requirements
• Cost
– Measure against a budget – as cost can impact the other two
Key Metrics[ ]
10
“Great service, Good food, Friendly staff – you can choose two”

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Cost[ ]
11
Cost is impacted most by the efficiency of
the other two metrics.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Accuracy[ ]
12
Error has two sources: human and
systematic. Isolating human error and
solving for systematic error gives a
better chance for long term success.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Accuracy[ ]
13
After solving for systematic error
choosing the best workers and
monitoring those workers provides
the next step towards high accuracy
and lowering costs.

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Throughput[ ]
14
Many factors impact throughput;
Reputation
Ergonomics
Clarity

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
http://www.mturk.com
15
John Hoskins, Amazon Mechanical Turk
hoskins@amazon.com
amazon
web services

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Cost[ ]
16
Cost is impacted most by the efficiency of
the other two metrics.
Optimization of task and workers lowers
both the cost of getting it done and
adjudicating a result.

Recommandé

[R. k. bansal]strength of materials 4th Loai Awad

Manual wpfFernando Dominguez

[R. k. bansal]strength of materials 4th Loai Awad

фгос для собрания (1)nsuoth

La elaha ella_allahLoai Awad

الموسوعه المصورة للاعجاز العلمي في القران الكريم 5Loai Awad

الموسوعه المصورة للاعجاز العلمي في القران الكريم4Loai Awad

ChocolateBriefingmediapublic

Recommandé

[R. k. bansal]strength of materials 4th Loai Awad

Manual wpfFernando Dominguez

[R. k. bansal]strength of materials 4th Loai Awad

фгос для собрания (1)nsuoth

La elaha ella_allahLoai Awad

الموسوعه المصورة للاعجاز العلمي في القران الكريم 5Loai Awad

الموسوعه المصورة للاعجاز العلمي في القران الكريم4Loai Awad

ChocolateBriefingmediapublic

الموسوعه المصورة للاعجاز العلمي في القران الكريم3 Loai Awad

Amazon mechanical turk intro to govt partners v2John Hoskins

How Public Sector is using Mechanical TurkJohn Hoskins

Amazon mechanical turk intro to bpo's v3John Hoskins

الموسوعه المصورة للاعجاز العلمي في القران الكريم 2Loai Awad

Quran miracle-encycopediaالموسوعه العلميه في الاعجاز القراني Loai Awad

Revolucao francesa rosivaldo_f_moreira

Strength of materials Loai Awad

How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...Hector Del Castillo, CPM, CPMM

Send Files | Sendbig.comSend Files | Sendbig.comSendBig4

GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdfDanny Diep To

APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua

20200128 Ethical by Design - Whitepaper.pdfChris Skinner

Planetary and Vedic Yagyas Bring Positive Impacts in LifeBhavana Pujan Kendra

Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants

Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward

Effective Strategies for Maximizing Your Profit When Selling Gold JewelryWhittensFineJewelry1

Guide Complete Set of Residential Architectural Drawings PDFChandresh Chudasama

Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7

Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...Associazione Digital Days

Darshan Hiranandani [News About Next CEO].pdfShashank Mehta

business environment micro environment macro environment.pptxShruti Mittal

Contenu connexe

En vedette

الموسوعه المصورة للاعجاز العلمي في القران الكريم3 Loai Awad

Amazon mechanical turk intro to govt partners v2John Hoskins

How Public Sector is using Mechanical TurkJohn Hoskins

Amazon mechanical turk intro to bpo's v3John Hoskins

الموسوعه المصورة للاعجاز العلمي في القران الكريم 2Loai Awad

Quran miracle-encycopediaالموسوعه العلميه في الاعجاز القراني Loai Awad

Revolucao francesa rosivaldo_f_moreira

Strength of materials Loai Awad

En vedette (8)

الموسوعه المصورة للاعجاز العلمي في القران الكريم3

Amazon mechanical turk intro to govt partners v2

How Public Sector is using Mechanical Turk

Amazon mechanical turk intro to bpo's v3

الموسوعه المصورة للاعجاز العلمي في القران الكريم 2

Quran miracle-encycopediaالموسوعه العلميه في الاعجاز القراني

Revolucao francesa

Strength of materials

Dernier

How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...Hector Del Castillo, CPM, CPMM

Send Files | Sendbig.comSend Files | Sendbig.comSendBig4

GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdfDanny Diep To

APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua

20200128 Ethical by Design - Whitepaper.pdfChris Skinner

Planetary and Vedic Yagyas Bring Positive Impacts in LifeBhavana Pujan Kendra

Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants

Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward

Effective Strategies for Maximizing Your Profit When Selling Gold JewelryWhittensFineJewelry1

Guide Complete Set of Residential Architectural Drawings PDFChandresh Chudasama

Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7

Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...Associazione Digital Days

Darshan Hiranandani [News About Next CEO].pdfShashank Mehta

business environment micro environment macro environment.pptxShruti Mittal

The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxmbikashkanyari

WSMM Media and Entertainment Feb_March_Final.pdfJamesConcepcion7

WAM Corporate Presentation April 12 2024.pdfWestern Alaska Minerals Corp.

Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxRakhi Bazaar

trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdfMintel Group

TriStar Gold Corporate Presentation - April 2024Adnet Communications

Dernier (20)

How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...

Send Files | Sendbig.comSend Files | Sendbig.com

GUIDELINES ON USEFUL FORMS IN FREIGHT FORWARDING (F) Danny Diep Toh MBA.pdf

APRIL2024_UKRAINE_xml_0000000000000 .pdf

20200128 Ethical by Design - Whitepaper.pdf

Planetary and Vedic Yagyas Bring Positive Impacts in Life

Church Building Grants To Assist With New Construction, Additions, And Restor...

Fordham -How effective decision-making is within the IT department - Analysis...

Effective Strategies for Maximizing Your Profit When Selling Gold Jewelry

Guide Complete Set of Residential Architectural Drawings PDF

Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...

Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...

Darshan Hiranandani [News About Next CEO].pdf

business environment micro environment macro environment.pptx

The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx

WSMM Media and Entertainment Feb_March_Final.pdf

WAM Corporate Presentation April 12 2024.pdf

Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx

trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf

TriStar Gold Corporate Presentation - April 2024

Best Practices for Mechanical Turk

1. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Transformational Impact of Cloud Labor John Hoskins & Daniel Gray jhoskins@amazon.com djgray@amazon.com

2. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Crowdsourcing Best Practices amazon web services

3. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Crowdsourcing myths

4. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • It’s cheaper – It’s actually more efficient • It’s faster – It’s actually more scalable • It’s not accurate – It’s actually more accurate The Myths[ ]

5. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 5 Crowdsourcing Best Practices

6. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • Consider the question carefully – Workers answer what you ask • Select your workers – Perspective and skills vary • Iterate and Optimize – Adjust for optimal results

7. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 The question[ ] 7 You will get an answer to the question that you ask. Focus on asking the right question

8. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Choosing your workers[ ] 8 Workers are different– from language and cultural differences – to varying skills. Test and monitor.

9. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Monitor and Improve[ ] 9 Monitor key metrics, adjust and measure key attributes impact on those metrics.

10. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 • Accuracy – Know your current • Throughput – Understand both turnaround and scale requirements • Cost – Measure against a budget – as cost can impact the other two Key Metrics[ ] 10 “Great service, Good food, Friendly staff – you can choose two”

11. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Cost[ ] 11 Cost is impacted most by the efficiency of the other two metrics.

12. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accuracy[ ] 12 Error has two sources: human and systematic. Isolating human error and solving for systematic error gives a better chance for long term success.

13. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Accuracy[ ] 13 After solving for systematic error choosing the best workers and monitoring those workers provides the next step towards high accuracy and lowering costs.

14. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Throughput[ ] 14 Many factors impact throughput; Reputation Ergonomics Clarity

15. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 http://www.mturk.com 15 John Hoskins, Amazon Mechanical Turk hoskins@amazon.com amazon web services

16. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Cost[ ] 16 Cost is impacted most by the efficiency of the other two metrics. Optimization of task and workers lowers both the cost of getting it done and adjudicating a result.

17. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Thank You

Notes de l'éditeur

Welcome to the Crowdsourcing Best Practices portion of today’s workshop.
First I want to squelch some bias’ that seem to present themselves to enterprises new to crowdsourcing.
Everyone envisions third world workers doing tasks for 10% of normal costs – that’s not necessarily true. Task work will cost the same to do with a crowd as it will with other options. Where the savings come in is in the efficiency of the process – 100% utilization of human capital, no overhead, no fixed fees. These add up to large overall savings – don’t focus on getting the task done cheaply, focus on the process costs – that’s where the savings present themselves. It’s faster – yes, but it’s faster because it’s scalable to meet demand. Work is done in parallel – and at scalable levels. Creating an environment where large volumes of task can get done in shorter time due to the immediate availability of workers scaling to need Finally – prospects always say it can’t possibly be as accurate as in house experts – but experience shows when implemented with automation and best practices – it’s actually more accurate. Most internal workflows are measured by sampling, that doesn’t uncover the outliers, the exceptions, and is subject to sampling error. In fact many customers don’t really know the true accuracy of their current workflow . Automated crowdsourcing workflows provide a confidence score on every answer – giving you the metrics you need to measure and improve accuracy to maximum levels.
When people ask me how to think about crowdsourcing their workflows – or what should change in their thinking – I always come back to these three things. Consider the question – think about how you are disintegrating your work. Select the best workers for your work And constantly improve
So let’s talk about the question. At Amazon it is our belief that the better you disintegrate the steps in your workflow – the closer you get to a binary question – one right answer – the easier it is to crowdsource. The more the question requires context or interpretation – the more possibility you’ve created for error. Asking the right question – or series of questions is the foundation to a successful crowdsourced implementation. Sometimes what you think is one question – might actually be more than one.
Consider cultural context – is it important to your task, or can you define your task well enough to eliminate it. Also, don’t think in terms of skills like programming, or accounting – think in terms of skills like recognition for transcribing poor handwriting, or expressiveness for keywording, [tell story of me transcribing audio]. Establish the task type, so that workers can self select. Workers don’t like to be wrong – they’ll avoid tasks they aren’t good at. Then, from the pool of workers choosing your tasks – find the better ones.
Finally – establish results goals – key metrics, measure and iterate to improve.
What are the common key metrics – you might have additional ones – or different priorities, but these are common across our customers. And they are interrelated Accuracy – what are you getting today, and what do you need. Accuracy comes at a cost – so be realistic [story about customers often not knowing their true accuracy] Throughput – what are the process requirements – and what opportunity does improvement provide. Often the new found speed of retrieving information opens the door to process improvements not considered in the base ROI. [tell CPG story] Cost – think of cost differently as it impacts the other two, more judgments arrive at greater confidence, at greater overall task cost – higher rewards attract more workers, improve throughput – etc. Remember, savings come in the efficiencies – in some cases we’ve seen where the task cost was actually higher than internal sources – but the efficiencies and speed provided significant business impact, negating the extra spent on tasks.
I put cost third intentionally. While overall it is a key metric in almost all cases – it has many facets – here I’m simply focusing on thiings that you can do to establish the reward you pay to its optimal amount. Task ergonomics play a huge role in worker efficiency. That impacts throughput as mentioned – but cost as well. Scrolling large windows, load times for data elements like videos and pictures, all of these cause the workers to take extra steps or pause – costing time – and to them, their time is money. Finally there’s the sociological aspect of the task – overall workers like knowing the purpose – what you’re trying to accomplish. That helps them understand how to answer, workers are also attracted to fun tasks like reading tweets, looking at photos. I’m not saying only do fun tasks – I’m saying consider the boredom factor in pricing your tasks, typical database cleansing in the marketplace pays a little better than photo moderation due to the bredom factor.
Although it can all be attributed to humans making mistakes – isolating and correcting for the cause of the most common errors builds greater overall accuracy. Mistakes come in two forms – humans just making an error – human error, and what I’ve termed as systematic error (commonly called ????). Systematic error is typically caused by things like poor instructions, ambiguous data, unclear questions. By establishing a good sample workforce, Like Mechanical Turk Masters, you can begin to test and improve for systematic error. Look for outliers – large levels of disagreement, root cause the specific tasks to see if improvement can eliminate.
After solving for systematic error and having a clear picture of what to expect – you can now begin measuring your workers to see if some are better than others. Look for accuracy on known answers – using the known answer API, high levels of agreement with other workers with high gold standard scores – use that data to build a confidence score on each answer – establishing a key system metric to monitor.
Response times are impacted by many factors. Initially, you’re as new to the workers as they are to you. How you establish that brand can impact your long term throughput. Workers are looking for Requesters with clearly defined tasks that they know they can do accurately – that adjudicate fair and pay quickly. Think in terms of worker efficiencies – you are paying workers ultimately for their time – and doing things that allow them to be more efficient saves them time. Like something as simple as prepopulating a web search you want done. Finally, clarity of task impacts throughput – helping workers understand how you want the question answered – how to handle edge cases. All of this gives the worker greater confidence to answer the question correctly and avoid mistakes – thereby improving their desire to do the tasks.
I put cost third intentionally. While overall it is a key metric in almost all cases – it has many facets – here I’m simply focusing on thiings that you can do to establish the reward you pay to its optimal amount. Task ergonomics play a huge role in worker efficiency. That impacts throughput as mentioned – but cost as well. Scrolling large windows, load times for data elements like videos and pictures, all of these cause the workers to take extra steps or pause – costing time – and to them, their time is money. Finally there’s the sociological aspect of the task – overall workers like knowing the purpose – what you’re trying to accomplish. That helps them understand how to answer, workers are also attracted to fun tasks like reading tweets, looking at photos. I’m not saying only do fun tasks – I’m saying consider the boredom factor in pricing your tasks, typical database cleansing in the marketplace pays a little better than photo moderation due to the bredom factor.