SlideShare une entreprise Scribd logo
1  sur  6
Amazon Mechanical Turk Requester Meetup Dahn Tamir, Knewton Inc.
Knewton - Introduction ,[object Object],[object Object]
How we use MTurk ,[object Object],[object Object],[object Object],[object Object],[object Object]
Why Mturk? ,[object Object],[object Object],[object Object],[object Object]
What We Learned ,[object Object],[object Object],[object Object],[object Object],[object Object]
Thank you! --- Questions? [email_address] 978-KNEWTON

Contenu connexe

En vedette

En vedette (11)

Autoforma relj-fmmp (1)
Autoforma relj-fmmp (1)Autoforma relj-fmmp (1)
Autoforma relj-fmmp (1)
 
Dev traning 2016 basics of PHP
Dev traning 2016   basics of PHPDev traning 2016   basics of PHP
Dev traning 2016 basics of PHP
 
1001 libros que leer
1001 libros que leer1001 libros que leer
1001 libros que leer
 
Gionelly_Fernández_Herramientas web 2.0_blog
Gionelly_Fernández_Herramientas web 2.0_blogGionelly_Fernández_Herramientas web 2.0_blog
Gionelly_Fernández_Herramientas web 2.0_blog
 
Excel 2010
Excel 2010Excel 2010
Excel 2010
 
Scaling
ScalingScaling
Scaling
 
Kaufman Research Interests
Kaufman Research InterestsKaufman Research Interests
Kaufman Research Interests
 
Cv16
Cv16Cv16
Cv16
 
Workshop #5: Phygital - The Future of Seating by L+W
Workshop #5: Phygital - The Future of Seating by L+WWorkshop #5: Phygital - The Future of Seating by L+W
Workshop #5: Phygital - The Future of Seating by L+W
 
Dt Wcdma Validação De Sites WCDMA - Parte 2
Dt Wcdma   Validação De Sites  WCDMA - Parte 2Dt Wcdma   Validação De Sites  WCDMA - Parte 2
Dt Wcdma Validação De Sites WCDMA - Parte 2
 
How to choose an idea for your startup Dalton Caldwell Y Combinator
How to choose an idea for your startup  Dalton Caldwell Y CombinatorHow to choose an idea for your startup  Dalton Caldwell Y Combinator
How to choose an idea for your startup Dalton Caldwell Y Combinator
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Amazon MTurk Developer Meetup - Tamir

Notes de l'éditeur

  1. My name is Dahn Tamir, and I’ve used MTurk for everything from vetting names for my new daughter to a recent study of web browser preference by political affiliation (http://www.evilsoft.org/?p=151). This evening I’m going to focus on the work we’ve done at Knewton.
  2. Knewton is a venture-backed eLearning startup in the west village. We prepare students for graduate entrance exams, and in the future will open our learning platform to publishers of other educational content. We've been using MTurk since we were in stealth mode a year ago and continue to be heavy users today.
  3. The core of our system is adaptivity, and adaptive testing requires response data from hundreds of users on thousands of test questions. We built groups of qualified workers and administered quizzes to establish the foundation for our testing engine. This is real science; overseen by the former director of research at Educational Testing Service. We have load tested our online classroom via MTurk, proofed all our course material, and beta tested the functionality of our learning and testing engines. We’ve also used Mturk for ratings and feedback on our name, logo, web design, price/feature analysis, video evaluation of teachers, and so on We’ve collected and cleaned data on schools, potential partners and marketing outlets And while this requires care as we don’t want to risk being seen as spammers, we do for instance tap over 500 current college students to distribute flyers at their campuses. We also pretest banner ads and landing pages on Mturk.
  4. How else can you get a thousand pages of text thoroughly proofread in 72 hours? But there's another dimension of speed beyond time to complete a project, and that's time to spin up and start getting responses. Because it's so fast and easy, we experiment a lot. Some things we try go nowhere, but the risk of trying is trivial. Calibrating our test engine was expected to cost tens of thousands of dollars, and we got it done for one-tenth of our budget. Through surveys and with custom qualifications we've established panels of workers by country, age, gender, education level, language ability, and so on, and can go to the right group for each task. Because we can afford to get many eyes on each task and because can iterate, we end up with more complete and accurate results on everything we do than we'd have without the wisdom of the crowd. This point is huge to us. Saving time and money are great, but in some cases the improvement in quality is reason enough to use Mturk.
  5. It's inconceivable to many that people would be Turking for the money if they are only paid a dollar or two an hour. If you think of Mturk fundamentally as a way to get 10c worth of work from some bored person for 1c, you're selling the opportunity short. There are many highly capable Turkers who are perhaps temporarily out of the workforce because of medical disability, child rearing, a layoff, or because they’re in school. Our top 20 workers each have from 100 thousand to 500 thousand approved HITs, and overall we believe a very large fraction of work on MTurk is completed by a small number of huge, accurate producers. Getting those people working for you is key. Restricting by approval rate is useful, we get better results by creating a pool of workers who have shown they can do good work on tasks relevant to us. A poor worker can have an artificially high approval rate and vice versa. And someone’s performance on other HITs may not predict performance on your work, for better or worse. Qualifications help. It pays to take time and care in building and testing HITs to ensure that everything looks and operates for the worker as you intend. Poorly-constructed or poorly-explained HITs just get poor results. We try to align the payment amount to the timing and difficulty of the task, and have paid from a penny to five dollars for a single HIT. It’s also helped to break up complex tasks into separate HITs whenever possible. The increased effort of structuring two or three HITs really is worthwhile. Finally for large projects it’s best to try a small sample first and expect to tweak the HIT a few times—then load your 50 thousand data points. Because most requesters use the approval-rate qualification, workers live in fear of unfair rejection. Good workers will avoid your tasks if the setup suggests a chance of rejection. For instance, it's not unreasonable to use the majority opinion as the "correct" answer on an image moderation task. But that does not mean you have to reject the response that was "wrong," especially as that response may actually be correct. We create goodwill with workers by paying for quality effort and tolerating the occasional "error." On the other hand, if we identify a scammer or careless worker, we simply reject their submissions and block them from future tasks.   For simple and well-established uses, the automation metaphor of MTurk works fine. But if you’re trying to do anything even a little different, it pays to introduce yourself on the forums, establish yourself as a trustworthy employer and solicit free advice. Once you are running HITs, take the time to be responsive to questions, concerns and suggestions from your workers. These are real people and your respect for their efforts will pay dividends in faster, more accurate results.
  6. I’d love to take your questions now, and also welcome you to contact me directly.