SlideShare une entreprise Scribd logo
1  sur  6
Amazon Mechanical Turk Requester Meetup Dahn Tamir, Knewton Inc.
Knewton - Introduction ,[object Object],[object Object]
How we use MTurk ,[object Object],[object Object],[object Object],[object Object],[object Object]
Why Mturk? ,[object Object],[object Object],[object Object],[object Object]
What We Learned ,[object Object],[object Object],[object Object],[object Object],[object Object]
Thank you! --- Questions? [email_address] 978-KNEWTON

Contenu connexe

En vedette

Dev traning 2016 basics of PHP
Dev traning 2016   basics of PHPDev traning 2016   basics of PHP
Dev traning 2016 basics of PHPSacheen Dhanjie
 
1001 libros que leer
1001 libros que leer1001 libros que leer
1001 libros que leerEric Marzochi
 
Gionelly_Fernández_Herramientas web 2.0_blog
Gionelly_Fernández_Herramientas web 2.0_blogGionelly_Fernández_Herramientas web 2.0_blog
Gionelly_Fernández_Herramientas web 2.0_blogUniversidad Yacambu
 
Kaufman Research Interests
Kaufman Research InterestsKaufman Research Interests
Kaufman Research InterestsEric Kaufman
 
Workshop #5: Phygital - The Future of Seating by L+W
Workshop #5: Phygital - The Future of Seating by L+WWorkshop #5: Phygital - The Future of Seating by L+W
Workshop #5: Phygital - The Future of Seating by L+Wux singapore
 
Dt Wcdma Validação De Sites WCDMA - Parte 2
Dt Wcdma   Validação De Sites  WCDMA - Parte 2Dt Wcdma   Validação De Sites  WCDMA - Parte 2
Dt Wcdma Validação De Sites WCDMA - Parte 2marco.silva
 
How to choose an idea for your startup Dalton Caldwell Y Combinator
How to choose an idea for your startup  Dalton Caldwell Y CombinatorHow to choose an idea for your startup  Dalton Caldwell Y Combinator
How to choose an idea for your startup Dalton Caldwell Y CombinatorWebrazzi
 

En vedette (11)

Autoforma relj-fmmp (1)
Autoforma relj-fmmp (1)Autoforma relj-fmmp (1)
Autoforma relj-fmmp (1)
 
Dev traning 2016 basics of PHP
Dev traning 2016   basics of PHPDev traning 2016   basics of PHP
Dev traning 2016 basics of PHP
 
1001 libros que leer
1001 libros que leer1001 libros que leer
1001 libros que leer
 
Gionelly_Fernández_Herramientas web 2.0_blog
Gionelly_Fernández_Herramientas web 2.0_blogGionelly_Fernández_Herramientas web 2.0_blog
Gionelly_Fernández_Herramientas web 2.0_blog
 
Excel 2010
Excel 2010Excel 2010
Excel 2010
 
Scaling
ScalingScaling
Scaling
 
Kaufman Research Interests
Kaufman Research InterestsKaufman Research Interests
Kaufman Research Interests
 
Cv16
Cv16Cv16
Cv16
 
Workshop #5: Phygital - The Future of Seating by L+W
Workshop #5: Phygital - The Future of Seating by L+WWorkshop #5: Phygital - The Future of Seating by L+W
Workshop #5: Phygital - The Future of Seating by L+W
 
Dt Wcdma Validação De Sites WCDMA - Parte 2
Dt Wcdma   Validação De Sites  WCDMA - Parte 2Dt Wcdma   Validação De Sites  WCDMA - Parte 2
Dt Wcdma Validação De Sites WCDMA - Parte 2
 
How to choose an idea for your startup Dalton Caldwell Y Combinator
How to choose an idea for your startup  Dalton Caldwell Y CombinatorHow to choose an idea for your startup  Dalton Caldwell Y Combinator
How to choose an idea for your startup Dalton Caldwell Y Combinator
 

Dernier

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Dernier (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Amazon MTurk Developer Meetup - Tamir

Notes de l'éditeur

  1. My name is Dahn Tamir, and I’ve used MTurk for everything from vetting names for my new daughter to a recent study of web browser preference by political affiliation (http://www.evilsoft.org/?p=151). This evening I’m going to focus on the work we’ve done at Knewton.
  2. Knewton is a venture-backed eLearning startup in the west village. We prepare students for graduate entrance exams, and in the future will open our learning platform to publishers of other educational content. We've been using MTurk since we were in stealth mode a year ago and continue to be heavy users today.
  3. The core of our system is adaptivity, and adaptive testing requires response data from hundreds of users on thousands of test questions. We built groups of qualified workers and administered quizzes to establish the foundation for our testing engine. This is real science; overseen by the former director of research at Educational Testing Service. We have load tested our online classroom via MTurk, proofed all our course material, and beta tested the functionality of our learning and testing engines. We’ve also used Mturk for ratings and feedback on our name, logo, web design, price/feature analysis, video evaluation of teachers, and so on We’ve collected and cleaned data on schools, potential partners and marketing outlets And while this requires care as we don’t want to risk being seen as spammers, we do for instance tap over 500 current college students to distribute flyers at their campuses. We also pretest banner ads and landing pages on Mturk.
  4. How else can you get a thousand pages of text thoroughly proofread in 72 hours? But there's another dimension of speed beyond time to complete a project, and that's time to spin up and start getting responses. Because it's so fast and easy, we experiment a lot. Some things we try go nowhere, but the risk of trying is trivial. Calibrating our test engine was expected to cost tens of thousands of dollars, and we got it done for one-tenth of our budget. Through surveys and with custom qualifications we've established panels of workers by country, age, gender, education level, language ability, and so on, and can go to the right group for each task. Because we can afford to get many eyes on each task and because can iterate, we end up with more complete and accurate results on everything we do than we'd have without the wisdom of the crowd. This point is huge to us. Saving time and money are great, but in some cases the improvement in quality is reason enough to use Mturk.
  5. It's inconceivable to many that people would be Turking for the money if they are only paid a dollar or two an hour. If you think of Mturk fundamentally as a way to get 10c worth of work from some bored person for 1c, you're selling the opportunity short. There are many highly capable Turkers who are perhaps temporarily out of the workforce because of medical disability, child rearing, a layoff, or because they’re in school. Our top 20 workers each have from 100 thousand to 500 thousand approved HITs, and overall we believe a very large fraction of work on MTurk is completed by a small number of huge, accurate producers. Getting those people working for you is key. Restricting by approval rate is useful, we get better results by creating a pool of workers who have shown they can do good work on tasks relevant to us. A poor worker can have an artificially high approval rate and vice versa. And someone’s performance on other HITs may not predict performance on your work, for better or worse. Qualifications help. It pays to take time and care in building and testing HITs to ensure that everything looks and operates for the worker as you intend. Poorly-constructed or poorly-explained HITs just get poor results. We try to align the payment amount to the timing and difficulty of the task, and have paid from a penny to five dollars for a single HIT. It’s also helped to break up complex tasks into separate HITs whenever possible. The increased effort of structuring two or three HITs really is worthwhile. Finally for large projects it’s best to try a small sample first and expect to tweak the HIT a few times—then load your 50 thousand data points. Because most requesters use the approval-rate qualification, workers live in fear of unfair rejection. Good workers will avoid your tasks if the setup suggests a chance of rejection. For instance, it's not unreasonable to use the majority opinion as the "correct" answer on an image moderation task. But that does not mean you have to reject the response that was "wrong," especially as that response may actually be correct. We create goodwill with workers by paying for quality effort and tolerating the occasional "error." On the other hand, if we identify a scammer or careless worker, we simply reject their submissions and block them from future tasks.   For simple and well-established uses, the automation metaphor of MTurk works fine. But if you’re trying to do anything even a little different, it pays to introduce yourself on the forums, establish yourself as a trustworthy employer and solicit free advice. Once you are running HITs, take the time to be responsive to questions, concerns and suggestions from your workers. These are real people and your respect for their efforts will pay dividends in faster, more accurate results.
  6. I’d love to take your questions now, and also welcome you to contact me directly.