SlideShare une entreprise Scribd logo
1  sur  90
Télécharger pour lire hors ligne
Crowdsourcing: Challenges and Opportunities



                  Guoliang Li
               Tsinghua University
Tutorial Objectives

   What is crowdsourcing?
   How and when to use crowdsourcing?
   How to do experiments for crowdsourcing?
   What are research challenges of crowdsourcing?




                  Crowdsourcing @ HotDB2012          (2)
Tutorial Outline

   Introduction
   Applications
   Platforms
   Challenges
   Opportunities




                    Crowdsourcing @ HotDB2012   (3)
What is Crowdsourcing?
                                          $ / Customer

   Outsourcing – 外包
       A known agent (an employee)                      Large Businesses




                                                                       # of Customers

   Crowdsourcing – 众包
       An undefined, generally large group of people via a group call
       The application of open source principles to fields outside of
        software
   Most successfully story: Wikipedia



                        Crowdsourcing @ HotDB2012                            (4)
Crowdsourcing Definition
   Coordinating a crowd (a large group of people on the web)
    to do micro-work (small contributions) that solves
    problems (that software or one user can’t do)
   A collection of mechanisms and associated methodologies
    for scaling and directing crowd activities to achieve goals
                                     volunteer   fun
                   social




     Evolving & broadly defined
                   Crowdsourcing @ HotDB2012             (5)
Example - Captcha

    Captcha: 200M every day
    ReCaptcha: 750M to date




Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum.
reCAPTCHA: Human-Based Character Recognition via Web Security Measures.
Science, 321: 1465-1468, 2008

                      Crowdsourcing @ HotDB2012                        (6)
NLP Example – Machine Translation
   Machine Translation
       Problem:
           Manual evaluation on translation quality is slow and expensive
       Crowdsourcing:
           Low cost of non-experts, $0.10 to translate a sentence
           High agreement/equivalent quality between non-experts and experts
           Complex tasks like human-mediated translation edit rate


   C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality
    Using Amazon’s Mechanical Turk”, EMNLP 2009.
   B. Bederson et al. Translation by Iteractive Collaboration between Monolingual
    Users, GI 2010



                           Crowdsourcing @ HotDB2012                         (7)
IR Example – Image Search




 Tingxin Yan, Vikas Kumar, Deepak Ganesan: CrowdSearch: exploiting crowds
 for accurate real-time image search on mobile phones. MobiSys 2010:77-90

                   Crowdsourcing @ HotDB2012                         (8)
IR Example - Relevance and ads




Omar Alonso, Daniel E. Rose, Benjamin Stewart: Crowdsourcing for relevance
evaluation. SIGIR Forum (SIGIR) 42(2):9-15 (2008)

                     Crowdsourcing @ HotDB2012                          (9)
CV Example - Painting Similarity




   How similar is the artistic style in the paintings above?
       Very similar
       Similar
       Somewhat dissimilar
                            Human and Machine Detection of Stylistic Similarity in Art.
       Very dissimilar
                            Adriana Kovashka and Matthew Lease. CrowdConf 2010

                            Crowdsourcing @ HotDB2012                          (10)
Audio Example
   Mobile service that aids
    blind users with “visual
    questions” in near-realtime
   An iPhone application
Jeffrey P. Bigham, Chandrika
Jayant, Hanjie Ji, Greg Little, Andrew
Miller, Robert C. Miller, Robin
Miller, Aubrey Tatarowicz, Brandyn
White, Samuel White, Tom Yeh.
VizWiz: nearly real-time answers to
visual questions. UIST 2010 (Best
Paper Award)


       http://www.cs.rochester.edu/u/jbigham/vizwiz/video/

                           Crowdsourcing @ HotDB2012         (11)
DB Example - CrowdDB

   Use crowd to answer DB
    queries
       Where to use crowd?
       How to use crowd?
       How to support SQL?
       How to devise a system?
       Quality?




    Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, Reynold
    Xin: CrowdDB: answering queries with crowdsourcing. SIGMOD 2011:61-72

                        Crowdsourcing @ HotDB2012                          (12)
Crowdsourcing Overview
   Requester
       People submit tasks

                              Submit tasks               Collect answers

   Platforms                 Publish tasks
       Task management
                                                 Platforms



                        Find interesting tasks          Return answers
   Worker
       People work on tasks


                      Crowdsourcing @ HotDB2012                   (13)
Crowdsourcing vs Human Computation
   Human Computation
       Design a solution using both automated
        computers and human computers
       Maybe a closed set of workers
       Crowdsourcing greatly facilitates
        human computation
   Social computing
       Social behavior
   Collective intelligence
       May not human


    Alexander J. Quinn, Benjamin B. Bederson: Human computation: a survey and
    taxonomy of a growing field. CHI 2011:1403-1412

                          Crowdsourcing @ HotDB2012                         (14)
A Growing Field - Tutorials
   WWW 2011: Managing Crowdsourced Human Computation
   VLDB 2011: Crowdsourcing Applications and Platforms
   SIGIR 2011: Crowdsourcing for Information Retrieval: : Principles, Methods and
    Applications
   AAAI 2011: Human Computation: Core Research Questions and State of the Art
   WSDM 2011: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You
   CVPR 2010: Mechanical Turk for Computer Vision
   ECIR 2010: Crowdsourcing for Relevance Evaluation
   HCIC 2011: Quality Crowdsourcing for Human Computer Interaction Research
   CrowdConf 2011: Crowdsourcing for Fun and Profit




                           Crowdsourcing @ HotDB2012                            (15)
A Growing Field - Workshops
   KDD 2009: 1st Human Computation Workshop           - HCOMP 2009
   KDD 2010 : 2nd Human Computation Workshop - HCOMP 2010
   AAAI 2011 : 3rd Human Computation Workshop - HCOMP 2011
   AAAI 2012 : 4th Human Computation Workshop - HCOMP 2012
   SIGIR 2010 : Crowdsourcing for Search Evaluation
   SIGIR 2011: Crowdsourcing for Information Retrieval
   CVPR 2010 : Advancing Computer Vision with Humans in the Loop
   NAACL 2010: Creating Speech and Language Data with Amazon’s Mechanical Turk
   NIPS 2010 : Computational Social Science and Wisdom of the Crowds
   Ubicomp 2010 : Workshop on Ubiquitous Crowdsourincg
   WSDM 2010: Crowdsourcing for Search and Data Mining Workshop
   ICWE 2010 : Enterprise Crowdsourcing Workshop
   CHI 2011: Workshop on Crowdsourcing and Human Computation
   AMTA 2010: Collaborative Translation Technology, Crowdsourcing and the Translator
   EC 2011: Workshop on Social Computing and User Generated Content
                          Crowdsourcing @ HotDB2012                          (16)
Many Related Areas
   Human–Computer Interaction (HCI)
   Artificial Intelligence (AI)             Social Science


   Machine Learning (ML)
   Information Retrieval (IR)
   Crowd Management (DB)
   Social Science
   Theory
   Statistics

                                  DB




                 Crowdsourcing @ HotDB2012            (17)
Landscape




            Crowdsourcing @ HotDB2012   (18)
Crowdsourcing




                                  Are there
             Platforms          any platforms?




          Crowdsourcing @ HotDB2012        (19)
Amazon Mechanical Turk (Mturk) -              www.mturk.com




highly-available, cheap, programmable, a prototyping platform
for crowd computing
                                                        (20)
                  Crowdsourcing @ HotDB2012
Tasks

   What are tasks?




                  Crowdsourcing @ HotDB2012   (21)
Why Micro-tasks

   Cheap, Easy, and Fast
   Ready to use infrastructures
       Payments, workforce, interface widgets
   Allow early, iterative, frequent trials
       Test new ideas
   Many successful examples
       Image search
       reCaptcha




                         Crowdsourcing @ HotDB2012   (22)
Human Intelligence Tasks (HITs)
   Human Intelligence Tasks – micro tasks

   Requesters create (HITs)
       web services API/Dashboard
       assess results, pay per HIT satisfactorily completed.


   Workers (sometimes called “Turkers”)
       log in, choose HITs, perform them.


   Currently >200,000 workers from 100 countries
   Millions of HITs completed

                         Crowdsourcing @ HotDB2012              (23)
Requester & Worker


   Build HIT          Test HIT           Post HIT


   Search for          Accept
                                         Do work
     HITs               HIT


    Submit            Reject or           Requester
     HIT              approve              Worker

             Crowdsourcing @ HotDB2012              (24)
The Requester

   Sign up with your Amazon account
   Amazon payments
   Purchase prepaid HITs
   There is no minimum or up-front fee
   AMT collects a 10% commission
   The minimum commission charge is $0.005 per HIT
   Approve/reject answers




                  Crowdsourcing @ HotDB2012           (25)
Dashboard

   Three tabs
       Design
       Publish
       Manage
   Design
       HIT Template
   Publish
       Make work available
   Manage
       Monitor progress



                       Crowdsourcing @ HotDB2012   (26)
Amazon Web Services API
   Rich set of services
   Command line tools
   More flexibility than dashboard

   CreateHIT (Requirements, Pay rate, Description) – returns HIT Id and HIT Type Id
   SubmitAssignment (AssignmentId) – notifies Amazon that this assignment has
    been completed
   ApproveAssignment (AssignmentID) – Requester accepts assignment, money is
    transferred, also RejectAssignment
   GrantBonus (WorkerID, Amount, Message) – Give the worker the specified bonus
    and sends message, should have a failsafe
   NotifyWorkers (list of WorkerIds, Message) – e-mails message to the workers.




                          Crowdsourcing @ HotDB2012                                (27)
Dashboard vs API

   Dashboard
       Easy to prototype
       Setup and launch an experiment in a few minutes
   API
       Ability to integrate AMT as part of a system
       Ideal if you want to run experiments regularly
       Schedule tasks




                       Crowdsourcing @ HotDB2012          (28)
The Worker
   Sign up with your Amazon account
   Tabs
       Account: work approved/rejected
       HIT: browse and search for work
       Qualifications: browse and search for qualifications




                       Crowdsourcing @ HotDB2012               (29)
Why Do Work on MTurk?

   Money ($$$)
   Fun (or avoid boredom)
   Socialize
   Earn acclaim/prestige
   Altruism
   Learn something new (e.g. English)
   Unintended by-product (e.g. re-Captcha)
   Create self-serving resource (e.g. Wikipedia)
   Multiple incentives are typically at work in parallel



                    Crowdsourcing @ HotDB2012               (30)
Who Are My Workers?

   2008-2009 studies found less global and diverse than
    previously thought
       47% US, 34% India, 19% others
       Female
       Educated
       Bored
       Money is secondary




                       Crowdsourcing @ HotDB2012           (31)
Survey on Workers
   “Mturk money is always necessary to make ends meet.”
     5% U.S. 13% India
   “Mturk money is irrelevant.”
     12% U.S. 10% India
   “Mturk is a fruitful way to spend free time and get some cash.”
     69% U.S. 59% India




                    Crowdsourcing @ HotDB2012                    (32)
Mturk
   Advantages                         Disadvantages/Limitations
       More participants                  Lower quality feedback
       More diverse participants          Less interaction
       High speed                         Greater need for quality control
       Low cost                           Less focused user groups
       Speed of experimentation           No control of users’ environment
       Diversity                          Not designed for user studies
                                           Spam - Uncertainty about user
                                            demographics, expertise
                                           Lots of problems and missing features

                      Crowdsourcing != MTurk

                         Crowdsourcing @ HotDB2012                         (33)
Pay-based Marketplaces / Vendors
   Amazon Mechanical Turk (since 2005, www.mturk.com)
   Crowdflower (since 2007, www.crowdflower.com)
   CloudCrowd (www.cloudcrowd.com/)
   DoMyStuff (www.domystuff.com/)
   Livework (https://www.livework.com/)
   Clickworker (www.clickworker.com/)
   SmartSheet (www.smartsheet.com/crowdsourcing)
   uTest (www.utest.com/)
   Elance (www.elance.com/)
   oDesk (www.odesk.com/)
   vWorker (www.vworker.com/)

                  Crowdsourcing @ HotDB2012              (34)
Microtask Aggregators




           Crowdsourcing @ HotDB2012   (35)
Samasource.org




        How and when to use crowdsourcing?




          Crowdsourcing @ HotDB2012          (36)
How and when to use crowdsourcing?
When to use crowdsourcing

   Computers cannot do
   A single person cannot do
   The work can be split into small tasks




                   Crowdsourcing @ HotDB2012   (38)
How to Do Experiments in Crowdsourcing

   Experimental Design
   Choose crowdsourcing platform
   Decompose your tasks into micro tasks
   Publish your tasks and wait for answers
   Aggregate workers’ answers




                   Crowdsourcing @ HotDB2012   (39)
Tutorial Outline

 Introduction
 Applications
 Platforms
 Challenges
 Opportunities




             Crowdsourcing @ HotDB2012   (40)
Research Challenges in Crowdsourcing

   Task Management
       Task assignment, payment, discover
   Human–Computer Interaction
       Payment / incentives, interface and interaction design,
        communication, reputation, recruitment, retention
   Quality Control / Data Quality
       Trust, reliability, spam detection, consensus labeling
   Human-Processing Unit (HPU) and CPU
       How to combine
   Scalability
       Large scale data


                        Crowdsourcing @ HotDB2012                 (41)
Task Management
   How to decompose a complex task?
   How to find tasks for workers?
   How much is the payment of a HIT?




                  Crowdsourcing @ HotDB2012   (42)
Supporting Complex Tasks

   Mturk works only for small tasks
   How to support complex tasks?
       Task decomposition – large tasks are divided into small problems
       Job distributed among multiple workers
       Collect all answers and combine them
       Verifying performance of heterogeneous CPUs and HPUs




                       Crowdsourcing @ HotDB2012                    (43)
CrowdForge -               MapReduce framework for crowds




   Aniket Kittur, Boris Smus, Robert Kraut: CrowdForge : crowdsourcing
    complex work. CHI Extended Abstracts 2011:1801-1806


                       Crowdsourcing @ HotDB2012                          (44)
How to Discover Tasks?




   Task discovery is very important.
   Heavy tailed distribution of completion times.

                   Crowdsourcing @ HotDB2012         (45)
Task Assignment

   Push Methods:system ➠ workers
       The system takes complete control over who is assigned which
        task.
       Worker expertise recording for task assignment (employer/task
        finds worker)
   Pull Methods:workers ➠ system
       The system merely sets up the environment to allow workers to
        assign themselves (or each other) tasks.
       Task organization for task discovery (worker finds
        employer/task)



                       Crowdsourcing @ HotDB2012                 (46)
Task Recommendation

   Content-Based recommendation
       find similarities between worker profile and task
        characteristics.
   Collaborative Filtering
       make use of preference information about tasks (e.g., ratings)
        to infer similarities between workers.
   Hybrid
       a mix of content-based and collaborative filtering methods.




                      Crowdsourcing @ HotDB2012                       (47)
Task Payments
   How much is a HIT?
   Delicate balance
       Too little, no interest
       Too much, attract spammers
       Paying a lot is a counter‐incentive
       Money does not improve quality but (generally) increase
        participation
       Payment based on user effort- Bonus
       Example: $0.04 (2 cents to answer a yes/no question, 2 cents if you
        provide feedback that is not mandatory)

    Winter A. Mason, Duncan J. Watts: Financial incentives and the "performance of
    crowds". SIGKDD Explorations (SIGKDD) 11(2):100-108 (2009)

                          Crowdsourcing @ HotDB2012                           (48)
Optimization Goal

   3 main goals for a task to be done:
     Minimize Cost (cheap)
     Minimize Completion Time (fast)
     Maximize Quality (good)
   Many optimization problems to tradeoff the three goals




                  Crowdsourcing @ HotDB2012            (49)
Human-Assisted Graph Search/Classification
   Given a DAG G,
     Containing unknown target nodes
     Find target nodes by asking humans search
      queries at nodes in G
     “is there a target node reachable from the
      current node?”
   Applications
       Classification, workflow debugging, interactive
        search

        Aditya G. Parameswaran, Anish Das Sarma, Hector Garcia-Molina,
        Neoklis Polyzotis, Jennifer Widom: Human-assisted graph search: it's
        okay to ask questions. PVLDB 4(5):267-278 (2011)


                             Crowdsourcing @ HotDB2012                         (50)
CrowdGraphSearch - Application

   Classify an image




                   Crowdsourcing @ HotDB2012   (51)
CrowdGraphSearch - Optimization

   Do not ask all nodes or serially
       Too expensive or too slow
   Given a limit of k questions, find the
    best nodes to ask in parallel
       Minimize the size
       To classify into huge taxonomy, first ask k
        question
       Is the superset small enough?




                          Crowdsourcing @ HotDB2012   (52)
CrowdSearch:                 Using crowd to improve image search
    How to ensure a high enough accuracy, say over 95%?




    Tingxin Yan, Vikas Kumar, Deepak Ganesan: CrowdSearch: exploiting
    crowds for accurate real-time image search on mobile phones. MobiSys
    2010:77-90

                        Crowdsourcing @ HotDB2012                          (53)
CrowdSearch - Optimization

   To minimize money
       Increase delay
   To minimize delay
       Increase money
   Goal: return one validate
    image before deadline,
    while minimizing the
    money




                         Crowdsourcing @ HotDB2012   (54)
CrowdSearch - Optimization




          Crowdsourcing @ HotDB2012   (55)
CrowdSearch - Optimization




          Crowdsourcing @ HotDB2012   (56)
Quality - Example
   Get people to look at sites and classify them as:
       G(general audience)
       PG (parental guidance)
       R (restricted)
       X (porn)




                      Crowdsourcing @ HotDB2012         (57)
Quality Control

   Quality of workers’ answers is extremely important part of
    the experiment
   Approach it as “overall” quality – not just for workers
   Bi-directional channel
       You may think the worker is doing a bad job.
       The same worker may think you are a lousy requester.




                      Crowdsourcing @ HotDB2012                (58)
Methods for Measuring Agreement
   What to look for
       Agreement, reliability, validity


   Beforehand
       Qualification test
       Screening, selection, recruiting, training
   During
       Accesses labels as workers produce them
       Reward, penalize, weight
   After
       Accuracy metrics
       Filter, weight

                         Crowdsourcing @ HotDB2012   (59)
Qualification Tests: Pros and Cons
   Qualification test
       create questions on topics so user gets familiar before starting
        assessments
   Advantages
       Great tool for controlling quality
       Adjust passing grade
   Disadvantages
       Hard to verify subjective tasks like judging relevance
       Slows down the experiment, difficult to “test” relevance
       Extra cost to design and implement the test
       Try creating task-related questions to get worker familiar with task
        before starting task in earnest
   No guarantees - Still not a guarantee of good outcome
                          Crowdsourcing @ HotDB2012                        (60)
Quality Control

   Majority vote
       2 bad, 3 good  good
       3 bad, 2 good  bad
   Weighted majority vote
       Identify workers that always disagree with the majority
       Lower down the weight of such workers




                       Crowdsourcing @ HotDB2012                  (61)
Dealing with Bad Workers

   Pay for “bad” work instead of rejecting it?
       Pro: preserve reputation, admit if poor design at fault
       Con: promote fraud, undermine approval rating system
   Use bonus as incentive
     Pay the minimum $0.01 and $0.01 for bonus
     Better than rejecting a $0.02 task
   Worker blocking - spammer “caught”, block from future tasks
     May be easier to always pay, then block as needed




                       Crowdsourcing @ HotDB2012                  (62)
Emails after Rejection




             Crowdsourcing @ HotDB2012   (63)
Emails after Rejection
   WORKER: this is not fair , you made me work for 10 cents
    and i lost my 30 minutes of time ,power and lot more and
    gave me 2 rejections atleast you may keep it pending. please
    show some respect to turkers

   WORKER: I understood the problems. At that time my kid
    was crying and i went to look after. that's why i responded
    like that. I was very much worried about a hit being
    rejected. The real fact is that i haven't seen that instructions
    of 5 web page and started doing as i do the dolores labs hit,
    then someone called me and i went to attend that call.
    sorry for that and thanks for your kind concern.


                    Crowdsourcing @ HotDB2012                   (64)
Gold Sets / Honey Pots
   Gold derived from
       Experts
       Crowd using high quorum
   Interject trap questions
   Block users in trap and invalidate answers
   Pros
       Often very effective
       Cost efficient
   Cons
       Not always applicable
       Digging gold is hard


                       Crowdsourcing @ HotDB2012   (65)
Crowdsourcing @ HotDB2012   (66)
Quality Control
   As a worker
       I hate when instructions are not clear
       I’m not a spammer – I just don’t get what you want
       A good pay is ideal but not the only condition for engagement
   As a requester
       A task that would produce the right results and is appealing to
        workers
       I want your honest answer for the task
       I want qualified workers and I want the system to do some of
        that for me
   Managing crowds and tasks is a daily activity and more
    difficult than managing computers


                        Crowdsourcing @ HotDB2012                         (67)
Human-Computer Interaction (HCI)

   Getting input from users is important in HCI
       surveys
       rapid prototyping
       usability tests
       cognitive walkthroughs
       performance measures
       quantitative ratings




                       Crowdsourcing @ HotDB2012   (68)
HCI - The UI hurts the market!
   Practitioners know that HITs in 3rd page and after, are not
    picked by workers.

   Many such HITs are left to expire after months, never
    completed.

   Badly designed task discovery interface hurts every
    participant in the market!

   Better modeling as a queuing system may demonstrate
    other such improvements

                   Crowdsourcing @ HotDB2012                (69)
HCI
   Survey design
   Questionnaire design
       Right questions
       Examples
   UI design
   Implementation
   Generic tips
       Experiment should be self-contained.
       Keep it short and simple. Brief and concise.
       Be very clear with the relevance task.
       Engage with the worker. Avoid boring stuff.
       Always ask for feedback (open-ended question) in an input box.


                         Crowdsourcing @ HotDB2012                       (70)
Other Design Principles

   Text alignment
   Legibility
   Reading level: complexity of words and sentences
   Attractiveness (worker’s attention & enjoyment)
   Multi-cultural / multi-lingual
   Who is the audience (e.g. target worker community)
       Special needs communities (e.g. simple color blindness)




                       Crowdsourcing @ HotDB2012                  (71)
System Design




          Crowdsourcing @ HotDB2012   (72)
DB & Crowd

   How can crowd help databases?
       Fix broken data
       Add missing data
       Subjective comparison
   How can DB help crowd apps?
       Lazy data acquisition
       Game the workers market
       Semi-automatically create user interfaces
       Manage the data sourced from the crowd




                         Crowdsourcing @ HotDB2012   (73)
CrowdDB

   Use crowd to answer DB
    queries
       Find missing data
       Make subjective comparison
   Recognize patterns
   Main operations
       Join
       Sort




                         Crowdsourcing @ HotDB2012   (74)
Research Challenges

   Data Model
       Uncertainty
       User view vs System view
       How to get data?
   Query processing
       User-defined functions (UDF)
       CrowdSQL




                        Crowdsourcing @ HotDB2012   (75)
Example DB Systems

   CrowdDB (Berkeley)
   Qurk (MIT)
   Scoop (Stanford)
   Hlog (Wiscosin)
   Freebase (Google)




                 Crowdsourcing @ HotDB2012   (76)
Tutorial Outline

 Introduction
 Applications
 Platforms
 Challenges
 Opportunities




             Crowdsourcing @ HotDB2012   (77)
Opportunities

   Problems with the current platform
       Very rudimentary
       No tools for data analysis
       No integration with databases
       Very limited search and browse features
   Opportunities
       What is the database model for crowdsourcing?
       MapReduce with crowdsourcing
       Can you integrate human-computation into a language?
       Task management




                      Crowdsourcing @ HotDB2012                (78)
Opportunities

   Quality control
     Human factors vs. outcomes
     Pricing tasks
     Predicting worker quality from observable properties (e.g.
      task completion time)
     HIT / Requestor ranking or recommendation
     Expert search : who are the right workers given task nature
      and constraints
   Privacy
       Workers and requesters
       Tasks

                     Crowdsourcing @ HotDB2012                (79)
Opportunities
   Crowdsourcing is cheap but not free, and cannot scale to web
    without help
       How to scale out?
       Improve the accuracy and efficiency of human computation algorithms.
       Indexing & Pruning techniques
   Dealing with uncertainty
       Temporal and labeling uncertainty
       Learning algorithms
       Search evaluation
   Combining CPU + HPU
       MapReduce with human computation?
       Integration points with enterprise systems


                         Crowdsourcing @ HotDB2012                   (80)
Conclusion

   Crowdsourcing for relevance evaluation works
   Fast turnaround, easy to experiment, cheap
   Still have to design the experiments carefully!
   Quality
       Worker quality
       User feedback extremely useful
       Usability considerations
   Platform
       MTurk is a popular platform and others are emerging
       Lots of opportunities to improve current platforms
       Scale out

                      Crowdsourcing @ HotDB2012               (81)
References - Tutorials
   AnHai Doan, Michael J. Franklin, Donald Kossmann, Tim Kraska: Crowdsourcing Applications
    and Platforms: A Data Management Perspective. PVLDB 4(12):1508-1509 (2011)
   Omar Alonso, Matthew Lease: Crowdsourcing for information retrieval: principles, methods,
    and applications. SIGIR 2011:1299-1300
   Omar Alonso, Matthew Lease: Crowdsourcing 101: putting the WSDM of crowds to work for
    you. WSDM 2011:1-2
   Panagiotis G. Ipeirotis, Praveen K. Paritosh: Managing crowdsourced human computation: a
    tutorial. WWW (Companion Volume) 2011:287-288
   Omar Alonso: Crowdsourcing for Information Retrieval Experimentation and Evaluation.
    CLEF 2011:2



   This tutorial (for research purpose only) used some slides in the above tutorials.
   Thank the authors.




                              Crowdsourcing @ HotDB2012                                  (82)
References - Survey Papers

   Man-Ching Yuen, Irwin King, Kwong-Sak Leung: A Survey of
    Crowdsourcing Systems. SocialCom/PASSAT 2011:766-773
   Alexander J. Quinn, Benjamin B. Bederson: Human
    computation: a survey and taxonomy of a growing field. CHI
    2011:1403-1412
   Rajarshi Das, Maja Vukovic. Emerging theories and models of
    human computation systems: a brief survey. UbiCrowd
    2011.
   AnHai Doan, Raghu Ramakrishnan, Alon Y. Halevy:
    Crowdsourcing systems on the World-Wide Web. Commun.
    ACM (CACM) 54(4):86-96 (2011)


                   Crowdsourcing @ HotDB2012               (83)
References - DB
   Adam Marcus, Eugene Wu, Samuel Madden, Robert C. Miller: Crowdsourced
    Databases: Query Processing with People. CIDR 2011:211-214
   Aditya G. Parameswaran, Neoklis Polyzotis: Answering Queries using Humans,
    Algorithms and Databases. CIDR 2011:160-166
   Salil S. Kanhere: Participatory Sensing: Crowdsourcing Data from Mobile
    Smartphones in Urban Spaces. Mobile Data Management 2011:3-6
   Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, Reynold Xin:
    CrowdDB: answering queries with crowdsourcing. SIGMOD 2011:61-72
   Adam Marcus, Eugene Wu, David R. Karger, Samuel Madden, Robert C. Miller:
    Human-powered Sorts and Joins. PVLDB 5(1):13-24 (2011)
   Aditya G. Parameswaran, Anish Das Sarma, Hector Garcia-Molina, Neoklis
    Polyzotis, Jennifer Widom: Human-assisted graph search: it's okay to ask
    questions. PVLDB 4(5):267-278 (2011)
   Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, Jamie Taylor:
    Freebase: a collaboratively created graph database for structuring human
    knowledge. SIGMOD 2008:1247-1250
                         Crowdsourcing @ HotDB2012                                (84)
References - HCI
   Giordano Koch, Johann Füller, Sabine Brunswicker: Online Crowdsourcing in
    the Public Sector: How to Design Open Government Platforms. HCI
    2011:203-212
   Ido Guy, Adam Perer, Tal Daniel, Ohad Greenshpan, Itai Turbahn: Guess who?:
    enriching the social graph through a crowdsourcing game. CHI 2011:1373-
    1382
   Aniket Kittur, Boris Smus, Robert Kraut: CrowdForge: crowdsourcing complex
    work. CHI Extended Abstracts 2011:1801-1806
   Yasuaki Sakamoto,Yuko Tanaka, Lixiu Yu, Jeffrey V. Nickerson: The
    Crowdsourcing Design Space. HCI 2011:346-355
   Jon Noronha, Eric Hysen, Haoqi Zhang, Krzysztof Z. Gajos: Platemate:
    crowdsourcing nutritional analysis from food photographs. UIST 2011:1-12
   Aniket Kittur, Boris Smus, Susheel Khamkar, Robert E. Kraut: CrowdForge:
    crowdsourcing complex work. UIST 2011:43-52
   Jeffrey Heer, Michael Bostock: Crowdsourcing graphical perception: using
    mechanical turk to assess visualization design. CHI 2010:203-212

                        Crowdsourcing @ HotDB2012                              (85)
References - NLP

   Omar Zaidan, Chris Callison-Burch: Crowdsourcing Translation: Professional
    Quality from Non-Professionals. ACL 2011:1220-1229
   Derya Ozkan, Louis-Philippe Morency: Modeling Wisdom of Crowds Using
    Latent Mixture of Discriminative Experts. ACL (Short Papers) 2011:335-340
   Nitin Madnani, Martin Chodorow, Joel R. Tetreault, Alla Rozovskaya: They Can
    Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error
    Detection Systems. ACL (Short Papers) 2011:508-513
   Keith Vertanen, Per Ola Kristensson: The Imagination of Crowds:
    Conversational AAC Language Modeling using Crowdsourcing and Large Data
    Sources. EMNLP 2011:700-711
   Matteo Negri, Luisa Bentivogli,Yashar Mehdad, Danilo Giampiccolo, Alessandro
    Marchetti: Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual
    Textual Entailment Corpora. EMNLP 2011:670-679




                        Crowdsourcing @ HotDB2012                            (86)
References - AI

   Yan Yan, Rómer Rosales, Glenn Fung, Jennifer G. Dy: Active
    Learning from Crowds. ICML 2011:1161-1168
   Peng Dai, Mausam, Daniel S. Weld: Decision-Theoretic Control of
    Crowd-Sourced Workflows. AAAI 2010
   Peter Welinder, Steve Branson, Serge Belongie, Pietro Perona: The
    Multidimensional Wisdom of Crowds. NIPS 2010:2424-2432
   Yen-ling Kuo, Jane Yung-jen Hsu: Resource-Bounded Crowd-
    Sourcing of Commonsense Knowledge. IJCAI 2011:2470-2475
   Edith Law, Haoqi Zhang: Towards Large-Scale Collaborative
    Planning: Answering High-Level Search Queries Using Human
    Computation. AAAI 2011



                    Crowdsourcing @ HotDB2012                    (87)
References - IR
   Gabriella Kazai: In Search of Quality in Crowdsourcing for Search Engine Evaluation. ECIR
    2011:165-176
   Omar Alonso, Ricardo A. Baeza-Yates: Design and Implementation of Relevance Assessments
    Using Crowdsourcing. ECIR 2011:153-164
   Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S. Thompson, Duc
    Thanh Tran: Repeatable and reliable search system evaluation using crowdsourcing. SIGIR
    2011:923-932
   Matthew Lease,Vitor R. Carvalho, Emine Yilmaz: Crowdsourcing for search and data mining.
    SIGIR Forum (SIGIR) 45(1):18-24 (2011)
   Krishna Yeswanth Kamath, James Caverlee: Transient crowd discovery on the real-time social
    web. WSDM 2011:585-594
   Omar Alonso, Ralf Schenkel, Martin Theobald: Crowdsourcing Assessments for XML Ranked
    Retrieval. ECIR 2010:602-606
   Hao Ma, Raman Chandrasekar, Chris Quirk, Abhishek Gupta: Improving search engines using
    human computation games. CIKM 2009:275-284
   Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling: Crowdsourcing for book
    search evaluation: impact of hit design on comparative system ranking. SIGIR 2011:205-214


                             Crowdsourcing @ HotDB2012                                  (88)
References - Theory

   Shuchi Chawla, Jason D. Hartline, Balasubramanian Sivan:
    Optimal crowdsourcing contests. SODA 2012:856-868




                   Crowdsourcing @ HotDB2012                   (89)
Crowdsourcing Challenges and Opportunities

Contenu connexe

Tendances

Human Computation for VGI Management
Human Computation for VGI ManagementHuman Computation for VGI Management
Human Computation for VGI ManagementIrene Celino
 
Workshop SXSW 2017 - Iskander Smit - The Shape of New Things
Workshop SXSW 2017 - Iskander Smit - The Shape of New ThingsWorkshop SXSW 2017 - Iskander Smit - The Shape of New Things
Workshop SXSW 2017 - Iskander Smit - The Shape of New ThingsIskander Smit
 
2010 ALGIM Gov 2.0
2010 ALGIM Gov 2.02010 ALGIM Gov 2.0
2010 ALGIM Gov 2.0gnat
 
AI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for LibrariesAI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for LibrariesBrian Pichman
 
Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...
Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...
Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...Simone Cicero
 
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjData-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjMirko Lorenz
 
Crowdsourcing: A Business Model Game Changer
Crowdsourcing: A Business Model Game ChangerCrowdsourcing: A Business Model Game Changer
Crowdsourcing: A Business Model Game ChangerCrowdsourcing Week
 
Civic tech the future of civic engagement and technology innovation
Civic tech  the future of civic engagement and technology innovationCivic tech  the future of civic engagement and technology innovation
Civic tech the future of civic engagement and technology innovationAlberto Gomez Isassi
 
From Pipelines to Platform thinking
From Pipelines to Platform thinking From Pipelines to Platform thinking
From Pipelines to Platform thinking Andrea Cocchi
 
Some Lessons for Startups (ppt)
Some Lessons for Startups (ppt)Some Lessons for Startups (ppt)
Some Lessons for Startups (ppt)Tim O'Reilly
 
Robotic alms ai and the future of charitable giving notes
Robotic alms  ai and the future of charitable giving notesRobotic alms  ai and the future of charitable giving notes
Robotic alms ai and the future of charitable giving notesrhoddavies1
 
The Meaning of the Platform Organization
The Meaning of the Platform OrganizationThe Meaning of the Platform Organization
The Meaning of the Platform OrganizationSimone Cicero
 
Open government international garry lloyd
Open government international   garry lloydOpen government international   garry lloyd
Open government international garry lloydGarry Lloyd
 
Digital Networks & Platform Business Models (Masterclass)
Digital Networks & Platform Business Models (Masterclass)Digital Networks & Platform Business Models (Masterclass)
Digital Networks & Platform Business Models (Masterclass)Benjamin Tincq
 
Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...UNDP Eurasia
 
MY1Nobel2037 - Nobel Fame and US$ Billion Fortunes
MY1Nobel2037 - Nobel Fame and US$ Billion FortunesMY1Nobel2037 - Nobel Fame and US$ Billion Fortunes
MY1Nobel2037 - Nobel Fame and US$ Billion Fortunesmy1nobel2037
 
The Pace of Change Requires AI (and/or its subsets)
The Pace of Change Requires AI (and/or its subsets) The Pace of Change Requires AI (and/or its subsets)
The Pace of Change Requires AI (and/or its subsets) Dharmabuilt
 
Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...
Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...
Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...Simone Cicero
 

Tendances (20)

Human Computation for VGI Management
Human Computation for VGI ManagementHuman Computation for VGI Management
Human Computation for VGI Management
 
Workshop SXSW 2017 - Iskander Smit - The Shape of New Things
Workshop SXSW 2017 - Iskander Smit - The Shape of New ThingsWorkshop SXSW 2017 - Iskander Smit - The Shape of New Things
Workshop SXSW 2017 - Iskander Smit - The Shape of New Things
 
2010 ALGIM Gov 2.0
2010 ALGIM Gov 2.02010 ALGIM Gov 2.0
2010 ALGIM Gov 2.0
 
AI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for LibrariesAI - Artificial Intelligence - Implications for Libraries
AI - Artificial Intelligence - Implications for Libraries
 
Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...
Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...
Becoming platforms: Harnessing the power of communities, beyond crowd-sourcin...
 
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjData-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
 
Crowdsourcing: A Business Model Game Changer
Crowdsourcing: A Business Model Game ChangerCrowdsourcing: A Business Model Game Changer
Crowdsourcing: A Business Model Game Changer
 
Civic tech the future of civic engagement and technology innovation
Civic tech  the future of civic engagement and technology innovationCivic tech  the future of civic engagement and technology innovation
Civic tech the future of civic engagement and technology innovation
 
Public infrastructure powered by search
Public infrastructure powered by searchPublic infrastructure powered by search
Public infrastructure powered by search
 
Web 2 & geo
Web 2 & geoWeb 2 & geo
Web 2 & geo
 
From Pipelines to Platform thinking
From Pipelines to Platform thinking From Pipelines to Platform thinking
From Pipelines to Platform thinking
 
Some Lessons for Startups (ppt)
Some Lessons for Startups (ppt)Some Lessons for Startups (ppt)
Some Lessons for Startups (ppt)
 
Robotic alms ai and the future of charitable giving notes
Robotic alms  ai and the future of charitable giving notesRobotic alms  ai and the future of charitable giving notes
Robotic alms ai and the future of charitable giving notes
 
The Meaning of the Platform Organization
The Meaning of the Platform OrganizationThe Meaning of the Platform Organization
The Meaning of the Platform Organization
 
Open government international garry lloyd
Open government international   garry lloydOpen government international   garry lloyd
Open government international garry lloyd
 
Digital Networks & Platform Business Models (Masterclass)
Digital Networks & Platform Business Models (Masterclass)Digital Networks & Platform Business Models (Masterclass)
Digital Networks & Platform Business Models (Masterclass)
 
Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...Making the invisible visible. Managing the digital footprint of development p...
Making the invisible visible. Managing the digital footprint of development p...
 
MY1Nobel2037 - Nobel Fame and US$ Billion Fortunes
MY1Nobel2037 - Nobel Fame and US$ Billion FortunesMY1Nobel2037 - Nobel Fame and US$ Billion Fortunes
MY1Nobel2037 - Nobel Fame and US$ Billion Fortunes
 
The Pace of Change Requires AI (and/or its subsets)
The Pace of Change Requires AI (and/or its subsets) The Pace of Change Requires AI (and/or its subsets)
The Pace of Change Requires AI (and/or its subsets)
 
Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...
Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...
Living in a Connected, Collaborative but “Dis-integrated” Society - Simone Ci...
 

En vedette

What is Crowdsourcing
What is CrowdsourcingWhat is Crowdsourcing
What is CrowdsourcingAliza Sherman
 
Crowdsourcing presentation (slideshare)
Crowdsourcing presentation (slideshare)Crowdsourcing presentation (slideshare)
Crowdsourcing presentation (slideshare)Olivia Ley
 
Building crowdsourcing applications
Building crowdsourcing applicationsBuilding crowdsourcing applications
Building crowdsourcing applicationsSimon Willison
 
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Gaurav Mishra
 
Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...
Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...
Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...Robert Ambrogi
 
The 2015 Global Economy Pulsecheck
The 2015 Global Economy PulsecheckThe 2015 Global Economy Pulsecheck
The 2015 Global Economy PulsecheckCrowdsourcing Week
 
Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...
Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...
Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...Cristina Pattuelli
 
How and why Crowdsourcing works in the Resources Industry
How and why Crowdsourcing works in the Resources IndustryHow and why Crowdsourcing works in the Resources Industry
How and why Crowdsourcing works in the Resources IndustryChris Schmid
 
Crowdsourcing: Looking Ahead to 2015
Crowdsourcing: Looking Ahead to 2015Crowdsourcing: Looking Ahead to 2015
Crowdsourcing: Looking Ahead to 2015Crowdsourcing Week
 
2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results
2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results
2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline ResultsCrowdsourcing Week
 
Gamified Crowdsourcing: The Impact on Community Engagement
Gamified Crowdsourcing: The Impact on Community EngagementGamified Crowdsourcing: The Impact on Community Engagement
Gamified Crowdsourcing: The Impact on Community EngagementBilal Mufti
 
Crowd Science: Measurements, Models, and Methods
Crowd Science: Measurements, Models, and Methods Crowd Science: Measurements, Models, and Methods
Crowd Science: Measurements, Models, and Methods Prashant Shukla
 
Hicss 2013 presentation_simula
Hicss 2013 presentation_simulaHicss 2013 presentation_simula
Hicss 2013 presentation_simulaHenri Simula
 
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social MediaCrowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social Mediaazubiaga
 

En vedette (20)

What is Crowdsourcing
What is CrowdsourcingWhat is Crowdsourcing
What is Crowdsourcing
 
Crowdsourcing presentation (slideshare)
Crowdsourcing presentation (slideshare)Crowdsourcing presentation (slideshare)
Crowdsourcing presentation (slideshare)
 
Crowdsourcing
CrowdsourcingCrowdsourcing
Crowdsourcing
 
Building crowdsourcing applications
Building crowdsourcing applicationsBuilding crowdsourcing applications
Building crowdsourcing applications
 
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
 
Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...
Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...
Plays Well with Others: How Collaboration and Crowdsourcing are Changing Lega...
 
The 2015 Global Economy Pulsecheck
The 2015 Global Economy PulsecheckThe 2015 Global Economy Pulsecheck
The 2015 Global Economy Pulsecheck
 
Crowdsourcing
CrowdsourcingCrowdsourcing
Crowdsourcing
 
Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...
Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...
Linked Jazz 52nd Street: A LOD Crowdsourcing Tool to Reveal Connections among...
 
How and why Crowdsourcing works in the Resources Industry
How and why Crowdsourcing works in the Resources IndustryHow and why Crowdsourcing works in the Resources Industry
How and why Crowdsourcing works in the Resources Industry
 
TKclass: Crowdsourcing tools
TKclass: Crowdsourcing toolsTKclass: Crowdsourcing tools
TKclass: Crowdsourcing tools
 
The Pitch
The PitchThe Pitch
The Pitch
 
Crowdsourcing: Looking Ahead to 2015
Crowdsourcing: Looking Ahead to 2015Crowdsourcing: Looking Ahead to 2015
Crowdsourcing: Looking Ahead to 2015
 
FLIRT crowdsourcing
FLIRT crowdsourcingFLIRT crowdsourcing
FLIRT crowdsourcing
 
2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results
2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results
2014 Global Crowdsourcing Pulsecheck: 1st Annual Survey Topline Results
 
Gamified Crowdsourcing: The Impact on Community Engagement
Gamified Crowdsourcing: The Impact on Community EngagementGamified Crowdsourcing: The Impact on Community Engagement
Gamified Crowdsourcing: The Impact on Community Engagement
 
Crowd Science: Measurements, Models, and Methods
Crowd Science: Measurements, Models, and Methods Crowd Science: Measurements, Models, and Methods
Crowd Science: Measurements, Models, and Methods
 
Basic Income in Europe
Basic Income in EuropeBasic Income in Europe
Basic Income in Europe
 
Hicss 2013 presentation_simula
Hicss 2013 presentation_simulaHicss 2013 presentation_simula
Hicss 2013 presentation_simula
 
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social MediaCrowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
 

Similaire à Crowdsourcing Challenges and Opportunities

Crowdsourcing: A Survey
Crowdsourcing: A SurveyCrowdsourcing: A Survey
Crowdsourcing: A SurveyIJERA Editor
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)Matthew Lease
 
Presentation at board DKV Seguros
Presentation at board DKV SegurosPresentation at board DKV Seguros
Presentation at board DKV Segurososimod
 
Middeware2012 crowd
Middeware2012 crowdMiddeware2012 crowd
Middeware2012 crowdmjfrankli
 
CrowdCrew team project - Venture Lab 2012 OAP
CrowdCrew team project - Venture Lab 2012 OAPCrowdCrew team project - Venture Lab 2012 OAP
CrowdCrew team project - Venture Lab 2012 OAPMassimo Riera
 
Yuri Van Geest - Exponential Organizations
Yuri Van Geest - Exponential OrganizationsYuri Van Geest - Exponential Organizations
Yuri Van Geest - Exponential OrganizationsBAQMaR
 
Integrate All The Things WS02Con
Integrate All The Things WS02ConIntegrate All The Things WS02Con
Integrate All The Things WS02ConJames Governor
 
CrowdCrew project - Venture Lab 2012 OAP
CrowdCrew project - Venture Lab 2012 OAPCrowdCrew project - Venture Lab 2012 OAP
CrowdCrew project - Venture Lab 2012 OAPMassimo Riera
 
Social media roles in crowdsourcing innovation tasks in B2B-relationships
Social media roles in crowdsourcing innovation tasks in B2B-relationshipsSocial media roles in crowdsourcing innovation tasks in B2B-relationships
Social media roles in crowdsourcing innovation tasks in B2B-relationshipsJari Jussila
 
Sweden future of ai 20180921 v7
Sweden future of ai 20180921 v7Sweden future of ai 20180921 v7
Sweden future of ai 20180921 v7ISSIP
 
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK Project
 
An Introduction to Human Computation and Games With A Purpose - Part I
An Introduction to Human Computation and Games With A Purpose - Part IAn Introduction to Human Computation and Games With A Purpose - Part I
An Introduction to Human Computation and Games With A Purpose - Part IAlessandro Bozzon
 
No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)
No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)
No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)Kath Straub
 
Seminar 20221027 v4.pptx
Seminar 20221027 v4.pptxSeminar 20221027 v4.pptx
Seminar 20221027 v4.pptxISSIP
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
 
Crowdsourcing & Gamification
Crowdsourcing & Gamification Crowdsourcing & Gamification
Crowdsourcing & Gamification Yefeng Liu
 
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...Hong-Linh Truong
 

Similaire à Crowdsourcing Challenges and Opportunities (20)

Crowdsourcing: A Survey
Crowdsourcing: A SurveyCrowdsourcing: A Survey
Crowdsourcing: A Survey
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
Presentation at board DKV Seguros
Presentation at board DKV SegurosPresentation at board DKV Seguros
Presentation at board DKV Seguros
 
A Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing FrameworkA Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing Framework
 
Middeware2012 crowd
Middeware2012 crowdMiddeware2012 crowd
Middeware2012 crowd
 
CrowdCrew team project - Venture Lab 2012 OAP
CrowdCrew team project - Venture Lab 2012 OAPCrowdCrew team project - Venture Lab 2012 OAP
CrowdCrew team project - Venture Lab 2012 OAP
 
Yuri Van Geest - Exponential Organizations
Yuri Van Geest - Exponential OrganizationsYuri Van Geest - Exponential Organizations
Yuri Van Geest - Exponential Organizations
 
Integrate All The Things WS02Con
Integrate All The Things WS02ConIntegrate All The Things WS02Con
Integrate All The Things WS02Con
 
CrowdCrew project - Venture Lab 2012 OAP
CrowdCrew project - Venture Lab 2012 OAPCrowdCrew project - Venture Lab 2012 OAP
CrowdCrew project - Venture Lab 2012 OAP
 
Social media roles in crowdsourcing innovation tasks in B2B-relationships
Social media roles in crowdsourcing innovation tasks in B2B-relationshipsSocial media roles in crowdsourcing innovation tasks in B2B-relationships
Social media roles in crowdsourcing innovation tasks in B2B-relationships
 
Sweden future of ai 20180921 v7
Sweden future of ai 20180921 v7Sweden future of ai 20180921 v7
Sweden future of ai 20180921 v7
 
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
 
An Introduction to Human Computation and Games With A Purpose - Part I
An Introduction to Human Computation and Games With A Purpose - Part IAn Introduction to Human Computation and Games With A Purpose - Part I
An Introduction to Human Computation and Games With A Purpose - Part I
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)
No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)
No Interface? No Problem: Applying HCD Agile to Data Projects (Righi)
 
Smart Business and Artificial Intelligence
Smart Business and Artificial IntelligenceSmart Business and Artificial Intelligence
Smart Business and Artificial Intelligence
 
Seminar 20221027 v4.pptx
Seminar 20221027 v4.pptxSeminar 20221027 v4.pptx
Seminar 20221027 v4.pptx
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Crowdsourcing & Gamification
Crowdsourcing & Gamification Crowdsourcing & Gamification
Crowdsourcing & Gamification
 
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
Emerging Dynamic TUW-ASE Summer 2015 - Distributed Systems and Challenges for...
 

Dernier

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Dernier (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

Crowdsourcing Challenges and Opportunities

  • 1. Crowdsourcing: Challenges and Opportunities Guoliang Li Tsinghua University
  • 2. Tutorial Objectives  What is crowdsourcing?  How and when to use crowdsourcing?  How to do experiments for crowdsourcing?  What are research challenges of crowdsourcing? Crowdsourcing @ HotDB2012 (2)
  • 3. Tutorial Outline  Introduction  Applications  Platforms  Challenges  Opportunities Crowdsourcing @ HotDB2012 (3)
  • 4. What is Crowdsourcing? $ / Customer  Outsourcing – 外包  A known agent (an employee) Large Businesses # of Customers  Crowdsourcing – 众包  An undefined, generally large group of people via a group call  The application of open source principles to fields outside of software  Most successfully story: Wikipedia Crowdsourcing @ HotDB2012 (4)
  • 5. Crowdsourcing Definition  Coordinating a crowd (a large group of people on the web) to do micro-work (small contributions) that solves problems (that software or one user can’t do)  A collection of mechanisms and associated methodologies for scaling and directing crowd activities to achieve goals volunteer fun social Evolving & broadly defined Crowdsourcing @ HotDB2012 (5)
  • 6. Example - Captcha  Captcha: 200M every day  ReCaptcha: 750M to date Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321: 1465-1468, 2008 Crowdsourcing @ HotDB2012 (6)
  • 7. NLP Example – Machine Translation  Machine Translation  Problem:  Manual evaluation on translation quality is slow and expensive  Crowdsourcing:  Low cost of non-experts, $0.10 to translate a sentence  High agreement/equivalent quality between non-experts and experts  Complex tasks like human-mediated translation edit rate  C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.  B. Bederson et al. Translation by Iteractive Collaboration between Monolingual Users, GI 2010 Crowdsourcing @ HotDB2012 (7)
  • 8. IR Example – Image Search Tingxin Yan, Vikas Kumar, Deepak Ganesan: CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. MobiSys 2010:77-90 Crowdsourcing @ HotDB2012 (8)
  • 9. IR Example - Relevance and ads Omar Alonso, Daniel E. Rose, Benjamin Stewart: Crowdsourcing for relevance evaluation. SIGIR Forum (SIGIR) 42(2):9-15 (2008) Crowdsourcing @ HotDB2012 (9)
  • 10. CV Example - Painting Similarity  How similar is the artistic style in the paintings above?  Very similar  Similar  Somewhat dissimilar Human and Machine Detection of Stylistic Similarity in Art.  Very dissimilar Adriana Kovashka and Matthew Lease. CrowdConf 2010 Crowdsourcing @ HotDB2012 (10)
  • 11. Audio Example  Mobile service that aids blind users with “visual questions” in near-realtime  An iPhone application Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samuel White, Tom Yeh. VizWiz: nearly real-time answers to visual questions. UIST 2010 (Best Paper Award) http://www.cs.rochester.edu/u/jbigham/vizwiz/video/ Crowdsourcing @ HotDB2012 (11)
  • 12. DB Example - CrowdDB  Use crowd to answer DB queries  Where to use crowd?  How to use crowd?  How to support SQL?  How to devise a system?  Quality? Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, Reynold Xin: CrowdDB: answering queries with crowdsourcing. SIGMOD 2011:61-72 Crowdsourcing @ HotDB2012 (12)
  • 13. Crowdsourcing Overview  Requester  People submit tasks Submit tasks Collect answers  Platforms Publish tasks  Task management Platforms Find interesting tasks Return answers  Worker  People work on tasks Crowdsourcing @ HotDB2012 (13)
  • 14. Crowdsourcing vs Human Computation  Human Computation  Design a solution using both automated computers and human computers  Maybe a closed set of workers  Crowdsourcing greatly facilitates human computation  Social computing  Social behavior  Collective intelligence  May not human Alexander J. Quinn, Benjamin B. Bederson: Human computation: a survey and taxonomy of a growing field. CHI 2011:1403-1412 Crowdsourcing @ HotDB2012 (14)
  • 15. A Growing Field - Tutorials  WWW 2011: Managing Crowdsourced Human Computation  VLDB 2011: Crowdsourcing Applications and Platforms  SIGIR 2011: Crowdsourcing for Information Retrieval: : Principles, Methods and Applications  AAAI 2011: Human Computation: Core Research Questions and State of the Art  WSDM 2011: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You  CVPR 2010: Mechanical Turk for Computer Vision  ECIR 2010: Crowdsourcing for Relevance Evaluation  HCIC 2011: Quality Crowdsourcing for Human Computer Interaction Research  CrowdConf 2011: Crowdsourcing for Fun and Profit Crowdsourcing @ HotDB2012 (15)
  • 16. A Growing Field - Workshops  KDD 2009: 1st Human Computation Workshop - HCOMP 2009  KDD 2010 : 2nd Human Computation Workshop - HCOMP 2010  AAAI 2011 : 3rd Human Computation Workshop - HCOMP 2011  AAAI 2012 : 4th Human Computation Workshop - HCOMP 2012  SIGIR 2010 : Crowdsourcing for Search Evaluation  SIGIR 2011: Crowdsourcing for Information Retrieval  CVPR 2010 : Advancing Computer Vision with Humans in the Loop  NAACL 2010: Creating Speech and Language Data with Amazon’s Mechanical Turk  NIPS 2010 : Computational Social Science and Wisdom of the Crowds  Ubicomp 2010 : Workshop on Ubiquitous Crowdsourincg  WSDM 2010: Crowdsourcing for Search and Data Mining Workshop  ICWE 2010 : Enterprise Crowdsourcing Workshop  CHI 2011: Workshop on Crowdsourcing and Human Computation  AMTA 2010: Collaborative Translation Technology, Crowdsourcing and the Translator  EC 2011: Workshop on Social Computing and User Generated Content Crowdsourcing @ HotDB2012 (16)
  • 17. Many Related Areas  Human–Computer Interaction (HCI)  Artificial Intelligence (AI) Social Science  Machine Learning (ML)  Information Retrieval (IR)  Crowd Management (DB)  Social Science  Theory  Statistics DB Crowdsourcing @ HotDB2012 (17)
  • 18. Landscape Crowdsourcing @ HotDB2012 (18)
  • 19. Crowdsourcing Are there Platforms any platforms? Crowdsourcing @ HotDB2012 (19)
  • 20. Amazon Mechanical Turk (Mturk) - www.mturk.com highly-available, cheap, programmable, a prototyping platform for crowd computing (20) Crowdsourcing @ HotDB2012
  • 21. Tasks  What are tasks? Crowdsourcing @ HotDB2012 (21)
  • 22. Why Micro-tasks  Cheap, Easy, and Fast  Ready to use infrastructures  Payments, workforce, interface widgets  Allow early, iterative, frequent trials  Test new ideas  Many successful examples  Image search  reCaptcha Crowdsourcing @ HotDB2012 (22)
  • 23. Human Intelligence Tasks (HITs)  Human Intelligence Tasks – micro tasks  Requesters create (HITs)  web services API/Dashboard  assess results, pay per HIT satisfactorily completed.  Workers (sometimes called “Turkers”)  log in, choose HITs, perform them.  Currently >200,000 workers from 100 countries  Millions of HITs completed Crowdsourcing @ HotDB2012 (23)
  • 24. Requester & Worker Build HIT Test HIT Post HIT Search for Accept Do work HITs HIT Submit Reject or Requester HIT approve Worker Crowdsourcing @ HotDB2012 (24)
  • 25. The Requester  Sign up with your Amazon account  Amazon payments  Purchase prepaid HITs  There is no minimum or up-front fee  AMT collects a 10% commission  The minimum commission charge is $0.005 per HIT  Approve/reject answers Crowdsourcing @ HotDB2012 (25)
  • 26. Dashboard  Three tabs  Design  Publish  Manage  Design  HIT Template  Publish  Make work available  Manage  Monitor progress Crowdsourcing @ HotDB2012 (26)
  • 27. Amazon Web Services API  Rich set of services  Command line tools  More flexibility than dashboard  CreateHIT (Requirements, Pay rate, Description) – returns HIT Id and HIT Type Id  SubmitAssignment (AssignmentId) – notifies Amazon that this assignment has been completed  ApproveAssignment (AssignmentID) – Requester accepts assignment, money is transferred, also RejectAssignment  GrantBonus (WorkerID, Amount, Message) – Give the worker the specified bonus and sends message, should have a failsafe  NotifyWorkers (list of WorkerIds, Message) – e-mails message to the workers. Crowdsourcing @ HotDB2012 (27)
  • 28. Dashboard vs API  Dashboard  Easy to prototype  Setup and launch an experiment in a few minutes  API  Ability to integrate AMT as part of a system  Ideal if you want to run experiments regularly  Schedule tasks Crowdsourcing @ HotDB2012 (28)
  • 29. The Worker  Sign up with your Amazon account  Tabs  Account: work approved/rejected  HIT: browse and search for work  Qualifications: browse and search for qualifications Crowdsourcing @ HotDB2012 (29)
  • 30. Why Do Work on MTurk?  Money ($$$)  Fun (or avoid boredom)  Socialize  Earn acclaim/prestige  Altruism  Learn something new (e.g. English)  Unintended by-product (e.g. re-Captcha)  Create self-serving resource (e.g. Wikipedia)  Multiple incentives are typically at work in parallel Crowdsourcing @ HotDB2012 (30)
  • 31. Who Are My Workers?  2008-2009 studies found less global and diverse than previously thought  47% US, 34% India, 19% others  Female  Educated  Bored  Money is secondary Crowdsourcing @ HotDB2012 (31)
  • 32. Survey on Workers  “Mturk money is always necessary to make ends meet.”  5% U.S. 13% India  “Mturk money is irrelevant.”  12% U.S. 10% India  “Mturk is a fruitful way to spend free time and get some cash.”  69% U.S. 59% India Crowdsourcing @ HotDB2012 (32)
  • 33. Mturk  Advantages  Disadvantages/Limitations  More participants  Lower quality feedback  More diverse participants  Less interaction  High speed  Greater need for quality control  Low cost  Less focused user groups  Speed of experimentation  No control of users’ environment  Diversity  Not designed for user studies  Spam - Uncertainty about user demographics, expertise  Lots of problems and missing features Crowdsourcing != MTurk Crowdsourcing @ HotDB2012 (33)
  • 34. Pay-based Marketplaces / Vendors  Amazon Mechanical Turk (since 2005, www.mturk.com)  Crowdflower (since 2007, www.crowdflower.com)  CloudCrowd (www.cloudcrowd.com/)  DoMyStuff (www.domystuff.com/)  Livework (https://www.livework.com/)  Clickworker (www.clickworker.com/)  SmartSheet (www.smartsheet.com/crowdsourcing)  uTest (www.utest.com/)  Elance (www.elance.com/)  oDesk (www.odesk.com/)  vWorker (www.vworker.com/) Crowdsourcing @ HotDB2012 (34)
  • 35. Microtask Aggregators Crowdsourcing @ HotDB2012 (35)
  • 36. Samasource.org How and when to use crowdsourcing? Crowdsourcing @ HotDB2012 (36)
  • 37. How and when to use crowdsourcing?
  • 38. When to use crowdsourcing  Computers cannot do  A single person cannot do  The work can be split into small tasks Crowdsourcing @ HotDB2012 (38)
  • 39. How to Do Experiments in Crowdsourcing  Experimental Design  Choose crowdsourcing platform  Decompose your tasks into micro tasks  Publish your tasks and wait for answers  Aggregate workers’ answers Crowdsourcing @ HotDB2012 (39)
  • 40. Tutorial Outline  Introduction  Applications  Platforms  Challenges  Opportunities Crowdsourcing @ HotDB2012 (40)
  • 41. Research Challenges in Crowdsourcing  Task Management  Task assignment, payment, discover  Human–Computer Interaction  Payment / incentives, interface and interaction design, communication, reputation, recruitment, retention  Quality Control / Data Quality  Trust, reliability, spam detection, consensus labeling  Human-Processing Unit (HPU) and CPU  How to combine  Scalability  Large scale data Crowdsourcing @ HotDB2012 (41)
  • 42. Task Management  How to decompose a complex task?  How to find tasks for workers?  How much is the payment of a HIT? Crowdsourcing @ HotDB2012 (42)
  • 43. Supporting Complex Tasks  Mturk works only for small tasks  How to support complex tasks?  Task decomposition – large tasks are divided into small problems  Job distributed among multiple workers  Collect all answers and combine them  Verifying performance of heterogeneous CPUs and HPUs Crowdsourcing @ HotDB2012 (43)
  • 44. CrowdForge - MapReduce framework for crowds  Aniket Kittur, Boris Smus, Robert Kraut: CrowdForge : crowdsourcing complex work. CHI Extended Abstracts 2011:1801-1806 Crowdsourcing @ HotDB2012 (44)
  • 45. How to Discover Tasks?  Task discovery is very important.  Heavy tailed distribution of completion times. Crowdsourcing @ HotDB2012 (45)
  • 46. Task Assignment  Push Methods:system ➠ workers  The system takes complete control over who is assigned which task.  Worker expertise recording for task assignment (employer/task finds worker)  Pull Methods:workers ➠ system  The system merely sets up the environment to allow workers to assign themselves (or each other) tasks.  Task organization for task discovery (worker finds employer/task) Crowdsourcing @ HotDB2012 (46)
  • 47. Task Recommendation  Content-Based recommendation  find similarities between worker profile and task characteristics.  Collaborative Filtering  make use of preference information about tasks (e.g., ratings) to infer similarities between workers.  Hybrid  a mix of content-based and collaborative filtering methods. Crowdsourcing @ HotDB2012 (47)
  • 48. Task Payments  How much is a HIT?  Delicate balance  Too little, no interest  Too much, attract spammers  Paying a lot is a counter‐incentive  Money does not improve quality but (generally) increase participation  Payment based on user effort- Bonus  Example: $0.04 (2 cents to answer a yes/no question, 2 cents if you provide feedback that is not mandatory) Winter A. Mason, Duncan J. Watts: Financial incentives and the "performance of crowds". SIGKDD Explorations (SIGKDD) 11(2):100-108 (2009) Crowdsourcing @ HotDB2012 (48)
  • 49. Optimization Goal  3 main goals for a task to be done:  Minimize Cost (cheap)  Minimize Completion Time (fast)  Maximize Quality (good)  Many optimization problems to tradeoff the three goals Crowdsourcing @ HotDB2012 (49)
  • 50. Human-Assisted Graph Search/Classification  Given a DAG G,  Containing unknown target nodes  Find target nodes by asking humans search queries at nodes in G  “is there a target node reachable from the current node?”  Applications  Classification, workflow debugging, interactive search Aditya G. Parameswaran, Anish Das Sarma, Hector Garcia-Molina, Neoklis Polyzotis, Jennifer Widom: Human-assisted graph search: it's okay to ask questions. PVLDB 4(5):267-278 (2011) Crowdsourcing @ HotDB2012 (50)
  • 51. CrowdGraphSearch - Application  Classify an image Crowdsourcing @ HotDB2012 (51)
  • 52. CrowdGraphSearch - Optimization  Do not ask all nodes or serially  Too expensive or too slow  Given a limit of k questions, find the best nodes to ask in parallel  Minimize the size  To classify into huge taxonomy, first ask k question  Is the superset small enough? Crowdsourcing @ HotDB2012 (52)
  • 53. CrowdSearch: Using crowd to improve image search  How to ensure a high enough accuracy, say over 95%? Tingxin Yan, Vikas Kumar, Deepak Ganesan: CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. MobiSys 2010:77-90 Crowdsourcing @ HotDB2012 (53)
  • 54. CrowdSearch - Optimization  To minimize money  Increase delay  To minimize delay  Increase money  Goal: return one validate image before deadline, while minimizing the money Crowdsourcing @ HotDB2012 (54)
  • 55. CrowdSearch - Optimization Crowdsourcing @ HotDB2012 (55)
  • 56. CrowdSearch - Optimization Crowdsourcing @ HotDB2012 (56)
  • 57. Quality - Example  Get people to look at sites and classify them as:  G(general audience)  PG (parental guidance)  R (restricted)  X (porn) Crowdsourcing @ HotDB2012 (57)
  • 58. Quality Control  Quality of workers’ answers is extremely important part of the experiment  Approach it as “overall” quality – not just for workers  Bi-directional channel  You may think the worker is doing a bad job.  The same worker may think you are a lousy requester. Crowdsourcing @ HotDB2012 (58)
  • 59. Methods for Measuring Agreement  What to look for  Agreement, reliability, validity  Beforehand  Qualification test  Screening, selection, recruiting, training  During  Accesses labels as workers produce them  Reward, penalize, weight  After  Accuracy metrics  Filter, weight Crowdsourcing @ HotDB2012 (59)
  • 60. Qualification Tests: Pros and Cons  Qualification test  create questions on topics so user gets familiar before starting assessments  Advantages  Great tool for controlling quality  Adjust passing grade  Disadvantages  Hard to verify subjective tasks like judging relevance  Slows down the experiment, difficult to “test” relevance  Extra cost to design and implement the test  Try creating task-related questions to get worker familiar with task before starting task in earnest  No guarantees - Still not a guarantee of good outcome Crowdsourcing @ HotDB2012 (60)
  • 61. Quality Control  Majority vote  2 bad, 3 good  good  3 bad, 2 good  bad  Weighted majority vote  Identify workers that always disagree with the majority  Lower down the weight of such workers Crowdsourcing @ HotDB2012 (61)
  • 62. Dealing with Bad Workers  Pay for “bad” work instead of rejecting it?  Pro: preserve reputation, admit if poor design at fault  Con: promote fraud, undermine approval rating system  Use bonus as incentive  Pay the minimum $0.01 and $0.01 for bonus  Better than rejecting a $0.02 task  Worker blocking - spammer “caught”, block from future tasks  May be easier to always pay, then block as needed Crowdsourcing @ HotDB2012 (62)
  • 63. Emails after Rejection Crowdsourcing @ HotDB2012 (63)
  • 64. Emails after Rejection  WORKER: this is not fair , you made me work for 10 cents and i lost my 30 minutes of time ,power and lot more and gave me 2 rejections atleast you may keep it pending. please show some respect to turkers  WORKER: I understood the problems. At that time my kid was crying and i went to look after. that's why i responded like that. I was very much worried about a hit being rejected. The real fact is that i haven't seen that instructions of 5 web page and started doing as i do the dolores labs hit, then someone called me and i went to attend that call. sorry for that and thanks for your kind concern. Crowdsourcing @ HotDB2012 (64)
  • 65. Gold Sets / Honey Pots  Gold derived from  Experts  Crowd using high quorum  Interject trap questions  Block users in trap and invalidate answers  Pros  Often very effective  Cost efficient  Cons  Not always applicable  Digging gold is hard Crowdsourcing @ HotDB2012 (65)
  • 67. Quality Control  As a worker  I hate when instructions are not clear  I’m not a spammer – I just don’t get what you want  A good pay is ideal but not the only condition for engagement  As a requester  A task that would produce the right results and is appealing to workers  I want your honest answer for the task  I want qualified workers and I want the system to do some of that for me  Managing crowds and tasks is a daily activity and more difficult than managing computers Crowdsourcing @ HotDB2012 (67)
  • 68. Human-Computer Interaction (HCI)  Getting input from users is important in HCI  surveys  rapid prototyping  usability tests  cognitive walkthroughs  performance measures  quantitative ratings Crowdsourcing @ HotDB2012 (68)
  • 69. HCI - The UI hurts the market!  Practitioners know that HITs in 3rd page and after, are not picked by workers.  Many such HITs are left to expire after months, never completed.  Badly designed task discovery interface hurts every participant in the market!  Better modeling as a queuing system may demonstrate other such improvements Crowdsourcing @ HotDB2012 (69)
  • 70. HCI  Survey design  Questionnaire design  Right questions  Examples  UI design  Implementation  Generic tips  Experiment should be self-contained.  Keep it short and simple. Brief and concise.  Be very clear with the relevance task.  Engage with the worker. Avoid boring stuff.  Always ask for feedback (open-ended question) in an input box. Crowdsourcing @ HotDB2012 (70)
  • 71. Other Design Principles  Text alignment  Legibility  Reading level: complexity of words and sentences  Attractiveness (worker’s attention & enjoyment)  Multi-cultural / multi-lingual  Who is the audience (e.g. target worker community)  Special needs communities (e.g. simple color blindness) Crowdsourcing @ HotDB2012 (71)
  • 72. System Design Crowdsourcing @ HotDB2012 (72)
  • 73. DB & Crowd  How can crowd help databases?  Fix broken data  Add missing data  Subjective comparison  How can DB help crowd apps?  Lazy data acquisition  Game the workers market  Semi-automatically create user interfaces  Manage the data sourced from the crowd Crowdsourcing @ HotDB2012 (73)
  • 74. CrowdDB  Use crowd to answer DB queries  Find missing data  Make subjective comparison  Recognize patterns  Main operations  Join  Sort Crowdsourcing @ HotDB2012 (74)
  • 75. Research Challenges  Data Model  Uncertainty  User view vs System view  How to get data?  Query processing  User-defined functions (UDF)  CrowdSQL Crowdsourcing @ HotDB2012 (75)
  • 76. Example DB Systems  CrowdDB (Berkeley)  Qurk (MIT)  Scoop (Stanford)  Hlog (Wiscosin)  Freebase (Google) Crowdsourcing @ HotDB2012 (76)
  • 77. Tutorial Outline  Introduction  Applications  Platforms  Challenges  Opportunities Crowdsourcing @ HotDB2012 (77)
  • 78. Opportunities  Problems with the current platform  Very rudimentary  No tools for data analysis  No integration with databases  Very limited search and browse features  Opportunities  What is the database model for crowdsourcing?  MapReduce with crowdsourcing  Can you integrate human-computation into a language?  Task management Crowdsourcing @ HotDB2012 (78)
  • 79. Opportunities  Quality control  Human factors vs. outcomes  Pricing tasks  Predicting worker quality from observable properties (e.g. task completion time)  HIT / Requestor ranking or recommendation  Expert search : who are the right workers given task nature and constraints  Privacy  Workers and requesters  Tasks Crowdsourcing @ HotDB2012 (79)
  • 80. Opportunities  Crowdsourcing is cheap but not free, and cannot scale to web without help  How to scale out?  Improve the accuracy and efficiency of human computation algorithms.  Indexing & Pruning techniques  Dealing with uncertainty  Temporal and labeling uncertainty  Learning algorithms  Search evaluation  Combining CPU + HPU  MapReduce with human computation?  Integration points with enterprise systems Crowdsourcing @ HotDB2012 (80)
  • 81. Conclusion  Crowdsourcing for relevance evaluation works  Fast turnaround, easy to experiment, cheap  Still have to design the experiments carefully!  Quality  Worker quality  User feedback extremely useful  Usability considerations  Platform  MTurk is a popular platform and others are emerging  Lots of opportunities to improve current platforms  Scale out Crowdsourcing @ HotDB2012 (81)
  • 82. References - Tutorials  AnHai Doan, Michael J. Franklin, Donald Kossmann, Tim Kraska: Crowdsourcing Applications and Platforms: A Data Management Perspective. PVLDB 4(12):1508-1509 (2011)  Omar Alonso, Matthew Lease: Crowdsourcing for information retrieval: principles, methods, and applications. SIGIR 2011:1299-1300  Omar Alonso, Matthew Lease: Crowdsourcing 101: putting the WSDM of crowds to work for you. WSDM 2011:1-2  Panagiotis G. Ipeirotis, Praveen K. Paritosh: Managing crowdsourced human computation: a tutorial. WWW (Companion Volume) 2011:287-288  Omar Alonso: Crowdsourcing for Information Retrieval Experimentation and Evaluation. CLEF 2011:2  This tutorial (for research purpose only) used some slides in the above tutorials.  Thank the authors. Crowdsourcing @ HotDB2012 (82)
  • 83. References - Survey Papers  Man-Ching Yuen, Irwin King, Kwong-Sak Leung: A Survey of Crowdsourcing Systems. SocialCom/PASSAT 2011:766-773  Alexander J. Quinn, Benjamin B. Bederson: Human computation: a survey and taxonomy of a growing field. CHI 2011:1403-1412  Rajarshi Das, Maja Vukovic. Emerging theories and models of human computation systems: a brief survey. UbiCrowd 2011.  AnHai Doan, Raghu Ramakrishnan, Alon Y. Halevy: Crowdsourcing systems on the World-Wide Web. Commun. ACM (CACM) 54(4):86-96 (2011) Crowdsourcing @ HotDB2012 (83)
  • 84. References - DB  Adam Marcus, Eugene Wu, Samuel Madden, Robert C. Miller: Crowdsourced Databases: Query Processing with People. CIDR 2011:211-214  Aditya G. Parameswaran, Neoklis Polyzotis: Answering Queries using Humans, Algorithms and Databases. CIDR 2011:160-166  Salil S. Kanhere: Participatory Sensing: Crowdsourcing Data from Mobile Smartphones in Urban Spaces. Mobile Data Management 2011:3-6  Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, Reynold Xin: CrowdDB: answering queries with crowdsourcing. SIGMOD 2011:61-72  Adam Marcus, Eugene Wu, David R. Karger, Samuel Madden, Robert C. Miller: Human-powered Sorts and Joins. PVLDB 5(1):13-24 (2011)  Aditya G. Parameswaran, Anish Das Sarma, Hector Garcia-Molina, Neoklis Polyzotis, Jennifer Widom: Human-assisted graph search: it's okay to ask questions. PVLDB 4(5):267-278 (2011)  Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, Jamie Taylor: Freebase: a collaboratively created graph database for structuring human knowledge. SIGMOD 2008:1247-1250 Crowdsourcing @ HotDB2012 (84)
  • 85. References - HCI  Giordano Koch, Johann Füller, Sabine Brunswicker: Online Crowdsourcing in the Public Sector: How to Design Open Government Platforms. HCI 2011:203-212  Ido Guy, Adam Perer, Tal Daniel, Ohad Greenshpan, Itai Turbahn: Guess who?: enriching the social graph through a crowdsourcing game. CHI 2011:1373- 1382  Aniket Kittur, Boris Smus, Robert Kraut: CrowdForge: crowdsourcing complex work. CHI Extended Abstracts 2011:1801-1806  Yasuaki Sakamoto,Yuko Tanaka, Lixiu Yu, Jeffrey V. Nickerson: The Crowdsourcing Design Space. HCI 2011:346-355  Jon Noronha, Eric Hysen, Haoqi Zhang, Krzysztof Z. Gajos: Platemate: crowdsourcing nutritional analysis from food photographs. UIST 2011:1-12  Aniket Kittur, Boris Smus, Susheel Khamkar, Robert E. Kraut: CrowdForge: crowdsourcing complex work. UIST 2011:43-52  Jeffrey Heer, Michael Bostock: Crowdsourcing graphical perception: using mechanical turk to assess visualization design. CHI 2010:203-212 Crowdsourcing @ HotDB2012 (85)
  • 86. References - NLP  Omar Zaidan, Chris Callison-Burch: Crowdsourcing Translation: Professional Quality from Non-Professionals. ACL 2011:1220-1229  Derya Ozkan, Louis-Philippe Morency: Modeling Wisdom of Crowds Using Latent Mixture of Discriminative Experts. ACL (Short Papers) 2011:335-340  Nitin Madnani, Martin Chodorow, Joel R. Tetreault, Alla Rozovskaya: They Can Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error Detection Systems. ACL (Short Papers) 2011:508-513  Keith Vertanen, Per Ola Kristensson: The Imagination of Crowds: Conversational AAC Language Modeling using Crowdsourcing and Large Data Sources. EMNLP 2011:700-711  Matteo Negri, Luisa Bentivogli,Yashar Mehdad, Danilo Giampiccolo, Alessandro Marchetti: Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora. EMNLP 2011:670-679 Crowdsourcing @ HotDB2012 (86)
  • 87. References - AI  Yan Yan, Rómer Rosales, Glenn Fung, Jennifer G. Dy: Active Learning from Crowds. ICML 2011:1161-1168  Peng Dai, Mausam, Daniel S. Weld: Decision-Theoretic Control of Crowd-Sourced Workflows. AAAI 2010  Peter Welinder, Steve Branson, Serge Belongie, Pietro Perona: The Multidimensional Wisdom of Crowds. NIPS 2010:2424-2432  Yen-ling Kuo, Jane Yung-jen Hsu: Resource-Bounded Crowd- Sourcing of Commonsense Knowledge. IJCAI 2011:2470-2475  Edith Law, Haoqi Zhang: Towards Large-Scale Collaborative Planning: Answering High-Level Search Queries Using Human Computation. AAAI 2011 Crowdsourcing @ HotDB2012 (87)
  • 88. References - IR  Gabriella Kazai: In Search of Quality in Crowdsourcing for Search Engine Evaluation. ECIR 2011:165-176  Omar Alonso, Ricardo A. Baeza-Yates: Design and Implementation of Relevance Assessments Using Crowdsourcing. ECIR 2011:153-164  Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S. Thompson, Duc Thanh Tran: Repeatable and reliable search system evaluation using crowdsourcing. SIGIR 2011:923-932  Matthew Lease,Vitor R. Carvalho, Emine Yilmaz: Crowdsourcing for search and data mining. SIGIR Forum (SIGIR) 45(1):18-24 (2011)  Krishna Yeswanth Kamath, James Caverlee: Transient crowd discovery on the real-time social web. WSDM 2011:585-594  Omar Alonso, Ralf Schenkel, Martin Theobald: Crowdsourcing Assessments for XML Ranked Retrieval. ECIR 2010:602-606  Hao Ma, Raman Chandrasekar, Chris Quirk, Abhishek Gupta: Improving search engines using human computation games. CIKM 2009:275-284  Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling: Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking. SIGIR 2011:205-214 Crowdsourcing @ HotDB2012 (88)
  • 89. References - Theory  Shuchi Chawla, Jason D. Hartline, Balasubramanian Sivan: Optimal crowdsourcing contests. SODA 2012:856-868 Crowdsourcing @ HotDB2012 (89)