SlideShare a Scribd company logo
1 of 45
Download to read offline
Crowdsourcing for Human Computer
Interaction Research


Ed H. Chi

Research Scientist
Google

(work done while at [Xerox] PARC with Aniket Kittur)
User studies

•  Getting input from users is important in HCI
   –    surveys
   –    rapid prototyping
   –    usability tests
   –    cognitive walkthroughs
   –    performance measures
   –    quantitative ratings
User studies

•  Getting input from users is expensive
   –  Time costs
   –  Monetary costs
•  Often have to trade off costs with sample size
Online solutions

•    Online user surveys
•    Remote usability testing
•    Online experiments
•    But still have difficulties
     –  Rely on practitioner for recruiting participants
     –  Limited pool of participants
Crowdsourcing

•  Make tasks available for anyone online to complete
•  Quickly access a large user pool, collect data, and
   compensate users
•  Example: NASA Clickworkers
    –  100k+ volunteers identified Mars craters from
       space photographs
    –  Aggregate results virtually indistinguishable from
       expert geologists

                                                   experts

                                                   crowds

                http://clickworkers.arc.nasa.gov
Amazon s Mechanical turk

•  Market for human intelligence tasks
•  Typically short, objective tasks
   –  Tag an image
   –  Find a webpage
   –  Evaluate relevance of search results
•  Users complete for a few pennies each
Example task
Using Mechanical Turk for user studies

                       Traditional user        Mechanical Turk
                           studies
Task complexity           Complex                   Simple
                           Long                     Short
Task subjectivity         Subjective               Objective
                          Opinions                 Verifiable
User information    Targeted demographics   Unknown demographics
                       High interactivity     Limited interactivity


    Can Mechanical Turk be usefully used for user studies?
Task

•  Assess quality of Wikipedia articles
•  Started with ratings from expert Wikipedians
    –  14 articles (e.g., Germany , Noam Chomsky )
    –  7-point scale
•  Can we get matching ratings with mechanical turk?
Experiment 1

•  Rate articles on 7-point scales:
   –  Well written
   –  Factually accurate
   –  Overall quality
•  Free-text input:
   –  What improvements does the article need?
•  Paid $0.05 each
Experiment 1: Good news

•  58 users made 210 ratings (15 per article)
   –  $10.50 total
•  Fast results
   –  44% within a day, 100% within two days
   –  Many completed within minutes
Experiment 1: Bad news

•  Correlation between turkers and Wikipedians
   only marginally significant (r=.50, p=.07)
•  Worse, 59% potentially invalid responses
                         Experiment 1
           Invalid           49%
         comments
           <1 min            31%
         responses

•  Nearly 75% of these done by only 8 users
Not a good start
•  Summary of Experiment 1:
   –  Only marginal correlation with experts.
   –  Heavy gaming of the system by a minority
•  Possible Response:
   –  Can make sure these gamers are not rewarded
   –  Ban them from doing your hits in the future
   –  Create a reputation system [Delores Lab]
•  Can we change how we collect user input ?
Design changes

•  Use verifiable questions to signal monitoring
   –  How many sections does the article have?
   –  How many images does the article have?
   –  How many references does the article have?
Design changes

•  Use verifiable questions to signal monitoring
•  Make malicious answers as high cost as
   good-faith answers
   –  Provide 4-6 keywords that would give someone a
     good summary of the contents of the article
Design changes

•  Use verifiable questions to signal monitoring
•  Make malicious answers as high cost as
   good-faith answers
•  Make verifiable answers useful for completing
   task
   –  Used tasks similar to how Wikipedians described
      evaluating quality (organization, presentation,
      references)
Design changes

•  Use verifiable questions to signal monitoring
•  Make malicious answers as high cost as
   good-faith answers
•  Make verifiable answers useful for completing
   task
•  Put verifiable tasks before subjective
   responses
   –  First do objective tasks and summarization
   –  Only then evaluate subjective quality
   –  Ecological validity?
Experiment 2: Results

   •  124 users provided 277 ratings (~20 per article)
   •  Significant positive correlation with Wikipedians (r=.
      66, p=.01)

   •  Smaller proportion malicious responses
   •  Increased time on task

                      Experiment 1        Experiment 2
  Invalid                49%                  3%
comments
  <1 min                 31%                  7%
responses
Median time              1:30                4:06
Generalizing to other user studies

•  Combine objective and subjective questions
   –  Rapid prototyping: ask verifiable questions about
      content/design of prototype before subjective
      evaluation
   –  User surveys: ask common-knowledge questions
      before asking for opinions
Limitations of mechanical turk

•  No control of users environment
   –  Potential for different browsers, physical
      distractions
   –  General problem with online experimentation
•  Not designed for user studies
   –  Difficult to do between-subjects design
   –  Involves some programming
•  Users
   –  Uncertainty about user demographics, expertise
Quick Summary

•  Mechanical Turk offers the practitioner a way to
   access a large user pool and quickly collect data at
   low cost
•  Good results require careful task design
  1.  Use verifiable questions to signal monitoring
  2.  Make malicious answers as high cost as good-faith
      answers
  3.  Make verifiable answers useful for completing task
  4.  Put verifiable tasks before subjective responses
Crowdsourcing for HCI Research


•  Does my interface/visualization work?
   –  WikiDashboard: transparency visualization for Wikipedia
   –  J. Heer’s work at Stanford at looking at perceptual effects
•  Coding of large amount of user data
   –  What is a question? In Twitter, Sharoda Paul at PARC
•  Decompose tasks into smaller tasks
   –  Digital Taylorism
   –  Frederick Winslow Taylor (1856-1915) 1911 book
      'Principles Of Scientific Management'
•  Incentive mechanisms
   –  Intrinsic vs. Extrinsic rewards
   –  Games vs. Pay
•  @edchi
•  chi@acm.org
•  http://edchi.net
What would make you trust Wikipedia more?




                                        24
What is Wikipedia?




    Wikipedia is the best thing ever. Anyone in the world can write
anything they want about any subject, so you know you re getting the
                      best possible information.
                      – Steve Carell, The Office


                                                                   25
What would make you trust Wikipedia more?




              Nothing



                                        26
What would make you trust Wikipedia more?




       Wikipedia, just by its nature, is
      impossible to trust completely. I don't
      think this can necessarily be
      changed.




                                                27
WikiDashboard
       Transparency of social dynamics can reduce conflict and coordination
        issues
       Attribution encourages contribution
         –  WikiDashboard: Social dashboard for wikis
         –  Prototype system: http://wikidashboard.parc.com



       Visualization for every wiki page
        showing edit history timeline and
        top individual editors

       Can drill down into activity history
        for specific editors and view edits
        to see changes side-by-side

Citation: Suh et al.
CHI 2008 Proceedings


                                Crowdsourcing Meetup (Stanford                 28
Hillary	
  Clinton	
  




Crowdsourcing Meetup (Stanford   29
2011)                                 29
Top	
  Editor	
  -­‐	
  Wasted	
  Time	
  R	
  




          Crowdsourcing Meetup (Stanford   30
          2011)
Surfacing information

•  Numerous studies mining Wikipedia revision
   history to surface trust-relevant information
   –  Adler & Alfaro, 2007; Dondio et al., 2006; Kittur et al., 2007;
      Viegas et al., 2004; Zeng et al., 2006




                                          Suh, Chi, Kittur, & Pendleton, CHI2008


•  But how much impact can this have on user
   perceptions in a system which is inherently
   mutable?
                                                                              31
Hypotheses

1.  Visualization will impact perceptions of trust
2.  Compared to baseline, visualization will
    impact trust both positively and negatively
3.  Visualization should have most impact when
    high uncertainty about article
   •    Low quality
   •    High controversy




                                                     32
Design

        •  3 x 2 x 2 design


                          Controversial    Uncontroversial


Visualization              Abortion          Volcano
                                                             High quality
•    High stability     George Bush           Shark
•    Low stability
•    Baseline (none)   Pro-life feminism        Disk
                                           defragmenter      Low quality
                       Scientology and
                          celebrities        Beeswax




                                                                           33
Example: High trust visualization




                                    34
Example: Low trust visualization




                                   35
Summary info

          •  % from anonymous
             users




                                36
Summary info

          •  % from anonymous
             users
          •  Last change by
             anonymous or
             established user




                                37
Summary info

          •  % from anonymous
             users
          •  Last change by
             anonymous or
             established user
          •  Stability of words




                                  38
Graph

•  Instability




                         39
Method

•  Users recruited via Amazon s Mechanical Turk
   –    253 participants
   –    673 ratings
   –    7 cents per rating
   –    Kittur, Chi, & Suh, CHI 2008: Crowdsourcing user studies
•  To ensure salience and valid answers, participants
   answered:
   –    In what time period was this article the least stable?
   –    How stable has this article been for the last month?
   –    Who was the last editor?
   –    How trustworthy do you consider the above editor?




                                                                 40
Results

                                    7       High stability        Baseline        Low stability


                                    6
           Trustworthiness rating
                                    5

                                    4

                                    3

                                    2

                                    1
                                        Low qual      High qual       Low qual        High qual

                                           Uncontroversial                   Controversial


main effects of quality and controversy:
• high-quality articles > low-quality articles (F(1, 425) = 25.37, p < .001)
• uncontroversial articles > controversial articles (F(1, 425) = 4.69, p = .
031)

                                                                                                  41
Results

                                   7       High stability        Baseline        Low stability


                                   6
          Trustworthiness rating
                                   5

                                   4

                                   3

                                   2

                                   1
                                       Low qual      High qual       Low qual        High qual

                                          Uncontroversial                   Controversial


interaction effects of quality and controversy:
• high quality articles were rated equally trustworthy whether controversial
or not, while
• low quality articles were rated lower when they were controversial than
when they were uncontroversial.
                                                                                                 42
Results

1.  Significant effect of                                  7       High stability        Baseline        Low stability


    visualization                                          6




                                  Trustworthiness rating
   –  High > low, p < .001                                 5


2.  Viz has both positive and                              4


    negative effects                                       3


   –  High > baseline, p < .001                            2


   –  Low > baseline, p < .01                              1
                                                               Low qual      High qual       Low qual        High qual

3.  No interaction of                                             Uncontroversial                   Controversial


    visualization with either
    quality or controversy
   –  Robust across conditions



                                                                                                                     43
Results

1.  Significant effect of                                  7       High stability        Baseline        Low stability


    visualization                                          6




                                  Trustworthiness rating
   –  High > low, p < .001                                 5


2.  Viz has both positive and                              4


    negative effects                                       3


   –  High > baseline, p < .001                            2


   –  Low > baseline, p < .01                              1
                                                               Low qual      High qual       Low qual        High qual

3.  No interaction of                                             Uncontroversial                   Controversial


    visualization with either
    quality or controversy
   –  Robust across conditions



                                                                                                                     44
Results

1.  Significant effect of                                  7       High stability        Baseline        Low stability


    visualization                                          6




                                  Trustworthiness rating
   –  High > low, p < .001                                 5


2.  Viz has both positive and                              4


    negative effects                                       3


   –  High > baseline, p < .001                            2


   –  Low > baseline, p < .01                              1
                                                               Low qual      High qual       Low qual        High qual

3.  No interaction effect of                                      Uncontroversial                   Controversial


    visualization with either
    quality or controversy
   –  Robust across conditions



                                                                                                                     45

More Related Content

Similar to Crowdsourcing for HCI Research with Amazon Mechanical Turk

Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI ResearchTutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI ResearchEd Chi
 
Session1 methods research_question
Session1 methods research_questionSession1 methods research_question
Session1 methods research_questionmilolostinspace
 
Understanding The Value Of User Research, Usability Testing, and Information ...
Understanding The Value Of User Research, Usability Testing, and Information ...Understanding The Value Of User Research, Usability Testing, and Information ...
Understanding The Value Of User Research, Usability Testing, and Information ...Kyle Soucy
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyIndiana Online Users Group
 
Evaluation and User Study in HCI
Evaluation and User Study in HCIEvaluation and User Study in HCI
Evaluation and User Study in HCIByungkyu (Jay) Kang
 
Validating Ideas Through Prototyping
Validating Ideas Through PrototypingValidating Ideas Through Prototyping
Validating Ideas Through PrototypingChris Risdon
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveoralonso
 
Usability and User Experience Training Seminar
Usability and User Experience Training SeminarUsability and User Experience Training Seminar
Usability and User Experience Training Seminarlabecvar
 
COSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsCOSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsMark Billinghurst
 
Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008Julie Coiro
 
Aect2018 workshop-v6ij-compressed
Aect2018 workshop-v6ij-compressedAect2018 workshop-v6ij-compressed
Aect2018 workshop-v6ij-compressedIsa Jahnke
 
Getting Started with User Research
Getting Started with User ResearchGetting Started with User Research
Getting Started with User ResearchDiane Loviglio
 

Similar to Crowdsourcing for HCI Research with Amazon Mechanical Turk (20)

Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI ResearchTutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
Tutorial on Using Amazon Mechanical Turk (MTurk) for HCI Research
 
Pragmatisk softwareinnovation, Ivan Aaen, AAU
Pragmatisk softwareinnovation, Ivan Aaen, AAUPragmatisk softwareinnovation, Ivan Aaen, AAU
Pragmatisk softwareinnovation, Ivan Aaen, AAU
 
Session1 methods research_question
Session1 methods research_questionSession1 methods research_question
Session1 methods research_question
 
ICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdfICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdf
 
ICS3211 Lecture 9
ICS3211 Lecture 9ICS3211 Lecture 9
ICS3211 Lecture 9
 
Understanding The Value Of User Research, Usability Testing, and Information ...
Understanding The Value Of User Research, Usability Testing, and Information ...Understanding The Value Of User Research, Usability Testing, and Information ...
Understanding The Value Of User Research, Usability Testing, and Information ...
 
Intro to UOSM2012
Intro to UOSM2012Intro to UOSM2012
Intro to UOSM2012
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled Technology
 
Evaluation and User Study in HCI
Evaluation and User Study in HCIEvaluation and User Study in HCI
Evaluation and User Study in HCI
 
User Research
User ResearchUser Research
User Research
 
Validating Ideas Through Prototyping
Validating Ideas Through PrototypingValidating Ideas Through Prototyping
Validating Ideas Through Prototyping
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
Usability and User Experience Training Seminar
Usability and User Experience Training SeminarUsability and User Experience Training Seminar
Usability and User Experience Training Seminar
 
COSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsCOSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR Applications
 
Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008
 
Aect 2018 workshop
Aect 2018 workshopAect 2018 workshop
Aect 2018 workshop
 
Aect2018 workshop-v6ij-compressed
Aect2018 workshop-v6ij-compressedAect2018 workshop-v6ij-compressed
Aect2018 workshop-v6ij-compressed
 
Don’t make me think!
Don’t make me think!Don’t make me think!
Don’t make me think!
 
Getting Started with User Research
Getting Started with User ResearchGetting Started with User Research
Getting Started with User Research
 
asdfas
asdfasasdfas
asdfas
 

More from Ed Chi

2017 10-10 (netflix ml platform meetup) learning item and user representation...
2017 10-10 (netflix ml platform meetup) learning item and user representation...2017 10-10 (netflix ml platform meetup) learning item and user representation...
2017 10-10 (netflix ml platform meetup) learning item and user representation...Ed Chi
 
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social ComputingHCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social ComputingEd Chi
 
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)Ed Chi
 
CIKM 2011 Social Computing Industry Invited Talk
CIKM 2011 Social Computing Industry Invited TalkCIKM 2011 Social Computing Industry Invited Talk
CIKM 2011 Social Computing Industry Invited TalkEd Chi
 
WikiSym 2011 Closing Keynote
WikiSym 2011 Closing KeynoteWikiSym 2011 Closing Keynote
WikiSym 2011 Closing KeynoteEd Chi
 
CSCL 2011 Keynote on Social Computing and eLearning
CSCL 2011 Keynote on Social Computing and eLearningCSCL 2011 Keynote on Social Computing and eLearning
CSCL 2011 Keynote on Social Computing and eLearningEd Chi
 
Replication is more than Duplication: Position slides for CHI2011 panel on re...
Replication is more than Duplication: Position slides for CHI2011 panel on re...Replication is more than Duplication: Position slides for CHI2011 panel on re...
Replication is more than Duplication: Position slides for CHI2011 panel on re...Ed Chi
 
Eddi: Topic Browsing of Twitter Streams
Eddi: Topic Browsing of Twitter StreamsEddi: Topic Browsing of Twitter Streams
Eddi: Topic Browsing of Twitter StreamsEd Chi
 
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...Ed Chi
 
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...Ed Chi
 
Zerozero88 Twitter URL Item Recommender
Zerozero88 Twitter URL Item RecommenderZerozero88 Twitter URL Item Recommender
Zerozero88 Twitter URL Item RecommenderEd Chi
 
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006Ed Chi
 
Model-Driven Research in Social Computing
Model-Driven Research in Social ComputingModel-Driven Research in Social Computing
Model-Driven Research in Social ComputingEd Chi
 
ASC Disaster Response Proposal from Aug 2007
ASC Disaster Response Proposal from Aug 2007ASC Disaster Response Proposal from Aug 2007
ASC Disaster Response Proposal from Aug 2007Ed Chi
 
Using Information Scent to Model Users in Web1.0 and Web2.0
Using Information Scent to Model Users in Web1.0 and Web2.0Using Information Scent to Model Users in Web1.0 and Web2.0
Using Information Scent to Model Users in Web1.0 and Web2.0Ed Chi
 
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...Ed Chi
 
2010-03-10 PARC Augmented Social Cognition Research Overview
2010-03-10 PARC Augmented Social Cognition Research Overview2010-03-10 PARC Augmented Social Cognition Research Overview
2010-03-10 PARC Augmented Social Cognition Research OverviewEd Chi
 
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica SinicaEd Chi
 
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...Ed Chi
 
Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...
Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...
Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...Ed Chi
 

More from Ed Chi (20)

2017 10-10 (netflix ml platform meetup) learning item and user representation...
2017 10-10 (netflix ml platform meetup) learning item and user representation...2017 10-10 (netflix ml platform meetup) learning item and user representation...
2017 10-10 (netflix ml platform meetup) learning item and user representation...
 
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social ComputingHCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
HCI Korea 2012 Keynote Talk on Model-Driven Research in Social Computing
 
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
 
CIKM 2011 Social Computing Industry Invited Talk
CIKM 2011 Social Computing Industry Invited TalkCIKM 2011 Social Computing Industry Invited Talk
CIKM 2011 Social Computing Industry Invited Talk
 
WikiSym 2011 Closing Keynote
WikiSym 2011 Closing KeynoteWikiSym 2011 Closing Keynote
WikiSym 2011 Closing Keynote
 
CSCL 2011 Keynote on Social Computing and eLearning
CSCL 2011 Keynote on Social Computing and eLearningCSCL 2011 Keynote on Social Computing and eLearning
CSCL 2011 Keynote on Social Computing and eLearning
 
Replication is more than Duplication: Position slides for CHI2011 panel on re...
Replication is more than Duplication: Position slides for CHI2011 panel on re...Replication is more than Duplication: Position slides for CHI2011 panel on re...
Replication is more than Duplication: Position slides for CHI2011 panel on re...
 
Eddi: Topic Browsing of Twitter Streams
Eddi: Topic Browsing of Twitter StreamsEddi: Topic Browsing of Twitter Streams
Eddi: Topic Browsing of Twitter Streams
 
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...
 
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
Model-based Research in Human-Computer Interaction (HCI): Keynote at Mensch u...
 
Zerozero88 Twitter URL Item Recommender
Zerozero88 Twitter URL Item RecommenderZerozero88 Twitter URL Item Recommender
Zerozero88 Twitter URL Item Recommender
 
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006
 
Model-Driven Research in Social Computing
Model-Driven Research in Social ComputingModel-Driven Research in Social Computing
Model-Driven Research in Social Computing
 
ASC Disaster Response Proposal from Aug 2007
ASC Disaster Response Proposal from Aug 2007ASC Disaster Response Proposal from Aug 2007
ASC Disaster Response Proposal from Aug 2007
 
Using Information Scent to Model Users in Web1.0 and Web2.0
Using Information Scent to Model Users in Web1.0 and Web2.0Using Information Scent to Model Users in Web1.0 and Web2.0
Using Information Scent to Model Users in Web1.0 and Web2.0
 
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...
China HCI Symposium 2010 March: Augmented Social Cognition Research from PARC...
 
2010-03-10 PARC Augmented Social Cognition Research Overview
2010-03-10 PARC Augmented Social Cognition Research Overview2010-03-10 PARC Augmented Social Cognition Research Overview
2010-03-10 PARC Augmented Social Cognition Research Overview
 
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
2010-02-22 Wikipedia MTurk Research talk given in Taiwan's Academica Sinica
 
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
 
Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...
Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...
Slowing Growth of Wikipedia and Models of its Dynamic (Presented at Wikimedia...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Crowdsourcing for HCI Research with Amazon Mechanical Turk

  • 1. Crowdsourcing for Human Computer Interaction Research Ed H. Chi Research Scientist Google (work done while at [Xerox] PARC with Aniket Kittur)
  • 2. User studies •  Getting input from users is important in HCI –  surveys –  rapid prototyping –  usability tests –  cognitive walkthroughs –  performance measures –  quantitative ratings
  • 3. User studies •  Getting input from users is expensive –  Time costs –  Monetary costs •  Often have to trade off costs with sample size
  • 4. Online solutions •  Online user surveys •  Remote usability testing •  Online experiments •  But still have difficulties –  Rely on practitioner for recruiting participants –  Limited pool of participants
  • 5. Crowdsourcing •  Make tasks available for anyone online to complete •  Quickly access a large user pool, collect data, and compensate users •  Example: NASA Clickworkers –  100k+ volunteers identified Mars craters from space photographs –  Aggregate results virtually indistinguishable from expert geologists experts crowds http://clickworkers.arc.nasa.gov
  • 6. Amazon s Mechanical turk •  Market for human intelligence tasks •  Typically short, objective tasks –  Tag an image –  Find a webpage –  Evaluate relevance of search results •  Users complete for a few pennies each
  • 8. Using Mechanical Turk for user studies Traditional user Mechanical Turk studies Task complexity Complex Simple Long Short Task subjectivity Subjective Objective Opinions Verifiable User information Targeted demographics Unknown demographics High interactivity Limited interactivity Can Mechanical Turk be usefully used for user studies?
  • 9. Task •  Assess quality of Wikipedia articles •  Started with ratings from expert Wikipedians –  14 articles (e.g., Germany , Noam Chomsky ) –  7-point scale •  Can we get matching ratings with mechanical turk?
  • 10. Experiment 1 •  Rate articles on 7-point scales: –  Well written –  Factually accurate –  Overall quality •  Free-text input: –  What improvements does the article need? •  Paid $0.05 each
  • 11. Experiment 1: Good news •  58 users made 210 ratings (15 per article) –  $10.50 total •  Fast results –  44% within a day, 100% within two days –  Many completed within minutes
  • 12. Experiment 1: Bad news •  Correlation between turkers and Wikipedians only marginally significant (r=.50, p=.07) •  Worse, 59% potentially invalid responses Experiment 1 Invalid 49% comments <1 min 31% responses •  Nearly 75% of these done by only 8 users
  • 13. Not a good start •  Summary of Experiment 1: –  Only marginal correlation with experts. –  Heavy gaming of the system by a minority •  Possible Response: –  Can make sure these gamers are not rewarded –  Ban them from doing your hits in the future –  Create a reputation system [Delores Lab] •  Can we change how we collect user input ?
  • 14. Design changes •  Use verifiable questions to signal monitoring –  How many sections does the article have? –  How many images does the article have? –  How many references does the article have?
  • 15. Design changes •  Use verifiable questions to signal monitoring •  Make malicious answers as high cost as good-faith answers –  Provide 4-6 keywords that would give someone a good summary of the contents of the article
  • 16. Design changes •  Use verifiable questions to signal monitoring •  Make malicious answers as high cost as good-faith answers •  Make verifiable answers useful for completing task –  Used tasks similar to how Wikipedians described evaluating quality (organization, presentation, references)
  • 17. Design changes •  Use verifiable questions to signal monitoring •  Make malicious answers as high cost as good-faith answers •  Make verifiable answers useful for completing task •  Put verifiable tasks before subjective responses –  First do objective tasks and summarization –  Only then evaluate subjective quality –  Ecological validity?
  • 18. Experiment 2: Results •  124 users provided 277 ratings (~20 per article) •  Significant positive correlation with Wikipedians (r=. 66, p=.01) •  Smaller proportion malicious responses •  Increased time on task Experiment 1 Experiment 2 Invalid 49% 3% comments <1 min 31% 7% responses Median time 1:30 4:06
  • 19. Generalizing to other user studies •  Combine objective and subjective questions –  Rapid prototyping: ask verifiable questions about content/design of prototype before subjective evaluation –  User surveys: ask common-knowledge questions before asking for opinions
  • 20. Limitations of mechanical turk •  No control of users environment –  Potential for different browsers, physical distractions –  General problem with online experimentation •  Not designed for user studies –  Difficult to do between-subjects design –  Involves some programming •  Users –  Uncertainty about user demographics, expertise
  • 21. Quick Summary •  Mechanical Turk offers the practitioner a way to access a large user pool and quickly collect data at low cost •  Good results require careful task design 1.  Use verifiable questions to signal monitoring 2.  Make malicious answers as high cost as good-faith answers 3.  Make verifiable answers useful for completing task 4.  Put verifiable tasks before subjective responses
  • 22. Crowdsourcing for HCI Research •  Does my interface/visualization work? –  WikiDashboard: transparency visualization for Wikipedia –  J. Heer’s work at Stanford at looking at perceptual effects •  Coding of large amount of user data –  What is a question? In Twitter, Sharoda Paul at PARC •  Decompose tasks into smaller tasks –  Digital Taylorism –  Frederick Winslow Taylor (1856-1915) 1911 book 'Principles Of Scientific Management' •  Incentive mechanisms –  Intrinsic vs. Extrinsic rewards –  Games vs. Pay
  • 24. What would make you trust Wikipedia more? 24
  • 25. What is Wikipedia? Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know you re getting the best possible information. – Steve Carell, The Office 25
  • 26. What would make you trust Wikipedia more? Nothing 26
  • 27. What would make you trust Wikipedia more? Wikipedia, just by its nature, is impossible to trust completely. I don't think this can necessarily be changed. 27
  • 28. WikiDashboard   Transparency of social dynamics can reduce conflict and coordination issues   Attribution encourages contribution –  WikiDashboard: Social dashboard for wikis –  Prototype system: http://wikidashboard.parc.com   Visualization for every wiki page showing edit history timeline and top individual editors   Can drill down into activity history for specific editors and view edits to see changes side-by-side Citation: Suh et al. CHI 2008 Proceedings Crowdsourcing Meetup (Stanford 28
  • 29. Hillary  Clinton   Crowdsourcing Meetup (Stanford 29 2011) 29
  • 30. Top  Editor  -­‐  Wasted  Time  R   Crowdsourcing Meetup (Stanford 30 2011)
  • 31. Surfacing information •  Numerous studies mining Wikipedia revision history to surface trust-relevant information –  Adler & Alfaro, 2007; Dondio et al., 2006; Kittur et al., 2007; Viegas et al., 2004; Zeng et al., 2006 Suh, Chi, Kittur, & Pendleton, CHI2008 •  But how much impact can this have on user perceptions in a system which is inherently mutable? 31
  • 32. Hypotheses 1.  Visualization will impact perceptions of trust 2.  Compared to baseline, visualization will impact trust both positively and negatively 3.  Visualization should have most impact when high uncertainty about article •  Low quality •  High controversy 32
  • 33. Design •  3 x 2 x 2 design Controversial Uncontroversial Visualization Abortion Volcano High quality •  High stability George Bush Shark •  Low stability •  Baseline (none) Pro-life feminism Disk defragmenter Low quality Scientology and celebrities Beeswax 33
  • 34. Example: High trust visualization 34
  • 35. Example: Low trust visualization 35
  • 36. Summary info •  % from anonymous users 36
  • 37. Summary info •  % from anonymous users •  Last change by anonymous or established user 37
  • 38. Summary info •  % from anonymous users •  Last change by anonymous or established user •  Stability of words 38
  • 40. Method •  Users recruited via Amazon s Mechanical Turk –  253 participants –  673 ratings –  7 cents per rating –  Kittur, Chi, & Suh, CHI 2008: Crowdsourcing user studies •  To ensure salience and valid answers, participants answered: –  In what time period was this article the least stable? –  How stable has this article been for the last month? –  Who was the last editor? –  How trustworthy do you consider the above editor? 40
  • 41. Results 7 High stability Baseline Low stability 6 Trustworthiness rating 5 4 3 2 1 Low qual High qual Low qual High qual Uncontroversial Controversial main effects of quality and controversy: • high-quality articles > low-quality articles (F(1, 425) = 25.37, p < .001) • uncontroversial articles > controversial articles (F(1, 425) = 4.69, p = . 031) 41
  • 42. Results 7 High stability Baseline Low stability 6 Trustworthiness rating 5 4 3 2 1 Low qual High qual Low qual High qual Uncontroversial Controversial interaction effects of quality and controversy: • high quality articles were rated equally trustworthy whether controversial or not, while • low quality articles were rated lower when they were controversial than when they were uncontroversial. 42
  • 43. Results 1.  Significant effect of 7 High stability Baseline Low stability visualization 6 Trustworthiness rating –  High > low, p < .001 5 2.  Viz has both positive and 4 negative effects 3 –  High > baseline, p < .001 2 –  Low > baseline, p < .01 1 Low qual High qual Low qual High qual 3.  No interaction of Uncontroversial Controversial visualization with either quality or controversy –  Robust across conditions 43
  • 44. Results 1.  Significant effect of 7 High stability Baseline Low stability visualization 6 Trustworthiness rating –  High > low, p < .001 5 2.  Viz has both positive and 4 negative effects 3 –  High > baseline, p < .001 2 –  Low > baseline, p < .01 1 Low qual High qual Low qual High qual 3.  No interaction of Uncontroversial Controversial visualization with either quality or controversy –  Robust across conditions 44
  • 45. Results 1.  Significant effect of 7 High stability Baseline Low stability visualization 6 Trustworthiness rating –  High > low, p < .001 5 2.  Viz has both positive and 4 negative effects 3 –  High > baseline, p < .001 2 –  Low > baseline, p < .01 1 Low qual High qual Low qual High qual 3.  No interaction effect of Uncontroversial Controversial visualization with either quality or controversy –  Robust across conditions 45