SlideShare une entreprise Scribd logo
1  sur  37
Believe it or not: Designing a Human-AI
Partnership for Mixed-Initiative Fact-Checking
Joint work with
An Thanh Nguyen (UT), Byron Wallace (Northeastern), & more!
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@utexas.edu
Slides:
slideshare.net/mattlease
“The place where people & technology meet”
~ Wobbrock et al., 2009
“iSchools” now exist at over 65 universities around the world
www.ischools.org
What’s an Information School?
2Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Information Literacy
National Information Literacy Awareness Month,
US Presidential Proclamation, October 1, 2009.
“Though we may know how to find the information
we need, we must also know how to evaluate it.
Over the past decade, we have seen a crisis of
authenticity emerge. We now live in a world where
anyone can publish an opinion or perspective, whether
true or not, and have that opinion amplified…”
3Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
“Truthiness”
“Truthiness is tearing apart our country... It used to
be everyone was entitled to their own opinion, but
not their own facts. But that’s not the case anymore.”
– Stephen Colbert (Jan. 25, 2006)
4Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Journalist Fact-Checking
5Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Danny Sullivan,
April 7, 2017
6Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Crowdsourced Fact Checking
7
Automated Fact-Checking
8Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Fake News Challenge
9Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
10Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
11Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Challenges
• Fair, Accountable, & Transparent (AI)
– Why trust “black box” classifier?
– How do we reason about potential bias?
– Do people really only want to know true vs. false?
– How to integrate human knowledge/experience?
• Joint AI + Human Reasoning, Correct Errors, Personalization
• How to design strong Human + AI Partnerships?
– Horvitz, CHI’99: mixed-initiative design
– Dove et al., CHI’17 “Machine Learning As a Design Material”
12
MemeBrowser (Ryu et al., ACM HyperText’12)
http://odyssey.ischool.utexas.edu/mb
13Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
• Crowdsourced stance labels
– Hybrid AI + Human (near real-time) Prediction
• Joint model of stance, veracity, & annotators
– Interaction between variables
– Interpretable
• Source on github
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Nguyen et al., AAAI’18
14
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
This Work
15
Demo!
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
This Work
16
http://fcweb.pythonanywhere.com
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Primary Interface
17
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Source Reputation
18
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
System Architecture
• Google Search API
• Two logistic regression models
– Stance (Ferreira & Vlachos ’16) w/ same features
• average accuracy > 70% but variable over claims
– Veracity (Popat et al. ‘17)
– Scikit-learn, L1 regularization, Liblinear solver, &
default parameters
19
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Data: Train & Test
Emergent (Ferreira & Vlachos ’16)
Accuracy of prediction models
20
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
21
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
22
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
23
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
– 113 participants (58 control, 55 system) – MTurk
• 1. Asked to predict claim veracity
– Likert: Def. False, Prob. F., neutral, Prob. True, Def. T.
– Error: distance
• For a (definitely) false claim, PF -> error=1, DT -> error=4
• 2. Shown model’s claim prediction
24
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
25
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
26
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
– 113 participants (58 control, 55 system) – MTurk
• 1. Asked to predict claim veracity
– Likert: Def. False, Prob. F., neutral, Prob. True, Def. T.
– Error: distance
• For a (definitely) false claim, PF -> error=1, DT -> error=4
• 2. Shown model’s claim prediction
– Asked if they want to change their answer?
27
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Before seeing model’s claim prediction
• “System” group:
– claims 1-2: > avg. prediction error
– claim 4: < error
– claims 3, 5: only small differences
• Human accuracy in claim prediction roughly
follows model’s accuracy in stance prediction
– i.e., helped when model correct, hurt when not
28
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
After seeing model’s claim prediction
• “System” group:
– Smaller changes for errors
– change answers > “control” group
• Human accuracy in claim prediction roughly
follows model’s accuracy in veracity prediction
– i.e., helped when model correct, hurt when not
29
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Study 1: Statistical Significance (1 of 2)
• Mixed-effects Generalized Linear Model (GLM)
• Before seeing model’s claim predictions
– CSP: # correct stance predictions seen
– WSP: # wrong stance predictions seen
– 2-tail test includes unlikely possibility that seeing
correct stance predictions increases human error
30
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Study 1: Statistical Significance (2 of 2)
• Mixed-effects Generalized Linear Model (GLM)
• After seeing model’s claim predictions
– CSP: # correct stance predictions seen
– WSP: # wrong stance predictions seen
– 2-tail test includes unlikely possibility that seeing
correct stance predictions increases human error
31
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 2: Setup
• 2 Groups: Control vs. Slider
32
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 2: Setup
• 2 Groups: Control vs. Slider
– 109 participants (51 control, 58 slider) – MTurk
• 1. Asked to predict claim veracity
– Likert: Def. False, Prob. F., neutral, Prob. True, Def. T.
and indicate confidence in prediction
– (-20,+20) score range: accuracy x confidence
• eg, a correct answer with 75% confidence -> 20x75%=15
33
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
• “Slider” group:
– Claims 1-3: > score on average
– Claims 4-5: < first quartiles, but medians same
• Some participants negatively impacted by slider
– No statistically significant difference on average
34
User Study 2: Results
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Discussion & Future Work
• Fact Checking & IR (Lease, DESIRES’18)
– How to diversify search results for controversial topics?
– Information evaluation (eg, vaccination & autism)
• Potential harm as well as good
– Potential added confusion, data / algorithmic bias
– Potential for personal “echo chamber”
– Adversarial applications
• Future Work
– Making personalization more visible
– Collaborative use, small and big
• 1st author Nguyen looking for a postdoc!
35
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Conclusion
• Fact-checking more than black-box prediction
– Interaction, exploration, trust
• We proposed a mixed-initiative human + AI
partnership for fact-checking
– Back-end AI + front-end interface/interaction
– Support AI + human collaboration
– Fair, Accountable, & Transparent (FAT) AI
36
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Thank You!
Slides: slideshare.net/mattlease
Lab: ir.ischool.utexas.edu
37

Contenu connexe

Tendances

Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Amit Sheth
 
Teaching, Assessment and Learning Analytics: Time to Question Assumptions
Teaching, Assessment and Learning Analytics: Time to Question AssumptionsTeaching, Assessment and Learning Analytics: Time to Question Assumptions
Teaching, Assessment and Learning Analytics: Time to Question Assumptions
Simon Buckingham Shum
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Amit Sheth
 

Tendances (20)

Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine Learning
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
 
Philosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and SocietyPhilosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and Society
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
 
Matching Mobile Applications for Cross Promotion
Matching Mobile Applications for Cross PromotionMatching Mobile Applications for Cross Promotion
Matching Mobile Applications for Cross Promotion
 
Urban Data Science at UW
Urban Data Science at UWUrban Data Science at UW
Urban Data Science at UW
 
2019 June 27 - Big data and data science
2019 June 27 - Big data and data science2019 June 27 - Big data and data science
2019 June 27 - Big data and data science
 
Data Science and Urban Science @ UW
Data Science and Urban Science @ UWData Science and Urban Science @ UW
Data Science and Urban Science @ UW
 
Big Data and the Quantified Self
Big Data and the Quantified SelfBig Data and the Quantified Self
Big Data and the Quantified Self
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected Manufacturing
 
Designing Cybersecurity Policies with Field Experiments
Designing Cybersecurity Policies with Field ExperimentsDesigning Cybersecurity Policies with Field Experiments
Designing Cybersecurity Policies with Field Experiments
 
Big Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DBig Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&D
 
Philosophy of Big Data
Philosophy of Big DataPhilosophy of Big Data
Philosophy of Big Data
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
Teaching, Assessment and Learning Analytics: Time to Question Assumptions
Teaching, Assessment and Learning Analytics: Time to Question AssumptionsTeaching, Assessment and Learning Analytics: Time to Question Assumptions
Teaching, Assessment and Learning Analytics: Time to Question Assumptions
 
Learning Analytics vs Cognitive Automation
Learning Analytics vs Cognitive AutomationLearning Analytics vs Cognitive Automation
Learning Analytics vs Cognitive Automation
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
 
Strategic Network Formation in a Location-Based Social Network
Strategic Network Formation in a Location-Based Social NetworkStrategic Network Formation in a Location-Based Social Network
Strategic Network Formation in a Location-Based Social Network
 

Similaire à Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking

Injustice_Harm_Digital_Infrastructures.pptx
Injustice_Harm_Digital_Infrastructures.pptxInjustice_Harm_Digital_Infrastructures.pptx
Injustice_Harm_Digital_Infrastructures.pptx
MusfiqJony
 
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Alex Pinto
 
BB_54028_20160212 SVTLS AI_FINAL_shown
BB_54028_20160212 SVTLS AI_FINAL_shownBB_54028_20160212 SVTLS AI_FINAL_shown
BB_54028_20160212 SVTLS AI_FINAL_shown
Mark Holman
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Krishnaram Kenthapadi
 

Similaire à Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking (20)

Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
Injustice_Harm_Digital_Infrastructures.pptx
Injustice_Harm_Digital_Infrastructures.pptxInjustice_Harm_Digital_Infrastructures.pptx
Injustice_Harm_Digital_Infrastructures.pptx
 
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
 
BB_54028_20160212 SVTLS AI_FINAL_shown
BB_54028_20160212 SVTLS AI_FINAL_shownBB_54028_20160212 SVTLS AI_FINAL_shown
BB_54028_20160212 SVTLS AI_FINAL_shown
 
Bias in AI
Bias in AIBias in AI
Bias in AI
 
Social Science Applications of Agent Based Modelling
Social Science Applications of Agent Based ModellingSocial Science Applications of Agent Based Modelling
Social Science Applications of Agent Based Modelling
 
UT Dallas CS - Rise of Crowd Computing
UT Dallas CS - Rise of Crowd ComputingUT Dallas CS - Rise of Crowd Computing
UT Dallas CS - Rise of Crowd Computing
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences.
 
Joe keating - world legal summit - ethical data science
Joe keating  - world legal summit - ethical data scienceJoe keating  - world legal summit - ethical data science
Joe keating - world legal summit - ethical data science
 
The Ethics of AI
The Ethics of AIThe Ethics of AI
The Ethics of AI
 
The law and ethics of data-driven artificial intelligence
The law and ethics of data-driven artificial intelligenceThe law and ethics of data-driven artificial intelligence
The law and ethics of data-driven artificial intelligence
 
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
 
When recommendation go bad
When recommendation go badWhen recommendation go bad
When recommendation go bad
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
The AI Now Report The Social and Economic Implications of Artificial Intelli...
The AI Now Report  The Social and Economic Implications of Artificial Intelli...The AI Now Report  The Social and Economic Implications of Artificial Intelli...
The AI Now Report The Social and Economic Implications of Artificial Intelli...
 
Emerging Technologies in Data Sharing and Analytics at Data61
Emerging Technologies in Data Sharing and Analytics at Data61Emerging Technologies in Data Sharing and Analytics at Data61
Emerging Technologies in Data Sharing and Analytics at Data61
 
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
 
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
 
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
 

Plus de Matthew Lease

The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016
Matthew Lease
 

Plus de Matthew Lease (20)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing Science
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject Crowdsourcing
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd Work
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine Evaluation
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical Turk
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to Ethics
 
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsCrowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems
 
Mechanical Turk is Not Anonymous
Mechanical Turk is Not AnonymousMechanical Turk is Not Anonymous
Mechanical Turk is Not Anonymous
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking

  • 1. Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Joint work with An Thanh Nguyen (UT), Byron Wallace (Northeastern), & more! Matt Lease School of Information @mattlease University of Texas at Austin ml@utexas.edu Slides: slideshare.net/mattlease
  • 2. “The place where people & technology meet” ~ Wobbrock et al., 2009 “iSchools” now exist at over 65 universities around the world www.ischools.org What’s an Information School? 2Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 3. Information Literacy National Information Literacy Awareness Month, US Presidential Proclamation, October 1, 2009. “Though we may know how to find the information we need, we must also know how to evaluate it. Over the past decade, we have seen a crisis of authenticity emerge. We now live in a world where anyone can publish an opinion or perspective, whether true or not, and have that opinion amplified…” 3Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 4. “Truthiness” “Truthiness is tearing apart our country... It used to be everyone was entitled to their own opinion, but not their own facts. But that’s not the case anymore.” – Stephen Colbert (Jan. 25, 2006) 4Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 5. Journalist Fact-Checking 5Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 6. Danny Sullivan, April 7, 2017 6Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 8. Automated Fact-Checking 8Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 9. Fake News Challenge 9Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 10. 10Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 11. 11Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 12. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Challenges • Fair, Accountable, & Transparent (AI) – Why trust “black box” classifier? – How do we reason about potential bias? – Do people really only want to know true vs. false? – How to integrate human knowledge/experience? • Joint AI + Human Reasoning, Correct Errors, Personalization • How to design strong Human + AI Partnerships? – Horvitz, CHI’99: mixed-initiative design – Dove et al., CHI’17 “Machine Learning As a Design Material” 12
  • 13. MemeBrowser (Ryu et al., ACM HyperText’12) http://odyssey.ischool.utexas.edu/mb 13Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
  • 14. • Crowdsourced stance labels – Hybrid AI + Human (near real-time) Prediction • Joint model of stance, veracity, & annotators – Interaction between variables – Interpretable • Source on github Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Nguyen et al., AAAI’18 14
  • 15. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking This Work 15 Demo!
  • 16. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking This Work 16 http://fcweb.pythonanywhere.com
  • 17. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Primary Interface 17
  • 18. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Source Reputation 18
  • 19. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking System Architecture • Google Search API • Two logistic regression models – Stance (Ferreira & Vlachos ’16) w/ same features • average accuracy > 70% but variable over claims – Veracity (Popat et al. ‘17) – Scikit-learn, L1 regularization, Liblinear solver, & default parameters 19
  • 20. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Data: Train & Test Emergent (Ferreira & Vlachos ’16) Accuracy of prediction models 20
  • 21. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System 21
  • 22. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System 22
  • 23. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System 23
  • 24. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System – 113 participants (58 control, 55 system) – MTurk • 1. Asked to predict claim veracity – Likert: Def. False, Prob. F., neutral, Prob. True, Def. T. – Error: distance • For a (definitely) false claim, PF -> error=1, DT -> error=4 • 2. Shown model’s claim prediction 24
  • 25. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System 25
  • 26. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System 26
  • 27. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 1: Setup • 2 Groups: Control vs. System – 113 participants (58 control, 55 system) – MTurk • 1. Asked to predict claim veracity – Likert: Def. False, Prob. F., neutral, Prob. True, Def. T. – Error: distance • For a (definitely) false claim, PF -> error=1, DT -> error=4 • 2. Shown model’s claim prediction – Asked if they want to change their answer? 27
  • 28. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Before seeing model’s claim prediction • “System” group: – claims 1-2: > avg. prediction error – claim 4: < error – claims 3, 5: only small differences • Human accuracy in claim prediction roughly follows model’s accuracy in stance prediction – i.e., helped when model correct, hurt when not 28
  • 29. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking After seeing model’s claim prediction • “System” group: – Smaller changes for errors – change answers > “control” group • Human accuracy in claim prediction roughly follows model’s accuracy in veracity prediction – i.e., helped when model correct, hurt when not 29
  • 30. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Study 1: Statistical Significance (1 of 2) • Mixed-effects Generalized Linear Model (GLM) • Before seeing model’s claim predictions – CSP: # correct stance predictions seen – WSP: # wrong stance predictions seen – 2-tail test includes unlikely possibility that seeing correct stance predictions increases human error 30
  • 31. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Study 1: Statistical Significance (2 of 2) • Mixed-effects Generalized Linear Model (GLM) • After seeing model’s claim predictions – CSP: # correct stance predictions seen – WSP: # wrong stance predictions seen – 2-tail test includes unlikely possibility that seeing correct stance predictions increases human error 31
  • 32. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 2: Setup • 2 Groups: Control vs. Slider 32
  • 33. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking User Study 2: Setup • 2 Groups: Control vs. Slider – 109 participants (51 control, 58 slider) – MTurk • 1. Asked to predict claim veracity – Likert: Def. False, Prob. F., neutral, Prob. True, Def. T. and indicate confidence in prediction – (-20,+20) score range: accuracy x confidence • eg, a correct answer with 75% confidence -> 20x75%=15 33
  • 34. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking • “Slider” group: – Claims 1-3: > score on average – Claims 4-5: < first quartiles, but medians same • Some participants negatively impacted by slider – No statistically significant difference on average 34 User Study 2: Results
  • 35. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Discussion & Future Work • Fact Checking & IR (Lease, DESIRES’18) – How to diversify search results for controversial topics? – Information evaluation (eg, vaccination & autism) • Potential harm as well as good – Potential added confusion, data / algorithmic bias – Potential for personal “echo chamber” – Adversarial applications • Future Work – Making personalization more visible – Collaborative use, small and big • 1st author Nguyen looking for a postdoc! 35
  • 36. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Conclusion • Fact-checking more than black-box prediction – Interaction, exploration, trust • We proposed a mixed-initiative human + AI partnership for fact-checking – Back-end AI + front-end interface/interaction – Support AI + human collaboration – Fair, Accountable, & Transparent (FAT) AI 36
  • 37. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking Thank You! Slides: slideshare.net/mattlease Lab: ir.ischool.utexas.edu 37