SlideShare une entreprise Scribd logo
1  sur  20
Capturing the Ineffable:
Collecting, Analysing, and Automating
Web Document Quality Assessments
Davide Ceolin, Julia Noordegraaf, Lora Aroyo
• Introduction
• Nichesourcing Web Document Quality Assessments
• User studies
• Conclusion and Future Work
Outlin
e
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Introduction
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Web Document Quality Assessment
• Source criticism
• Methodological practice from the humanities
• e.g., from the American Library Association:
• How was the source located?
• What type of source is it?
• Who is the author and what are the qualifications of the
author in regard to the topic that is discussed?
• When was the information published?
• In which country was it published?
• What is the reputation of the publisher?
• Does the source show a particular cultural or political bias?.
• How does it apply to Web sources?
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Web Document Quality Assessment
What is the quality of each of these documents?
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Authoritative source ✓
Accurate ✓
Precise ✓
Complete ✓
Neutral (?)
Blog Post (?)
Accurate (?)
Precise (?)
Complete (?)
Neutral ✗
• We adapt source criticism to Web documents &
aim at automating the process of quality
estimation by:
• Gathering quality assessments (mostly from experts).
• Looking for markers (document features) that correlate
with them.
Quality and Quality
Markers
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Objectives
• Analyse the consistency of quality assessments.
• Are quality assessments consistent among users, over
time, etc.?
• Analyse user ability to interpret document
features.
• Can the users estimate the quality of a document from
its sentiment or trustworthiness level?
• Analyse the predictability of quality assessments.
• Can we automatically estimate the quality of a
document?
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Nichesourcing Web
Document Quality
Assessments
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• Dataset: documents about vaccinations
• Initially, 50 docs, various sources (blogs, authorities, etc.)
• Features
• Information (automatically) extracted from documents
using AlchemyAPI & Web of Trust.
• Entities, Topics, Sentiment, Emotions, Trustworthiness.
• Quality dimensions
• Overall quality, accuracy, completeness, precision,
trustworthiness, readability, neutrality.
Dataset, Features, and Quality
Dimensions
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• Setup:
• 6 documents per participant.
• Random selection.
• Even distribution of assessments.
• Scenario:
Suppose you are asked to write an article
about debate on vaccinations triggered by the
measles outbreak in 2015 at Disneyland in
California.
WebQ: Nichesourcing Web Quality Assessments
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• Documents are anonymized.
• Users choose documents that meet their quality
criteria based on features only.
• All feature values are shown, alone and together.
WebQ: Task 1
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• Read each of the 6 articles.
• Assess it.
• Rate completeness, accuracy, etc.
• Likert scale 1-5.
• Annotate the article to explain the ratings
• Articles are proxied & annotated through AnnotatorJS.
WebQ: Task 2
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
User Studies
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• User Study 1
• Participants: 20 last-year UvA journalism
students.
• Duration: 60’.
• User Study 2
• Participants: 20 RMA media scholars.
• Duration: 45’.
• Improvements (learnt from user study 1).
Setup
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• Data collected:
• 104 (US1) + 47 (US2) assessments.
• 238 (US1) + 89 (US2) annotations.
• No significant difference between Use Cases
(Wilcoxon signed-rank test).
• Assessments are assimilable.
• Assessment predictability (SVC)
• Up to 63% accuracy (5-classes)
• Up to 89% accuracy (2-classes)
• Promising predictability. We will try other algorithms.
Results
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• Highest correlation with overall quality:
• Accuracy
• Trustworthiness
• Precision
• Completeness
• Given the task at hand, neutrality is not relevant.
• Weak correlation task 1 - overall quality (task 2).
• Users were mostly unable to interpret those features.
Results
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
Conclusion
Capturing the Ineffable
• We collected Web document quality
assessments.
• WebQ – Nichesourcing application.
• 2 user studies with experts.
• Clear defined task.
• Controlled dataset.
• We analysed the assessments, and automated
their prediction.
• The task matters more than subjectivity.
• Assessments are quite uniform and coherent.
• Features in isolation are not very meaningful.
• The application setup is important.
Conclusion
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
• We plan to and are currently working on:
• Extending the dataset (currently ~1,500 documents).
• Scaling up the experiments and gathering more
assessments.
• Involving laymen via crowdsourcing.
• Extending the analyses.
• Utilising other automated reasoning approaches.
(Current and) Future Work
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments
https://qupid-project.net/
d.ceolin@vu.nl
Thank you!
Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality
Assessments

Contenu connexe

Similaire à Capturing Web Document Quality Assessments

Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Taking the Holistic View: Building a customer feedback database.
Taking the Holistic View: Building a customer feedback database.Taking the Holistic View: Building a customer feedback database.
Taking the Holistic View: Building a customer feedback database.Selena Killick
 
Evaluation quality (2)
Evaluation quality (2)Evaluation quality (2)
Evaluation quality (2)amccullar
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Dakiry
 
Who's Afraid of Qualitative Analysis?
Who's Afraid of Qualitative Analysis?Who's Afraid of Qualitative Analysis?
Who's Afraid of Qualitative Analysis?BrigitteScott
 
Exploring perspectives in digital library evaluation
Exploring perspectives in digital library evaluationExploring perspectives in digital library evaluation
Exploring perspectives in digital library evaluationGiannis Tsakonas
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
 
Crowdsourcing for HCI Research with Amazon Mechanical Turk
Crowdsourcing for HCI Research with Amazon Mechanical TurkCrowdsourcing for HCI Research with Amazon Mechanical Turk
Crowdsourcing for HCI Research with Amazon Mechanical TurkEd Chi
 
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...Kaitlan Chu
 
Quality in qualitative research the role of the software’s in quality assur...
Quality in qualitative research   the role of the software’s in quality assur...Quality in qualitative research   the role of the software’s in quality assur...
Quality in qualitative research the role of the software’s in quality assur...Merlien Institute
 
Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008Julie Coiro
 
Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...
Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...
Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...Kaitlan Chu
 
Lesson 4: Evaluating Sources
Lesson 4: Evaluating SourcesLesson 4: Evaluating Sources
Lesson 4: Evaluating SourcesLeslie Lewis
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsRobert H. McDonald
 
Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012
Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012
Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012TEST Huddle
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data qualityIUPUI
 
Cutting the Commute: Assess Authentically and Still Arrive on Time
Cutting the Commute:  Assess Authentically and Still Arrive on TimeCutting the Commute:  Assess Authentically and Still Arrive on Time
Cutting the Commute: Assess Authentically and Still Arrive on TimeToni Carter
 

Similaire à Capturing Web Document Quality Assessments (20)

Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...
NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...
NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...
 
Taking the Holistic View: Building a customer feedback database.
Taking the Holistic View: Building a customer feedback database.Taking the Holistic View: Building a customer feedback database.
Taking the Holistic View: Building a customer feedback database.
 
Evaluation quality (2)
Evaluation quality (2)Evaluation quality (2)
Evaluation quality (2)
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”
 
Who's Afraid of Qualitative Analysis?
Who's Afraid of Qualitative Analysis?Who's Afraid of Qualitative Analysis?
Who's Afraid of Qualitative Analysis?
 
Exploring perspectives in digital library evaluation
Exploring perspectives in digital library evaluationExploring perspectives in digital library evaluation
Exploring perspectives in digital library evaluation
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Crowdsourcing for HCI Research with Amazon Mechanical Turk
Crowdsourcing for HCI Research with Amazon Mechanical TurkCrowdsourcing for HCI Research with Amazon Mechanical Turk
Crowdsourcing for HCI Research with Amazon Mechanical Turk
 
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
User Experience Design on Cleveland Clinic Corporate Website | Medical Inform...
 
How Google works
How Google worksHow Google works
How Google works
 
Quality in qualitative research the role of the software’s in quality assur...
Quality in qualitative research   the role of the software’s in quality assur...Quality in qualitative research   the role of the software’s in quality assur...
Quality in qualitative research the role of the software’s in quality assur...
 
Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008Uconn Coiro Assessment 2008
Uconn Coiro Assessment 2008
 
Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...
Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...
Challenges Faced & Lessons Learned Conducting Cleveland Clinic's First UX Stu...
 
Lesson 4: Evaluating Sources
Lesson 4: Evaluating SourcesLesson 4: Evaluating Sources
Lesson 4: Evaluating Sources
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your Patrons
 
Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012
Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012
Michael Bolton - Testing Through The Qualitive Lens - EuroSTAR 2012
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data quality
 
Cutting the Commute: Assess Authentically and Still Arrive on Time
Cutting the Commute:  Assess Authentically and Still Arrive on TimeCutting the Commute:  Assess Authentically and Still Arrive on Time
Cutting the Commute: Assess Authentically and Still Arrive on Time
 
Usability bootcamp 3 18-13 high edweb ne-final
Usability bootcamp 3 18-13 high edweb ne-finalUsability bootcamp 3 18-13 high edweb ne-final
Usability bootcamp 3 18-13 high edweb ne-final
 

Plus de Davide Ceolin

Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)
Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)
Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)Davide Ceolin
 
Provenance as a Key Factor for Privacy-proof Trust
Provenance as a Key Factor for Privacy-proof TrustProvenance as a Key Factor for Privacy-proof Trust
Provenance as a Key Factor for Privacy-proof TrustDavide Ceolin
 
VU University Amsterdam - The Social Web 2016 - Lecture 6
VU University Amsterdam - The Social Web 2016 - Lecture 6VU University Amsterdam - The Social Web 2016 - Lecture 6
VU University Amsterdam - The Social Web 2016 - Lecture 6Davide Ceolin
 
VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5Davide Ceolin
 
VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4Davide Ceolin
 
VU University Amsterdam - The Social Web 2016 - Lecture 3
VU University Amsterdam - The Social Web 2016 - Lecture 3VU University Amsterdam - The Social Web 2016 - Lecture 3
VU University Amsterdam - The Social Web 2016 - Lecture 3Davide Ceolin
 
VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2Davide Ceolin
 
VU University Amsterdam - The Social Web 2016 - Lecture 1
VU University Amsterdam - The Social Web 2016 - Lecture 1 VU University Amsterdam - The Social Web 2016 - Lecture 1
VU University Amsterdam - The Social Web 2016 - Lecture 1 Davide Ceolin
 
Semi-automated Assessment of Annotation Trustworthiness
Semi-automated Assessment of Annotation TrustworthinessSemi-automated Assessment of Annotation Trustworthiness
Semi-automated Assessment of Annotation TrustworthinessDavide Ceolin
 
Subjective Logic Extensions for the Web and the Semantic Web
Subjective Logic Extensions for the Web and the Semantic WebSubjective Logic Extensions for the Web and the Semantic Web
Subjective Logic Extensions for the Web and the Semantic WebDavide Ceolin
 
Trust Evaluation through User Reputation and Provenance Analysis
Trust Evaluation through User Reputation and Provenance AnalysisTrust Evaluation through User Reputation and Provenance Analysis
Trust Evaluation through User Reputation and Provenance AnalysisDavide Ceolin
 

Plus de Davide Ceolin (11)

Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)
Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)
Lecture 2 Social Web 2017 (Guest Lecture By Dr. Giulia Ranzini)
 
Provenance as a Key Factor for Privacy-proof Trust
Provenance as a Key Factor for Privacy-proof TrustProvenance as a Key Factor for Privacy-proof Trust
Provenance as a Key Factor for Privacy-proof Trust
 
VU University Amsterdam - The Social Web 2016 - Lecture 6
VU University Amsterdam - The Social Web 2016 - Lecture 6VU University Amsterdam - The Social Web 2016 - Lecture 6
VU University Amsterdam - The Social Web 2016 - Lecture 6
 
VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5
 
VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4
 
VU University Amsterdam - The Social Web 2016 - Lecture 3
VU University Amsterdam - The Social Web 2016 - Lecture 3VU University Amsterdam - The Social Web 2016 - Lecture 3
VU University Amsterdam - The Social Web 2016 - Lecture 3
 
VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2
 
VU University Amsterdam - The Social Web 2016 - Lecture 1
VU University Amsterdam - The Social Web 2016 - Lecture 1 VU University Amsterdam - The Social Web 2016 - Lecture 1
VU University Amsterdam - The Social Web 2016 - Lecture 1
 
Semi-automated Assessment of Annotation Trustworthiness
Semi-automated Assessment of Annotation TrustworthinessSemi-automated Assessment of Annotation Trustworthiness
Semi-automated Assessment of Annotation Trustworthiness
 
Subjective Logic Extensions for the Web and the Semantic Web
Subjective Logic Extensions for the Web and the Semantic WebSubjective Logic Extensions for the Web and the Semantic Web
Subjective Logic Extensions for the Web and the Semantic Web
 
Trust Evaluation through User Reputation and Provenance Analysis
Trust Evaluation through User Reputation and Provenance AnalysisTrust Evaluation through User Reputation and Provenance Analysis
Trust Evaluation through User Reputation and Provenance Analysis
 

Dernier

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 

Capturing Web Document Quality Assessments

  • 1. Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments Davide Ceolin, Julia Noordegraaf, Lora Aroyo
  • 2. • Introduction • Nichesourcing Web Document Quality Assessments • User studies • Conclusion and Future Work Outlin e Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 3. Introduction Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 4. Web Document Quality Assessment • Source criticism • Methodological practice from the humanities • e.g., from the American Library Association: • How was the source located? • What type of source is it? • Who is the author and what are the qualifications of the author in regard to the topic that is discussed? • When was the information published? • In which country was it published? • What is the reputation of the publisher? • Does the source show a particular cultural or political bias?. • How does it apply to Web sources? Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 5. Web Document Quality Assessment What is the quality of each of these documents? Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments Authoritative source ✓ Accurate ✓ Precise ✓ Complete ✓ Neutral (?) Blog Post (?) Accurate (?) Precise (?) Complete (?) Neutral ✗
  • 6. • We adapt source criticism to Web documents & aim at automating the process of quality estimation by: • Gathering quality assessments (mostly from experts). • Looking for markers (document features) that correlate with them. Quality and Quality Markers Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 7. Objectives • Analyse the consistency of quality assessments. • Are quality assessments consistent among users, over time, etc.? • Analyse user ability to interpret document features. • Can the users estimate the quality of a document from its sentiment or trustworthiness level? • Analyse the predictability of quality assessments. • Can we automatically estimate the quality of a document? Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 8. Nichesourcing Web Document Quality Assessments Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 9. • Dataset: documents about vaccinations • Initially, 50 docs, various sources (blogs, authorities, etc.) • Features • Information (automatically) extracted from documents using AlchemyAPI & Web of Trust. • Entities, Topics, Sentiment, Emotions, Trustworthiness. • Quality dimensions • Overall quality, accuracy, completeness, precision, trustworthiness, readability, neutrality. Dataset, Features, and Quality Dimensions Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 10. • Setup: • 6 documents per participant. • Random selection. • Even distribution of assessments. • Scenario: Suppose you are asked to write an article about debate on vaccinations triggered by the measles outbreak in 2015 at Disneyland in California. WebQ: Nichesourcing Web Quality Assessments Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 11. • Documents are anonymized. • Users choose documents that meet their quality criteria based on features only. • All feature values are shown, alone and together. WebQ: Task 1 Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 12. • Read each of the 6 articles. • Assess it. • Rate completeness, accuracy, etc. • Likert scale 1-5. • Annotate the article to explain the ratings • Articles are proxied & annotated through AnnotatorJS. WebQ: Task 2 Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 13. User Studies Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 14. • User Study 1 • Participants: 20 last-year UvA journalism students. • Duration: 60’. • User Study 2 • Participants: 20 RMA media scholars. • Duration: 45’. • Improvements (learnt from user study 1). Setup Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 15. • Data collected: • 104 (US1) + 47 (US2) assessments. • 238 (US1) + 89 (US2) annotations. • No significant difference between Use Cases (Wilcoxon signed-rank test). • Assessments are assimilable. • Assessment predictability (SVC) • Up to 63% accuracy (5-classes) • Up to 89% accuracy (2-classes) • Promising predictability. We will try other algorithms. Results Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 16. • Highest correlation with overall quality: • Accuracy • Trustworthiness • Precision • Completeness • Given the task at hand, neutrality is not relevant. • Weak correlation task 1 - overall quality (task 2). • Users were mostly unable to interpret those features. Results Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 18. • We collected Web document quality assessments. • WebQ – Nichesourcing application. • 2 user studies with experts. • Clear defined task. • Controlled dataset. • We analysed the assessments, and automated their prediction. • The task matters more than subjectivity. • Assessments are quite uniform and coherent. • Features in isolation are not very meaningful. • The application setup is important. Conclusion Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 19. • We plan to and are currently working on: • Extending the dataset (currently ~1,500 documents). • Scaling up the experiments and gathering more assessments. • Involving laymen via crowdsourcing. • Extending the analyses. • Utilising other automated reasoning approaches. (Current and) Future Work Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments
  • 20. https://qupid-project.net/ d.ceolin@vu.nl Thank you! Capturing the Ineffable: Collecting, Analysing, and Automating Web Document Quality Assessments