CUbRIK research on social aspects

CUbRIK SummerSchool2014
CUbRIK Summer School 0
Mining, Analyzing and Exploiting Community Feedback on the Web
Sergiu Chelaru
L3S Research Center, Hannover

2-4/07/2014 CUbRIK Summer School 1
Community Feedback on the Web
Comments: a way to communicate with users and/or communities

Outline
Comment-Centric Feedback
Comment Ratings
Polarized Content
Controversial Comments
Trolls
Social Feedback
Query Result Characteristics
Social Features
Learning to Rank using Social Features
Community Sentiment in Web Queries
Analysis of Sentiment in Web Queries
Detecting Query Sentiment
Two Application Scenarios
Summary and Contributions

Comment Centric Feedback
YouTube dataset
756 Google Zeitgeist keywords
50 videos, metadata, 500 comments
67k videos, 6 mil comments
Yahoo! News dataset
Yahoo! RSS Feed, Sept-Dec 2011
27k news stories
5.4 mil comments
Descriptive statistics for the
YouTube and Yahoo! News corpora.

Comment-Centric Feedback
Distribution of number of comments for videos in
YouTube and news stories in Yahoo! News.

Comment Ratings
Distribution of comment ratings for (a) YouTube, and (b) Yahoo! News.
(a)
(b)

Term Analysis of Rated Comments
Top-50 terms according to their MI values for accepted comments (with high
comment ratings) vs. not accepted comments (with low comment ratings).

Examples of
comments
belonging to
the categories
“accepted”.

Examples of comments belonging to the categories “unaccepted”.

Sentiment Analysis of Rated Comments
Does language and sentiment used by the community have an influence on comment ratings?
Three disjoint partitions:
5Neg: comments with rating score r<= -5
0Dist: comments with rating score r = 0
5Pos: comments with rating score r>=5
Comparison of mean senti-values for comments with different kinds of community ratings in (a) YouTube and (b) Yahoo! News.

Ratings and Polarized Content
Variance of Comment Ratings as Indicator for Polarizing Videos

Ratings and Polarized Content
Variance of Comment Ratings as Indicator for Polarizing Topics

Predicting Comment Ratings
Classify comments into accepted by the communityand not accepted
AC_POS
AC_NEG
THRESH-0
Text processing: stopwords removal, stemming
 푐1,푙1,…, 푐푛,푙푛
Ratingthresholds for “accepted” vs “not accepted”
Different amounts for training set size T

Comment rating classification: BEPs for different training set sizes T and
different rating thresholds.

Precision-recall curves for
comment rating prediction.

2-4/07/2014
CUbRIK Summer School
16
Controversial Comments
In many platforms
“For some reason, a lot of you thing that rich people pay
NO taxes? They pay taxes even though 50% of Americans
do not. What Obama wants to do is RAISE their taxes.
That’s not fair. Let’s make sure everyone pays taxes and
politicians use tax money in a sensible way before we
raise taxes on a few.”
10
15
comment_rating= #likes -#dislikes

Controversial Comments
Examples of comments belonging to the categories “controversial” and
“non-controversial”.

Term Analysis Controversial of Comments
bank: criticized because of their role in the financial crisis, comments are approved by a large majority of the users.
Top-20 terms according to their MI values for controversial vs. non-controversial comments.

Analysis of likesand dislikes
Comment Approval Ratio
Φ푐= 푙푐 푙푐+푑푐
푙푐(푑푐):number of likes (dislikes) for a comment 푐
Controversy Interval
0.5−δ퐶≤Φ푐≤0.5+δ퐶,δ퐶=0.1
Non-controversy Interval
0.5−δ푁퐶≤Φ푐≤0.5+δ푁퐶,δ푁퐶ε[0.1,0.2,0.3]

Analysis of likesand dislikes
(a) Distribution of number of comments per comment approval intervals for distinct thresholds for the number of received ratings. (b) Controversy interval vs. accepted (positive) and not accepted (negative) intervals.
(a)
(b)

Predicting Controversial Comments
Co
BEPs for controversial comment prediction.
Note that:
•BEPs relatively low
•Results implementable
•Trading recall for precision leads to applicable results: P = 0.859 for R = 0.1
Precision-recall curve for the classification of controversial comments forδ푁퐶= 0.4

Trolls on the Social Web
Trolls: “posting disruptive, false or offensive comments to fool and provoke other users”
Study comment rating feedback for troll/non-troll
users
Study methods for automatically detecting the presence of trolls
Slashdot No More Trolls: 200 trolls, 200 non trolls, 24 comments / user
YouTube dataset

Trolls on the Social Web
Johny1
Mexican, Puerto Rican, Cuban ... whocares?
I love that this Negro says/ sings: "If I WERE a boy."
I would feel awful about admitting being a Republican.
I hope Britney Slut will die of Swine flu.
I love that this Negro says/ sings: "If I WERE a boy."
All I want is that she doesn't rape valuable classical songs. Even a diva like this Beyoncé doesn't have the right to commit such a crime.
Johny2
you obviously have no idea what you are talking about.
Shut up you douchebag.
Moron.Ifthe religious groups did not subject their will on to everyone, there would not even need to be an atheist title. No one would care.
Perhaps people with speak issues should be euthanized.
Kindathe point there, dipshit.
You are quite the ignorant fuckwit. They do look like crap, you have no idea what you're talking about. Most likely don't have the device either.Moron.
Examples of troll users in YouTube (Johny1) and Slashdot (Johny2).

Term Analysis of Troll Comments
Top-20 terms according to their MI values for troll vs. non-troll comments.

Trolls and Community Ratings
(a)
(b)
Comment rating distribution for comments from troll users and non- troll users in (a) YouTube and, (b) Slashdot.

Content-based Troll Prediction
Linear SVM, 2-fold cross validation
BEP: 0.68 for YouTube, 0.74 for Slashdot

Outline
Comment Ratings
Trolls
Social Feedback
Social Features

Social Feedback

Contribution
What are the characteristics of the YouTube query results with respect to the social features?
How effective is each individual feature for ranking the videos for a given query?
Can social features help improving the video retrieval performance in a learning to rank (LETOR) framework?

Data Collection
Query Sets
1,4k popularqueries (푄푝)
1,3k tailqueries (푄푡)
Video Sets
푉푝:132k videos retrieved for 푄푝
푉푡:63k videos retrieved for 푄푡

Query Result Characteristics
Category distribution of (a) popular, and (b) tail queries
(a)
(b)

Number of results (reported by YouTube) for (a) popular, and (b) tail queries
(a)
(b)

Avg. no. of (a) views, (b) likes, (c) dislikes and (d) comments vs. video rank in the query results

Data Annotation
100 queries, 100 videos/query =>10k videos

Basic and Social Features
The list of all the basic and social features (F) employed in our work.

Effectiveness of Features
Fraction of queries for which a given feature yields the ranking with the highest NDCG@10 for (a) popular, and (b) tail queries
(a)
(b)

Video Retrieval Framework
7 LETOR algorithms
Feature Selection
GAS
MMR
(q, F, r)
5-fold cross validation
NDCG@10, NDCG@5
Train 7 Letor Models
Run Prediction Models
Build kdimensional
Query-Video Pairs
NDCG
Top k Feature Selection
Train Queries+Videos
Test Queries+Videos
for k ε {1,...,# features}

LETOR Results for Popular+Tail
Average NDCG@10 scores for LETOR algorithms using the basic and best-k features obtained with the GAS and MMR strategies for the popular and tail query sets (for bold cases, differences from the baseline are statistically significant). For GAS and MMR, we also denote the number of selected features (k) in parentheses.

Outline
Comment Ratings
Trolls
Social Feedback
Social Features

Contribution
Analysis of sentiment in Web queries
Study the applicability of state-of-the-art sentiment analysis methods for detecting the sentiment of the queries
Employ query sentiment detectors in two use cases, query recommendation and controversial topic discovery

What is Sentiment Analysis
1
Examples of positive (top) and negative (bottom) opinionated reviews for the movie Madagascar 3:Europe’s most wanted.

Data Collection
50 controversial topics from procon.org and Wikipedia (e.g.,abortion, iphone, marijuana)
AOL query log
31,053 queries
7,651 annotated queries
Templates for gathering queries (along with the number of
manually annotated queries per template)

Sentiment in Web Queries
Queries and sentiment categories for the topic “George Bush”.

Sentiment in Query Results
Traces of bias in top-k query results
60 queries, 600 titles, 600 snippets
Sentiment distribution of (a) query result titles, and (b) query result snippets for the queries from each sentiment class.
(a)
(b)

Post-Retrieval Analysis
Post retrieval behaviour of the user
MSN log, 5 topics, 1.5k queries, 79 opinionated,
222 clicked pages
Sentiment distribution of the clicked results for (a) positive queries, and (b) negative queries.
(a)
(b)

Detecting Query Sentiment
Study state-of-the-art methods to detect the sentiment class of a query
Feature vectors
Query text, top-10 result titles and snippets
TF-IDF weights, stemming, stopwords, negations
Classification aproaches
Simple logistic regression (SLR)
Naive Bayes (mNB)
3 SVM types
3 types of one vs all(binnary classifiers)
50/50 split for training/testing

Classification accuracy and AUC for the subjective vs. all classifiers trained with four different representations of the queries (QAllstands for QTextTitleSnippet).

Precision-recall curves and BEPs for (a) subjective vs. all, (b) positive vs. all, and (c) negative vs. all classifiers.
(a)
(b)
(c)

Recommender Methods
Improvment of recommandations by analyzing the sentiment of the suggested query
Our approach: opinionated suggestions
For a query q, generate query suggestions having the same sentiment class as q
Baseline: search engine suggestions
Issue qto a SE (Nov-2011), collect suggestedand relatedqueries
Evaluation: compare the opinionated suggestions vs
the SE suggestions

Recommender Methods
User study
Suggested query: rellevant/irrelevant/undecided
15 topics, 30 seed queries, 600 annotated suggestions
CS researchers, AMT workers
Query recommendation performance based on (a) in-house
annotations, and (b) AMT annotations.

Recommender Methods
Search engine’s suggestions (provided as “related queries” and “auto- completions”, the latter are shown in italics) vs. opinionated suggestions for the query “economy is really bad”.

Controversial Topic Discovery
Classify sentiment in queries, infer controversial topics
A toy example illustrating controversial topic detection: the procedure
will output only “zen” as being controversial, as it yields very high variance in query sentiment scores and filter “zendaya”, as its queries have less variance.

Controversial Topic Discovery
Topics ranked with respect to the variance in sentiment
scores of their queries.
Wicca: a modern pagan religion
cult, good, right
fake, evil, stupid

Summary and Contributions
In-depth analysis on 11mil comments
Studied dependencies between comment ratings and textual content
Explored the applicability of ML technieques to detect accepted and controversial comments
Studied users exhibiting offensive behaviour
Social Feedback
Analysed query/query result characteristics for popular and tail queries
Effectiveness of individual social features for LETOR

Summary and Contributions
Community-Sentiment in Web queries
Studies Sentiment in Web search queries
Methods able to detect the sentiment class of a query
Application 1: Query recommandation method
Application 2: Controversial topic discovery method

Publications
Chelaru, S., Altingovde, I. S., Siersdorfer, S., and Nejdl, W. Analyzing, detecting, and exploiting sentiment in web queries. ACM Transactions on the Web 8, 1 (Dec. 2013), 6:1–6:28
Chelaru, S., Altingovde, I. S., and Siersdorfer, S. Analyzingthe polarity of opinionated queries. In ECIR ’12, Springer-Verlag, pp. 463–467
Siersdorfer, S., Chelaru, S., Nejdl, W., and San Pedro, J. How useful are your comments?: analyzingand predicting youtubecomments and comment ratings. In WWW ’10, ACM, pp. 891–90
Siersdorfer, S., Chelaru, S., San Pedro, J., Altingovde, I. S., and Nejdl,W. Analyzingand mining comments and comment ratings on the social web. ACM Transactions on the Web8, (June 2014), 17:1-17:39
Chelaru, S., Orellana-Rodriguez, C., and Altingovde, I. S. Can social features help learning to rank youtubevideos? WISE ’12, Springer-Verlag, pp. 552–566
Chelaru, S., Orellana-Rodriguez, C., and Altingovde, I. How useful is social feedback for learning to rank youtubevideos? World Wide Web Journal (2013), 1–29
Chelaru, S., Herder, E., DjafariNaini, K., and Siehndel, P. Recognizing skill networks and their specific communication and connection practices. In HT ’14 (Accepted Paper), ACM
Demartini, G., Siersdorfer, S., Chelaru, S., and Nejdl, W. Analyzingpolitical trends in the blogosphereICWSM ’11.

Thanks
Questions?

CUbRIK research on social aspects

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (11)

Similaire à CUbRIK research on social aspects

Similaire à CUbRIK research on social aspects (20)

Plus de CUbRIK Project

Plus de CUbRIK Project (20)

Dernier

Dernier (20)

CUbRIK research on social aspects