The document describes the StanceCat shared task on detecting stance and gender from tweets on Catalan independence. The task included stance detection (favor, against, neutral) and gender detection on a corpus of 10,800 tweets in Catalan and Spanish annotated for these attributes. Ten teams participated with 31 runs applying classifiers like SVM and neural networks. Results showed F-measures below 50%, with more errors for males and between stances. The dataset was released to facilitate further research on computational analysis of socio-political debates.
Stance and Gender Detection in Tweets on Catalan Independence. Ibereval@SEPLN 2017
1. Stance and Gender Detection in
Tweets on Catalan Independence
@Ibereval 2017
Viviana Patti, Cristina Bosco, Università degli Studi di Torino
Mariona Taulé, M. Antònia Martí, Universitat de Barcelona
Francisco Rangel, Autoritas & Universitat Politècnica de València
Paolo Rosso, Universitat Politècnica de València
2. StanceCat
• Introduction and Motivations:
Stance vs. Polarity detection
• StanceCat: Task Description
• TW-CaSe corpus
• Evaluation Metrics
• Overview of the submitted approaches
http://stel.ub.edu/Stance-IberEval2017/index.html
3. StanceCat: Introduction & Motivation
The rise of social media is encouraging users to voice and
share their views, generating a large amount of
social data
They can be a great opportunity to investigate
communicative behaviors and conversational contexts,
for extracting knowledge about
all real life domains
The proposal of the shared task is collocated within the
wider context of a research about communication in
online political debates in Twitter
To develop resources and tools for under-resourced
languages
4. StanceCat: Introduction & Motivation
Online debates are a large source of informal and opinion-
sharing dialogue on current socio-political issues
Several works rely on finer-grained sentiment analysis
techniques to analyze such debates.
Among these works some is dedicated to the classification of
users’ stance, i.e. the detection of positions pro or con a
particular target entity that users assume within debates
Dual-sided debates where two possible polarizing sides
can be taken by participants
5. StanceCat: Introduction & Motivation
Stance detection, formalized as the task of identifying the
speaker’s opinion towards a particular target, has recently
attracted the attention of researchers in sentiment analysis
Applied to data from microblogging platforms such as Twitter
Monitoring sentiment in a specific political debate
Stance detection does not only provide information for improving
the performance of a sentiment analysis system, but can help
to better understand the way in which people communicate
ideas to highlight their point of view towards a target entity.
6. StanceCat: Introduction & Motivation
Being able to detect stance in user-generated content can
provide useful insights to discover novel information about
social network structures (Lai et al. CLEF 2017)
Detecting stance in social media could become a helpful tool for
journalism, companies, government
Politics is an especially good application domain: focusing on
stance is interesting when the target entity is a controversial
issue, e.g., political reforms, or a polarizing person, e.g.,
candidates in political elections, and we observe the
interaction between polarized communities.
7. StanceCat: Introduction & Motivation
Semeval 2016 - Track II. Sentiment analysis Task 6
Detecting Stance in Tweets
http://alt.qcri.org/semeval2016/task6/
“Given a tweet text and a target entity (person, organization,
movement, policy, etc.), automatic natural language systems must
determine whether the tweeter is in favor of the target, against the
given target, or whether neither inference is likely”
Stance detection: automatically determining from text whether
the author is in favor / against / neutral-none w.r.t. a target
8. StanceCat: Stance vs Sentiment
Stance detection is of course related to sentiment analysis
BUT there are significant differences.
In a classical sentiment analysis tasks systems have to
determine if a piece of text is positive, negative or
neutral.
In stance detection systems have to determine the
favorability towards a given target entity of interest,
where the target may not be explicitly mentioned in the
text.
9. StanceCat: Stance vs Sentiment
Example [source: training set of SemEval-2016 Task 6]
Support #independent #BernieSanders because he’s not a liar.
#POTUS #libcrib #democrats #tlot #republicans #WakeUpAmerica
#SemST.
• Target: Hillary Clinton [context: Party presidential primaries
for Democratic and Rapublican parties in US]
• The tweeter expresses a positive opinion towards an adversary of
the target (Sanders)
• We can infer that the tweeter expresses a negative stance towards
the target, i.e. she/he is likely unfavorable towards Hillary Clinton
• Important: tweet does not contain any explicit clue to find the target
• In many cases the stance must be inferred
10. • For a deeper exploration of the relation between sentiment and
stance in the Semeval dataset see:
Saif M. Mohammad, Parinaz Sobhani, Svetlana Kiritchenko:
Stance and Sentiment in Tweets. ACM Trans. Internet Techn. 17(3):
26:1-26:23 (2017)
• An interactive visualization of the dataset is available at:
http://www.saifmohammad.com/WebPages/StanceDataset.htm
a useful tool to explore the stance-target combinations present in
the annotated dataset and the relations between stance and
sentiment.
StanceCat: Stance vs Sentiment
11. Our focus and stance target: Catalan Independence
corpus from Twitter, filtering by the hashtag #independencia #27S
timelapse: end of September 2015 – December 2015
27S: September 27, 2015
Regional elections in Catalonia
de facto
referendum on independence
#independencia #27S
two of the hashtags which has been accepted within the dialogical
and social context growing around the topic; largely exploited in the
debate
StanceCat: Introduction & Motivation
12. Multilingual perspective
different socio-political debates
French: #mariagepourtous
Debate on the homosexual wedding
in France (Bosco et al. @LREC2016)
Italian: #labuonascuola
Debate on the reform of the education sector
in Italy (Stranisci et al. @LREC2016)
StanceCat: Introduction & Motivation
15. StanceCat: Task Description
StanceCat Task
SubTask 1- Stance Detection
Deciding whether each message
is neutral, in favor or against the
target: ‘Catalan Independence’
SubTask 2- Gender
Detection
Identification of the gender of
the author of the message
Languages: Catalan and Spanish
16. StanceCat: Task Description
StanceCat Task
SubTask 1- Stance Detection
Deciding whether each message
is neutral, in favor or against the
target: ‘Catalan Independence’
SubTask 2- Gender
Detection
Identification of the gender of
the author of the message
Languages: Catalan and Spanish
Stance detection: SemEval-2016, Task-6; author profiling: PAN@CLEF.
Novelty: The two tasks have never been performed together for Spanish
and Catalan as part of one single task.
Results will be of interest not only for sentiment analysis but also for
author profiling and for socio-political studies
20. StanceCat: Corpus
• Example:
Language: Catalan
Stance: FAVOR
Gender: FEMALE
Tweet: 15 diplomàtics internacional observen les plesbiscitàries, será que
interessen a tothom menys a Espanya #27
‘ 15 international diplomats observe the plebiscite, perhaps it is of
interest to everybody except to Spain #27’
21. StanceCat: Corpus
• Criteria:
– Writing text: emoticons, @mentions and #hashtags √
– Links (webpages, photographs, videos…) (NO, in TW-CaSe 0.1)
(YES, in TW-CaSe 1.0)
22. StanceCat: Corpus
• Annotation procedure:
1.3 trained annotators tagged the stance in 500 Catalan tweets and in
500 Spanish tweets in parallel
2.Interannotator Agreement Test (IAT)
3.Annotation of the whole corpus individually.
Annotators: 3 trained annotators + 2 seniors researchers
Meetings: once a week problematic cases solved by common
consensus
24. StanceCat: Corpus
• Disagreements: Communicative intentions are unclear
Language: Spanish
Stance: NONE A= AGAINST B and C = NONE
Gender: MALE
Tweet: #27 voy a denunciar a todo aquel q me siga insultando usando ls
red. Yo no soy imbécil, ni mi bandera es n trapo
‘#27 I’m going to denounce anyone who continues to insult me using the web. I’m
not stupid, neither my flag is a rag’
25. StanceCat: Corpus
• Disagreements: Communicative intentions are unclear
Language: Catalan
Stance: NONE A= AGAINST B= FAVOR C = NONE
Gender: MALE
Tweet: La @cupnacional t la clau de Matrix
‘The @cupnacional has the key of Matrix’
26. • Distribution of labels for stance, gender and
language
StanceCat: Corpus
Female Male
Total Dataset
favor against none favor against none
Cat
1,456 57 646 1,192 74 894 4,319 training
365 14 162 298 18 224 1,081 test
Spa
145 693 1,322 190 753 1,216 4,319 training
36 173 331 48 188 305 1,081 test
27. StanceCat: Evaluation Metrics
• Macro-average on F-score (FAVOR &
AGAINST) to evaluate Stance
(Semeval 2016)
• Accuracy to evaluate Gender
(PAN@CLEF)
28. StanceCat: Baselines
• Majority class: A random basis approach
that returns the majority class.
• LDR (Low Dimensionality Representation):
– The key concept is the probability of occurrence
(weight) of each word in the training set in
each of the possible classes.
– The distribution of weights for a document
should be more similar to the distribution of
weights of its corresponding class.
30. StanceCat: Approaches
CLASSIFICATION APPROACHES PARTICIPANT FEATURES
SVM
DECISION TREES
RANDOM FOREST
LOGISTIC REGRESSION
MULTINOMIAL NB
NEURAL NETWORKS
MULTILAYER PERCEPTRON
LSTM
CNN
MLP
FASTTEXT
KIM
BI-LSTM
ltl_uni_due
iTACOS
ARA1337
ELiRF-UPV
LTRC_IIITH
atoppe
LuSer
deepCybErNet
Word n-grams
Character n-grams
POS
Hashtags
Stylistic features (number of
hashtags, number of words…)
(Stance&gender) Specific tokens
Word embeddings
N-gram embeddings
One-hot vectors
35. StanceCat: Error Analysis
• More errors in case of males.
• In Catalan, more errors from Against to
Favor. In Spanish, more errors from Favor
to Against.
• In Spanish, errors from Against to Favor
are minimal (2%).