=== Paper available here http://tiny.cc/ItalianSentimentAnalysis ===
English. This paper describes the UNIBA team participation in the SENTIPOLC task at EVALITA 2014. We propose a supervised approach relying on keyword, lexicon and micro-blogging features as well as representation of tweets in a word space. Our system ranked 1st in both the subjectivity and polarity detection sub- tasks. As a further contribution, we participated in the unconstrained run, investigating the use of co-training to automatically enrich the labelled training set.
Italiano. Questo articolo riporta i risultati della partecipazione del team UNIBA al task SENTIPOLC di EVALITA 2014. L’approccio supervisionato che abbiamo proposto affianca alle keyword la rappresentazione semantica dei tweet in uno spazio geometrico, l’utilizzo di feature tipiche dei micro-blog e di dizionari per la definizione della polarita` a priori del lessico dei tweet. Abbiamo sperimentato, inoltre, l’uso del co-training per l’arricchimento del dataset tramite annotazione automatica di nuovi tweet.
UNIBA at EVALITA 2014-SENTIPOLC Task: Predicting tweet sentiment polarity combining micro-blogging, lexicon and semantic features
1. UNIBA at EVALITA 2014-SENTIPOLC Task:
Predicting tweet sentiment polarity combining
micro-blogging, lexicon and semantic features
Pierpaolo Basile and Nicole Novielli
pierpaolo.basile@uniba.it, nicole.novielli@uniba.it
Department of Computer Science
University of Bari Aldo Moro (ITALY)
EVALITA 2014
Pisa, 11th December 2014
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 1 / 21
4. cation: classify subjectivity/objectivity of the tweet's
content
La Yamaha e buona solo ad alzar polemiche. E la Honda a crearne. ! subj
Domani sera a Palazzo Chigi incontro informale tra Mario Monti e parti sociali: I leader
di Cgil, Cisl, Uil e Ug... http: // bit. ly/ rNWoB0 ! obj
Subtask B - Polarity Classi
5. cation: classify tweet's polarity: positive, negative, neutral
and tweets expressing both positive and negative sentiment
#Maroni La politica del governo Monti e sbagliata ! neg
Coma iniziare un buon sabato? Guardandosi South Park il
6. lm! ! pos
Pilot Task, Irony Detection
governo #monti : piu equita o piu equitalia?
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 2 / 21
7. The System Our Approach
Our Approach
Supervised approach relying on three kinds of features
Keywords and micro-blogging properties of tweets
Content representation in a distributional semantic model
Sentiment lexicon
Our contribution
Represent both the tweets and the polarity classes in the word space
Automatically develop a sentiment lexicon for the Italian starting form
SentiWordNet
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 3 / 21
8. The System Features
Building a Sentiment Lexicon for Italian
Automatic development of a sentiment lexicon starting form SentiWordNet
Lexicon-based features
features based on sentiment scores of tokens
sentiment variation in the tweet
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 4 / 21
9. The System Features
Building the Word Space from Unlabeled Tweets
Homogenous representation of both tweets and polarity classes in the word space
Distributional features
tweet vector
!t
as semantic feature in training our classi
10. ers
similarity between
!t
and each prototype vector
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 5 / 21
11. The System Features
Building the Word Space from Unlabeled Tweets
Homogenous representation of both the tweets and the polarity classes in the word space
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 6 / 21
12. The System Features
Tweets downloading
Idea
Using four lexicons extracted from training data for each class (pos, neg, subj, obj)
Probability assigned to each token:
P(tjci ) =
#t + 1
#toti + jVj
(1)
Rank tokens in descending order according to the Kullback-Leibler divergence (KLD):
KLD = P(tjcs ) log
P(tjcs )
P(tjco)
(2)
Top terms (50) in the rank for each class are used as seeds for downloading the same
number of tweets for each lexicon
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 7 / 21
13. The System Features
Keywords and Micro-blogging features
Keywords
exploit tokens occurring in the tweets (unigrams)
replace the user mentions, URLs and hashtags with three metatokens: USER ,
URL and TAG .
Micro-blogging
the use of upper case and character repetitions
positive and negative emoticons
informal expressions of laughters (i.e., sequences of ah)
the presence of exclamation and interrogative marks
occurrences of adversative words, disjunctive words, conclusive words and
explicative words
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 8 / 21
14. Learning
Learning step
Constrained run, only the provided training data can be used, but lexicons are
allowed: extract the features from the training data and run the learning algorithm
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 9 / 21
15. Learning
Learning step
Constrained run, only the provided training data can be used, but lexicons are
allowed: extract the features from the training data and run the learning algorithm
Unconstrained run, additional training data can be included: investigate a
co-training approach to automatically add new examples to the training set
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 9 / 21
16. System Setup
System Setup: overview
Learning algorithm
SVM with the RBF kernel
C=4 selected after a 10-folds validation on training data
total number of features: 12,117
Completely developed in JAVA
Weka library is adopted for learning
Tweets are tokenized using Twitter NLP and Part-of-Speech Tagging
Word space: download 10 million tweets using the Twitter Streaming API and
build the space using word2vec
continuous Bag-of-Words Model (CBOW)
200 vector dimensions
remove the terms with less than ten occurrences (about 200,000 terms
overall)
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 10 / 21
17. Evaluation
Evaluation
Evaluating systems on their ability in
Task A: decide whether a given tweet is subjective or objective
Task B: decide the tweet polarity with respect to four classes: positive, negative,
neutral and mixed sentiment (both positive and negative)
Dataset
Training: 4,513 manually annotated tweets (495 tweets are not available for the
download and are removed)
Test: 1,935 manually annotated tweets (1,748 available at the time of the
evaluation)
Metrics: systems are compared against the gold standard in terms of F measure
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 11 / 21
18. Evaluation Results
Results
Setting Task F Rank Imp.
baseline
Task A 0.4005 - -
Task B 0.3718 - -
constrained
Task A 0.7140 1 78%
Task B 0.6771 1 82%
unconstrained
Task A 0.6892 2 72%
Task B 0.6638 1 79%
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 12 / 21
19. Evaluation Discussion
Results Discussion
The system always obtains the best task performance in all settings
Co-training approach seems to introduce noise
Deep error analysis
the co-training system slightly improves the performance in classifying
positive tweets
bias in our classi
21. c lexicon about political
topics (governo, Monti, crisi)
classi
22. er could be improved by enriching our lexicon with jargon and
idiomatic expressions
ablation test: investigate the predictive power of the features in our
model
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 13 / 21
23. Evaluation Ablation test
Ablation Test
Decrease of F by removing each feature group, compared to the complete feature setting
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 14 / 21
24. Conclusions and Future Work
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 15 / 21
25. Conclusions and Future Work
Conclusions and Future Work
Conclusions
The combination of keyword/micro-blogging, semantic, and lexicon features results
in best performance
Semantic features (distributional approach) have more predictive power
Future Work
Validate and generalize our
26. ndings: further data and languages
Fix some issues in co-training approach
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 16 / 21
28. For Further Reading
For Further Reading I
Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy
data.
In: Proc. of the 23rd Intl Conf. on Computational Linguistics: Posters, COLING
'10, pp. 36{44. Association for Computational Linguistics, Stroudsburg, PA, USA
(2010)
Basile, V., Bolioli, A., Nissim, M., Patti, V., Rosso, P.: Overview of the Evalita
2014 SENTIment POLarity Classi
29. cation Task.
In: Proc. of EVALITA 2014. Pisa, Italy (2014)
Basile, V., Nissim, M.: Sentiment analysis on italian tweets.
In: Proc. of WASSA 2013, pp. 100{107 (2013)
Davidov, D., Tsur, O., Rappoport, A.: Enhanced sentiment learning using twitter
hashtags and smileys.
In: Proc. of the 23rd Intl Conf. on Computational Linguistics: Posters, COLING
'10, pp. 241{249. Association for Computational Linguistics, Stroudsburg, PA, USA
(2010)
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 18 / 21
30. For Further Reading
For Further Reading II
Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for
opinion mining.
In: Proc. of LREC, pp. 417{422 (2006)
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classi
31. cation using distant
supervision.
Processing pp. 1{6 (2009)
Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Twitter power: Tweets as
electronic word of mouth.
J. Am. Soc. Inf. Sci. Technol. 60(11), 2169{2188 (2009)
Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: The good
the bad and the omg!
In: Proc. of ICWSM 2011, pp. 538{541 (2011)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Ecient estimation of word
representations in vector space.
In: Proc. of ICLR Work. (2013)
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 19 / 21
32. For Further Reading
For Further Reading III
O'Connor, B., Balasubramanyan, R., Routledge, B., Smith, N.: From tweets to
polls: Linking text sentiment to public opinion time series.
In: Intl AAAI Conf. on Weblogs and Social Media (ICWSM), vol. 11, pp. 122{129
(2010)
Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion
mining.
In: Proc. of the Seventh Intl Conf. on Language Resources and Evaluation
(LREC'10) (2010)
Pang, B., Lee, L.: Opinion mining and sentiment analysis.
Found. Trends Inf. Retr. 2(1-2), 1{135 (2008)
Pianta, E., Bentivogli, L., Girardi, C.: Multiwordnet: developing an aligned
multilingual database.
In: Proc. 1st Intl Conf. on Global WordNet, pp. 293{302 (2002)
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 20 / 21
33. For Further Reading
For Further Reading IV
Smolensky, P.: Tensor product variable binding and the representation of symbolic
structures in connectionist systems.
Arti
34. cial Intelligence 46(1-2), 159{216 (1990)
Vanzo, A., Croce, D., Basili, R.: A context-based model for sentiment analysis in
twitter.
In: Proc. of COLING 2014 (2014)
Wilson, T., Wiebe, J., Homann, P.: Recognizing contextual polarity: An
exploration of features for phrase-level sentiment analysis.
Comput. Linguist. 35(3), 399{433 (2009)
Zanchetta, E., Baroni, M.: Morph-it!: a free corpus-based morphological resource
for the italian language.
Proc. of the Corpus Linguistics Conf. 2005 (2005)
Basile and Novielli (UNIBA) UNIBA-SENTIPOLC EVALITA 2014 11 Dec. '14 21 / 21