SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
A Replication Study of the Top
Performing Systems in SemEval
Twitter Sentiment Analysis
Efstratios Sygkounas, Giuseppe Rizzo,
Raphaël Troncy
<raphael.troncy@eurecom.fr>
@rtroncy
Replications Study1
 Replicability
Repeating a previous result under the original conditions
(e.g. same system configuration and datasets)
 Reproducibility
Reproducing a previous result under different, but
comparable conditions
 Generalizability
Applying an existing, empirically validated technique to a
different task/domain than the original one
21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 2
1 Hasibi, F., Balog, K., Bratsberg, S.E. On the Reproducibility of the TAGME Entity Linking System.
38th European Conference on Information Retrieval (ECIR), 2016
SemEval 2013-20152
Task: Sentiment analysis in Twitter
2013
Task 2
2014
Task 9
2015
Task 10
Subtask A
Contextual
Polarity
disambiguation
Subtask B
Message
Polarity
Classification
Subtask C
Topic-Based
Message
Polarity
Classification
Subtask D
Detecting
Trends
Towards a
Topic
Subtask E
Determining
strength of
association of
Twitter terms
with positive
sentiment
2 Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S., Ritter, A., Stoyanovm, V. SemEval-2015 Task 10:
Sentiment Analysis in Twitter. 9th International Workshop on Semantic Evaluation (SemEval), 2015
http://alt.qcri.org/semeval2015
21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 3
SemEval Subtask B (started in 2013)
 Annotations performed by Amazon Mechanical
Turkers
 Tweets are classified in 3 classes
Positive Neutral Negative
21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 4
SemEval Subtask B
Tweet ID
Gold
Standard ID
Gold
Standard
Tweet
5228553018
76580353
T15111159 positive I've been watching Gilmore Girls
for the past 3 hours. Oops,
happy Thursday!
5230874482
64671233
T15111142 neutral My Friday consists of Netflix and
hot tea allllllllll day long.
5229601206
83429889
T15111318 negative Kobe Bryant smiling as he re-
enters the game with the Lakers
losing 91-63 in the 4th quarter.
Probably insanity settling in.
- 521/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
SemEval Subtask B
Systems scored according to the F1 measure
2015: ~40 systems competing
Webis
Team
SemEval
2015
Hagen, M., Potthast, M., Buchner, M., Stein, B.: Webis: An Ensemble for Twitter Sentiment
Detection. International Workshop on Semantic Evaluation (SemEval), 2015
- 621/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
- 721/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Ensemble Learning that combines different
classifiers with different settings
Webis System
Webis’s system is an ensemble of 4 classifiers
NRC-
CANADA
GU-MLT-LT
KLUE
TeamX
SemEval
2013
SemEval
2014
- 821/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Webis System
System Classifier Features
NRC-
Canada
Support Vector
Machine(SVM)
n-grams, alcaps, POS, polarity
dictionaries, punctuation marks,
emoticons, word lengthening, clusters
and negation
GU-MLT-LT
Linear regression normalized uni-grams, stems,
clustering and negation
KLUE
Maximum
Entropy
unigrams, bigrams, and an extended
unigram model that includes a simple
treatment of negation
TeamX
Logistic
Regression
(LIBLINEAR)
word n-grams, character n-grams,
clusters and word senses
- 921/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Webis System
System Language resources
NRC-
Canada
NRC Emotion, MPQA, Bing Liu’s Opinion Lexicon, NRC
Hashtag Sentiment and the Sentiment140
GU-MLT-LT Polarity Dictionary and SentiWordNet
KLUE
SentiStrength, extended version of AFINN-111, large-
vocabulary distributional semantic models (DSM) from
English Wikipedia and Google Web 1T 5-Grams
databases
TeamX
Formal: MPQA Subjectivity Lexicon, General Inquirer
and SentiWordNet
Informal: AFINN-111, Bing Liu’s Opinion Lexicon, NRC
Hashtag, Sentiment Lexicon and Sentiment140 Lexicon
- 1021/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Replicability
Download Webis’s already trained models and code
https://github.com/webis-de/ECIR-2015-and-SEMEVAL-
2015
Download SemEval’s datasets via the Twitter API
(some tweets not available anymore)
- 1121/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Replicability
 Versioning is an important aspect to be considered in
any replication study
 We replaced the Stanford NLP Core old libraries with
the newest ones
21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 12
Dataset
Claimed in
paper
Webis’s
models
Replicate Webis system on test 2013 68.49 69.62
Replicate Webis system on test 2014 70.86 66.65
Replicate Webis system on test 2015 64.84 66.17
Reproducibility
Dataset Claimed in
paper
Webis’s
models
Our
models
Replicate Webis system on test 2013 68.49 69.62 70.06
Replicate Webis system on test 2014 70.86 66.65 69.31
Replicate Webis system on test 2015 64.84 66.17 66.57
Replicate Webis system - TeamX on test 2013 N/A 69.04 70.34
Replicate Webis system - TeamX on test 2014 N/A 66.51 68.56
Replicate Webis system - TeamX on test 2015 N/A 65.58 66.19
 Our models have better performance in general
 SentiME without TeamX performs worst for 2014’s and
2015’s but not for 2013’s dataset
21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 13
Generalization
21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 14
 SentiME
Consisted by 4 classifiers
+ Stanford Sentiment System
We train our models using bagging in order to boost
the training of the ensemble
 We noticed a lot of commonalities in TeamX’s
and Stanford’s Sentiment System features, so
we decided to perform test with/without TeamX
in order to assess the classifier's contribution
Stanford Sentiment System
 Stanford Sentiment System is a recursive neural
tensor network parsed by the Stanford Tree Bank
 Stanford Sentiment System can capture the meaning
of compositional phrases which is hard to be achieved
by the normal bag of words approaches
● Classifies a sentence in 5
classes (very positive, positive,
neutral, negative and very
negative)
● We use the pre-trained models
Stanford team provides
- 1521/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Bagging
Due to the fact that bagging introduces some
randomness into the training process, and that
the size of the bootstrap samples are not fixed,
we decide to perform multiple experiments with
different sizes ranging from 33% to 175%
We observed that doing bagging with 150% of the
initial dataset size leads to the best performance
in terms of F1 score
- 1621/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
SentiME System
- 1721/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Generalization
1. Webis replicate system: this is the replicate of the
Webis system using re-trained models
2. SentiME system: the system we propose
3. Webis replicate system without TeamX
4. SentiME system without TeamX
We performed four different experiments to
evaluate the performance of SentiME compare
to our previous replicate of the Webis system
- 1821/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Generalization
System SemEval2014-
test
SemEval2014-
sarcasm
SemEval2015-
test
SemEval2015-
sarcasm
Webis Replicate system 69.31 60.00 66.57 54.19
SentiME system 68.27 62.57 67.39 60.92
Webis replicate system
without TeamX
68.56 62.04 66.19 56.86
SentiME system without
TeamX
69.27 62.04 66.38 58.92
Webis 70.86 49.33 64.84 53.59
• SentiME outperforms Webis Replicate system on all datasets except SemEval2014-test
• SentiME improves the F score by respectively 2,5% and 6,5% on SemEval2014-sarcasm
and SemEval2015-sarcasm datasets
• On the SemEval2014-sarcasm dataset there is a significant difference of performance
between the original Webis system (49.33%) and our replicate (60%).
- 1921/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Summary
 Stanford Sentiment System is heavily skew towards
negative classification
 We manage to improve the Webis system by 1% in the
general case by introducing a fifth sub-classifier (the
Stanford Sentiment System) and by boosting the
training with bagging 150%
 The SentiME system also outperforms the Webis
system by 6,5% on the particular and more difficult
sarcasm dataset (thanks to Stanford classifier)
- 2021/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
Some Lessons Learned
 Availability of source code AND models significantly
helps to perform reproducibility study
 Pre-trained models provided by Webis are not exactly
the same than the re-trained models we have created
from the data at disposal
You have to archive data … and software libraries !
 It is possible that Webis’s authors did not detail the full
set of features they have used
- 2121/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
https://github.com/MultimediaSemantics/sentime

Contenu connexe

En vedette

Brighton & Hove budget cuts 2015-16
Brighton & Hove budget cuts 2015-16Brighton & Hove budget cuts 2015-16
Brighton & Hove budget cuts 2015-16brightonpa
 
Coca Cola Consoldiated incidence pricing agreement with Coca Cola
Coca Cola Consoldiated incidence pricing agreement with Coca ColaCoca Cola Consoldiated incidence pricing agreement with Coca Cola
Coca Cola Consoldiated incidence pricing agreement with Coca ColaNeil Kimberley
 
Atención Primaria y la atención a las personas con enfermedad crónica
Atención Primaria y la atención a las personas con enfermedad crónicaAtención Primaria y la atención a las personas con enfermedad crónica
Atención Primaria y la atención a las personas con enfermedad crónicaRafa Cofiño
 
Employment support for long term incapacity benefit claimants
Employment support for long term incapacity benefit claimantsEmployment support for long term incapacity benefit claimants
Employment support for long term incapacity benefit claimantslocalinsight
 
15_11-18_PR_Traverse Announces Patent Issuance
15_11-18_PR_Traverse Announces Patent Issuance15_11-18_PR_Traverse Announces Patent Issuance
15_11-18_PR_Traverse Announces Patent IssuanceJoseph Scaduto
 
Sex cake and your business
Sex cake and your businessSex cake and your business
Sex cake and your businessGraham Brooks
 

En vedette (8)

Brighton & Hove budget cuts 2015-16
Brighton & Hove budget cuts 2015-16Brighton & Hove budget cuts 2015-16
Brighton & Hove budget cuts 2015-16
 
Coca Cola Consoldiated incidence pricing agreement with Coca Cola
Coca Cola Consoldiated incidence pricing agreement with Coca ColaCoca Cola Consoldiated incidence pricing agreement with Coca Cola
Coca Cola Consoldiated incidence pricing agreement with Coca Cola
 
Atención Primaria y la atención a las personas con enfermedad crónica
Atención Primaria y la atención a las personas con enfermedad crónicaAtención Primaria y la atención a las personas con enfermedad crónica
Atención Primaria y la atención a las personas con enfermedad crónica
 
Employment support for long term incapacity benefit claimants
Employment support for long term incapacity benefit claimantsEmployment support for long term incapacity benefit claimants
Employment support for long term incapacity benefit claimants
 
hidayath cv 2016
hidayath cv 2016hidayath cv 2016
hidayath cv 2016
 
15_11-18_PR_Traverse Announces Patent Issuance
15_11-18_PR_Traverse Announces Patent Issuance15_11-18_PR_Traverse Announces Patent Issuance
15_11-18_PR_Traverse Announces Patent Issuance
 
Missao Piaui Diario da Serra 2016
Missao Piaui Diario da Serra 2016Missao Piaui Diario da Serra 2016
Missao Piaui Diario da Serra 2016
 
Sex cake and your business
Sex cake and your businessSex cake and your business
Sex cake and your business
 

Similaire à Replication of Top Twitter Sentiment Systems

Modeling Software Systems in Experimental Robotics for Improved Reproducibility
Modeling Software Systems in Experimental Robotics for Improved ReproducibilityModeling Software Systems in Experimental Robotics for Improved Reproducibility
Modeling Software Systems in Experimental Robotics for Improved ReproducibilityFlorian Lier
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Carole Goble
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsTaesu Kim
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsUniversity of Zurich
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
RiVal - A toolkit to foster reproducibility in Recommender System evaluation
RiVal - A toolkit to foster reproducibility in Recommender System evaluationRiVal - A toolkit to foster reproducibility in Recommender System evaluation
RiVal - A toolkit to foster reproducibility in Recommender System evaluationAlejandro Bellogin
 
From One Test To Test Framework With Rapise
From One Test To Test Framework With Rapise From One Test To Test Framework With Rapise
From One Test To Test Framework With Rapise Inflectra
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...IRJET Journal
 
Eclipse Meets Systems Biology
Eclipse Meets Systems BiologyEclipse Meets Systems Biology
Eclipse Meets Systems BiologyRichard Adams
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringTao Xie
 
The Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformThe Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformNuxeo
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 tsysglobalsolutions
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 

Similaire à Replication of Top Twitter Sentiment Systems (20)

Modeling Software Systems in Experimental Robotics for Improved Reproducibility
Modeling Software Systems in Experimental Robotics for Improved ReproducibilityModeling Software Systems in Experimental Robotics for Improved Reproducibility
Modeling Software Systems in Experimental Robotics for Improved Reproducibility
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applications
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software Analytics
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
2014 toronto-torbug
2014 toronto-torbug2014 toronto-torbug
2014 toronto-torbug
 
RiVal - A toolkit to foster reproducibility in Recommender System evaluation
RiVal - A toolkit to foster reproducibility in Recommender System evaluationRiVal - A toolkit to foster reproducibility in Recommender System evaluation
RiVal - A toolkit to foster reproducibility in Recommender System evaluation
 
Meenakshi Pal_16
Meenakshi Pal_16Meenakshi Pal_16
Meenakshi Pal_16
 
PPT
PPTPPT
PPT
 
Manoj_Rajandrakumar_Resume
Manoj_Rajandrakumar_ResumeManoj_Rajandrakumar_Resume
Manoj_Rajandrakumar_Resume
 
From One Test To Test Framework With Rapise
From One Test To Test Framework With Rapise From One Test To Test Framework With Rapise
From One Test To Test Framework With Rapise
 
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...
 
Eclipse Meets Systems Biology
Eclipse Meets Systems BiologyEclipse Meets Systems Biology
Eclipse Meets Systems Biology
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
The Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platformThe Nuxeo Way: leveraging open source to build a world-class ECM platform
The Nuxeo Way: leveraging open source to build a world-class ECM platform
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
TiMetmay10
TiMetmay10TiMetmay10
TiMetmay10
 

Plus de Raphael Troncy

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyRaphael Troncy
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentRaphael Troncy
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningRaphael Troncy
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationRaphael Troncy
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Raphael Troncy
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Raphael Troncy
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Raphael Troncy
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionRaphael Troncy
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Raphael Troncy
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webRaphael Troncy
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streamsRaphael Troncy
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdRaphael Troncy
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentRaphael Troncy
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksRaphael Troncy
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED OpeningRaphael Troncy
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingRaphael Troncy
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED OpeningRaphael Troncy
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsRaphael Troncy
 

Plus de Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social web
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streams
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED Opening
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop opening
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting Events
 

Dernier

定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleanscorenetworkseo
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 

Dernier (20)

定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleans
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 

Replication of Top Twitter Sentiment Systems

  • 1. A Replication Study of the Top Performing Systems in SemEval Twitter Sentiment Analysis Efstratios Sygkounas, Giuseppe Rizzo, Raphaël Troncy <raphael.troncy@eurecom.fr> @rtroncy
  • 2. Replications Study1  Replicability Repeating a previous result under the original conditions (e.g. same system configuration and datasets)  Reproducibility Reproducing a previous result under different, but comparable conditions  Generalizability Applying an existing, empirically validated technique to a different task/domain than the original one 21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 2 1 Hasibi, F., Balog, K., Bratsberg, S.E. On the Reproducibility of the TAGME Entity Linking System. 38th European Conference on Information Retrieval (ECIR), 2016
  • 3. SemEval 2013-20152 Task: Sentiment analysis in Twitter 2013 Task 2 2014 Task 9 2015 Task 10 Subtask A Contextual Polarity disambiguation Subtask B Message Polarity Classification Subtask C Topic-Based Message Polarity Classification Subtask D Detecting Trends Towards a Topic Subtask E Determining strength of association of Twitter terms with positive sentiment 2 Rosenthal, S., Nakov, P., Kiritchenko, S., Mohammad, S., Ritter, A., Stoyanovm, V. SemEval-2015 Task 10: Sentiment Analysis in Twitter. 9th International Workshop on Semantic Evaluation (SemEval), 2015 http://alt.qcri.org/semeval2015 21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 3
  • 4. SemEval Subtask B (started in 2013)  Annotations performed by Amazon Mechanical Turkers  Tweets are classified in 3 classes Positive Neutral Negative 21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 4
  • 5. SemEval Subtask B Tweet ID Gold Standard ID Gold Standard Tweet 5228553018 76580353 T15111159 positive I've been watching Gilmore Girls for the past 3 hours. Oops, happy Thursday! 5230874482 64671233 T15111142 neutral My Friday consists of Netflix and hot tea allllllllll day long. 5229601206 83429889 T15111318 negative Kobe Bryant smiling as he re- enters the game with the Lakers losing 91-63 in the 4th quarter. Probably insanity settling in. - 521/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 6. SemEval Subtask B Systems scored according to the F1 measure 2015: ~40 systems competing Webis Team SemEval 2015 Hagen, M., Potthast, M., Buchner, M., Stein, B.: Webis: An Ensemble for Twitter Sentiment Detection. International Workshop on Semantic Evaluation (SemEval), 2015 - 621/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 7. - 721/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan Ensemble Learning that combines different classifiers with different settings
  • 8. Webis System Webis’s system is an ensemble of 4 classifiers NRC- CANADA GU-MLT-LT KLUE TeamX SemEval 2013 SemEval 2014 - 821/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 9. Webis System System Classifier Features NRC- Canada Support Vector Machine(SVM) n-grams, alcaps, POS, polarity dictionaries, punctuation marks, emoticons, word lengthening, clusters and negation GU-MLT-LT Linear regression normalized uni-grams, stems, clustering and negation KLUE Maximum Entropy unigrams, bigrams, and an extended unigram model that includes a simple treatment of negation TeamX Logistic Regression (LIBLINEAR) word n-grams, character n-grams, clusters and word senses - 921/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 10. Webis System System Language resources NRC- Canada NRC Emotion, MPQA, Bing Liu’s Opinion Lexicon, NRC Hashtag Sentiment and the Sentiment140 GU-MLT-LT Polarity Dictionary and SentiWordNet KLUE SentiStrength, extended version of AFINN-111, large- vocabulary distributional semantic models (DSM) from English Wikipedia and Google Web 1T 5-Grams databases TeamX Formal: MPQA Subjectivity Lexicon, General Inquirer and SentiWordNet Informal: AFINN-111, Bing Liu’s Opinion Lexicon, NRC Hashtag, Sentiment Lexicon and Sentiment140 Lexicon - 1021/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 11. Replicability Download Webis’s already trained models and code https://github.com/webis-de/ECIR-2015-and-SEMEVAL- 2015 Download SemEval’s datasets via the Twitter API (some tweets not available anymore) - 1121/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 12. Replicability  Versioning is an important aspect to be considered in any replication study  We replaced the Stanford NLP Core old libraries with the newest ones 21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 12 Dataset Claimed in paper Webis’s models Replicate Webis system on test 2013 68.49 69.62 Replicate Webis system on test 2014 70.86 66.65 Replicate Webis system on test 2015 64.84 66.17
  • 13. Reproducibility Dataset Claimed in paper Webis’s models Our models Replicate Webis system on test 2013 68.49 69.62 70.06 Replicate Webis system on test 2014 70.86 66.65 69.31 Replicate Webis system on test 2015 64.84 66.17 66.57 Replicate Webis system - TeamX on test 2013 N/A 69.04 70.34 Replicate Webis system - TeamX on test 2014 N/A 66.51 68.56 Replicate Webis system - TeamX on test 2015 N/A 65.58 66.19  Our models have better performance in general  SentiME without TeamX performs worst for 2014’s and 2015’s but not for 2013’s dataset 21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 13
  • 14. Generalization 21/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan - 14  SentiME Consisted by 4 classifiers + Stanford Sentiment System We train our models using bagging in order to boost the training of the ensemble  We noticed a lot of commonalities in TeamX’s and Stanford’s Sentiment System features, so we decided to perform test with/without TeamX in order to assess the classifier's contribution
  • 15. Stanford Sentiment System  Stanford Sentiment System is a recursive neural tensor network parsed by the Stanford Tree Bank  Stanford Sentiment System can capture the meaning of compositional phrases which is hard to be achieved by the normal bag of words approaches ● Classifies a sentence in 5 classes (very positive, positive, neutral, negative and very negative) ● We use the pre-trained models Stanford team provides - 1521/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 16. Bagging Due to the fact that bagging introduces some randomness into the training process, and that the size of the bootstrap samples are not fixed, we decide to perform multiple experiments with different sizes ranging from 33% to 175% We observed that doing bagging with 150% of the initial dataset size leads to the best performance in terms of F1 score - 1621/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 17. SentiME System - 1721/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 18. Generalization 1. Webis replicate system: this is the replicate of the Webis system using re-trained models 2. SentiME system: the system we propose 3. Webis replicate system without TeamX 4. SentiME system without TeamX We performed four different experiments to evaluate the performance of SentiME compare to our previous replicate of the Webis system - 1821/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 19. Generalization System SemEval2014- test SemEval2014- sarcasm SemEval2015- test SemEval2015- sarcasm Webis Replicate system 69.31 60.00 66.57 54.19 SentiME system 68.27 62.57 67.39 60.92 Webis replicate system without TeamX 68.56 62.04 66.19 56.86 SentiME system without TeamX 69.27 62.04 66.38 58.92 Webis 70.86 49.33 64.84 53.59 • SentiME outperforms Webis Replicate system on all datasets except SemEval2014-test • SentiME improves the F score by respectively 2,5% and 6,5% on SemEval2014-sarcasm and SemEval2015-sarcasm datasets • On the SemEval2014-sarcasm dataset there is a significant difference of performance between the original Webis system (49.33%) and our replicate (60%). - 1921/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 20. Summary  Stanford Sentiment System is heavily skew towards negative classification  We manage to improve the Webis system by 1% in the general case by introducing a fifth sub-classifier (the Stanford Sentiment System) and by boosting the training with bagging 150%  The SentiME system also outperforms the Webis system by 6,5% on the particular and more difficult sarcasm dataset (thanks to Stanford classifier) - 2021/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan
  • 21. Some Lessons Learned  Availability of source code AND models significantly helps to perform reproducibility study  Pre-trained models provided by Webis are not exactly the same than the re-trained models we have created from the data at disposal You have to archive data … and software libraries !  It is possible that Webis’s authors did not detail the full set of features they have used - 2121/10/2016 - 15th International Semantic Web Conference (ISWC), Kobe, Japan https://github.com/MultimediaSemantics/sentime