SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
OPENEDITION LAB
TEXT MINING
PROJECTS
Patrice	
  Bellot

Aix-­‐Marseille	
  Université	
  -­‐	
  CNRS	
  (LSIS	
  UMR	
  7296	
  ;	
  OpenEdition)	
  
!
patrice.bellot@univ-­‐amu.fr
LSIS	
  -­‐	
  DIMAG	
  team	
  http://www.lsis.org/spip.php?id_rubrique=291	
  
OpenEdition	
  Lab	
  :	
  http://lab.hypotheses.org
Hypotheses!
600+ blogs
Revues.org!
300+ journals
Calenda!
20 000+ events
OpenEdition Books!
1000+ books
A	
  European	
  Web	
  
platform	
  for	
  Human	
  
and	
  Social	
  Sciences
A	
  digital	
  infrastructure	
  
for	
  open	
  access
A	
  lab	
  for	
  experimenting	
  
new	
  Text	
  Mining	
  

and	
  new	
  IR	
  systems
Open Edition - a Facility of Excellence
3
2012-2020 7 millions €
Objectives:!
15 000 + books!
2000 + blogs!
Freemium!
Multilingual
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
OpenEdition Lab — Our Team
Directors : Patrice Bellot (Professor in Comp. Sc. / NLP / IR) - Marin Dacos (Head of OpenEdition)
Engineers : Elodie Faath - Arnaud Cordier
PhD Students : Hussam Hamdan, Chahinez Benkoussas, Anaïs Ollagnier
Post-docs : Young-Min Kim (2012-13), Shereen Albitar (2014)
4
http://lab.hypotheses.org
• 220	
  learned	
  societes	
  and	
  centers	
  (France)	
  
• 30	
  university	
  presses	
  (France,	
  UK,	
  Belgium,	
  Switzerland,	
  Canada,	
  Mexico,	
  Hungary/USA)	
  
• CCSD	
  –	
  France	
  -­‐	
  Lyon	
  (HAL	
  /	
  DataCenter),	
  
• CHNM	
  –	
  USA	
  –	
  Washington,	
  
• OAPEN	
  –	
  NL	
  -­‐	
  The	
  Hague,	
  
• UNED	
  –	
  Spain	
  -­‐	
  Universidad	
  Nacional	
  de	
  Educación	
  a	
  Distancia,	
  
• Fundação	
  Calouste	
  Gulbenkian	
  –	
  Portugal,	
  
• Max	
  Weber	
  Stinftung	
  –	
  Germany,	
  
• Google	
  –	
  USA	
  (Google	
  Grants	
  for	
  DH),	
  
• DARIAH	
  –	
  Europe.
Our partners
And	
  you?
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
OpenEdition Lab (Text Mining Projects for DL)
Aims to :
— Link papers / books / blogs automatically (reference analysis, Named Entities…)
— Detect hot topics, hot books, hot papers : content oriented analysis (not only by using logs)

- sentiment analysis

- review of books analysis (and finding)
— Book searching with complex and long queries
— Reading recommandation
6
Project 1: BILBO
EN SVM CRF
Natural	
  Language	
  Processing	
  /	
  Text	
  Mining	
  /	
  Information	
  Retrieval	
  /	
  Machine	
  Learning
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 8
Projet N°2 : ECHO!
Détec'on!automa'que!de!compte0rendus!de!lecture! LREC,&2014!
Mise!en!rela'on!!
(BILBO)!
Recherche!Web!
Analyse!de!
sen'ments!
Mesure&de&l’écho!
NAACL-SEMEVAL,&2013!
logs,!métriques…!
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 9
Projet N°3 : COOKER!
BILBO!
ECHO!
Graphe!des!contenus!
(puis!hypergraphe)!
Recommanda)on!
COOKER!
Classifica?on!
automa?que!
et!métaCdonnées!
(thèmes,!langues,!
auteurs…)!
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Semantic Annotation of Bib. References
10
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 11
A	
  –	
  references	
  in	
  a	
  specific	
  section
B	
  :	
  references	
  in	
  notes
C	
  :	
  references	
  in	
  the	
  body
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
BILBO: A software for Annotating Bibliographical
Reference in Digital Humanities
Google Digital Humanities Research Awards (2011, 2012)
State of the art : 

— CiteSeer system (Giles et al., 1998) in computer science, 80% of precision for author and
40% for pages. Conditional Random Fields (CRFs) (Peng et al., 2006, Lafferty et al., 2001) for
scientific articles, 95% of average precision (99% for author, 95% for title, 85% for editor).
— Run on the cover page (title) and/or on the Reference section at the end of papers : not in
the footnotes, not in the text body
— Not very robust (in the real world : no stylesheets - poorly respected)
!
13
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 14
References – three levels Architecture
Source XHTML (TEI guidelines)
LEARNING
AUTOMATIC ANNOTATION
TXT
Estimated XML files
• Revues.org online journals
- 340 journals
- Various reference formats
- 20 different languages
(90% in French)
• Unstructured and scattered reference data
• Prototype development, Web service
! source code will be distributed (GPL)
• Google Digital Humanities Research Awards (’10, ’11)
Part of Equipex future investment award: DILOH (’12)
Web Service
Plain text input
Future platform
Level 1
Level 2
Learning data
Tokenizer, Extractor
New data
Tokenizer, Extractor
Manual annotation
External machine
learning modules
Level 1
model
Level 2
model
Level 3
model
Machine learning
modules
Mallet, SVMlight
Conditional Random Fields
Automatic annotator
Call a model
Level 1 Level 2
Bibliography Notes Implicit References
Level 3
Comparison with other online tools
New Data : Reference data of library of
Kim	
  &	
  Bellot,	
  2012
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Conditional Random Fields for IE
— A discriminative model that is specified over a graph that encodes the conditional dependencies
(relationships between observations)
— Can be employed for sequential labeling (linear chain CRF)
— Take context into account
— The probability of a label sequence y given an observation

sequence x is : 









with F the (rich) feature functions (transition and state functions)







Parameters must be estimated using an iterative technique such as iterative scaling or gradient-
based methods
15
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001)	

Yi 1 Yi Yi+1
?
s
s -
?
s
s -
?
s
s
Xi 1 Xi Xi+1
Yi 1 Yi Yi+1
c
6
s -
c
6
s -
c
6
s
Xi 1 Xi Xi+1
Yi 1 Yi Yi+1
c
s
c
s
c
s
Xi 1 Xi Xi+1
Figure 2. Graphical structures of simple HMMs (left), MEMMs (center), and the chain-structured case of CRFs (right) for sequences.
An open circle indicates that the variable is not generated by the model.
sequence. In addition, the features do not need to specify
completely a state or observation, so one might expect that
the model can be estimated from less training data. Another
attractive property is the convexity of the loss function; in-
deed, CRFs share all of the convexity properties of general
maximum entropy models.
For the remainder of the paper we assume that the depen-
dencies of Y, conditioned on X, form a chain. To sim-
plify some expressions, we add special start and stop states
Y0 = start and Yn+1 = stop. Thus, we will be using the
graphical structure shown in Figure 2. For a chain struc-
ture, the conditional probability of a label sequence can be
expressed concisely in matrix form, which will be useful
in describing the parameter estimation and inference al-
gorithms in Section 4. Suppose that p✓(Y | X) is a CRF
given by (1). For each position i in the observation se-
quence x, we define the |Y| ⇥ |Y| matrix random variable
Mi(x) = [Mi(y0
, y | x)] by
Mi(y0
, y | x) = exp (⇤i(y0
, y | x))
⇤i(y0
, y | x) =
P
k k fk(ei, Y|ei
= (y0
, y), x) +
P
k µk gk(vi, Y|vi
= y, x) ,
of the training data. Both algorithms are based on the im-
proved iterative scaling (IIS) algorithm of Della Pietra et al.
(1997); the proof technique based on auxiliary functions
can be extended to show convergence of the algorithms for
CRFs.
Iterative scaling algorithms update the weights as k
k + k and µk µk + µk for appropriately chosen
k and µk. In particular, the IIS update k for an edge
feature fk is the solution of
eE[fk]
def
=
X
x,y
ep(x, y)
n+1X
i=1
fk(ei, y|ei
, x)
=
X
x,y
ep(x) p(y | x)
n+1X
i=1
fk(ei, y|ei
, x) e kT (x,y)
.
where T(x, y) is the total feature count
T(x, y)
def
=
X
i,k
fk(ei, y|ei , x) +
X
i,k
gk(vi, y|vi , x) .
The equations for vertex feature updates µk have similar
tj(yi−1, yi, x, i) =
b(x, i) if yi−1 = IN and yi = NNP
0 otherwise.
In the remainder of this report, notation is simplified by writing
s(yi, x, i) = s(yi−1, yi, x, i)
and
Fj(y, x) =
n
i=1
fj(yi−1, yi, x, i),
where each fj(yi−1, yi, x, i) is either a state function s(yi−1, yi, x, i) or a transi-
tion function t(yi−1, yi, x, i). This allows the probability of a label sequence y
given an observation sequence x to be written as
p(y|x, λ) =
1
Z(x)
exp (
j
λjFj(y, x)). (3)
Z(x) is a normalization factor.
4
exp (
j
λjtj(yi−1, yi, x, i) +
k
µksk(yi, x, i)), (2)
where tj(yi−1, yi, x, i) is a transition feature function of the entire observation
sequence and the labels at positions i and i−1 in the label sequence; sk(yi, x, i)
is a state feature function of the label at position i and the observation sequence;
and λj and µk are parameters to be estimated from training data.
When defining feature functions, we construct a set of real-valued features
b(x, i) of the observation to expresses some characteristic of the empirical dis-
tribution of the training data that should also hold of the model distribution.
An example of such a feature is
b(x, i) =
1 if the observation at position i is the word “September”
0 otherwise.
Each feature function takes on the value of one of these real-valued observation
features b(x, i) if the current state (in the case of a state function) or previous
and current states (in the case of a transition function) take on particular val-
ues. All feature functions are therefore real-valued. For example, consider the
following transition function:
tj(yi−1, yi, x, i) =
b(x, i) if yi−1 = IN and yi = NNP
0 otherwise.
In the remainder of this report, notation is simplified by writing
s(yi, x, i) = s(yi−1, yi, x, i)
and
Fj(y, x) =
n
i=1
fj(yi−1, yi, x, i),
where each fj(yi−1, yi, x, i) is either a state function s(yi−1, yi, x, i) or a transi-
tion function t(yi−1, yi, x, i). This allows the probability of a label sequence y
given an observation sequence x to be written as
p(y|x, λ) =
1
Z(x)
exp (
j
λjFj(y, x)). (3)
Z(x) is a normalization factor.
4
1.2 Graphical Models 7
Logistic Regression
HMMs
Linear-chain CRFs
Naive Bayes
SEQUENCE
SEQUENCE
CONDITIONAL CONDITIONAL
Generative directed models
General CRFs
CONDITIONAL
General
GRAPHS
General
GRAPHS
Figure 1.2 Diagram of the relationship between naive Bayes, logistic regression,
HMMs, linear-chain CRFs, generative models, and general CRFs.
Furthermore, even when naive Bayes has good classification accuracy, its prob-
ability estimates tend to be poor. To understand why, imagine training naive
Bayes on a data set in which all the features are repeated, that is, x =
(x1, x1, x2, x2, . . . , xK, xK). This will increase the confidence of the naive Bayes
probability estimates, even though no new information has been added to the data.
Assumptions like naive Bayes can be especially problematic when we generalize
to sequence models, because inference essentially combines evidence from di↵erent
parts of the model. If probability estimates at a local level are overconfident, it
might be di cult to combine them sensibly.
Actually, the di↵erence in performance between naive Bayes and logistic regression
is due only to the fact that the first is generative and the second discriminative;
the two classifiers are, for discrete input, identical in all other respects. Naive Bayes
and logistic regression consider the same hypothesis space, in the sense that any
logistic regression classifier can be converted into a naive Bayes classifier with the
same decision boundary, and vice versa. Another way of saying this is that the naive
Bayes model (1.5) defines the same family of distributions as the logistic regression
model (1.7), if we interpret it generatively as
p(y, x) =
exp {
P
k kfk(y, x)}
P
˜y,˜x exp {
P
k kfk(˜y, ˜x)}
. (1.9)
This means that if the naive Bayes model (1.5) is trained to maximize the con-
ditional likelihood, we recover the same classifier as from logistic regression. Con-
versely, if the logistic regression model is interpreted generatively, as in (1.9), and is
1.3 Linear-Chain Conditional Random Fields 9
. . .
. . .
y
x
Figure 1.3 Graphical model of an HMM-like linear-chain CRF.
. . .
. . .
y
x
Figure 1.4 Graphical model of a linear-chain CRF in which the transition score
depends on the current observation.
1.3 Linear-Chain Conditional Random Fields
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 16
Table V: Verified input, local and global features. The selected ones in BILBO are
written in black, and the non-selected ones are in gray.
Input features
Feature category Description
Raw input token (I1) Tokenized word itself in the input string and the lowercased
word
Preceding or following tokens (I2) Three preceding and three following tokens of current token
N-gram (I3) Attachment of preceding or following N-gram tokens
Prefix/suffix in character level (I4) 8 different prefix/suffix as in [Councill et al. 2008]
Local features
Feature cate-
gory
Feature name Description Example
Number ALLNUMBERS All characters are numbers 1984
(F1) NUMBERS One or more characters are numbers in-4
DASH One or more dashes are included in numbers 665-680
(F1digit) 1DIGIT, 2DIGIT ... If number, number of digits in it 5, 78, ...
Capitalization ALLCAPS All characters are capital letters RAYMOND
(F2) FIRSTCAP First character is capital letter Paris
ALLSAMLL All characters are lower cased pouvoirs
NONIMPCAP Capital letters are mixed dell’Ateneo
Regular form INITIAL Initialized expression Ch.-R.
(F3) WEBLINK Regular expression for web pages apcss.org
Emphasis (F4) ITALIC Italic characters Regional
Location BIBL START Position is in the first one-third of reference -
(F5) BIBL IN Position is between the one-third and two-third -
BIBL END Position is between the two-third and the end -
Lexicon POSSEDITOR Possible for the abbreviation of editor ed.
(F6) POSSPAGE Possible for the abbreviation of page pp.
POSSMONTH Possible for month September
POSSVOLUME Possible for the abbreviation of volume vol.
External list SURNAMELIST Found in an external surname list RAYMOND
(F7) FORENAMELIST Found in an external forename list Simone
PLACELIST Found in an external place list New York
JOURNALLIST Found in an external journal list African
Affaire
Punctuation
(F8)
COMMA,
POINT, LINK,
PUNC, LEAD-
INGQUOTES,
END-
INGQUOTES,
PAIREDBRACES
Punctuation mark itself (comma, point) or punc-
tuation type. These features are defined espe-
cially for the case of non-separated punctuation.
46-55, 1993.
S.; [en “The
design”. (1)
Global features
Feature category Feature name Description
Local feature existence [local feature name] Corresponding local feature is found in the input
string
(G1) (F3, F4, and F6 features are finally selected)
Feature distribution
(G2) NOPUNC, 1PUNC,
2PUNC, MORE-
PUNC
There are no, 1, 2, or more PUNC features in the in-
put string
NONUMBER There is no number in the input string
STARTINITIAL The input string starts with an initial expression
ENDQUOTECOMMA An ending quote is followed by a comma
FIRSTCAPCOMMA A token having FIRSTCAP feature is followed by a
comma
Kim	
  &	
  Bellot,	
  2013
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 17
Fig. 3: Basic tokenization effect. Each point is the averaged value of 10 different cross-
validated experiments.
●
●
●
●
●
Cumulative local feature effect
Training sets
Micro−averagedF−measure
50% 60% 70% 80% 90%
757779818385
● F0(Base)
F1(Num.)
F2(Cap.)
F3(Reg.)
F4(Emp.)
F5(Loc.)
F6(Lex.)
F7(Ext.)
F8(Pun.)
(a) Corpus level 1
●
●
●
●
●
Cumulative local feature effect
Training sets
Micro−averagedF−measure
50% 60% 70% 80% 90%
868890929496
● F0(Base)
F1(Num.)
F2(Cap.)
F3(Reg.)
F4(Emp.)
F5(Loc.)
F6(Lex.)
F7(Ext.)
F8(Pun.)
(b) Cora dataset
Fig. 4: Cumulative local feature effect from F1 to F8 with C1 and Cora
uation. We repeat cross validations by cumulatively adding features of each category
from F1 to F8. Too detailed features such as that of category F1-sub are excluded here
because by testing the detailed ones at the end, we want to eliminate them if they
Kim	
  &	
  Bellot,	
  2013
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 18
Reference Parsing in Digital Humanities 39:25
●
●
●
●
●
Cumulative List feature effect
Training sets
Micro−averagedF−measure
50% 60% 70% 80% 90%
8283848586
● F6
F7a
F7b
F7c
F7
(a) Corpus level 1
●
●
●
●
●
Cumulative List feature effect
Training sets
Micro−averagedF−measure
50% 60% 70% 80% 90%
9192939495
● F6
F7a
F7b
F7c
F7
(b) Cora dataset
Fig. 5: Cumulative external list feature effect
Detailed analysis of the effect of external lists and lexicon. One of interesting discoveries
from the above analysis is lexical features are not always effective for reference pars-
ing. Lexicon features defined with strict rules without overlapping have actually no
significant impact, whereas external lists such as surname, forename, place, and jour-
Kim	
  &	
  Bellot,	
  2013
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 19
39:30 Y.-M. Kim and P. Bellot
Table VII: Micro averaged precision and recall per field for C1 and Cora with finally
chosen strategy
(a) C1 - detailed labels
Fields #true #annot. #exist. prec.(%) recall(%)
surname 1080 1164 1203 92.78 89.78
forename 1128 1220 1244 92.46 90.68
title(m) 3277 4132 3690 79.31 88.81
title(a) 2782 3253 3069 85.52 90.65
title(j) 440 564 681 78.01 64.61
title(u) 511 660 652 77.42 78.37
title(s) 18 24 118 75.00 15.25
publisher 1021 1367 1171 74.69 87.19
date 793 838 855 94.63 92.75
biblscope(pp) 210 223 219 94.17 95.89
biblscope(i) 152 191 189 79.58 80.42
biblscope(v) 75 87 102 86.21 73.53
extent 66 69 70 95.65 94.29
place 433 524 539 82.63 80.33
abbr 417 468 502 89.10 83.07
nolabel 231 306 488 75.49 47.34
edition 46 178 211 25.84 21.80
orgname 74 87 118 85.06 62.71
bookindicator 47 49 65 95.92 72.31
OTHERS 95 177 395 53.67 24.05
Average 12896 15581 15581 82.77 82.77
(b) Cora dataset
Fields #true #annot. #exist. prec.(%) recall(%)
author 2797 2855 2830 97.97 98.83
title 3508 3613 3560 97.10 98.54
booktitle 1750 1882 1865 92.99 93.83
journal 546 615 617 88.78 88.49
date 636 641 642 99.22 99.07
institution 268 299 306 89.63 87.58
publisher 165 188 203 87.77 81.28
location 247 279 289 88.53 85.47
editor 232 261 295 88.89 78.64
pages 422 429 438 98.37 96.35
volume 306 327 320 93.58 95.63
tech 130 155 178 83.87 73.03
note 75 122 123 61.48 60.98
Average 11082 11666 11666 95.00 95.00
punctuation is attached, but the latter is significantly negative when reference fields
are much detailed. Hypothesis 2 is confirmed with these observations.
For our system BILBO, input and local features written in black in Table V are fi-
nally selected. BILBO provides two different labeling levels, simple model using only
Learning	
  on	
  FR	
  data	
  
Testing	
  on	
  US	
  data
(715	
  references)
Kim	
  &	
  Bellot,	
  2013
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 20
Test	
  :	
  http://bilbo.openeditionlab.org	
  
Sources	
  :	
  http://github.com/OpenEdition/bilbo
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
EQUIPEX OpenEdition: BILBO
21
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
IR and Digital Libraries
!
Sentiment Analysis
22
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Searching for book reviews
• Applying and testing classical supervised approaches for filtering reviews = a new kind of genre
classification.
• Developing a corpus of reviews of books from the OpenEdition.org platforms and from the Web.
• Collecting two kinds of reviews:

— Long reviews of scientific books written by expert reviewers in scientific journals

— Short reviews such reader comments on social web sites
• Linking reviews to their corresponding books using BILBO
23
Review	
  
≠	
  
Abstract
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Searching for book reviews
• A supervised classification approach
• Feature selection : decision trees, Z-score
• Features : localisation of named entities,
24
was performed. We can see that a lot of this fea-
tures relate to the classe where they predominate.
Table 3: Distribution of the 30 highest normalized
Z scores across the corpus.
# Feature Z
score
# Feature Z
score
1 abandonne 30.14 16 winter 9.23
2 seront 30.00 17 cleo 8.88
3 biographie 21.84 18 visible 8.75
4 entranent 21.20 19 fondamentale 8.67
5 prise 21.20 20 david 8.54
6 sacre 21.20 21 pratiques 8.52
7 toute 20.70 22 signification 8.47
8 quitte 19.55 23 01 8.38
9 dimension 15.65 24 institutionnels 8.38
10 les 14.43 25 1930 8.16
11 commandement 11.01 26 attaques 8.14
12 lie 10.61 27 courrier 8.08
13 construisent 10.16 28 moyennes 7.99
14 lieux 10.14 29 petite 7.85
15 garde 9.75 30 adapted 7.84
In our training corpus, we have 106 911 words
obtained from the Bag-of-Words approach. We se-
lected all tokens (features) that appear more than
5 times in each classes. The goal is therefore to
design a method capable of selecting terms that
clearly belong to one genre of documents. We ob-
know, this section contains authors’ names, loca-
tions, dates, etc... However, in the Review class
this section is quite often absent. Based on this
analysis, we tagged all documents of each class
using the Named Entity Recognition tool TagEN
(Poibeau, 2003). We aim to explore the distribu-
tion of 3 named entities (”authors’ names”, ”loca-
tions” and ”dates”) in the text after removing all
XML-HTML tags. After that, we divided texts
into 10 parts (the size of each part = total num-
ber of words / 10). The distribution ratio of each
named entity in each part is used as feature to build
the new document representation and we obtained
a set of 30 features.
Figure 3: ”Person” named entity distribution
6 Experiments
Figure 4: ”Location” named entity distribution
Figure 5: ”Date” named entity distribution
6.2 Support Vector Machines (SVM)
SVM designates a learning approach introduced
by Vapnik in 1995 for solving two-class pattern
recognition problem (Vapnik, 1995). The SVM
method is based on the Structural Risk Mini-
mization principle (Vapnik, 1995) from computa-
tional learning theory. In their basic form, SVMs
learn linear threshold function. Nevertheless, by
a simple plug-in of an appropriate kernel func-
tion, they can be used to learn linear classifiers,
radial basic function (RBF) networks, and three-
layer sigmoid neural nets (Joachims, 1998). The
key in such classifiers is to determine the opti-
mal boundaries between the different classes and
use them for the purposes of classification (Ag-
garwal and Zhai, 2012). Having the vectors form
the different representations presented below. we
used the Weka toolkit to learning model. This
model with the use of the linear kernel and Radial
|w| indicates the number of words included in the
current document and wj is the number of words
that appear in the document.
arg max
hi
P(hi).
|w|
Y
j=1
P(wj|hi) (5)
where P(wj|hi) =
tfj,hi
nhi
We estimate the probabilities with the Equation
(5) and get the relation between the lexical fre-
quency of the word wj in the whole size of the
collection Thi
(denoted tfj,hi
) and the size of the
corresponding corpus.
Table 4: Results showing the performances of
the classification models using different indexing
schemes on the test set. The best values for the
Review class are noted in bold and those for
Review class are are underlined
Review Review
# Model R P F-M R P F-M
1 NB 65.5% 81.5% 72.6% 81.6% 65.7% 72.8%
SVM (Linear) 99.6% 98.3% 98.9% 97.9% 99.5% 98.7%
SVM (RBF) 89.8% 97.2% 93.4% 96.8% 88.5% 92.5%
* C = 5.0
* = 0.00185
2 NB 90.6% 64.2% 75.1% 37.4% 76.3% 50.2%
SVM (Linear) 87.2% 81.3% 84.2% 75.3% 82.7% 78.8%
SVM (RBF) 87.2% 86.5% 86.8% 83.1% 84.0% 83.6%
* C = 32.0
* = 0.00781
3 NB 80.0% 68.4% 73.7% 54.2% 68.7% 60.6%
SVM (Linear) 77.0% 81.9% 79.4% 78.9% 73.5% 76.1%
SVM (RBF) 81.2% 48.6% 79.9% 72.6% 75.8% 74.1%
* C = 8.0
* = 0.03125
Benkoussas	
  &	
  Bellot,	
  LREC	
  2014
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 25
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Sentiment Analysis in Twitter
26
Authorities are only too aware that Kashgar is 4,000 kilometres (2,500 miles) from Beijing but only a tenth of
the distance from the Pakistani border, and are desperate to ensure instability or militancy does not leak over the
frontiers.
Taiwan-made products stood a good chance of becoming even more competitive thanks to wider access to overseas
markets and lower costs for material imports, he said.
”March appears to be a more reasonable estimate while earlier admission cannot be entirely ruled out,” according
to Chen, also Taiwan’s chief WTO negotiator.
friday evening plans were great, but saturday’s plans didnt go as expected – i went dancing & it was an ok club,
but terribly crowded :-(
WHY THE HELL DO YOU GUYS ALL HAVE MRS. KENNEDY! SHES A FUCKING DOUCHE
AT&T was okay but whenever they do something nice in the name of customer service it seems like a favor, while
T-Mobile makes that a normal everyday thin
obama should be impeached on TREASON charges. Our Nuclear arsenal was TOP Secret. Till HE told our enemies
what we had. #Coward #Traitor
My graduation speech: ”I’d like to thanks Google, Wikipedia and my computer! :D #iThingteens
Table 5: List of example sentences with annotations that were provided to the annotators. All subjective phrases are
italicized. Positive phrases are in green, negative phrases are in red, and neutral phrases are in blue.
Worker 1 I would love to watch Vampire Diaries :) and some Heroes! Great combination 9/13
Worker 2 I would love to watch Vampire Diaries :) and some Heroes! Great combination 11/13
Worker 3 I would love to watch Vampire Diaries :) and some Heroes! Great combination 10/13
Worker 4 I would love to watch Vampire Diaries :) and some Heroes! Great combination 13/13
Worker 5 I would love to watch Vampire Diaries :) and some Heroes! Great combination 11/13
Intersection I would love to watch Vampire Diaries :) and some Heroes! Great combination
Table 6: Example of a sentence annotated for subjectivity on Mechanical Turk. Words and phrases that were marked as
subjective are italicized and highlighted in bold. The first five rows are annotations provided by Turkers, and the final
row shows their intersection. The final column shows the accuracy for each annotation compared to the intersection.
Note that ignoring Fneutral does not reduce the
task to predicting positive vs. negative labels only
(even though some participants have chosen to do
so) since the gold standard still contains neutral
For both subtasks, there were teams that only sub-
mitted results for the Twitter test set. Some teams
submitted both a constrained and an unconstrained
version (e.g., AVAYA and teragram). As one would
ation methodology. We then summarize the charac-
teristics of the approaches taken by the participating
systems and we discuss their scores.
2 Task Description
We had two subtasks: an expression-level subtask
and a message-level subtask. Participants could
choose to participate in either or both subtasks. Be-
low we provide short descriptions of the objectives
of these two subtasks.
Subtask A: Contextual Polarity Disambiguation
Given a message containing a marked instance
of a word or a phrase, determine whether that
instance is positive, negative or neutral in that
context. The boundaries for the marked in-
stance were provided: this was a classification
task, not an entity recognition task.
2
http://www.daedalus.es/TASS/corpus.php
this lexicon was used to automatically label addi-
tional Tweet/SMS messages and then used with the
original data to train the classifier, then such a sys-
tem would be unconstrained.
3 Dataset Creation
In the following sections we describe the collection
and annotation of the Twitter and SMS datasets.
3.1 Data Collection
Twitter is the most common micro-blogging site on
the Web, and we used it to gather tweets that express
sentiment about popular topics. We first extracted
named entities using a Twitter-tuned NER system
(Ritter et al., 2011) from millions of tweets, which
we collected over a one-year period spanning from
January 2012 to January 2013; we used the public
streaming Twitter API to download tweets.
313
e RT “Until tonight I never realised how fucked up I was” -
So, wat interview did you go to? How did it go?
om each corpus that contain subjective phrases.
al-
pro-
mpis
pan-
al.,
rom
d on
MS
ions
task
to a
yed
oal,
con-
MS
res-
olar-
We
lua-
Subtask B: Message Polarity Classification
Given a message, decide whether it is of
positive, negative, or neutral sentiment. For
messages conveying both a positive and a
negative sentiment, whichever is the stronger
one was to be chosen.
Each participating team was allowed to submit re-
sults for two different systems per subtask: one con-
strained, and one unconstrained. A constrained sys-
tem could only use the provided data for training,
but it could also use other resources such as lexi-
cons obtained elsewhere. An unconstrained system
could use any additional data as part of the training
process; this could be done in a supervised, semi-
supervised, or unsupervised fashion.
Note that constrained/unconstrained refers to the
data used to train a classifier. For example, if other
data (excluding the test data) was used to develop
a sentiment lexicon, and the lexicon was used to
generate features, the system would still be con-
strained. However, if other data (excluding the test
vey an opinion. Given a sentence, identify whether it is objective,
bjective word or phrase in the context of the sentence and mark
elow. The number above each word indicates its position. The
box so that you can confirm that you chose the correct range.
ing one of the radio buttons: positive, negative, or neutral. If a
indicating that ”There are no subjective words/phrases”. Please
nning if this is your first time answering this hit.
kers on Mechanical Turk followed by a screenshot.
Total Phrase Count Vocabulary
ers Positive Negative Neutral Size
0.0 5,895 3,131 471 20,012
0.0 648 430 57 4,426
1.2 2,734 1,541 160 11,736
5.6 1,071 1,104 159 3,562
tatistics for Subtask A.
med
tion
this
ered
oned
ffer-
ds.
to-
us-
We
ent-
one
that
Corpus Positive Negative Objective
/ Neutral
Twitter - Training 3,662 1,466 4,600
Twitter - Dev 575 340 739
Twitter - Test 1,573 601 1,640
SMS - Test 492 394 1,208
Table 3: Statistics for Subtask B.
We annotated the same Twitter messages with an-
notations for subtask A and subtask B. However,
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Aspect Based Sentiment Analysis
— Subtask 1: Aspect term extraction
Given a set of sentences with pre-identified entities (e.g., restaurants), identify the aspect terms
present in the sentence and return a list containing all the distinct aspect terms. An aspect term
names a particular aspect of the target entity.
For example, "I liked the service and the staff, but not the food”, “The food was nothing much, but I
loved the staff”. Multi-word aspect terms (e.g., “hard disk”) should be treated as single terms (e.g.,
in “The hard disk is very noisy” the only aspect term is “hard disk”).



— Subtask 2: Aspect term polarity
For a given set of aspect terms within a sentence, determine whether the polarity of each aspect
term is positive, negative, neutral or conflict (i.e., both positive and negative).
For example:
“I loved their fajitas” → {fajitas: positive}

“I hated their fajitas, but their salads were great” → {fajitas: negative, salads: positive}

“The fajitas are their first plate” → {fajitas: neutral}

“The fajitas were great to taste, but not to see” → {fajitas: conflict}
27
http://alt.qcri.org/semeval2014/task4/
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Aspect Based Sentiment Analysis
— Subtask 3: Aspect category detection
Given a predefined set of aspect categories (e.g., price, food), identify the aspect categories
discussed in a given sentence. Aspect categories are typically coarser than the aspect terms of
Subtask 1, and they do not necessarily occur as terms in the given sentence.
For example, given the set of aspect categories {food, service, price, ambience, anecdotes/
miscellaneous}:
“The restaurant was too expensive” → {price}

“The restaurant was expensive, but the menu was great” → {price, food}



— Subtask 4: Aspect category polarity
Given a set of pre-identified aspect categories (e.g., {food, price}), determine the polarity
(positive, negative, neutral or conflict) of each aspect category.
For example:
“The restaurant was too expensive” → {price: negative}

“The restaurant was expensive, but the menu was great” → {price: negative, food: positive}
28
http://alt.qcri.org/semeval2014/task4/
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 29
Hussam Hamdan1 Frédéric Léchet1 Patrice Lellot
Hussam:hamdan_lsis:org1 Frederic:bechet_lif:univ-mrs:fr1 Patrice:bellot_lsis:org
Vix-Marseille University1 Marseille France
Twitter is a real-time1 highly social microblogging service that allows us to post short messages1The Sentiment Vnalysis of Twitter is useful for many domains )Marketing1Finance1
Social1 etc:::E1 Many approaches were proposed for this task1 we have applied several machine learning approaches in order to classify the Tweets using the dataset of SemEval
DNj!: Many resources were used for feature extractionG WordNet )similar adjectives and verb groupsE1 RLpedia )the hidden conceptsE1 SentiWordNet )the polarity and subjectivityE1
and other Twitter specific features such as number of y1w1_1 etc:
Highlights
Results
Naive Bayes Model
Average F-measure of
negative and positive
classes has been
improved by 45 wrt
uni-gram model
SVM Model
Average F-measure of negative
and positive classes has been
improved by 1(55
wrt uni-gram model
System Architecture
Preprocessing Feature Extraction
Classification
Model
Training Set:
6#56 Tweets
Development SetG
j48# Tweets
Gas by my house hit §!:!5yyyy1
Ikm going to *hapel Hill on Sat: GE
DBpedia WordNet
Senti-Features
xSentiWordNetC
Twitter
Specific
Pos Neg Objective
Conclusion
Classification
Twitter
Dictionary
Gas by my house hit §!:!5yyyy1
Ikm going to *hapel Hill on Sat:
very happy
Settlement
connected1 blessed
move1 displace1 sit
sit_down
w_1 wy1 ww
polarity1 subjectivity
wpos wneg
Preprocessing
Feature Extraction
*lassification model
- Using the similar adjectives from WordNet has a significant effect with Naive Layes but a little effect with SVM:
- Using the hidden concepts is not so significant in this data set1 more significant for the objective class with SVM
- Using Senti-features and Twitter specific features and verb groups were useful with SVM
Experiments with DBpediaD WordNet and SentiWordNet as resources for
Sentiment analysis in micro-blogging
)Linear kernelE
2 PG Precision1 RG Recall1 FG F-measure
2
SemEval	
  2013
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Sentiment Analysis on Twitter : Using Z-Score
• Z-Score helps to discriminate words for Document Classification, Authorship Attribution (J. Savoy,
ACM TOIS 2013)
30
Z_score for each term ti in a class Cj (tij) by cal-
culating its term relative frequency tfrij in a par-
ticular class Cj, as well as the mean (meani)
which is the term probability over the whole cor-
pus multiplied by nj the number of terms in the
class Cj, and standard deviation (sdi) of term ti
according to the underlying corpus (see Eq.
(1,2)).
Z!"#$% !!"
=
!"#!"!!"#$!
!"#
Eq. (1)
Z!"#$% !!"
=
!"#!"!!!∗!(!")
!"∗! !" ∗(!!!(!"))
Eq. (2)
The term which has salient frequency in a class
in compassion to others will have a salient
Z_score. Z_score was exploited for SA by
(Zubaryeva and Savoy 2010) , they choose a
threshold (>2) for selecting the number of terms
having Z_score more than the threshold, then
they used a logistic regression for combining
Bing Liu's Opinion Lexicon which is created by
(Hu and Liu 2004) and augmented in many latter
works. We extract the number of positive, nega-
tive and neutral words in tweets according to the-
se lexicons. Bing Liu's lexicon only contains
negative and positive annotation but Subjectivity
contains negative, positive and neutral.
- Part Of Speech (POS)
We annotate each word in the tweet by its POS
tag, and then we compute the number of adjec-
tives, verbs, nouns, adverbs and connectors in
each tweet.
4 Evaluation
4.1 Data collection
We used the data set provided in SemEval 2013
and 2014 for subtask B of sentiment analysis in
Twitter(Rosenthal, Ritter et al. 2014) (Wilson,
Kozareva et al. 2013). The participants were
provided with training tweets annotated as posi-
Z_score for each term ti in a class Cj (tij) by cal-
culating its term relative frequency tfrij in a par-
ticular class Cj, as well as the mean (meani)
which is the term probability over the whole cor-
pus multiplied by nj the number of terms in the
class Cj, and standard deviation (sdi) of term ti
according to the underlying corpus (see Eq.
(1,2)).
Z!"#$% !!"
=
!"#!"!!"#$!
!"#
Eq. (1)
Z!"#$% !!"
=
!"#!"!!!∗!(!")
!"∗! !" ∗(!!!(!"))
Eq. (2)
The term which has salient frequency in a class
in compassion to others will have a salient
Z_score. Z_score was exploited for SA by
(Zubaryeva and Savoy 2010) , they choose a
threshold (>2) for selecting the number of terms
having Z_score more than the threshold, then
they used a logistic regression for combining
these scores. We use Z_scores as added features
for classification because the tweet is too short,
therefore many tweets does not have any words
with salient Z_score. The three following figures
1,2,3 show the distribution of Z_score over each
Bin
(Hu
wor
tive
se
neg
con
- Pa
We
tag,
tive
each
4
4.1
W
and
Twi
Koz
prov
tive
twe
we
of p
pus multiplied by nj the number of term
class Cj, and standard deviation (sdi) o
according to the underlying corpus (
(1,2)).
Z!"#$% !!"
=
!"#!"!!"#$!
!"#
Eq. (1)
Z!"#$% !!"
=
!"#!"!!!∗!(!")
!"∗! !" ∗(!!!(!"))
Eq. (2)
The term which has salient frequency in
in compassion to others will have a
Z_score. Z_score was exploited for
(Zubaryeva and Savoy 2010) , they c
threshold (>2) for selecting the number
having Z_score more than the thresho
they used a logistic regression for co
these scores. We use Z_scores as added
for classification because the tweet is to
therefore many tweets does not have an
with salient Z_score. The three following
1,2,3 show the distribution of Z_score o
class, we remark that the majority of te
(1,2)).
Z!"#$% !!"
=
!"#!"!!"#$!
!"#
Eq. (1)
Z!"#$% !!"
=
!"#!"!!!∗!(!")
!"∗! !" ∗(!!!(!"))
Eq. (2)
The term which has salient frequency in a class
in compassion to others will have a salient
Z_score. Z_score was exploited for SA by
(Zubaryeva and Savoy 2010) , they choose a
threshold (>2) for selecting the number of terms
having Z_score more than the threshold, then
they used a logistic regression for combining
these scores. We use Z_scores as added features
for classification because the tweet is too short,
therefore many tweets does not have any words
with salient Z_score. The three following figures
1,2,3 show the distribution of Z_score over each
class, we remark that the majority of terms has
Z_score between -1.5 and 2.5 in each class and
the rest are either vey frequent (>2.5) or very rare
(<-1.5). It should indicate that negative value
means that the term is not frequent in this class in
comparison with its frequencies in other classes.
Table1 demonstrates the first ten terms having
the highest Z_scores in each class. We have test-
ed to use different values for the threshold, the
best results was obtained when the threshold is 3.
positive
Z_score
negative
Z_score
Neutral
Z_score
Love
Good
Happy
Great
Excite
Best
Thank
Hope
Cant
Wait
14.31
14.01
12.30
11.10
10.35
9.24
9.21
8.24
8.10
8.05
Not
Fuck
Don’t
Shit
Bad
Hate
Sad
Sorry
Cancel
stupid
13.99
12.97
10.97
8.99
8.40
8.29
8.28
8.11
7.53
6.83
Httpbit
Httpfb
Httpbnd
Intern
Nov
Httpdlvr
Open
Live
Cloud
begin
6.44
4.56
3.78
3.58
3.45
3.40
3.30
3.28
3.28
3.17
Table1. The first ten terms having the highest Z_score in
each class
- Part Of Speech (POS)
We annotate each word in the tweet by its POS
tag, and then we compute the number of adjec-
tives, verbs, nouns, adverbs and connectors in
each tweet.
4 Evaluation
4.1 Data collection
We used the data set provided in SemEval 2013
and 2014 for subtask B of sentiment analysis in
Twitter(Rosenthal, Ritter et al. 2014) (Wilson,
Kozareva et al. 2013). The participants were
provided with training tweets annotated as posi-
tive, negative or neutral. We downloaded these
tweets using a given script. Among 9646 tweets,
we could only download 8498 of them because
of protected profiles and deleted tweets. Then,
we used the development set containing 1654
tweets for evaluating our methods. We combined
the development set with training set and built a
new model which predicted the labels of the test
set 2013 and 2014.
4.2 Experiments
Official Results
The results of our system submitted for
SemEval evaluation gave 46.38%, 52.02% for
test set 2013 and 2014 respectively. It should
mention that these results are not correct because
of a software bug discovered after the submis-
sion deadline, therefore the correct results is
demonstrated as non-official results. In fact the
previous results are the output of our classifier
which is trained by all the features in section 3,
but because of index shifting error the test set
was represented by all the features except the
terms.
Non-official Results
We have done various experiments using the
features presented in Section 3 with Multinomial
Naïve-Bayes model. We firstly constructed fea-
features which improve the performance by 6.5%
and 10.9%, then by pre-polarity features which
also improve the f-measure by 4%, 6%, but the
extending with POS tags decreases the f-
measure. We also test all combinations with the-
se previous features, Table2 demonstrates the
results of each combination, we remark that POS
tags are not useful over all the experiments, the
best result is obtained by combining Z_score and
pre-polarity features. We find that Z_score fea-
tures improve significantly the f-measure and
they are better than pre-polarity features.
Figure 1 Z_score distribution in positive class
Figure 2 Z_score distribution in neutral class
Features F-measure
2013 2014
Terms 49.42 46.31
Terms+Z 55.90 57.28
Terms+POS 43.45 41.14
Terms+POL 53.53 52.73
Terms+Z+POS 52.59 54.43
Terms+Z+POL 58.34 59.38
Terms+POS+POL 48.42 50.03
Terms+Z+POS+POL 55.35 58.58
Table 2. Average f-measures for positive and negative clas-
ses of SemEval2013 and 2014 test sets.
We repeated all previous experiments after using
a twitter dictionary where we extend the tweet by
the expressions related to each emotion icons or
abbreviations in tweets. The results in Table3
demonstrate that using that dictionary improves
the f-measure over all the experiments, the best
results obtained also by combining Z_scores and
pre-polarity features.
Features F-measure
2013 2014
Terms 50.15 48.56
Terms+Z 57.17 58.37
Terms+POS 44.07 42.64
Terms+POL 54.72 54.53
Terms+Z+POS 53.20 56.47
Terms+Z+POL 59.66 61.07
Terms+POS+POL 48.97 51.90
Terms+Z+POS+POL 55.83 60.22
Table 3. Average f-measures for positive and negative clas-
ses of SemEval2013 and 2014 test sets after using a twitter
dictionary.
5 Conclusion
In this paper we tested the impact of using
Twitter Dictionary, Sentiment Lexicons, Z_score
features and POS tags for the sentiment classifi-
cation of tweets. We extended the feature vector
of tweets by all these features; we have proposed
new type of features Z_score and demonstrated
that they can improve the performance.
Figure 1 Z_score distribution in positive class
Figure 2 Z_score distribution in neutral class
Figure 3 Z_score distribution in negative class
demonstrate that using that dictionary improves
the f-measure over all the experiments, the best
results obtained also by combining Z_scores and
pre-polarity features.
Features F-measure
2013 2014
Terms 50.15 48.56
Terms+Z 57.17 58.37
Terms+POS 44.07 42.64
Terms+POL 54.72 54.53
Terms+Z+POS 53.20 56.47
Terms+Z+POL 59.66 61.07
Terms+POS+POL 48.97 51.90
Terms+Z+POS+POL 55.83 60.22
Table 3. Average f-measures for positive and negative clas-
ses of SemEval2013 and 2014 test sets after using a twitter
dictionary.
5 Conclusion
In this paper we tested the impact of using
Twitter Dictionary, Sentiment Lexicons, Z_score
features and POS tags for the sentiment classifi-
cation of tweets. We extended the feature vector
of tweets by all these features; we have proposed
new type of features Z_score and demonstrated
that they can improve the performance.
We think that Z_score can be used in different
ways for improving the Sentiment Analysis, we
are going to test it in another type of corpus and
using other methods in order to combine these
features.
Reference
Apoorv Agarwal,Boyi Xie,Ilia Vovsha,Owen
Rambow and Rebecca Passonneau (2011).
Sentiment analysis of Twitter data.
Proceedings of the Workshop on Languages
se previous features, Table2 demonstrates the
results of each combination, we remark that POS
tags are not useful over all the experiments, the
best result is obtained by combining Z_score and
pre-polarity features. We find that Z_score fea-
tures improve significantly the f-measure and
they are better than pre-polarity features.
Figure 1 Z_score distribution in positive class
Figure 2 Z_score distribution in neutral class
Terms+POL 53.53 52.73
Terms+Z+POS 52.59 54.43
Terms+Z+POL 58.34 59.38
Terms+POS+POL 48.42 50.03
Terms+Z+POS+POL 55.35 58.58
Table 2. Average f-measures for positive and negative clas-
ses of SemEval2013 and 2014 test sets.
We repeated all previous experiments after using
a twitter dictionary where we extend the tweet by
the expressions related to each emotion icons or
abbreviations in tweets. The results in Table3
demonstrate that using that dictionary improves
the f-measure over all the experiments, the best
results obtained also by combining Z_scores and
pre-polarity features.
Features F-measure
2013 2014
Terms 50.15 48.56
Terms+Z 57.17 58.37
Terms+POS 44.07 42.64
Terms+POL 54.72 54.53
Terms+Z+POS 53.20 56.47
Terms+Z+POL 59.66 61.07
Terms+POS+POL 48.97 51.90
Terms+Z+POS+POL 55.83 60.22
Table 3. Average f-measures for positive and negative clas-
ses of SemEval2013 and 2014 test sets after using a twitter
dictionary.
5 Conclusion
In this paper we tested the impact of using
Twitter Dictionary, Sentiment Lexicons, Z_score
features and POS tags for the sentiment classifi-
cation of tweets. We extended the feature vector
of tweets by all these features; we have proposed
[Hamdan,	
  Béchet	
  &	
  Bellot,	
  SemEval	
  2014]
Run Const- Unconst- Use Super-
rained rained Neut.? vised?
NRC-Canada 69.02 yes yes
GU-MLT-LT 65.27 yes yes
teragram 64.86 64.86(1) yes yes
BOUNCE 63.53 yes yes
KLUE 63.06 yes yes
AMI&ERIC 62.55 61.17(3) yes yes/semi
FBM 61.17 yes yes
AVAYA 60.84 64.06(2) yes yes/semi
SAIL 60.14 61.03(4) yes yes
UT-DB 59.87 yes yes
FBK-irst 59.76 yes yes
Run Const- Unconst-
rained rained N
NRC-Canada 68.46
GU-MLT-LT 62.15
KLUE 62.03
AVAYA 60.00 59.47(1)
teragram 59.10(2)
NTNU 57.97 54.55(6)
CodeX 56.70
FBK-irst 54.87
AMI&ERIC 53.63 52.62(7)
ECNUCS 53.21 54.77(5)
UT-DB 52.46
Best	
  official	
  2013	
  results
[Hamdan,	
  Bellot	
  &	
  Béchet,	
  SemEval	
  2014]
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Subjectivity lexicon : MPQA
- The MPQA (Multi-Perspective Question Answering) Subjectivity Lexicon
31
http://mpqa.cs.pitt.edu
Theresa  Wilson,  Janyce  Wiebe,  and  Paul  Hoffmann  (2005).  Recognizing  Contextual  Polarity  in  Phrase-­‐‑Level  Sentiment  Analysis.  Proc.  of  HLT-­‐‑EMNLP-­‐‑2005.
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 32
http://wordnetweb.princeton.edu/perl/webwn http://www.cs.rochester.edu/research/cisd/wordnet
COMMUNICATIONS OF THE ACM November 1995/Vol. 38, No. 11 39
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 33
http://sentiwordnet.isti.cnr.it
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Aspect Based Sentiment Analysis
— Dataset : 3K English sentences from the restaurant review + 3K English sentences extracted
from customer reviews of laptops + tagged by experienced human annotators
— We proposed : 

1. Aspect term extraction: CRF model
2. Aspect Term Polarity Detection: Multinomial Naive-Bayes classifier with some features
such as Z-score, POS and prior polarity extracted from Subjectivity Lexicon (Wilson, Wiebe et al.
2005) and Bing Liu's Opinion Lexicon
3. Category Detection & Category Polarity Detection : Z-score model

!
!
34
elative frequency tfrij
s well as the mean
probability over the
by nj the number of
andard deviation (sdi)
nderlying corpus (see
1)
))
Eq. (2)
r SA by (Zubaryeva
ose a threshold (Z>2)
terms having Z_score
n they used a logistic
hese scores. We use
or multinomial Naive
We remark that our system is 24% and 21%
above the baseline for aspect terms extraction in
restaurant and laptop reviews respectively, and
above 3% for category detection in restaurant
reviews.
Data subtask P R F
Res 1 Baseline 0,52 0,42 0,47
System 0.81 0.63 0.71
3 Baseline 0,73 0,59 0,65
System 0.77 0.60 0.68
Lap 1 Baseline 0,44 0,29 0,35
System 0.76 0.45 0.56
Table 1. Results of subtask 1, 2 for restaurant reviews,
subtask 1 for laptop reviews
The second step involves the evaluation of
subtask 2 and 4, we were provided with(1)
restaurant review sentences annotated by their
aspect terms, and categories, we had to
determine the polarity for each aspect term and
category; (2) laptop review sentences annotated
(tij) by calculating its term relative frequency tfrij
in a particular class Cj, as well as the mean
(meani) which is the term probability over the
whole corpus multiplied by nj the number of
terms in the class Cj, and standard deviation (sdi)
of term ti according to the underlying corpus (see
Eq. (1,2)).
Z!"#$% !!"
=
!"#!"!!"#$!
!"#
Eq. (1)
Z!"#$% !!"
=
!"#!"!!!∗!(!")
!"∗! !" ∗(!!!(!"))
Eq. (2)
Z_score was exploited for SA by (Zubaryeva
and Savoy 2010), they choose a threshold (Z>2)
for selecting the number of terms having Z_score
more than the threshold, then they used a logistic
regression for combining these scores. We use
Z_score as added features for multinomial Naive
Bayes classifier.
3.4 Subtask4: Category Polarity Detection
We have used Multinomial Naive-Bayes as in
the subtask2 step (2) with the same features, but
the different that we add also the name of the
category as a feature. Thus, for each sentence
having n category we add n examples to the
training set, the difference between them is the
feature of the category.
4 Experiments and Evaluations
We tested our system using the training and
testing data provided by SemEval 2014 ABSA
task. Two data sets were provided; the first
contains3Ksentences of restaurant reviews
annotated by the aspect terms, their polarities,
their categories, the polarities of each category.
The second contains of 3K sentences of laptop
reviews annotated just by the aspect terms, their
polarities.
The evaluation process was done in two steps.
First step is concerning the subtasks 1 and 3
We remark that our system is 24% and 21%
above the baseline for aspect terms extraction in
restaurant and laptop reviews respectively, and
above 3% for category detection in restaurant
reviews.
Data subtask P R F
Res 1 Baseline 0,52 0,42 0,47
System 0.81 0.63 0.71
3 Baseline 0,73 0,59 0,65
System 0.77 0.60 0.68
Lap 1 Baseline 0,44 0,29 0,35
System 0.76 0.45 0.56
Table 1. Results of subtask 1, 2 for restaurant reviews,
subtask 1 for laptop reviews
The second step involves the evaluation of
subtask 2 and 4, we were provided with(1)
restaurant review sentences annotated by their
aspect terms, and categories, we had to
determine the polarity for each aspect term and
category; (2) laptop review sentences annotated
by aspect terms and we had to determine the
aspect term polarity. Table 2 demonstrates the
results of our system and the baseline (A:
accuracy, R: number of true retrieved examples,
All: number of all true examples).
Data subtask R All A
Res 2 Baseline 673 1134 0,64
System 818 1134 0.72
4 Baseline 673 1025 0,65
System 739 1025 0.72
Lap 2 Baseline 336 654 0,51
System 424 654 0,64
Table 2. Results of subtask 2, 4 for restaurant reviews,
subtask 2 for laptop reviews
We remark that our system is 8% and 13% above
the baseline for aspect terms polarity detection in
restaurant and laptop reviews respectively, and
7% above for category polarity detection in
restaurant reviews.
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
IR and Digital Libraries
!
Social Book Search
35
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 36
INEX
topics
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
INEX 2014 Social Book Search Track
— In 2014, the Social Book Search Track consists of two tasks:
• Suggestion task: a system-oriented batch retrieval/recommendation task
• Interactive task: a user-oriented interactive task where we want to gather user data on
searching for different search tasks and different search interfaces.
— 2.8 million book descriptions with metadata from Amazon and LibraryThing
— 14 million reviews (1.5 million books have no review)
— Amazon: formal metadata like booktitle, author, publisher, publication year, library classification
codes, Amazon categories and similar product information, as well as user-generated content in the
form of user ratings and reviews
— LibraryThing, there are user tags and user-provided metadata on awards, book characters and
locations and blurbs
37
https://inex.mmci.uni-­‐saarland.de/tracks/books/
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 38
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition) 39
<browseNode> fields.
Table 1. Some facts about the Amazon collection.
Number of pages (i.e. books) 2, 781, 400
Number of reviews 15, 785, 133
Number of pages that contain a least a review 1, 915, 336
3 Retrieval model
3.1 Sequential Dependence Model
Like the previous year, we used a language modeling approach to retrieval [4].
We use Metzler and Croft’s Markov Random Field (MRF) model [5] to integrate
multiword phrases in the query. Specifically, we use the Sequential Dependance
Run nDCG@10 P@10 MRR MAP
p4-inex2011SB.xml social.fb.10.50 0.3101 0.2071 0.4811 0.2283
p54-run4.all-topic-fields.reviews-split.combSUM 0.2991 0.1991 0.4731 0.1945
p4-inex2011SB.xml social 0.2913 0.1910 0.4661 0.2115
p4-inex2011SB.xml full.fb.10.50 0.2853 0.1858 0.4453 0.2051
p54-run2.all-topic-fields.all-doc-fields 0.2843 0.1910 0.4567 0.2035
p62.recommendation 0.2710 0.1900 0.4250 0.1770
p54-run3.title.reviews-split.combSUM 0.2643 0.1858 0.4195 0.1661
p62.sdm-reviews-combine 0.2618 0.1749 0.4361 0.1755
p62.baseline-sdm 0.2536 0.1697 0.3962 0.1815
p62.baseline-tags-browsenode 0.2534 0.1687 0.3877 0.1884
p4-inex2011SB.xml full 0.2523 0.1649 0.4062 0.1825
wiki-web-nyt-gw 0.2502 0.1673 0.4001 0.1857
p4-inex2011SB.xml amazon 0.2411 0.1536 0.3939 0.1722
p62.sdm-wiki 0.1953 0.1332 0.3017 0.1404
p62.sdm-wiki-anchors 0.1724 0.1199 0.2720 0.1253
p4-inex2011SB.xml lt 0.1592 0.1052 0.2695 0.1199
p18.UPF QE group BTT02 0.1531 0.0995 0.2478 0.1223
p18.UPF QE genregroup BTT02 0.1327 0.0934 0.2283 0.1001
p18.UPF QEGr BTT02 RM 0.1291 0.0872 0.2183 0.0973
p18.UPF base BTT02 0.1281 0.0863 0.2135 0.1018
p18.UPF QE genre BTT02 0.1214 0.0844 0.2089 0.0910
p18.UPF base BT02 0.1202 0.0796 0.2039 0.1048
p54-run1.title.all-doc-fields 0.1129 0.0801 0.1982 0.0868
Table 2. O cial results of the Best Books for Social Search task of the INEX 2011
Book track, using judgements derived from the LibraryThing discussion groups. Our
runs are identified by the p62 prefix and are in boldface.
Run nDCG@10 P@10 MRR MAP
p62.baseline-sdm 0.6092 0.5875 0.7794 0.3896
p4-inex2011SB.xml amazon 0.6055 0.5792 0.7940 0.3500
p62.baseline-tags-browsenode 0.6012 0.5708 0.7779 0.3996
p4-inex2011SB.xml full 0.6011 0.5708 0.7798 0.3818
p4-inex2011SB.xml full.fb.10.50 0.5929 0.5500 0.8075 0.3898
p62.sdm-reviews-combine 0.5654 0.5208 0.7584 0.2781
p4-inex2011SB.xml social 0.5464 0.5167 0.7031 0.3486
p4-inex2011SB.xml social.fb.10.50 0.5425 0.5042 0.7210 0.3261
p54-run2.all-topic-fields.all-doc-fields 0.5415 0.4625 0.8535 0.3223
Table 3. Top runs of the Best Books for Social Search task of the INEX 2011 Book
track, using judgements obtained by crowdsourcing (Amazon Mechanical Turk). Our
runs are identified by the p62 prefix and are in boldface.
Model (SDM), which is a special case of the MRF. In this model three features
are considered: single term features (standard unigram language model features,
fT ), exact phrase features (words appearing in sequence, fO) and unordered
window features (require words to be close together, but not necessarily in an
exact sequence order, fU ).
Documents are thus ranked according to the following scoring function:
scoreSDM (Q, D) = T
X
q2Q
fT (q, D)
+ O
|Q| 1
X
i=1
fO(qi, qi+1, D)
+ U
|Q| 1
X
i=1
fU (qi, qi+1, D)
where the features weights are set according to the author’s recommendation
( T = 0.85, O = 0.1, U = 0.05). fT , fO and fU are the log maximum likelihood
estimates of query terms in document D, computed over the target collection
with a Dirichlet smoothing.
3.2 External resources combination
As previously done last year, we exploited external resources in a Pseudo-Relevance
Feedback (PRF) fashion to expand the query with informative terms. Given a re-
source R, we form a subset RQ of informative documents considering the initial
query Q using pseudo-relevance feedback. To this end we first rank documents
of R using the SDM ranking function. An entropy measure HRQ
(t) is then com-
puted for each term t over RQ in order to weigh them according to their relative
informativeness:
HRQ
(t) =
X
w2t
p(w|RQ) · log p(w|RQ)
These external weighted terms are finally used to expand the original query.
Sequential DependanceModel (SDM) - Markov Random Field (Metzler & Croft, 2004)
We use our SDM baseline defined in section 3.1 and incorporate the ab
recommendation estimate:
scorerecomm(Q, D) = D scoreSDM (Q, D) + (1 D) tD
where the D parameter was set based on the observation over the test to
made available to participants for training purposes. Indeed we observed
these topics that the tD had no influence on the ranking of documents after
hundredth result (average estimation). Hence we fix the smoothing param
to:
D =
arg maxD scoreSDM (Q, D) scoreSDM (Q, D)100
NResults
In practice, this approach is re-ranking of the results of the SDM retri
model based on the popularity and the likability of the di↵erent books.
4 Runs
together. Children node pages (or sub-articles) are weighted half that of their
parents in order to minimize a potential topic drift. We avoid loops in the graph
(i.e. a children node can not be linked to one of his elder) because it brings
no additional information. It also could change weights between linked articles.
Informative words are then extracted from the sub-articles and incorporated to
our retrieval model like another external resource.
3.4 Social opinion for book search
The test collection used this year for the Book Track contains Amazon pages
of books. These pages are composed amongst others of editorial information,
like the number of pages or the blurb, user ratings and user reviews. However,
contrary to the previous years, the actual content of the books is not available.
Hence, the task is to rank books according to the sparse informative content and
the opinion of readers expressed in the reviews, considering that the user ratings
are integers between 1 and 5.
Here, we wanted to model two social popularity assumptions: a product that
have a lot of reviews must be relevant (or at least popular), and a high rated
product must be relevant. Then, a product having a large number of good reviews
really must be relevant. However in the collection there is often a small amount
of ratings for a given book. The challenge was to determine whether each user
rating is significant or not. To do so, we first define XD
R a random set of ”bad”
ratings (1, 2 or 3 over 5 points) for book D. Then, we evaluate the statistical
significant di↵erences between XD
R and XD
R [ XD
U using Welch’s t-test, where
XD
U is the actual set of user rating for book D. The statistical test is computed
by:
tD =
XD
R [ XD
U XD
U
sXD
R [XD
U XD
U
where
sXD
R [XD
U XD
U
=
s
s2
RU
nRU
+
s2
U
nU
Where s2
is the unbiased estimator of the variance of the two sets and nX is the
number of ratings for set X.
The underlying assumption is that significant di↵erences occur under two
di↵erent situations. First, when there is a small amount of user ratings (Xi
U )
but they all are very good. For example this is the case of good but little-known
books. Second, when there is a very large amount of user ratings but there are
average. Hence this statistical test gives us a single estimate of both likability
and popularity.
Test statistique entre les
notes observées et des
notes aléatoires
Est-ce qu’une note
est significative ?
Projet ANR CAAS
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Query Expansion : with Concepts from DBPedia
40
P.	
  Bellot	
  (AMU-­‐CNRS,	
  LSIS-­‐OpenEdition)
Terms only vs. Extended Features
— We modeled book likeliness based on the following idea: the more the number of reviews it has,
the more interesting the book is (it may not be a good or popular book but a book that has a high
impact)
— InL2 information retrieval model alone (DFR-based model, Divergence From Randomness) seem
to perform better than SDM (Language Modeling) with extended features
41
Benkoussas,	
  Hamdan,	
  Albitar,	
  Ollagnier	
  &	
  Bellot,	
  2014

Contenu connexe

En vedette

TechChill Baltics introduction by Andris & Ernests
TechChill Baltics introduction by Andris & ErnestsTechChill Baltics introduction by Andris & Ernests
TechChill Baltics introduction by Andris & ErnestsTechHubRiga
 
Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)
Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)
Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)JhoLantiOne
 
Практические аспекты организации процесса тестирования в государственных учре...
Практические аспекты организации процесса тестирования в государственных учре...Практические аспекты организации процесса тестирования в государственных учре...
Практические аспекты организации процесса тестирования в государственных учре...Alexandra Varfolomeeva
 
Правила хорошего тона для тестировщика
Правила хорошего тона для тестировщикаПравила хорошего тона для тестировщика
Правила хорошего тона для тестировщикаAlexandra Varfolomeeva
 
Основы тестирования графических интерфейсов
Основы тестирования графических интерфейсов Основы тестирования графических интерфейсов
Основы тестирования графических интерфейсов Alexandra Varfolomeeva
 

En vedette (9)

Infrastructures et recommandations pour les Humanités Numériques - Big Data e...
Infrastructures et recommandations pour les Humanités Numériques - Big Data e...Infrastructures et recommandations pour les Humanités Numériques - Big Data e...
Infrastructures et recommandations pour les Humanités Numériques - Big Data e...
 
TechChill Baltics introduction by Andris & Ernests
TechChill Baltics introduction by Andris & ErnestsTechChill Baltics introduction by Andris & Ernests
TechChill Baltics introduction by Andris & Ernests
 
Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)
Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)
Toshiba satellite u945 s4130 14-inch ultrabook (ice blue with fusion lattice)
 
Iker y la caja misteriosa
Iker y la caja misteriosaIker y la caja misteriosa
Iker y la caja misteriosa
 
Практические аспекты организации процесса тестирования в государственных учре...
Практические аспекты организации процесса тестирования в государственных учре...Практические аспекты организации процесса тестирования в государственных учре...
Практические аспекты организации процесса тестирования в государственных учре...
 
La industria hotelera en españa
La industria hotelera en españaLa industria hotelera en españa
La industria hotelera en españa
 
Scholarly Book Recommendation
Scholarly Book RecommendationScholarly Book Recommendation
Scholarly Book Recommendation
 
Правила хорошего тона для тестировщика
Правила хорошего тона для тестировщикаПравила хорошего тона для тестировщика
Правила хорошего тона для тестировщика
 
Основы тестирования графических интерфейсов
Основы тестирования графических интерфейсов Основы тестирования графических интерфейсов
Основы тестирования графических интерфейсов
 

Similaire à OpenEdition Lab projects in Text Mining

DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTAM Publications
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph dataPetra Selmer
 
2012 05-10 kaiser
2012 05-10 kaiser2012 05-10 kaiser
2012 05-10 kaiserSCEE Team
 
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMulticlass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMarjan Sterjev
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbolsAxel de Romblay
 
An evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsAn evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsinfopapers
 
EMPIRICAL PROJECTObjective to help students put in practice w.docx
EMPIRICAL PROJECTObjective  to help students put in practice w.docxEMPIRICAL PROJECTObjective  to help students put in practice w.docx
EMPIRICAL PROJECTObjective to help students put in practice w.docxSALU18
 
Regeneration of simple and complicated curves using Fourier series
Regeneration of simple and complicated curves using Fourier seriesRegeneration of simple and complicated curves using Fourier series
Regeneration of simple and complicated curves using Fourier seriesIJAEMSJORNAL
 
2. polynomial interpolation
2. polynomial interpolation2. polynomial interpolation
2. polynomial interpolationEasyStudy3
 
Machine Learning Summer School 2016
Machine Learning Summer School 2016Machine Learning Summer School 2016
Machine Learning Summer School 2016chris wiggins
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...Pooyan Jamshidi
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSAksw Group
 
Q2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptx
Q2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptxQ2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptx
Q2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptxSAMUELGIER2
 
Data fitting in Scilab - Tutorial
Data fitting in Scilab - TutorialData fitting in Scilab - Tutorial
Data fitting in Scilab - TutorialScilab
 

Similaire à OpenEdition Lab projects in Text Mining (20)

DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECT
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
Grupo 13 taller parcial2_nrc2882
Grupo 13 taller parcial2_nrc2882Grupo 13 taller parcial2_nrc2882
Grupo 13 taller parcial2_nrc2882
 
2012 05-10 kaiser
2012 05-10 kaiser2012 05-10 kaiser
2012 05-10 kaiser
 
I stata
I stataI stata
I stata
 
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMulticlass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark Examples
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbols
 
An evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsAn evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernels
 
Acet syllabus 1 fac
Acet   syllabus 1 facAcet   syllabus 1 fac
Acet syllabus 1 fac
 
EMPIRICAL PROJECTObjective to help students put in practice w.docx
EMPIRICAL PROJECTObjective  to help students put in practice w.docxEMPIRICAL PROJECTObjective  to help students put in practice w.docx
EMPIRICAL PROJECTObjective to help students put in practice w.docx
 
Regeneration of simple and complicated curves using Fourier series
Regeneration of simple and complicated curves using Fourier seriesRegeneration of simple and complicated curves using Fourier series
Regeneration of simple and complicated curves using Fourier series
 
Colored inversion
Colored inversionColored inversion
Colored inversion
 
2. polynomial interpolation
2. polynomial interpolation2. polynomial interpolation
2. polynomial interpolation
 
Machine Learning Summer School 2016
Machine Learning Summer School 2016Machine Learning Summer School 2016
Machine Learning Summer School 2016
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
 
AI Final report 1.pdf
AI Final report 1.pdfAI Final report 1.pdf
AI Final report 1.pdf
 
Recommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenuRecommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenu
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
 
Q2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptx
Q2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptxQ2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptx
Q2_LESSON 1 INTRODUCTION QUADRATIC FUNCTIONS.pptx
 
Data fitting in Scilab - Tutorial
Data fitting in Scilab - TutorialData fitting in Scilab - Tutorial
Data fitting in Scilab - Tutorial
 

Dernier

Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 

Dernier (20)

Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 

OpenEdition Lab projects in Text Mining

  • 1. OPENEDITION LAB TEXT MINING PROJECTS Patrice  Bellot
 Aix-­‐Marseille  Université  -­‐  CNRS  (LSIS  UMR  7296  ;  OpenEdition)   ! patrice.bellot@univ-­‐amu.fr LSIS  -­‐  DIMAG  team  http://www.lsis.org/spip.php?id_rubrique=291   OpenEdition  Lab  :  http://lab.hypotheses.org
  • 2. Hypotheses! 600+ blogs Revues.org! 300+ journals Calenda! 20 000+ events OpenEdition Books! 1000+ books A  European  Web   platform  for  Human   and  Social  Sciences A  digital  infrastructure   for  open  access A  lab  for  experimenting   new  Text  Mining  
 and  new  IR  systems
  • 3. Open Edition - a Facility of Excellence 3 2012-2020 7 millions € Objectives:! 15 000 + books! 2000 + blogs! Freemium! Multilingual
  • 4. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) OpenEdition Lab — Our Team Directors : Patrice Bellot (Professor in Comp. Sc. / NLP / IR) - Marin Dacos (Head of OpenEdition) Engineers : Elodie Faath - Arnaud Cordier PhD Students : Hussam Hamdan, Chahinez Benkoussas, Anaïs Ollagnier Post-docs : Young-Min Kim (2012-13), Shereen Albitar (2014) 4 http://lab.hypotheses.org
  • 5. • 220  learned  societes  and  centers  (France)   • 30  university  presses  (France,  UK,  Belgium,  Switzerland,  Canada,  Mexico,  Hungary/USA)   • CCSD  –  France  -­‐  Lyon  (HAL  /  DataCenter),   • CHNM  –  USA  –  Washington,   • OAPEN  –  NL  -­‐  The  Hague,   • UNED  –  Spain  -­‐  Universidad  Nacional  de  Educación  a  Distancia,   • Fundação  Calouste  Gulbenkian  –  Portugal,   • Max  Weber  Stinftung  –  Germany,   • Google  –  USA  (Google  Grants  for  DH),   • DARIAH  –  Europe. Our partners And  you?
  • 6. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) OpenEdition Lab (Text Mining Projects for DL) Aims to : — Link papers / books / blogs automatically (reference analysis, Named Entities…) — Detect hot topics, hot books, hot papers : content oriented analysis (not only by using logs)
 - sentiment analysis
 - review of books analysis (and finding) — Book searching with complex and long queries — Reading recommandation 6
  • 7. Project 1: BILBO EN SVM CRF Natural  Language  Processing  /  Text  Mining  /  Information  Retrieval  /  Machine  Learning
  • 8. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 8 Projet N°2 : ECHO! Détec'on!automa'que!de!compte0rendus!de!lecture! LREC,&2014! Mise!en!rela'on!! (BILBO)! Recherche!Web! Analyse!de! sen'ments! Mesure&de&l’écho! NAACL-SEMEVAL,&2013! logs,!métriques…!
  • 9. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 9 Projet N°3 : COOKER! BILBO! ECHO! Graphe!des!contenus! (puis!hypergraphe)! Recommanda)on! COOKER! Classifica?on! automa?que! et!métaCdonnées! (thèmes,!langues,! auteurs…)!
  • 10. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Semantic Annotation of Bib. References 10
  • 11. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 11 A  –  references  in  a  specific  section
  • 12. B  :  references  in  notes C  :  references  in  the  body
  • 13. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) BILBO: A software for Annotating Bibliographical Reference in Digital Humanities Google Digital Humanities Research Awards (2011, 2012) State of the art : 
 — CiteSeer system (Giles et al., 1998) in computer science, 80% of precision for author and 40% for pages. Conditional Random Fields (CRFs) (Peng et al., 2006, Lafferty et al., 2001) for scientific articles, 95% of average precision (99% for author, 95% for title, 85% for editor). — Run on the cover page (title) and/or on the Reference section at the end of papers : not in the footnotes, not in the text body — Not very robust (in the real world : no stylesheets - poorly respected) ! 13
  • 14. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 14 References – three levels Architecture Source XHTML (TEI guidelines) LEARNING AUTOMATIC ANNOTATION TXT Estimated XML files • Revues.org online journals - 340 journals - Various reference formats - 20 different languages (90% in French) • Unstructured and scattered reference data • Prototype development, Web service ! source code will be distributed (GPL) • Google Digital Humanities Research Awards (’10, ’11) Part of Equipex future investment award: DILOH (’12) Web Service Plain text input Future platform Level 1 Level 2 Learning data Tokenizer, Extractor New data Tokenizer, Extractor Manual annotation External machine learning modules Level 1 model Level 2 model Level 3 model Machine learning modules Mallet, SVMlight Conditional Random Fields Automatic annotator Call a model Level 1 Level 2 Bibliography Notes Implicit References Level 3 Comparison with other online tools New Data : Reference data of library of Kim  &  Bellot,  2012
  • 15. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Conditional Random Fields for IE — A discriminative model that is specified over a graph that encodes the conditional dependencies (relationships between observations) — Can be employed for sequential labeling (linear chain CRF) — Take context into account — The probability of a label sequence y given an observation
 sequence x is : 
 
 
 
 
 with F the (rich) feature functions (transition and state functions)
 
 
 
 Parameters must be estimated using an iterative technique such as iterative scaling or gradient- based methods 15 Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
 Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001) Yi 1 Yi Yi+1 ? s s - ? s s - ? s s Xi 1 Xi Xi+1 Yi 1 Yi Yi+1 c 6 s - c 6 s - c 6 s Xi 1 Xi Xi+1 Yi 1 Yi Yi+1 c s c s c s Xi 1 Xi Xi+1 Figure 2. Graphical structures of simple HMMs (left), MEMMs (center), and the chain-structured case of CRFs (right) for sequences. An open circle indicates that the variable is not generated by the model. sequence. In addition, the features do not need to specify completely a state or observation, so one might expect that the model can be estimated from less training data. Another attractive property is the convexity of the loss function; in- deed, CRFs share all of the convexity properties of general maximum entropy models. For the remainder of the paper we assume that the depen- dencies of Y, conditioned on X, form a chain. To sim- plify some expressions, we add special start and stop states Y0 = start and Yn+1 = stop. Thus, we will be using the graphical structure shown in Figure 2. For a chain struc- ture, the conditional probability of a label sequence can be expressed concisely in matrix form, which will be useful in describing the parameter estimation and inference al- gorithms in Section 4. Suppose that p✓(Y | X) is a CRF given by (1). For each position i in the observation se- quence x, we define the |Y| ⇥ |Y| matrix random variable Mi(x) = [Mi(y0 , y | x)] by Mi(y0 , y | x) = exp (⇤i(y0 , y | x)) ⇤i(y0 , y | x) = P k k fk(ei, Y|ei = (y0 , y), x) + P k µk gk(vi, Y|vi = y, x) , of the training data. Both algorithms are based on the im- proved iterative scaling (IIS) algorithm of Della Pietra et al. (1997); the proof technique based on auxiliary functions can be extended to show convergence of the algorithms for CRFs. Iterative scaling algorithms update the weights as k k + k and µk µk + µk for appropriately chosen k and µk. In particular, the IIS update k for an edge feature fk is the solution of eE[fk] def = X x,y ep(x, y) n+1X i=1 fk(ei, y|ei , x) = X x,y ep(x) p(y | x) n+1X i=1 fk(ei, y|ei , x) e kT (x,y) . where T(x, y) is the total feature count T(x, y) def = X i,k fk(ei, y|ei , x) + X i,k gk(vi, y|vi , x) . The equations for vertex feature updates µk have similar tj(yi−1, yi, x, i) = b(x, i) if yi−1 = IN and yi = NNP 0 otherwise. In the remainder of this report, notation is simplified by writing s(yi, x, i) = s(yi−1, yi, x, i) and Fj(y, x) = n i=1 fj(yi−1, yi, x, i), where each fj(yi−1, yi, x, i) is either a state function s(yi−1, yi, x, i) or a transi- tion function t(yi−1, yi, x, i). This allows the probability of a label sequence y given an observation sequence x to be written as p(y|x, λ) = 1 Z(x) exp ( j λjFj(y, x)). (3) Z(x) is a normalization factor. 4 exp ( j λjtj(yi−1, yi, x, i) + k µksk(yi, x, i)), (2) where tj(yi−1, yi, x, i) is a transition feature function of the entire observation sequence and the labels at positions i and i−1 in the label sequence; sk(yi, x, i) is a state feature function of the label at position i and the observation sequence; and λj and µk are parameters to be estimated from training data. When defining feature functions, we construct a set of real-valued features b(x, i) of the observation to expresses some characteristic of the empirical dis- tribution of the training data that should also hold of the model distribution. An example of such a feature is b(x, i) = 1 if the observation at position i is the word “September” 0 otherwise. Each feature function takes on the value of one of these real-valued observation features b(x, i) if the current state (in the case of a state function) or previous and current states (in the case of a transition function) take on particular val- ues. All feature functions are therefore real-valued. For example, consider the following transition function: tj(yi−1, yi, x, i) = b(x, i) if yi−1 = IN and yi = NNP 0 otherwise. In the remainder of this report, notation is simplified by writing s(yi, x, i) = s(yi−1, yi, x, i) and Fj(y, x) = n i=1 fj(yi−1, yi, x, i), where each fj(yi−1, yi, x, i) is either a state function s(yi−1, yi, x, i) or a transi- tion function t(yi−1, yi, x, i). This allows the probability of a label sequence y given an observation sequence x to be written as p(y|x, λ) = 1 Z(x) exp ( j λjFj(y, x)). (3) Z(x) is a normalization factor. 4 1.2 Graphical Models 7 Logistic Regression HMMs Linear-chain CRFs Naive Bayes SEQUENCE SEQUENCE CONDITIONAL CONDITIONAL Generative directed models General CRFs CONDITIONAL General GRAPHS General GRAPHS Figure 1.2 Diagram of the relationship between naive Bayes, logistic regression, HMMs, linear-chain CRFs, generative models, and general CRFs. Furthermore, even when naive Bayes has good classification accuracy, its prob- ability estimates tend to be poor. To understand why, imagine training naive Bayes on a data set in which all the features are repeated, that is, x = (x1, x1, x2, x2, . . . , xK, xK). This will increase the confidence of the naive Bayes probability estimates, even though no new information has been added to the data. Assumptions like naive Bayes can be especially problematic when we generalize to sequence models, because inference essentially combines evidence from di↵erent parts of the model. If probability estimates at a local level are overconfident, it might be di cult to combine them sensibly. Actually, the di↵erence in performance between naive Bayes and logistic regression is due only to the fact that the first is generative and the second discriminative; the two classifiers are, for discrete input, identical in all other respects. Naive Bayes and logistic regression consider the same hypothesis space, in the sense that any logistic regression classifier can be converted into a naive Bayes classifier with the same decision boundary, and vice versa. Another way of saying this is that the naive Bayes model (1.5) defines the same family of distributions as the logistic regression model (1.7), if we interpret it generatively as p(y, x) = exp { P k kfk(y, x)} P ˜y,˜x exp { P k kfk(˜y, ˜x)} . (1.9) This means that if the naive Bayes model (1.5) is trained to maximize the con- ditional likelihood, we recover the same classifier as from logistic regression. Con- versely, if the logistic regression model is interpreted generatively, as in (1.9), and is 1.3 Linear-Chain Conditional Random Fields 9 . . . . . . y x Figure 1.3 Graphical model of an HMM-like linear-chain CRF. . . . . . . y x Figure 1.4 Graphical model of a linear-chain CRF in which the transition score depends on the current observation. 1.3 Linear-Chain Conditional Random Fields
  • 16. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 16 Table V: Verified input, local and global features. The selected ones in BILBO are written in black, and the non-selected ones are in gray. Input features Feature category Description Raw input token (I1) Tokenized word itself in the input string and the lowercased word Preceding or following tokens (I2) Three preceding and three following tokens of current token N-gram (I3) Attachment of preceding or following N-gram tokens Prefix/suffix in character level (I4) 8 different prefix/suffix as in [Councill et al. 2008] Local features Feature cate- gory Feature name Description Example Number ALLNUMBERS All characters are numbers 1984 (F1) NUMBERS One or more characters are numbers in-4 DASH One or more dashes are included in numbers 665-680 (F1digit) 1DIGIT, 2DIGIT ... If number, number of digits in it 5, 78, ... Capitalization ALLCAPS All characters are capital letters RAYMOND (F2) FIRSTCAP First character is capital letter Paris ALLSAMLL All characters are lower cased pouvoirs NONIMPCAP Capital letters are mixed dell’Ateneo Regular form INITIAL Initialized expression Ch.-R. (F3) WEBLINK Regular expression for web pages apcss.org Emphasis (F4) ITALIC Italic characters Regional Location BIBL START Position is in the first one-third of reference - (F5) BIBL IN Position is between the one-third and two-third - BIBL END Position is between the two-third and the end - Lexicon POSSEDITOR Possible for the abbreviation of editor ed. (F6) POSSPAGE Possible for the abbreviation of page pp. POSSMONTH Possible for month September POSSVOLUME Possible for the abbreviation of volume vol. External list SURNAMELIST Found in an external surname list RAYMOND (F7) FORENAMELIST Found in an external forename list Simone PLACELIST Found in an external place list New York JOURNALLIST Found in an external journal list African Affaire Punctuation (F8) COMMA, POINT, LINK, PUNC, LEAD- INGQUOTES, END- INGQUOTES, PAIREDBRACES Punctuation mark itself (comma, point) or punc- tuation type. These features are defined espe- cially for the case of non-separated punctuation. 46-55, 1993. S.; [en “The design”. (1) Global features Feature category Feature name Description Local feature existence [local feature name] Corresponding local feature is found in the input string (G1) (F3, F4, and F6 features are finally selected) Feature distribution (G2) NOPUNC, 1PUNC, 2PUNC, MORE- PUNC There are no, 1, 2, or more PUNC features in the in- put string NONUMBER There is no number in the input string STARTINITIAL The input string starts with an initial expression ENDQUOTECOMMA An ending quote is followed by a comma FIRSTCAPCOMMA A token having FIRSTCAP feature is followed by a comma Kim  &  Bellot,  2013
  • 17. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 17 Fig. 3: Basic tokenization effect. Each point is the averaged value of 10 different cross- validated experiments. ● ● ● ● ● Cumulative local feature effect Training sets Micro−averagedF−measure 50% 60% 70% 80% 90% 757779818385 ● F0(Base) F1(Num.) F2(Cap.) F3(Reg.) F4(Emp.) F5(Loc.) F6(Lex.) F7(Ext.) F8(Pun.) (a) Corpus level 1 ● ● ● ● ● Cumulative local feature effect Training sets Micro−averagedF−measure 50% 60% 70% 80% 90% 868890929496 ● F0(Base) F1(Num.) F2(Cap.) F3(Reg.) F4(Emp.) F5(Loc.) F6(Lex.) F7(Ext.) F8(Pun.) (b) Cora dataset Fig. 4: Cumulative local feature effect from F1 to F8 with C1 and Cora uation. We repeat cross validations by cumulatively adding features of each category from F1 to F8. Too detailed features such as that of category F1-sub are excluded here because by testing the detailed ones at the end, we want to eliminate them if they Kim  &  Bellot,  2013
  • 18. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 18 Reference Parsing in Digital Humanities 39:25 ● ● ● ● ● Cumulative List feature effect Training sets Micro−averagedF−measure 50% 60% 70% 80% 90% 8283848586 ● F6 F7a F7b F7c F7 (a) Corpus level 1 ● ● ● ● ● Cumulative List feature effect Training sets Micro−averagedF−measure 50% 60% 70% 80% 90% 9192939495 ● F6 F7a F7b F7c F7 (b) Cora dataset Fig. 5: Cumulative external list feature effect Detailed analysis of the effect of external lists and lexicon. One of interesting discoveries from the above analysis is lexical features are not always effective for reference pars- ing. Lexicon features defined with strict rules without overlapping have actually no significant impact, whereas external lists such as surname, forename, place, and jour- Kim  &  Bellot,  2013
  • 19. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 19 39:30 Y.-M. Kim and P. Bellot Table VII: Micro averaged precision and recall per field for C1 and Cora with finally chosen strategy (a) C1 - detailed labels Fields #true #annot. #exist. prec.(%) recall(%) surname 1080 1164 1203 92.78 89.78 forename 1128 1220 1244 92.46 90.68 title(m) 3277 4132 3690 79.31 88.81 title(a) 2782 3253 3069 85.52 90.65 title(j) 440 564 681 78.01 64.61 title(u) 511 660 652 77.42 78.37 title(s) 18 24 118 75.00 15.25 publisher 1021 1367 1171 74.69 87.19 date 793 838 855 94.63 92.75 biblscope(pp) 210 223 219 94.17 95.89 biblscope(i) 152 191 189 79.58 80.42 biblscope(v) 75 87 102 86.21 73.53 extent 66 69 70 95.65 94.29 place 433 524 539 82.63 80.33 abbr 417 468 502 89.10 83.07 nolabel 231 306 488 75.49 47.34 edition 46 178 211 25.84 21.80 orgname 74 87 118 85.06 62.71 bookindicator 47 49 65 95.92 72.31 OTHERS 95 177 395 53.67 24.05 Average 12896 15581 15581 82.77 82.77 (b) Cora dataset Fields #true #annot. #exist. prec.(%) recall(%) author 2797 2855 2830 97.97 98.83 title 3508 3613 3560 97.10 98.54 booktitle 1750 1882 1865 92.99 93.83 journal 546 615 617 88.78 88.49 date 636 641 642 99.22 99.07 institution 268 299 306 89.63 87.58 publisher 165 188 203 87.77 81.28 location 247 279 289 88.53 85.47 editor 232 261 295 88.89 78.64 pages 422 429 438 98.37 96.35 volume 306 327 320 93.58 95.63 tech 130 155 178 83.87 73.03 note 75 122 123 61.48 60.98 Average 11082 11666 11666 95.00 95.00 punctuation is attached, but the latter is significantly negative when reference fields are much detailed. Hypothesis 2 is confirmed with these observations. For our system BILBO, input and local features written in black in Table V are fi- nally selected. BILBO provides two different labeling levels, simple model using only Learning  on  FR  data   Testing  on  US  data (715  references) Kim  &  Bellot,  2013
  • 20. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 20 Test  :  http://bilbo.openeditionlab.org   Sources  :  http://github.com/OpenEdition/bilbo
  • 21. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) EQUIPEX OpenEdition: BILBO 21
  • 22. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) IR and Digital Libraries ! Sentiment Analysis 22
  • 23. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Searching for book reviews • Applying and testing classical supervised approaches for filtering reviews = a new kind of genre classification. • Developing a corpus of reviews of books from the OpenEdition.org platforms and from the Web. • Collecting two kinds of reviews:
 — Long reviews of scientific books written by expert reviewers in scientific journals
 — Short reviews such reader comments on social web sites • Linking reviews to their corresponding books using BILBO 23 Review   ≠   Abstract
  • 24. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Searching for book reviews • A supervised classification approach • Feature selection : decision trees, Z-score • Features : localisation of named entities, 24 was performed. We can see that a lot of this fea- tures relate to the classe where they predominate. Table 3: Distribution of the 30 highest normalized Z scores across the corpus. # Feature Z score # Feature Z score 1 abandonne 30.14 16 winter 9.23 2 seront 30.00 17 cleo 8.88 3 biographie 21.84 18 visible 8.75 4 entranent 21.20 19 fondamentale 8.67 5 prise 21.20 20 david 8.54 6 sacre 21.20 21 pratiques 8.52 7 toute 20.70 22 signification 8.47 8 quitte 19.55 23 01 8.38 9 dimension 15.65 24 institutionnels 8.38 10 les 14.43 25 1930 8.16 11 commandement 11.01 26 attaques 8.14 12 lie 10.61 27 courrier 8.08 13 construisent 10.16 28 moyennes 7.99 14 lieux 10.14 29 petite 7.85 15 garde 9.75 30 adapted 7.84 In our training corpus, we have 106 911 words obtained from the Bag-of-Words approach. We se- lected all tokens (features) that appear more than 5 times in each classes. The goal is therefore to design a method capable of selecting terms that clearly belong to one genre of documents. We ob- know, this section contains authors’ names, loca- tions, dates, etc... However, in the Review class this section is quite often absent. Based on this analysis, we tagged all documents of each class using the Named Entity Recognition tool TagEN (Poibeau, 2003). We aim to explore the distribu- tion of 3 named entities (”authors’ names”, ”loca- tions” and ”dates”) in the text after removing all XML-HTML tags. After that, we divided texts into 10 parts (the size of each part = total num- ber of words / 10). The distribution ratio of each named entity in each part is used as feature to build the new document representation and we obtained a set of 30 features. Figure 3: ”Person” named entity distribution 6 Experiments Figure 4: ”Location” named entity distribution Figure 5: ”Date” named entity distribution 6.2 Support Vector Machines (SVM) SVM designates a learning approach introduced by Vapnik in 1995 for solving two-class pattern recognition problem (Vapnik, 1995). The SVM method is based on the Structural Risk Mini- mization principle (Vapnik, 1995) from computa- tional learning theory. In their basic form, SVMs learn linear threshold function. Nevertheless, by a simple plug-in of an appropriate kernel func- tion, they can be used to learn linear classifiers, radial basic function (RBF) networks, and three- layer sigmoid neural nets (Joachims, 1998). The key in such classifiers is to determine the opti- mal boundaries between the different classes and use them for the purposes of classification (Ag- garwal and Zhai, 2012). Having the vectors form the different representations presented below. we used the Weka toolkit to learning model. This model with the use of the linear kernel and Radial |w| indicates the number of words included in the current document and wj is the number of words that appear in the document. arg max hi P(hi). |w| Y j=1 P(wj|hi) (5) where P(wj|hi) = tfj,hi nhi We estimate the probabilities with the Equation (5) and get the relation between the lexical fre- quency of the word wj in the whole size of the collection Thi (denoted tfj,hi ) and the size of the corresponding corpus. Table 4: Results showing the performances of the classification models using different indexing schemes on the test set. The best values for the Review class are noted in bold and those for Review class are are underlined Review Review # Model R P F-M R P F-M 1 NB 65.5% 81.5% 72.6% 81.6% 65.7% 72.8% SVM (Linear) 99.6% 98.3% 98.9% 97.9% 99.5% 98.7% SVM (RBF) 89.8% 97.2% 93.4% 96.8% 88.5% 92.5% * C = 5.0 * = 0.00185 2 NB 90.6% 64.2% 75.1% 37.4% 76.3% 50.2% SVM (Linear) 87.2% 81.3% 84.2% 75.3% 82.7% 78.8% SVM (RBF) 87.2% 86.5% 86.8% 83.1% 84.0% 83.6% * C = 32.0 * = 0.00781 3 NB 80.0% 68.4% 73.7% 54.2% 68.7% 60.6% SVM (Linear) 77.0% 81.9% 79.4% 78.9% 73.5% 76.1% SVM (RBF) 81.2% 48.6% 79.9% 72.6% 75.8% 74.1% * C = 8.0 * = 0.03125 Benkoussas  &  Bellot,  LREC  2014
  • 25. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 25
  • 26. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Sentiment Analysis in Twitter 26 Authorities are only too aware that Kashgar is 4,000 kilometres (2,500 miles) from Beijing but only a tenth of the distance from the Pakistani border, and are desperate to ensure instability or militancy does not leak over the frontiers. Taiwan-made products stood a good chance of becoming even more competitive thanks to wider access to overseas markets and lower costs for material imports, he said. ”March appears to be a more reasonable estimate while earlier admission cannot be entirely ruled out,” according to Chen, also Taiwan’s chief WTO negotiator. friday evening plans were great, but saturday’s plans didnt go as expected – i went dancing & it was an ok club, but terribly crowded :-( WHY THE HELL DO YOU GUYS ALL HAVE MRS. KENNEDY! SHES A FUCKING DOUCHE AT&T was okay but whenever they do something nice in the name of customer service it seems like a favor, while T-Mobile makes that a normal everyday thin obama should be impeached on TREASON charges. Our Nuclear arsenal was TOP Secret. Till HE told our enemies what we had. #Coward #Traitor My graduation speech: ”I’d like to thanks Google, Wikipedia and my computer! :D #iThingteens Table 5: List of example sentences with annotations that were provided to the annotators. All subjective phrases are italicized. Positive phrases are in green, negative phrases are in red, and neutral phrases are in blue. Worker 1 I would love to watch Vampire Diaries :) and some Heroes! Great combination 9/13 Worker 2 I would love to watch Vampire Diaries :) and some Heroes! Great combination 11/13 Worker 3 I would love to watch Vampire Diaries :) and some Heroes! Great combination 10/13 Worker 4 I would love to watch Vampire Diaries :) and some Heroes! Great combination 13/13 Worker 5 I would love to watch Vampire Diaries :) and some Heroes! Great combination 11/13 Intersection I would love to watch Vampire Diaries :) and some Heroes! Great combination Table 6: Example of a sentence annotated for subjectivity on Mechanical Turk. Words and phrases that were marked as subjective are italicized and highlighted in bold. The first five rows are annotations provided by Turkers, and the final row shows their intersection. The final column shows the accuracy for each annotation compared to the intersection. Note that ignoring Fneutral does not reduce the task to predicting positive vs. negative labels only (even though some participants have chosen to do so) since the gold standard still contains neutral For both subtasks, there were teams that only sub- mitted results for the Twitter test set. Some teams submitted both a constrained and an unconstrained version (e.g., AVAYA and teragram). As one would ation methodology. We then summarize the charac- teristics of the approaches taken by the participating systems and we discuss their scores. 2 Task Description We had two subtasks: an expression-level subtask and a message-level subtask. Participants could choose to participate in either or both subtasks. Be- low we provide short descriptions of the objectives of these two subtasks. Subtask A: Contextual Polarity Disambiguation Given a message containing a marked instance of a word or a phrase, determine whether that instance is positive, negative or neutral in that context. The boundaries for the marked in- stance were provided: this was a classification task, not an entity recognition task. 2 http://www.daedalus.es/TASS/corpus.php this lexicon was used to automatically label addi- tional Tweet/SMS messages and then used with the original data to train the classifier, then such a sys- tem would be unconstrained. 3 Dataset Creation In the following sections we describe the collection and annotation of the Twitter and SMS datasets. 3.1 Data Collection Twitter is the most common micro-blogging site on the Web, and we used it to gather tweets that express sentiment about popular topics. We first extracted named entities using a Twitter-tuned NER system (Ritter et al., 2011) from millions of tweets, which we collected over a one-year period spanning from January 2012 to January 2013; we used the public streaming Twitter API to download tweets. 313 e RT “Until tonight I never realised how fucked up I was” - So, wat interview did you go to? How did it go? om each corpus that contain subjective phrases. al- pro- mpis pan- al., rom d on MS ions task to a yed oal, con- MS res- olar- We lua- Subtask B: Message Polarity Classification Given a message, decide whether it is of positive, negative, or neutral sentiment. For messages conveying both a positive and a negative sentiment, whichever is the stronger one was to be chosen. Each participating team was allowed to submit re- sults for two different systems per subtask: one con- strained, and one unconstrained. A constrained sys- tem could only use the provided data for training, but it could also use other resources such as lexi- cons obtained elsewhere. An unconstrained system could use any additional data as part of the training process; this could be done in a supervised, semi- supervised, or unsupervised fashion. Note that constrained/unconstrained refers to the data used to train a classifier. For example, if other data (excluding the test data) was used to develop a sentiment lexicon, and the lexicon was used to generate features, the system would still be con- strained. However, if other data (excluding the test vey an opinion. Given a sentence, identify whether it is objective, bjective word or phrase in the context of the sentence and mark elow. The number above each word indicates its position. The box so that you can confirm that you chose the correct range. ing one of the radio buttons: positive, negative, or neutral. If a indicating that ”There are no subjective words/phrases”. Please nning if this is your first time answering this hit. kers on Mechanical Turk followed by a screenshot. Total Phrase Count Vocabulary ers Positive Negative Neutral Size 0.0 5,895 3,131 471 20,012 0.0 648 430 57 4,426 1.2 2,734 1,541 160 11,736 5.6 1,071 1,104 159 3,562 tatistics for Subtask A. med tion this ered oned ffer- ds. to- us- We ent- one that Corpus Positive Negative Objective / Neutral Twitter - Training 3,662 1,466 4,600 Twitter - Dev 575 340 739 Twitter - Test 1,573 601 1,640 SMS - Test 492 394 1,208 Table 3: Statistics for Subtask B. We annotated the same Twitter messages with an- notations for subtask A and subtask B. However,
  • 27. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Aspect Based Sentiment Analysis — Subtask 1: Aspect term extraction Given a set of sentences with pre-identified entities (e.g., restaurants), identify the aspect terms present in the sentence and return a list containing all the distinct aspect terms. An aspect term names a particular aspect of the target entity. For example, "I liked the service and the staff, but not the food”, “The food was nothing much, but I loved the staff”. Multi-word aspect terms (e.g., “hard disk”) should be treated as single terms (e.g., in “The hard disk is very noisy” the only aspect term is “hard disk”).
 
 — Subtask 2: Aspect term polarity For a given set of aspect terms within a sentence, determine whether the polarity of each aspect term is positive, negative, neutral or conflict (i.e., both positive and negative). For example: “I loved their fajitas” → {fajitas: positive}
 “I hated their fajitas, but their salads were great” → {fajitas: negative, salads: positive}
 “The fajitas are their first plate” → {fajitas: neutral}
 “The fajitas were great to taste, but not to see” → {fajitas: conflict} 27 http://alt.qcri.org/semeval2014/task4/
  • 28. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Aspect Based Sentiment Analysis — Subtask 3: Aspect category detection Given a predefined set of aspect categories (e.g., price, food), identify the aspect categories discussed in a given sentence. Aspect categories are typically coarser than the aspect terms of Subtask 1, and they do not necessarily occur as terms in the given sentence. For example, given the set of aspect categories {food, service, price, ambience, anecdotes/ miscellaneous}: “The restaurant was too expensive” → {price}
 “The restaurant was expensive, but the menu was great” → {price, food}
 
 — Subtask 4: Aspect category polarity Given a set of pre-identified aspect categories (e.g., {food, price}), determine the polarity (positive, negative, neutral or conflict) of each aspect category. For example: “The restaurant was too expensive” → {price: negative}
 “The restaurant was expensive, but the menu was great” → {price: negative, food: positive} 28 http://alt.qcri.org/semeval2014/task4/
  • 29. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 29 Hussam Hamdan1 Frédéric Léchet1 Patrice Lellot Hussam:hamdan_lsis:org1 Frederic:bechet_lif:univ-mrs:fr1 Patrice:bellot_lsis:org Vix-Marseille University1 Marseille France Twitter is a real-time1 highly social microblogging service that allows us to post short messages1The Sentiment Vnalysis of Twitter is useful for many domains )Marketing1Finance1 Social1 etc:::E1 Many approaches were proposed for this task1 we have applied several machine learning approaches in order to classify the Tweets using the dataset of SemEval DNj!: Many resources were used for feature extractionG WordNet )similar adjectives and verb groupsE1 RLpedia )the hidden conceptsE1 SentiWordNet )the polarity and subjectivityE1 and other Twitter specific features such as number of y1w1_1 etc: Highlights Results Naive Bayes Model Average F-measure of negative and positive classes has been improved by 45 wrt uni-gram model SVM Model Average F-measure of negative and positive classes has been improved by 1(55 wrt uni-gram model System Architecture Preprocessing Feature Extraction Classification Model Training Set: 6#56 Tweets Development SetG j48# Tweets Gas by my house hit §!:!5yyyy1 Ikm going to *hapel Hill on Sat: GE DBpedia WordNet Senti-Features xSentiWordNetC Twitter Specific Pos Neg Objective Conclusion Classification Twitter Dictionary Gas by my house hit §!:!5yyyy1 Ikm going to *hapel Hill on Sat: very happy Settlement connected1 blessed move1 displace1 sit sit_down w_1 wy1 ww polarity1 subjectivity wpos wneg Preprocessing Feature Extraction *lassification model - Using the similar adjectives from WordNet has a significant effect with Naive Layes but a little effect with SVM: - Using the hidden concepts is not so significant in this data set1 more significant for the objective class with SVM - Using Senti-features and Twitter specific features and verb groups were useful with SVM Experiments with DBpediaD WordNet and SentiWordNet as resources for Sentiment analysis in micro-blogging )Linear kernelE 2 PG Precision1 RG Recall1 FG F-measure 2 SemEval  2013
  • 30. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Sentiment Analysis on Twitter : Using Z-Score • Z-Score helps to discriminate words for Document Classification, Authorship Attribution (J. Savoy, ACM TOIS 2013) 30 Z_score for each term ti in a class Cj (tij) by cal- culating its term relative frequency tfrij in a par- ticular class Cj, as well as the mean (meani) which is the term probability over the whole cor- pus multiplied by nj the number of terms in the class Cj, and standard deviation (sdi) of term ti according to the underlying corpus (see Eq. (1,2)). Z!"#$% !!" = !"#!"!!"#$! !"# Eq. (1) Z!"#$% !!" = !"#!"!!!∗!(!") !"∗! !" ∗(!!!(!")) Eq. (2) The term which has salient frequency in a class in compassion to others will have a salient Z_score. Z_score was exploited for SA by (Zubaryeva and Savoy 2010) , they choose a threshold (>2) for selecting the number of terms having Z_score more than the threshold, then they used a logistic regression for combining Bing Liu's Opinion Lexicon which is created by (Hu and Liu 2004) and augmented in many latter works. We extract the number of positive, nega- tive and neutral words in tweets according to the- se lexicons. Bing Liu's lexicon only contains negative and positive annotation but Subjectivity contains negative, positive and neutral. - Part Of Speech (POS) We annotate each word in the tweet by its POS tag, and then we compute the number of adjec- tives, verbs, nouns, adverbs and connectors in each tweet. 4 Evaluation 4.1 Data collection We used the data set provided in SemEval 2013 and 2014 for subtask B of sentiment analysis in Twitter(Rosenthal, Ritter et al. 2014) (Wilson, Kozareva et al. 2013). The participants were provided with training tweets annotated as posi- Z_score for each term ti in a class Cj (tij) by cal- culating its term relative frequency tfrij in a par- ticular class Cj, as well as the mean (meani) which is the term probability over the whole cor- pus multiplied by nj the number of terms in the class Cj, and standard deviation (sdi) of term ti according to the underlying corpus (see Eq. (1,2)). Z!"#$% !!" = !"#!"!!"#$! !"# Eq. (1) Z!"#$% !!" = !"#!"!!!∗!(!") !"∗! !" ∗(!!!(!")) Eq. (2) The term which has salient frequency in a class in compassion to others will have a salient Z_score. Z_score was exploited for SA by (Zubaryeva and Savoy 2010) , they choose a threshold (>2) for selecting the number of terms having Z_score more than the threshold, then they used a logistic regression for combining these scores. We use Z_scores as added features for classification because the tweet is too short, therefore many tweets does not have any words with salient Z_score. The three following figures 1,2,3 show the distribution of Z_score over each Bin (Hu wor tive se neg con - Pa We tag, tive each 4 4.1 W and Twi Koz prov tive twe we of p pus multiplied by nj the number of term class Cj, and standard deviation (sdi) o according to the underlying corpus ( (1,2)). Z!"#$% !!" = !"#!"!!"#$! !"# Eq. (1) Z!"#$% !!" = !"#!"!!!∗!(!") !"∗! !" ∗(!!!(!")) Eq. (2) The term which has salient frequency in in compassion to others will have a Z_score. Z_score was exploited for (Zubaryeva and Savoy 2010) , they c threshold (>2) for selecting the number having Z_score more than the thresho they used a logistic regression for co these scores. We use Z_scores as added for classification because the tweet is to therefore many tweets does not have an with salient Z_score. The three following 1,2,3 show the distribution of Z_score o class, we remark that the majority of te (1,2)). Z!"#$% !!" = !"#!"!!"#$! !"# Eq. (1) Z!"#$% !!" = !"#!"!!!∗!(!") !"∗! !" ∗(!!!(!")) Eq. (2) The term which has salient frequency in a class in compassion to others will have a salient Z_score. Z_score was exploited for SA by (Zubaryeva and Savoy 2010) , they choose a threshold (>2) for selecting the number of terms having Z_score more than the threshold, then they used a logistic regression for combining these scores. We use Z_scores as added features for classification because the tweet is too short, therefore many tweets does not have any words with salient Z_score. The three following figures 1,2,3 show the distribution of Z_score over each class, we remark that the majority of terms has Z_score between -1.5 and 2.5 in each class and the rest are either vey frequent (>2.5) or very rare (<-1.5). It should indicate that negative value means that the term is not frequent in this class in comparison with its frequencies in other classes. Table1 demonstrates the first ten terms having the highest Z_scores in each class. We have test- ed to use different values for the threshold, the best results was obtained when the threshold is 3. positive Z_score negative Z_score Neutral Z_score Love Good Happy Great Excite Best Thank Hope Cant Wait 14.31 14.01 12.30 11.10 10.35 9.24 9.21 8.24 8.10 8.05 Not Fuck Don’t Shit Bad Hate Sad Sorry Cancel stupid 13.99 12.97 10.97 8.99 8.40 8.29 8.28 8.11 7.53 6.83 Httpbit Httpfb Httpbnd Intern Nov Httpdlvr Open Live Cloud begin 6.44 4.56 3.78 3.58 3.45 3.40 3.30 3.28 3.28 3.17 Table1. The first ten terms having the highest Z_score in each class - Part Of Speech (POS) We annotate each word in the tweet by its POS tag, and then we compute the number of adjec- tives, verbs, nouns, adverbs and connectors in each tweet. 4 Evaluation 4.1 Data collection We used the data set provided in SemEval 2013 and 2014 for subtask B of sentiment analysis in Twitter(Rosenthal, Ritter et al. 2014) (Wilson, Kozareva et al. 2013). The participants were provided with training tweets annotated as posi- tive, negative or neutral. We downloaded these tweets using a given script. Among 9646 tweets, we could only download 8498 of them because of protected profiles and deleted tweets. Then, we used the development set containing 1654 tweets for evaluating our methods. We combined the development set with training set and built a new model which predicted the labels of the test set 2013 and 2014. 4.2 Experiments Official Results The results of our system submitted for SemEval evaluation gave 46.38%, 52.02% for test set 2013 and 2014 respectively. It should mention that these results are not correct because of a software bug discovered after the submis- sion deadline, therefore the correct results is demonstrated as non-official results. In fact the previous results are the output of our classifier which is trained by all the features in section 3, but because of index shifting error the test set was represented by all the features except the terms. Non-official Results We have done various experiments using the features presented in Section 3 with Multinomial Naïve-Bayes model. We firstly constructed fea- features which improve the performance by 6.5% and 10.9%, then by pre-polarity features which also improve the f-measure by 4%, 6%, but the extending with POS tags decreases the f- measure. We also test all combinations with the- se previous features, Table2 demonstrates the results of each combination, we remark that POS tags are not useful over all the experiments, the best result is obtained by combining Z_score and pre-polarity features. We find that Z_score fea- tures improve significantly the f-measure and they are better than pre-polarity features. Figure 1 Z_score distribution in positive class Figure 2 Z_score distribution in neutral class Features F-measure 2013 2014 Terms 49.42 46.31 Terms+Z 55.90 57.28 Terms+POS 43.45 41.14 Terms+POL 53.53 52.73 Terms+Z+POS 52.59 54.43 Terms+Z+POL 58.34 59.38 Terms+POS+POL 48.42 50.03 Terms+Z+POS+POL 55.35 58.58 Table 2. Average f-measures for positive and negative clas- ses of SemEval2013 and 2014 test sets. We repeated all previous experiments after using a twitter dictionary where we extend the tweet by the expressions related to each emotion icons or abbreviations in tweets. The results in Table3 demonstrate that using that dictionary improves the f-measure over all the experiments, the best results obtained also by combining Z_scores and pre-polarity features. Features F-measure 2013 2014 Terms 50.15 48.56 Terms+Z 57.17 58.37 Terms+POS 44.07 42.64 Terms+POL 54.72 54.53 Terms+Z+POS 53.20 56.47 Terms+Z+POL 59.66 61.07 Terms+POS+POL 48.97 51.90 Terms+Z+POS+POL 55.83 60.22 Table 3. Average f-measures for positive and negative clas- ses of SemEval2013 and 2014 test sets after using a twitter dictionary. 5 Conclusion In this paper we tested the impact of using Twitter Dictionary, Sentiment Lexicons, Z_score features and POS tags for the sentiment classifi- cation of tweets. We extended the feature vector of tweets by all these features; we have proposed new type of features Z_score and demonstrated that they can improve the performance. Figure 1 Z_score distribution in positive class Figure 2 Z_score distribution in neutral class Figure 3 Z_score distribution in negative class demonstrate that using that dictionary improves the f-measure over all the experiments, the best results obtained also by combining Z_scores and pre-polarity features. Features F-measure 2013 2014 Terms 50.15 48.56 Terms+Z 57.17 58.37 Terms+POS 44.07 42.64 Terms+POL 54.72 54.53 Terms+Z+POS 53.20 56.47 Terms+Z+POL 59.66 61.07 Terms+POS+POL 48.97 51.90 Terms+Z+POS+POL 55.83 60.22 Table 3. Average f-measures for positive and negative clas- ses of SemEval2013 and 2014 test sets after using a twitter dictionary. 5 Conclusion In this paper we tested the impact of using Twitter Dictionary, Sentiment Lexicons, Z_score features and POS tags for the sentiment classifi- cation of tweets. We extended the feature vector of tweets by all these features; we have proposed new type of features Z_score and demonstrated that they can improve the performance. We think that Z_score can be used in different ways for improving the Sentiment Analysis, we are going to test it in another type of corpus and using other methods in order to combine these features. Reference Apoorv Agarwal,Boyi Xie,Ilia Vovsha,Owen Rambow and Rebecca Passonneau (2011). Sentiment analysis of Twitter data. Proceedings of the Workshop on Languages se previous features, Table2 demonstrates the results of each combination, we remark that POS tags are not useful over all the experiments, the best result is obtained by combining Z_score and pre-polarity features. We find that Z_score fea- tures improve significantly the f-measure and they are better than pre-polarity features. Figure 1 Z_score distribution in positive class Figure 2 Z_score distribution in neutral class Terms+POL 53.53 52.73 Terms+Z+POS 52.59 54.43 Terms+Z+POL 58.34 59.38 Terms+POS+POL 48.42 50.03 Terms+Z+POS+POL 55.35 58.58 Table 2. Average f-measures for positive and negative clas- ses of SemEval2013 and 2014 test sets. We repeated all previous experiments after using a twitter dictionary where we extend the tweet by the expressions related to each emotion icons or abbreviations in tweets. The results in Table3 demonstrate that using that dictionary improves the f-measure over all the experiments, the best results obtained also by combining Z_scores and pre-polarity features. Features F-measure 2013 2014 Terms 50.15 48.56 Terms+Z 57.17 58.37 Terms+POS 44.07 42.64 Terms+POL 54.72 54.53 Terms+Z+POS 53.20 56.47 Terms+Z+POL 59.66 61.07 Terms+POS+POL 48.97 51.90 Terms+Z+POS+POL 55.83 60.22 Table 3. Average f-measures for positive and negative clas- ses of SemEval2013 and 2014 test sets after using a twitter dictionary. 5 Conclusion In this paper we tested the impact of using Twitter Dictionary, Sentiment Lexicons, Z_score features and POS tags for the sentiment classifi- cation of tweets. We extended the feature vector of tweets by all these features; we have proposed [Hamdan,  Béchet  &  Bellot,  SemEval  2014] Run Const- Unconst- Use Super- rained rained Neut.? vised? NRC-Canada 69.02 yes yes GU-MLT-LT 65.27 yes yes teragram 64.86 64.86(1) yes yes BOUNCE 63.53 yes yes KLUE 63.06 yes yes AMI&ERIC 62.55 61.17(3) yes yes/semi FBM 61.17 yes yes AVAYA 60.84 64.06(2) yes yes/semi SAIL 60.14 61.03(4) yes yes UT-DB 59.87 yes yes FBK-irst 59.76 yes yes Run Const- Unconst- rained rained N NRC-Canada 68.46 GU-MLT-LT 62.15 KLUE 62.03 AVAYA 60.00 59.47(1) teragram 59.10(2) NTNU 57.97 54.55(6) CodeX 56.70 FBK-irst 54.87 AMI&ERIC 53.63 52.62(7) ECNUCS 53.21 54.77(5) UT-DB 52.46 Best  official  2013  results [Hamdan,  Bellot  &  Béchet,  SemEval  2014]
  • 31. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Subjectivity lexicon : MPQA - The MPQA (Multi-Perspective Question Answering) Subjectivity Lexicon 31 http://mpqa.cs.pitt.edu Theresa  Wilson,  Janyce  Wiebe,  and  Paul  Hoffmann  (2005).  Recognizing  Contextual  Polarity  in  Phrase-­‐‑Level  Sentiment  Analysis.  Proc.  of  HLT-­‐‑EMNLP-­‐‑2005.
  • 32. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 32 http://wordnetweb.princeton.edu/perl/webwn http://www.cs.rochester.edu/research/cisd/wordnet COMMUNICATIONS OF THE ACM November 1995/Vol. 38, No. 11 39
  • 33. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 33 http://sentiwordnet.isti.cnr.it
  • 34. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Aspect Based Sentiment Analysis — Dataset : 3K English sentences from the restaurant review + 3K English sentences extracted from customer reviews of laptops + tagged by experienced human annotators — We proposed : 
 1. Aspect term extraction: CRF model 2. Aspect Term Polarity Detection: Multinomial Naive-Bayes classifier with some features such as Z-score, POS and prior polarity extracted from Subjectivity Lexicon (Wilson, Wiebe et al. 2005) and Bing Liu's Opinion Lexicon 3. Category Detection & Category Polarity Detection : Z-score model
 ! ! 34 elative frequency tfrij s well as the mean probability over the by nj the number of andard deviation (sdi) nderlying corpus (see 1) )) Eq. (2) r SA by (Zubaryeva ose a threshold (Z>2) terms having Z_score n they used a logistic hese scores. We use or multinomial Naive We remark that our system is 24% and 21% above the baseline for aspect terms extraction in restaurant and laptop reviews respectively, and above 3% for category detection in restaurant reviews. Data subtask P R F Res 1 Baseline 0,52 0,42 0,47 System 0.81 0.63 0.71 3 Baseline 0,73 0,59 0,65 System 0.77 0.60 0.68 Lap 1 Baseline 0,44 0,29 0,35 System 0.76 0.45 0.56 Table 1. Results of subtask 1, 2 for restaurant reviews, subtask 1 for laptop reviews The second step involves the evaluation of subtask 2 and 4, we were provided with(1) restaurant review sentences annotated by their aspect terms, and categories, we had to determine the polarity for each aspect term and category; (2) laptop review sentences annotated (tij) by calculating its term relative frequency tfrij in a particular class Cj, as well as the mean (meani) which is the term probability over the whole corpus multiplied by nj the number of terms in the class Cj, and standard deviation (sdi) of term ti according to the underlying corpus (see Eq. (1,2)). Z!"#$% !!" = !"#!"!!"#$! !"# Eq. (1) Z!"#$% !!" = !"#!"!!!∗!(!") !"∗! !" ∗(!!!(!")) Eq. (2) Z_score was exploited for SA by (Zubaryeva and Savoy 2010), they choose a threshold (Z>2) for selecting the number of terms having Z_score more than the threshold, then they used a logistic regression for combining these scores. We use Z_score as added features for multinomial Naive Bayes classifier. 3.4 Subtask4: Category Polarity Detection We have used Multinomial Naive-Bayes as in the subtask2 step (2) with the same features, but the different that we add also the name of the category as a feature. Thus, for each sentence having n category we add n examples to the training set, the difference between them is the feature of the category. 4 Experiments and Evaluations We tested our system using the training and testing data provided by SemEval 2014 ABSA task. Two data sets were provided; the first contains3Ksentences of restaurant reviews annotated by the aspect terms, their polarities, their categories, the polarities of each category. The second contains of 3K sentences of laptop reviews annotated just by the aspect terms, their polarities. The evaluation process was done in two steps. First step is concerning the subtasks 1 and 3 We remark that our system is 24% and 21% above the baseline for aspect terms extraction in restaurant and laptop reviews respectively, and above 3% for category detection in restaurant reviews. Data subtask P R F Res 1 Baseline 0,52 0,42 0,47 System 0.81 0.63 0.71 3 Baseline 0,73 0,59 0,65 System 0.77 0.60 0.68 Lap 1 Baseline 0,44 0,29 0,35 System 0.76 0.45 0.56 Table 1. Results of subtask 1, 2 for restaurant reviews, subtask 1 for laptop reviews The second step involves the evaluation of subtask 2 and 4, we were provided with(1) restaurant review sentences annotated by their aspect terms, and categories, we had to determine the polarity for each aspect term and category; (2) laptop review sentences annotated by aspect terms and we had to determine the aspect term polarity. Table 2 demonstrates the results of our system and the baseline (A: accuracy, R: number of true retrieved examples, All: number of all true examples). Data subtask R All A Res 2 Baseline 673 1134 0,64 System 818 1134 0.72 4 Baseline 673 1025 0,65 System 739 1025 0.72 Lap 2 Baseline 336 654 0,51 System 424 654 0,64 Table 2. Results of subtask 2, 4 for restaurant reviews, subtask 2 for laptop reviews We remark that our system is 8% and 13% above the baseline for aspect terms polarity detection in restaurant and laptop reviews respectively, and 7% above for category polarity detection in restaurant reviews.
  • 35. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) IR and Digital Libraries ! Social Book Search 35
  • 36. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 36 INEX topics
  • 37. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) INEX 2014 Social Book Search Track — In 2014, the Social Book Search Track consists of two tasks: • Suggestion task: a system-oriented batch retrieval/recommendation task • Interactive task: a user-oriented interactive task where we want to gather user data on searching for different search tasks and different search interfaces. — 2.8 million book descriptions with metadata from Amazon and LibraryThing — 14 million reviews (1.5 million books have no review) — Amazon: formal metadata like booktitle, author, publisher, publication year, library classification codes, Amazon categories and similar product information, as well as user-generated content in the form of user ratings and reviews — LibraryThing, there are user tags and user-provided metadata on awards, book characters and locations and blurbs 37 https://inex.mmci.uni-­‐saarland.de/tracks/books/
  • 38. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 38
  • 39. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) 39 <browseNode> fields. Table 1. Some facts about the Amazon collection. Number of pages (i.e. books) 2, 781, 400 Number of reviews 15, 785, 133 Number of pages that contain a least a review 1, 915, 336 3 Retrieval model 3.1 Sequential Dependence Model Like the previous year, we used a language modeling approach to retrieval [4]. We use Metzler and Croft’s Markov Random Field (MRF) model [5] to integrate multiword phrases in the query. Specifically, we use the Sequential Dependance Run nDCG@10 P@10 MRR MAP p4-inex2011SB.xml social.fb.10.50 0.3101 0.2071 0.4811 0.2283 p54-run4.all-topic-fields.reviews-split.combSUM 0.2991 0.1991 0.4731 0.1945 p4-inex2011SB.xml social 0.2913 0.1910 0.4661 0.2115 p4-inex2011SB.xml full.fb.10.50 0.2853 0.1858 0.4453 0.2051 p54-run2.all-topic-fields.all-doc-fields 0.2843 0.1910 0.4567 0.2035 p62.recommendation 0.2710 0.1900 0.4250 0.1770 p54-run3.title.reviews-split.combSUM 0.2643 0.1858 0.4195 0.1661 p62.sdm-reviews-combine 0.2618 0.1749 0.4361 0.1755 p62.baseline-sdm 0.2536 0.1697 0.3962 0.1815 p62.baseline-tags-browsenode 0.2534 0.1687 0.3877 0.1884 p4-inex2011SB.xml full 0.2523 0.1649 0.4062 0.1825 wiki-web-nyt-gw 0.2502 0.1673 0.4001 0.1857 p4-inex2011SB.xml amazon 0.2411 0.1536 0.3939 0.1722 p62.sdm-wiki 0.1953 0.1332 0.3017 0.1404 p62.sdm-wiki-anchors 0.1724 0.1199 0.2720 0.1253 p4-inex2011SB.xml lt 0.1592 0.1052 0.2695 0.1199 p18.UPF QE group BTT02 0.1531 0.0995 0.2478 0.1223 p18.UPF QE genregroup BTT02 0.1327 0.0934 0.2283 0.1001 p18.UPF QEGr BTT02 RM 0.1291 0.0872 0.2183 0.0973 p18.UPF base BTT02 0.1281 0.0863 0.2135 0.1018 p18.UPF QE genre BTT02 0.1214 0.0844 0.2089 0.0910 p18.UPF base BT02 0.1202 0.0796 0.2039 0.1048 p54-run1.title.all-doc-fields 0.1129 0.0801 0.1982 0.0868 Table 2. O cial results of the Best Books for Social Search task of the INEX 2011 Book track, using judgements derived from the LibraryThing discussion groups. Our runs are identified by the p62 prefix and are in boldface. Run nDCG@10 P@10 MRR MAP p62.baseline-sdm 0.6092 0.5875 0.7794 0.3896 p4-inex2011SB.xml amazon 0.6055 0.5792 0.7940 0.3500 p62.baseline-tags-browsenode 0.6012 0.5708 0.7779 0.3996 p4-inex2011SB.xml full 0.6011 0.5708 0.7798 0.3818 p4-inex2011SB.xml full.fb.10.50 0.5929 0.5500 0.8075 0.3898 p62.sdm-reviews-combine 0.5654 0.5208 0.7584 0.2781 p4-inex2011SB.xml social 0.5464 0.5167 0.7031 0.3486 p4-inex2011SB.xml social.fb.10.50 0.5425 0.5042 0.7210 0.3261 p54-run2.all-topic-fields.all-doc-fields 0.5415 0.4625 0.8535 0.3223 Table 3. Top runs of the Best Books for Social Search task of the INEX 2011 Book track, using judgements obtained by crowdsourcing (Amazon Mechanical Turk). Our runs are identified by the p62 prefix and are in boldface. Model (SDM), which is a special case of the MRF. In this model three features are considered: single term features (standard unigram language model features, fT ), exact phrase features (words appearing in sequence, fO) and unordered window features (require words to be close together, but not necessarily in an exact sequence order, fU ). Documents are thus ranked according to the following scoring function: scoreSDM (Q, D) = T X q2Q fT (q, D) + O |Q| 1 X i=1 fO(qi, qi+1, D) + U |Q| 1 X i=1 fU (qi, qi+1, D) where the features weights are set according to the author’s recommendation ( T = 0.85, O = 0.1, U = 0.05). fT , fO and fU are the log maximum likelihood estimates of query terms in document D, computed over the target collection with a Dirichlet smoothing. 3.2 External resources combination As previously done last year, we exploited external resources in a Pseudo-Relevance Feedback (PRF) fashion to expand the query with informative terms. Given a re- source R, we form a subset RQ of informative documents considering the initial query Q using pseudo-relevance feedback. To this end we first rank documents of R using the SDM ranking function. An entropy measure HRQ (t) is then com- puted for each term t over RQ in order to weigh them according to their relative informativeness: HRQ (t) = X w2t p(w|RQ) · log p(w|RQ) These external weighted terms are finally used to expand the original query. Sequential DependanceModel (SDM) - Markov Random Field (Metzler & Croft, 2004) We use our SDM baseline defined in section 3.1 and incorporate the ab recommendation estimate: scorerecomm(Q, D) = D scoreSDM (Q, D) + (1 D) tD where the D parameter was set based on the observation over the test to made available to participants for training purposes. Indeed we observed these topics that the tD had no influence on the ranking of documents after hundredth result (average estimation). Hence we fix the smoothing param to: D = arg maxD scoreSDM (Q, D) scoreSDM (Q, D)100 NResults In practice, this approach is re-ranking of the results of the SDM retri model based on the popularity and the likability of the di↵erent books. 4 Runs together. Children node pages (or sub-articles) are weighted half that of their parents in order to minimize a potential topic drift. We avoid loops in the graph (i.e. a children node can not be linked to one of his elder) because it brings no additional information. It also could change weights between linked articles. Informative words are then extracted from the sub-articles and incorporated to our retrieval model like another external resource. 3.4 Social opinion for book search The test collection used this year for the Book Track contains Amazon pages of books. These pages are composed amongst others of editorial information, like the number of pages or the blurb, user ratings and user reviews. However, contrary to the previous years, the actual content of the books is not available. Hence, the task is to rank books according to the sparse informative content and the opinion of readers expressed in the reviews, considering that the user ratings are integers between 1 and 5. Here, we wanted to model two social popularity assumptions: a product that have a lot of reviews must be relevant (or at least popular), and a high rated product must be relevant. Then, a product having a large number of good reviews really must be relevant. However in the collection there is often a small amount of ratings for a given book. The challenge was to determine whether each user rating is significant or not. To do so, we first define XD R a random set of ”bad” ratings (1, 2 or 3 over 5 points) for book D. Then, we evaluate the statistical significant di↵erences between XD R and XD R [ XD U using Welch’s t-test, where XD U is the actual set of user rating for book D. The statistical test is computed by: tD = XD R [ XD U XD U sXD R [XD U XD U where sXD R [XD U XD U = s s2 RU nRU + s2 U nU Where s2 is the unbiased estimator of the variance of the two sets and nX is the number of ratings for set X. The underlying assumption is that significant di↵erences occur under two di↵erent situations. First, when there is a small amount of user ratings (Xi U ) but they all are very good. For example this is the case of good but little-known books. Second, when there is a very large amount of user ratings but there are average. Hence this statistical test gives us a single estimate of both likability and popularity. Test statistique entre les notes observées et des notes aléatoires Est-ce qu’une note est significative ? Projet ANR CAAS
  • 40. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Query Expansion : with Concepts from DBPedia 40
  • 41. P.  Bellot  (AMU-­‐CNRS,  LSIS-­‐OpenEdition) Terms only vs. Extended Features — We modeled book likeliness based on the following idea: the more the number of reviews it has, the more interesting the book is (it may not be a good or popular book but a book that has a high impact) — InL2 information retrieval model alone (DFR-based model, Divergence From Randomness) seem to perform better than SDM (Language Modeling) with extended features 41 Benkoussas,  Hamdan,  Albitar,  Ollagnier  &  Bellot,  2014