Opinion-Based Entity Ranking

Ganesan & Zhai 2012, Information Retrieval, Vol 15, Number 2

Kavita Ganesan (www.kavita-ganesan.com)
University of Illinois @ Urbana Champaign
Journal
Project Page

 Currently: No easy or direct way of finding
entities (e.g. products, people, businesses)
based on online opinions

 You need to read opinions about different
entities to find entities that fulfill personal
criteria
e.g. finding mp3 players with ‘good sound quality’

 Currently: No easy or direct way of finding
entities (e.g. products, people, businesses)
based on online opinions

 You need to read opinions about different
entities to find entities that fulfill personal
criteria
 (e.g. finding mp3 players with ‘good sound quality’
Time consuming process & impairs
user productivity!

 Use existing opinions to rank entities based on
a set of unstructured user preferences

 Example of user preferences:
 Finding a hotel: “clean rooms, heated pools”
 Finding a restaurant: “authentic food, good ambience”

 Most obvious way: use results of existing
opinion mining methods
 Find sentiment ratings on various aspects
▪ For example, for an mp3 player: find ratings for screen, sound,
battery life aspects
▪ Then, rank entities based on these discovered aspect ratings
 Problem is that this is Not practical!
▪ Costly – It is costly to mine large amounts of textual content
▪ Prior knowledge – You need to know the set of queriable
aspects in advance. So, you may have to define aspects for
each domain either manually or through text mining
▪ Supervision – Most of the existing methods rely on some form
of supervision like the presence of overall user ratings. Such
information may not always be available.

 Leverage Existing Text Retrieval Models
 Why?
 Retrieval models can scale up to large amounts of
textual content
 The models themselves can be tweaked or
redefined
 This does not require costly information extraction
or text mining

Leveraging robust text retrieval models
Indexed
rank
Entity 1 Entity 1
Reviews

rank retrieval User Preferences
Entity 2 models (query)
Entity 2
Reviews (BM25, LM, PL2)
rank

Entity 3 Entity 3
Reviews Keyword match
between user prefs
& textual reviews

Leveraging robust text retrieval models
Indexed
rank
Entity 3 Entity 3
Reviews

rank retrieval User Preferences
Entity 2 models (query)
Entity 2
Reviews (BM25, LM, PL2)
rank

Entity 1 Entity 1
Reviews Keyword match
between user prefs
& textual reviews

 Based on the basic setup, this ranking problem seems
similar to regular document retrieval problem
 However, there are important differences:
1. The query is meant to express a user's preferences in keywords
 Query is expected to be longer than regular keyword queries
 Query may contain sub-queries expressing preferences for different
aspects
 It may actually be beneficial to model these semantic aspects

2. Ranking is to capture how well an entity satisfies a user's
preferences
 Not the relevance of a document to a query (as in regular retrieval)
 The matching of opinion/sentiment words would be important in
this case

 Investigate use of text retrieval models for the
task of Opinion-Based Entity Ranking

 Explore some extensions over IR models

 Propose evaluation method for the ranking task

 User Study
 To determine if results make sense to users
 Validate effectiveness of evaluation method

 In standard text retrieval we cannot distinguish
the multiple preferences in a query.
For example: “clean rooms, cheap, good service”
 Would be treated as a long keyword query even
though there are 3 preferences in the query
 Problem with this is that an entity may score highly
because of matching one aspect extremely well

 To improve this:
 We try to score each preference separately and then
combine the results

Aspect Queries

“clean rooms, cheap, “good
“clean rooms” “cheap”
service”
good service”

scored
retrieval model separately
retrieval model

result set 1 result set 2 result set 3
Results

results
Results
combined

 In standard retrieval models the matching of
an opinion word & a standard topic word is
not distinguished

 However, with Opinion-Based Entity Ranking:
 It is important to match opinion words in the
query, but opinion words tend to have more
variation than topic words
 Solution: Expand a query with similar opinion
words to help emphasize the matching of opinions

Similar Meaning to
Fantastic battery life “Fantastic battery life”
Query
Good battery life

Great battery life

Excellent battery life

Review documents

Similar Meaning to
Fantastic battery life “Fantastic battery life”
Query
Add synonyms of
Good battery life
word “fantastic”

Fantastic, good, Great battery life
great,excellent…
battery life
Excellent battery life
Expanded Query
Review documents

 Document Collection

 Gold Standard: Relevance Judgement

 User Queries

 Evaluation Measure

 Document Collection:
 Reviews of Hotels – Tripadvisor
 Reviews of Cars – Edmunds

Numerical
aspect ratings
Gold
standard
Free text reviews

 Gold Standard:
 Needed to asses performance of ranking task

 For each entity & for each aspect (in dataset):
 Average numerical ratings across reviews. This will
give the judgment score for each aspect
 Assumption:
Since the numerical ratings were given by users,
this would be a good approximation to actual
human judgment

 Gold Standard:
Ex. User looking for cars with “good performance”
 Ideally, the system should return cars with
▪ High numerical ratings on performance aspect
▪ Otherwise, we can say that the system is not doing well in
ranking
Should have high
ratings on
performance

 User Queries
 Semi synthethic queries
 Not able to obtain natural sample of queries

 Ask users to specify preferences on different aspects
of car & hotel based on aspects available in dataset
▪ Seed queries
▪ Ex. Fuel: “good gas mileage”, “great mpg”

 Randomly combine seed queries from different
aspects  forms synthetic queries
▪ Ex. Query 1: “great mpg, reliable car”
▪ Ex. Query 2: “comfortable, good performance”

 Evaluation Measure: nDCG
 This measure is ideal because it is based on
multiple levels of ranking
 The numerical ratings used as judgment scores has
a range of values and nDCG will actually support
this.

 Users were asked to manually determine the relevance
of system generated rankings to a set of queries

Two reasons for user study:
 Validate that results made sense to real users
 On average, users thought that the entities retrieved by the
system were a reasonable match to the queries

 Validate effectiveness of gold standard rankings
 Gold standard ranking has relatively strong agreement
with user rankings. This means the gold standard based on
numerical ratings is a good approximation to human
judgment

Most effective Most effective
on BM25 (p23) on BM25 (p23)
8.0% Hotels 2.5% Cars
6.0% 2.0%
1.5%
4.0%
1.0%
2.0% 0.5%
0.0% 0.0%
PL2 LM BM25 PL2 LM BM25
QAM QAM + OpinExp QAM QAM + OpinExp

Improvement in ranking using QAM
Improvement in ranking using QAM + OpinExp

 Lightweight approach to ranking entities based
on opinions
 Use existing text retrieval models

 Explored some enhancements over retrieval
models
 Namely opinion expansion & query aspect modeling
 Both showed some improvement in ranking

 Proposed evaluation method using user ratings
 User study shows that the evaluation method is sound
 This method can be used for future evaluation tasks

Opinion-Based Entity Ranking

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Opinion-Based Entity Ranking

Similaire à Opinion-Based Entity Ranking (20)

Dernier

Dernier (20)

Opinion-Based Entity Ranking

Notes de l'éditeur