SlideShare a Scribd company logo
1 of 37
Download to read offline
IIR 2016, VENEZIA, ITALY
COMPUTING NEIGHBOURHOODS WITH LANGUAGE
MODELS IN A COLLABORATIVE FILTERING SCENARIO
Daniel Valcarce, Javier Parapar, Álvaro Barreiro
@dvalcarce @jparapar @AlvaroBarreiroG
Information Retrieval Lab
@IRLab_UDC
University of A Coruña
Spain
Outline
1. Introduction to Recommender Systems
2. Neighbourhood-based Methods
3. Computing Neighbourhoods
4. Language Models for Neighbourhoods
5. Experiments
6. Conclusions and Future Directions
1/26
INTRODUCTION TO RECOMMENDER SYSTEMS
Recommender Systems
Recommender systems provide personalised suggestions for
items that may be of interest to the users.
Top-N Recommendation: create a ranking of the N most
relevant items for each user.
Different approaches:
Content-based: exploit item description to recommend
items similar to those the target user liked in the past.
Collaborative filtering: rely on the user feedback such as
ratings or clicks to generate recommendations.
Hybrid: combination of content-based and collaborative
filtering approaches.
3/26
Recommender Systems
Recommender systems provide personalised suggestions for
items that may be of interest to the users.
Top-N Recommendation: create a ranking of the N most
relevant items for each user.
Different approaches:
Content-based: exploit item description to recommend
items similar to those the target user liked in the past.
Collaborative filtering: rely on the user feedback such as
ratings or clicks to generate recommendations.
Hybrid: combination of content-based and collaborative
filtering approaches.
3/26
Collaborative Filtering
Collaborative Filtering (CF) exploit feedback from users:
Explicit: ratings or reviews.
Implicit: clicks or purchases.
Two main families of CF methods:
Model-based: learn a model from the data and use it for
recommendation.
Neighbourhood-based (or memory-based): compute
recommendations using directly part of the ratings.
4/26
Collaborative Filtering
Collaborative Filtering (CF) exploit feedback from users:
Explicit: ratings or reviews.
Implicit: clicks or purchases.
Two main families of CF methods:
Model-based: learn a model from the data and use it for
recommendation.
Neighbourhood-based (or memory-based): compute
recommendations using directly part of the ratings.
4/26
NEIGHBOURHOOD-BASED METHODS
Neighbourhood-based Methods
Two perspectives:
User-based: recommend items that users with common
interests with you liked.
Item-based: recommend items similar to those you liked.
Similarity between items is computed using common users
among items (not the content!).
6/26
Weighted Sum Recommender (WSR)
Very simple but effective approach (Valcarce et al., ECIR 2016).
WSR computes a weighted sum of the ratings in the
neighbourhood. Weights are calculated using cosine similarity.
Item-based version (WSR-IB):
ˆru,i
j∈Ji
cosine i, j ru,j (1)
User-based version (WSR-UB):
ˆru,i
v∈Vu
cosine (u, v) rv,i (2)
7/26
Weighted Sum Recommender (WSR)
Very simple but effective approach (Valcarce et al., ECIR 2016).
WSR computes a weighted sum of the ratings in the
neighbourhood. Weights are calculated using cosine similarity.
Item-based version (WSR-IB):
ˆru,i
j∈Ji
cosine i, j ru,j (1)
User-based version (WSR-UB):
ˆru,i
v∈Vu
cosine (u, v) rv,i (2)
The computation of neighbourhoods is crucial!
7/26
COMPUTING NEIGHBOURHOODS
Computing Neighbourhoods with k-NN algorithm
The effectiveness of neighbourhood-based methods relies
largely on how neighbours are computed.
The most common approach is to compute the k nearest
neighbours (k-NN algorithm) using a pairwise similarity.
The most common similarities are Pearson’s correlation
coefficient or cosine similarity.
Cosine provides important improvements over Pearson’s
correlation coefficient (Cremonesi et al., RecSys 2010).
9/26
Computing Neighbourhoods with k-NN algorithm
The effectiveness of neighbourhood-based methods relies
largely on how neighbours are computed.
The most common approach is to compute the k nearest
neighbours (k-NN algorithm) using a pairwise similarity.
The most common similarities are Pearson’s correlation
coefficient or cosine similarity.
Cosine provides important improvements over Pearson’s
correlation coefficient (Cremonesi et al., RecSys 2010).
Let’s study cosine similarity from the perspective of
Information Retrieval.
9/26
Cosine Similarity and the Vector Space Model
Recommendation Information Retrieval
Target user Query
Rest of users Documents
Items Terms
10/26
Cosine Similarity and the Vector Space Model
Recommendation Information Retrieval
Target user Query
Rest of users Documents
Items Terms
Under this scheme, using cosine similarity for finding
neighbours is equivalent to search in the Vector Space Model.
10/26
Cosine Similarity and the Vector Space Model
Recommendation Information Retrieval
Target user Query
Rest of users Documents
Items Terms
Under this scheme, using cosine similarity for finding
neighbours is equivalent to search in the Vector Space Model.
If we swap users and items, we can derive an analogous
item-based approach.
10/26
Cosine Similarity and the Vector Space Model
Recommendation Information Retrieval
Target user Query
Rest of users Documents
Items Terms
Under this scheme, using cosine similarity for finding
neighbours is equivalent to search in the Vector Space Model.
If we swap users and items, we can derive an analogous
item-based approach.
We can use sophisticated search techniques for finding
neighbours!
10/26
LANGUAGE MODELS FOR NEIGHBOURHOODS
Language Models
Statistical language models are a state-of-the-art framework for
document retrieval.
Documents are ranked according to their posterior probability
given the query:
p(d|q)
p(q|d) p(d)
p(q)
rank
p(q|d) p(d)
12/26
Language Models
Statistical language models are a state-of-the-art framework for
document retrieval.
Documents are ranked according to their posterior probability
given the query:
p(d|q)
p(q|d) p(d)
p(q)
rank
p(q|d) p(d)
The query likelihood, p(q|d), is based on a unigram model:
p(q|d)
t∈q
p(t|d)c(t,d)
12/26
Language Models
Statistical language models are a state-of-the-art framework for
document retrieval.
Documents are ranked according to their posterior probability
given the query:
p(d|q)
p(q|d) p(d)
p(q)
rank
p(q|d) p(d)
The query likelihood, p(q|d), is based on a unigram model:
p(q|d)
t∈q
p(t|d)c(t,d)
The document prior, p(d), is usually considered uniform.
12/26
Language Models for Finding Neighbourhoods (I)
Information Retrieval:
p(d|q)
rank
p(d)
t∈q
p(t|d)c(t,d)
User-based collaborative filtering:
p(v|u)
rank
p(v)
i∈Iu
p(i|v)rv,i
Item-based collaborative filtering:
p(j|i)
rank
p(j)
u∈Ui
p(u|j)ru,j
13/26
Language Models for Finding Neighbourhoods (II)
User-based collaborative filtering:
p(v|u)
rank
p(v)
i∈Iu
p(i|v)rv,i
We assume a multinomial distribution over the count of ratings.
The maximum likelihood estimate (MLE) is:
pmle(i|v)
rv,i
j∈Iv
rv,j
14/26
Language Models for Finding Neighbourhoods (II)
User-based collaborative filtering:
p(v|u)
rank
p(v)
i∈Iu
p(i|v)rv,i
We assume a multinomial distribution over the count of ratings.
The maximum likelihood estimate (MLE) is:
pmle(i|v)
rv,i
j∈Iv
rv,j
However it suffers from sparsity. We need smoothing!
14/26
Smoothing Methods for Language Models
Absolute Discounting (AD)
pδ(i|u)
max(ru,i − δ, 0) + δ |Iu| p(i|C)
j∈Iu
ru,j
Jelinek-Mercer (JM)
pλ(i|u) (1 − λ)
ru,i
j∈Iu
ru,j
+ λ p(i|C)
Dirichlet Priors (DP)
pµ(i|u)
ru,i + µ p(i|C)
µ + j∈Iu
ru,j
15/26
EXPERIMENTS
Experimental settings
Baselines:
Pearson’s correlation coefficient
RM1Sim: user-based similarity (Bellogín et al., RecSys ’13)
Cosine similarity
Our similarities are Language Models using:
Absolute Discounting smoothing
Jelinek-Mercer smoothing
Dirichlet Priors smoothing
17/26
Parameter Sensibility of WSR-UB on MovieLens 100k
0.18
0 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k
0.28
0.30
0.32
0.34
0.36
0.38
0.40
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
µ
nDCG@10
λ, δ
Pearson
Cosine
RM1Sim (λ)
LM-Absolute Discounting (δ)
LM-Jelinek-Mercer (λ)
LM-Dirichlet Priors (µ)
18/26
Parameter Sensibility of WSR-IB on R3-Yahoo!
0.012
0.014
0.016
0.018
0.020
0.022
0.024
0.026
0.028
0.030
100
101
102
103
104
105
106
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
nDCG@10
µ
λ, δ
Pearson
Cosine
LM-Absolute Discounting (δ)
LM-Jelinek-Mercer (λ)
LM-Dirichlet Priors (µ)
19/26
Precision (nDCG@10)
Algorithm ML 100k ML 1M R3-Yahoo LibraryThing
NNCosNgbr 0.1427 0.1042 0.0138 0.0550
PureSVD 0.3595a 0.3499ac 0.0198a 0.2245a
Cosine-WSR 0.3899ab 0.3430a 0.0274ab 0.2476ab
LM-DP-WSR 0.4017abc 0.3585abc 0.0271ab 0.2464ab
LM-JM-WSR 0.4013abc 0.3622abcd 0.0276ab 0.2537abcd
Table: Values of precision in terms of normalised discounted
cumulative gain at 10. Statistical significance is superscripted
(Wilcoxon two-sided p < 0.01). Pink = best algorithm. Blue = not
significantly different to the best.
20/26
Diversity (Gini@10)
Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing
Cosine-WSR 0.0549 0.0400 0.0902 0.1025
LM-DP-WSR 0.0659 0.0435 0.1557 0.1356
LM-JM-WSR 0.0627 0.0435 0.1034 0.1245
Table: Values of the complement of the Gini index at 10.
Pink = best algorithm.
21/26
Novelty (MSI@10)
Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing
Cosine-WSR 11.0579 12.4816 21.1968 41.1462
LM-DP-WSR 11.5219 12.8040 25.9647 46.4197
LM-JM-WSR 11.3921 12.8417 21.7935 43.5986
Table: Values of novelty in terms of Mean Self Information at 10.
Pink = best algorithm.
22/26
CONCLUSIONS AND FUTURE DIRECTIONS
Conclusions
Statistical language models are a powerful tool for computing
neighbourhoods in a collaborative filtering scenario. Combined
with WSR, language models:
Provide highly accurate recommendations.
Improve novelty and diversity figures compared to cosine.
Have low computational complexity.
24/26
Future work
Explore other probability distributions:
Multivariate Bernoulli.
Multivariate Poisson.
Evaluate the use of inverted indexes to compute
neighbourhoods:
Efficiency.
Scalability.
25/26
THANK YOU!
@DVALCARCE
http://www.dc.fi.udc.es/~dvalcarce

More Related Content

Similar to Computing Neighbourhoods with Language Models in a Collaborative Filtering Scenario [IIR '16 Slides]

Download
DownloadDownload
Download
butest
 
Download
DownloadDownload
Download
butest
 
CSTalks-Quaternary Semantics Recomandation System-24 Aug
CSTalks-Quaternary Semantics Recomandation System-24 AugCSTalks-Quaternary Semantics Recomandation System-24 Aug
CSTalks-Quaternary Semantics Recomandation System-24 Aug
cstalks
 
RBHF_SDM_2011_Jie
RBHF_SDM_2011_JieRBHF_SDM_2011_Jie
RBHF_SDM_2011_Jie
MDO_Lab
 

Similar to Computing Neighbourhoods with Language Models in a Collaborative Filtering Scenario [IIR '16 Slides] (20)

Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
 
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
 
Leveraging Knowledge Bases for Contextual Entity Exploration Categories
Leveraging Knowledge Basesfor Contextual Entity Exploration CategoriesLeveraging Knowledge Basesfor Contextual Entity Exploration Categories
Leveraging Knowledge Bases for Contextual Entity Exploration Categories
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
(Gaurav sawant &amp; dhaval sawlani)bia 678 final project report
(Gaurav sawant &amp; dhaval sawlani)bia 678 final project report(Gaurav sawant &amp; dhaval sawlani)bia 678 final project report
(Gaurav sawant &amp; dhaval sawlani)bia 678 final project report
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
From Free-text User Reviews to Product Recommendation using Paragraph Vectors...
 
CSTalks-Quaternary Semantics Recomandation System-24 Aug
CSTalks-Quaternary Semantics Recomandation System-24 AugCSTalks-Quaternary Semantics Recomandation System-24 Aug
CSTalks-Quaternary Semantics Recomandation System-24 Aug
 
A Study of Priors for Relevance-Based Language Modelling of Recommender Syste...
A Study of Priors for Relevance-Based Language Modelling of Recommender Syste...A Study of Priors for Relevance-Based Language Modelling of Recommender Syste...
A Study of Priors for Relevance-Based Language Modelling of Recommender Syste...
 
Probabilistic Collaborative Filtering with Negative Cross Entropy
Probabilistic Collaborative Filtering with Negative Cross EntropyProbabilistic Collaborative Filtering with Negative Cross Entropy
Probabilistic Collaborative Filtering with Negative Cross Entropy
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
 
RBHF_SDM_2011_Jie
RBHF_SDM_2011_JieRBHF_SDM_2011_Jie
RBHF_SDM_2011_Jie
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
 
Models for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationModels for Information Retrieval and Recommendation
Models for Information Retrieval and Recommendation
 
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
 
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
Sergey Nikolenko and  Anton Alekseev  User Profiling in Text-Based Recommende...Sergey Nikolenko and  Anton Alekseev  User Profiling in Text-Based Recommende...
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
 

More from Daniel Valcarce

More from Daniel Valcarce (7)

Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 
On the Robustness and Discriminative Power of IR Metrics for Top-N Recommenda...
On the Robustness and Discriminative Power of IR Metrics for Top-N Recommenda...On the Robustness and Discriminative Power of IR Metrics for Top-N Recommenda...
On the Robustness and Discriminative Power of IR Metrics for Top-N Recommenda...
 
LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]
LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]
LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]
 
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
 
A Study of Smoothing Methods for Relevance-Based Language Modelling of Recomm...
A Study of Smoothing Methods for Relevance-Based Language Modelling of Recomm...A Study of Smoothing Methods for Relevance-Based Language Modelling of Recomm...
A Study of Smoothing Methods for Relevance-Based Language Modelling of Recomm...
 
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
 
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
 

Recently uploaded

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Recently uploaded (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 

Computing Neighbourhoods with Language Models in a Collaborative Filtering Scenario [IIR '16 Slides]

  • 1. IIR 2016, VENEZIA, ITALY COMPUTING NEIGHBOURHOODS WITH LANGUAGE MODELS IN A COLLABORATIVE FILTERING SCENARIO Daniel Valcarce, Javier Parapar, Álvaro Barreiro @dvalcarce @jparapar @AlvaroBarreiroG Information Retrieval Lab @IRLab_UDC University of A Coruña Spain
  • 2. Outline 1. Introduction to Recommender Systems 2. Neighbourhood-based Methods 3. Computing Neighbourhoods 4. Language Models for Neighbourhoods 5. Experiments 6. Conclusions and Future Directions 1/26
  • 4. Recommender Systems Recommender systems provide personalised suggestions for items that may be of interest to the users. Top-N Recommendation: create a ranking of the N most relevant items for each user. Different approaches: Content-based: exploit item description to recommend items similar to those the target user liked in the past. Collaborative filtering: rely on the user feedback such as ratings or clicks to generate recommendations. Hybrid: combination of content-based and collaborative filtering approaches. 3/26
  • 5. Recommender Systems Recommender systems provide personalised suggestions for items that may be of interest to the users. Top-N Recommendation: create a ranking of the N most relevant items for each user. Different approaches: Content-based: exploit item description to recommend items similar to those the target user liked in the past. Collaborative filtering: rely on the user feedback such as ratings or clicks to generate recommendations. Hybrid: combination of content-based and collaborative filtering approaches. 3/26
  • 6. Collaborative Filtering Collaborative Filtering (CF) exploit feedback from users: Explicit: ratings or reviews. Implicit: clicks or purchases. Two main families of CF methods: Model-based: learn a model from the data and use it for recommendation. Neighbourhood-based (or memory-based): compute recommendations using directly part of the ratings. 4/26
  • 7. Collaborative Filtering Collaborative Filtering (CF) exploit feedback from users: Explicit: ratings or reviews. Implicit: clicks or purchases. Two main families of CF methods: Model-based: learn a model from the data and use it for recommendation. Neighbourhood-based (or memory-based): compute recommendations using directly part of the ratings. 4/26
  • 9. Neighbourhood-based Methods Two perspectives: User-based: recommend items that users with common interests with you liked. Item-based: recommend items similar to those you liked. Similarity between items is computed using common users among items (not the content!). 6/26
  • 10. Weighted Sum Recommender (WSR) Very simple but effective approach (Valcarce et al., ECIR 2016). WSR computes a weighted sum of the ratings in the neighbourhood. Weights are calculated using cosine similarity. Item-based version (WSR-IB): ˆru,i j∈Ji cosine i, j ru,j (1) User-based version (WSR-UB): ˆru,i v∈Vu cosine (u, v) rv,i (2) 7/26
  • 11. Weighted Sum Recommender (WSR) Very simple but effective approach (Valcarce et al., ECIR 2016). WSR computes a weighted sum of the ratings in the neighbourhood. Weights are calculated using cosine similarity. Item-based version (WSR-IB): ˆru,i j∈Ji cosine i, j ru,j (1) User-based version (WSR-UB): ˆru,i v∈Vu cosine (u, v) rv,i (2) The computation of neighbourhoods is crucial! 7/26
  • 13. Computing Neighbourhoods with k-NN algorithm The effectiveness of neighbourhood-based methods relies largely on how neighbours are computed. The most common approach is to compute the k nearest neighbours (k-NN algorithm) using a pairwise similarity. The most common similarities are Pearson’s correlation coefficient or cosine similarity. Cosine provides important improvements over Pearson’s correlation coefficient (Cremonesi et al., RecSys 2010). 9/26
  • 14. Computing Neighbourhoods with k-NN algorithm The effectiveness of neighbourhood-based methods relies largely on how neighbours are computed. The most common approach is to compute the k nearest neighbours (k-NN algorithm) using a pairwise similarity. The most common similarities are Pearson’s correlation coefficient or cosine similarity. Cosine provides important improvements over Pearson’s correlation coefficient (Cremonesi et al., RecSys 2010). Let’s study cosine similarity from the perspective of Information Retrieval. 9/26
  • 15. Cosine Similarity and the Vector Space Model Recommendation Information Retrieval Target user Query Rest of users Documents Items Terms 10/26
  • 16. Cosine Similarity and the Vector Space Model Recommendation Information Retrieval Target user Query Rest of users Documents Items Terms Under this scheme, using cosine similarity for finding neighbours is equivalent to search in the Vector Space Model. 10/26
  • 17. Cosine Similarity and the Vector Space Model Recommendation Information Retrieval Target user Query Rest of users Documents Items Terms Under this scheme, using cosine similarity for finding neighbours is equivalent to search in the Vector Space Model. If we swap users and items, we can derive an analogous item-based approach. 10/26
  • 18. Cosine Similarity and the Vector Space Model Recommendation Information Retrieval Target user Query Rest of users Documents Items Terms Under this scheme, using cosine similarity for finding neighbours is equivalent to search in the Vector Space Model. If we swap users and items, we can derive an analogous item-based approach. We can use sophisticated search techniques for finding neighbours! 10/26
  • 19. LANGUAGE MODELS FOR NEIGHBOURHOODS
  • 20. Language Models Statistical language models are a state-of-the-art framework for document retrieval. Documents are ranked according to their posterior probability given the query: p(d|q) p(q|d) p(d) p(q) rank p(q|d) p(d) 12/26
  • 21. Language Models Statistical language models are a state-of-the-art framework for document retrieval. Documents are ranked according to their posterior probability given the query: p(d|q) p(q|d) p(d) p(q) rank p(q|d) p(d) The query likelihood, p(q|d), is based on a unigram model: p(q|d) t∈q p(t|d)c(t,d) 12/26
  • 22. Language Models Statistical language models are a state-of-the-art framework for document retrieval. Documents are ranked according to their posterior probability given the query: p(d|q) p(q|d) p(d) p(q) rank p(q|d) p(d) The query likelihood, p(q|d), is based on a unigram model: p(q|d) t∈q p(t|d)c(t,d) The document prior, p(d), is usually considered uniform. 12/26
  • 23. Language Models for Finding Neighbourhoods (I) Information Retrieval: p(d|q) rank p(d) t∈q p(t|d)c(t,d) User-based collaborative filtering: p(v|u) rank p(v) i∈Iu p(i|v)rv,i Item-based collaborative filtering: p(j|i) rank p(j) u∈Ui p(u|j)ru,j 13/26
  • 24. Language Models for Finding Neighbourhoods (II) User-based collaborative filtering: p(v|u) rank p(v) i∈Iu p(i|v)rv,i We assume a multinomial distribution over the count of ratings. The maximum likelihood estimate (MLE) is: pmle(i|v) rv,i j∈Iv rv,j 14/26
  • 25. Language Models for Finding Neighbourhoods (II) User-based collaborative filtering: p(v|u) rank p(v) i∈Iu p(i|v)rv,i We assume a multinomial distribution over the count of ratings. The maximum likelihood estimate (MLE) is: pmle(i|v) rv,i j∈Iv rv,j However it suffers from sparsity. We need smoothing! 14/26
  • 26. Smoothing Methods for Language Models Absolute Discounting (AD) pδ(i|u) max(ru,i − δ, 0) + δ |Iu| p(i|C) j∈Iu ru,j Jelinek-Mercer (JM) pλ(i|u) (1 − λ) ru,i j∈Iu ru,j + λ p(i|C) Dirichlet Priors (DP) pµ(i|u) ru,i + µ p(i|C) µ + j∈Iu ru,j 15/26
  • 28. Experimental settings Baselines: Pearson’s correlation coefficient RM1Sim: user-based similarity (Bellogín et al., RecSys ’13) Cosine similarity Our similarities are Language Models using: Absolute Discounting smoothing Jelinek-Mercer smoothing Dirichlet Priors smoothing 17/26
  • 29. Parameter Sensibility of WSR-UB on MovieLens 100k 0.18 0 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k 0.28 0.30 0.32 0.34 0.36 0.38 0.40 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 µ nDCG@10 λ, δ Pearson Cosine RM1Sim (λ) LM-Absolute Discounting (δ) LM-Jelinek-Mercer (λ) LM-Dirichlet Priors (µ) 18/26
  • 30. Parameter Sensibility of WSR-IB on R3-Yahoo! 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 0.030 100 101 102 103 104 105 106 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 nDCG@10 µ λ, δ Pearson Cosine LM-Absolute Discounting (δ) LM-Jelinek-Mercer (λ) LM-Dirichlet Priors (µ) 19/26
  • 31. Precision (nDCG@10) Algorithm ML 100k ML 1M R3-Yahoo LibraryThing NNCosNgbr 0.1427 0.1042 0.0138 0.0550 PureSVD 0.3595a 0.3499ac 0.0198a 0.2245a Cosine-WSR 0.3899ab 0.3430a 0.0274ab 0.2476ab LM-DP-WSR 0.4017abc 0.3585abc 0.0271ab 0.2464ab LM-JM-WSR 0.4013abc 0.3622abcd 0.0276ab 0.2537abcd Table: Values of precision in terms of normalised discounted cumulative gain at 10. Statistical significance is superscripted (Wilcoxon two-sided p < 0.01). Pink = best algorithm. Blue = not significantly different to the best. 20/26
  • 32. Diversity (Gini@10) Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing Cosine-WSR 0.0549 0.0400 0.0902 0.1025 LM-DP-WSR 0.0659 0.0435 0.1557 0.1356 LM-JM-WSR 0.0627 0.0435 0.1034 0.1245 Table: Values of the complement of the Gini index at 10. Pink = best algorithm. 21/26
  • 33. Novelty (MSI@10) Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing Cosine-WSR 11.0579 12.4816 21.1968 41.1462 LM-DP-WSR 11.5219 12.8040 25.9647 46.4197 LM-JM-WSR 11.3921 12.8417 21.7935 43.5986 Table: Values of novelty in terms of Mean Self Information at 10. Pink = best algorithm. 22/26
  • 35. Conclusions Statistical language models are a powerful tool for computing neighbourhoods in a collaborative filtering scenario. Combined with WSR, language models: Provide highly accurate recommendations. Improve novelty and diversity figures compared to cosine. Have low computational complexity. 24/26
  • 36. Future work Explore other probability distributions: Multivariate Bernoulli. Multivariate Poisson. Evaluate the use of inverted indexes to compute neighbourhoods: Efficiency. Scalability. 25/26