Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Recommendation and
Information Retrieval:
Two sides of the same coin?
Prof.dr.ir. Arjen P. de Vries
arjen@acm.org
CWI, TU ...
Outline
• Recommendation Systems
– Collaborative Filtering (CF)
• Probabilistic approaches
– Language modelling for Inform...
Recommendation
• Informally:
– Search for information “without a query”
• Three types:
– Content-based recommendation
– Co...
Recommendation
• Informally:
– Search for information “without a query”
• Three types:
– Content-based recommendation
– Co...
Popularity-based
Content-based
Online News
Music
Collaborative Filtering
Collaborative Filtering
• Collaborative filtering (originally introduced by
Patti Maes as “social information filtering”)
...
Rating Matrix
Users
Items
Rating
User Profile
Item Profile
Unknown Rating
Collaborative Filtering
If user Boris watched
Love Actually, how
would he rate it?
Collaborative Filtering
• Standard item-based formulation
(Adomavicius & Tuzhilin 2005)
( )
( )
( )
( )
sim ,
rat , rat ,
...
Collaborative Filtering
• Benefits over content-based approach
– Overcomes problems with finding suitable
features to repr...
Prediction vs. Ranking
• Original formulations focused on
modelling the users’ item ratings: rating
prediction
– Evaluatio...
Recency-based
Prediction vs. Ranking
• Original formulations focused on
modelling the users’ item ratings: rating
prediction
– Evaluatio...
Relevance Ranking
• Core problem of Information Retrieval!
Generative Model
• A statistical model for generating data
– Probability distribution over samples in a
given ‘language’
M...
Unigram models etc.
• Unigram Models
• N-gram Models (here, N=2)
= P ( ) P ( | ) P ( | ) P ( | )
P ( ) P ( ) P ( ) P ( )
P...
Fundamental Problem
• Usually we don’t know the model M
– But have a sample representative of that
model
• First estimate ...
• Unigram Language Models (LM)
– Urn metaphor
Language Models…
• P( ) ~ P ( ) P ( ) P ( ) P ( )
= 4/9 * 2/9 * 4/9 * 3/9
© ...
… for Information Retrieval
• Rank models (documents) by probability of
generating the query:
Q:
P( | ) = 4/9 * 2/9 * 4/9 ...
Zero-frequency Problem
• Suppose some event not in our example
– Model will assign zero probability to that event
– And to...
Smoothing
• Idea:
Shift part of probability mass to unseen
events
• Interpolate document-based model with a
background mod...
Relevance Ranking
• Core problem of Information Retrieval!
– Question arising naturally:
Are CF and IR, from a modelling p...
User-Item Relevance Models
• Idea: CF by a probabilistic retrieval model
User-Item Relevance Models
• Idea: CF by a probabilistic retrieval model
• Treat user profile as query and answer
the foll...
Implicit or explicit
relevance?• Rating-based CF:
– Users explicitly rate “items”
We use “items” to represent contents (mo...
User-Item Relevance Models
• Existing User-based/Item-based
approaches
– Heuristic implementations of “word-of-mouth”
– Un...
User-Item Relevance Models
Other Items that
the target user liked
Other users who liked
the target itemTarget Item
Target ...
User-Item Relevance Models
• Introduce the following random variables
• Rank items by their log odds of relevance
( | , )
...
Item Representation
Query Items:
other Items that
the target user liked
Target Item
Target User
Relevance
Item Representat...
User-Item Relevance Models
• Item representation
– Use items that I liked to represent target user
– Assume the item “rati...
Co-occurrence popularity
User-Item Relevance Models
• Probabilistic justification of Item-based CF
– The RSV of a target i...
Co-occurrence between target item and query item
Popularity of query item
User-Item Relevance Models
• Probabilistic justi...
User Representation
Other users who liked
the target itemTarget Item
Target User
Relevance
{ub}im?
uk
User-Item Relevance Models
• User representation
– Represent target item by users who like it
– Assume the user profiles a...
Co-occurrence between the target user and the other users
Popularity of the other users
User-Item Relevance Models
• Proba...
Empirical Results
• Data Set:
– Music play-lists from audioscrobbler.com
– 428 users and 516 items
– 80% users as training...
P@N vs. lambda
Effectiveness (P@N)
So far…
• User-Item relevance models
– Give a probabilistic justification for CF
– Deal with the problem of sparsity
– Pro...
Rating Prediction?
• Previous log-based CF method predicts
nor uses rating information
– Ranks items solely by usage frequ...
…… bi Bi1i
,1ax
1,mx
,A bx
,a Bx, ?a bxau
Au
1u
……
bi … …Sorted Item Similarity
1,bx
,A bx
au , ?a bx Rating Prediction
SIR
Unknown Rating
bi
au
……
,1ax ,a Bx, ?a bx
SortedUserSimilarity
RatingPrediction
Unknown Rating
SUR
Sparseness
• Whether you choose SIR or SUR, in many
cases, the neighborhood extends to
include “not-so-similar” users and/...
bi …
, ?a bxau
……SortedUserSimilarity
… Sorted Item Similarity
SIR
Unknown Rating
SUIR
SUR
Rating Prediction
Rating
Predic...
,a bx
,k mx
2I1I
1 1I =1 0I =
2 1I =
,a bx SIR∈ ,a bx SUR∈
,a bx SUIR∈ ,a bx SUIR∈
2 0I =
Similarity Fusion
Sketch of Derivation
2
,
, 2 2
, 2 2
, 2 2
, ,
, , ,
( | , , )
( , | , , ) ( )
( , 1| , , ) ( 1)
( , 0 | , , )(1 ( 1))
( |...
User-Item Relevance
Models
Theoretical Level
Information
Retrieval
Field
Machine
Learning
Field
User RepresentationItem Re...
Remarks
• SIGIR 2006 paper estimates probabilities
directly from the similarity distance given
between users and items
• T...
Relevance Feedback
• Relevance Models for query expansion in IR
– Language model estimated from known relevant or
from top...
CF =~ IR?
Follow-up question:
Can we go beyond “model level”
equivalences observed so far, and actually
cast the CF proble...
IR System
Query Process
Text
Retrieval
Engine
OutputInverted
Index
Term
occurrences
(term-doc
matrix)
Query
CF RecSys?!
User Profile Process
Item
Similarity
Text
Retrieval
Engine
OutputInverted
Index
User
Profiles
(User-item
matri...
Collaborative Filtering
• Standard item-based formulation
• More general
( ) ( )
( )
( ) ( )
( )
1 2rat , , , , ,
j g u j ...
Text Retrieval
• In (Metzler & Zaragoza, 2009)
– In particular: factored form
( ) ( )
( )
, , ,
t g q
s q d s q d t
∈
= ∑
...
Text Retrieval
• Examples
– TF:
– TF-IDF:
– BM25:
( ) ( )
( ) ( )
1
2
, qf
, tf ,
w q t t
w d t t d
=
=
( ) ( )
( ) ( )
( ...
IR =~ CF?
• In item-based Collaborative Filtering
• Apply different models
– With different normalizations and norms: sqd,...
IR =~ CF!
• TF L1 s01 is equivalent to item-based CF
( ) ( )
( )
( )
sim ,
rat , rat ,
sim ,u
u
j I
j I
i j
u i u j
i j∈
∈...
Empirical Results
• Movielens 1M
– Movielens100k: comparable results
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
TF L1
s0...
Vector Space Model
• Challenge:
– No shared “words” to relate documents to queries
• Solution:
– First project users and i...
Item Space
• User
• Item
• Rank
• Predict rating:
User space
• User
• Item
• Rank
• Predict rating:
Linear Algebra
• Users and items in shared orthonormal
space:
• Consider covariance matrix
• Spectral theorem now states t...
Linear Algebra
• Use this basis to represent items and
users:
• The dot product then has a remarkable
form (of the IR mode...
Subspaces…
• Number of items (n) vs. number of users
(m):
– If n < m, a linear dependency must exist
between users in term...
Subspaces…
• Matrix Factorization methods are captured
by assuming a lower-dimensionality space
to project items and users...
Ratings into Inverted File
• Note: distribution of item occurrences not Zipfian
like text, so existing implementations (in...
Weighting schemes
Empirical results 1M
Empirical results 10M
Rating prediction
Concluding Remarks
• The probabilistic models are elegant (often
deploying impressive maths), but what do
they really add ...
Concluding Remarks
• Clearly, the models in CF & IR are closely
related
• Should these then really be studied in two
diffe...
Meanwhile at TREC…
Contextual Suggestions
• Given a user profile and a context, make
suggestions
– AKA Context-aware Recommendation, zero-
qu...
“Entertain me”
• Recommend “things to do”, where
– User profile consists of opinions about
attractions
– Context consists ...
TREC-CS (1/3)
• Given a user profile
– 70 – 100 POIs represented by a title,
description and URL (situated in Chicago /
Sa...
TREC CS (2/3)
• … and a context
– Corresponding to a metropolitan area in the
USA, e.g., 109, Kalamazoo, MI
TREC CS (3/3)
• Suggest Web pages / snippets
– From the Open Web, or from ClueWeb
700, 109 ,1,"About KIA History Kalamazoo...
Common approach
References
• Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems:
a survey of the state-of-th...
Thanks
• Alejandro Bellogín
• Jun Wang
• Thijs Westerveld
• Victor Lavrenko
Models for Information Retrieval and Recommendation
Models for Information Retrieval and Recommendation
Prochain SlideShare
Chargement dans…5
×

Models for Information Retrieval and Recommendation

1 740 vues

Publié le

Online information services personalize the user experience by applying recommendation systems to identify the information that is most relevant to the user. The question how to estimate relevance has been the core concept in the field of information retrieval for many years. Not so surprisingly then, it turns out that the methods used in online recommendation systems are closely related to the models developed in the information retrieval area. In this lecture, I present a unified approach to information retrieval and collaborative filtering, and demonstrate how this let’s us turn a standard information retrieval system into a state-of-the-art recommendation system.

Publié dans : Formation
  • Soyez le premier à commenter

Models for Information Retrieval and Recommendation

  1. 1. Recommendation and Information Retrieval: Two sides of the same coin? Prof.dr.ir. Arjen P. de Vries arjen@acm.org CWI, TU Delft, Spinque
  2. 2. Outline • Recommendation Systems – Collaborative Filtering (CF) • Probabilistic approaches – Language modelling for Information Retrieval – Language modelling for log-based CF – Brief: adaptations for rating-based CF • Vector Space Model (“Back to the Future”) – User and item spaces, orthonormal bases and “the spectral theorem”
  3. 3. Recommendation • Informally: – Search for information “without a query” • Three types: – Content-based recommendation – Collaborative filtering (CF) • Memory-based • Model-based – Hybrid approaches
  4. 4. Recommendation • Informally: – Search for information “without a query” • Three types: – Content-based recommendation – Collaborative filtering • Memory-based • Model-based – Hybrid approaches Today’s focus!
  5. 5. Popularity-based Content-based Online News
  6. 6. Music Collaborative Filtering
  7. 7. Collaborative Filtering • Collaborative filtering (originally introduced by Patti Maes as “social information filtering”) 1. Compare user judgments 2. Recommend differences between similar users • Leading principle: People’s tastes are not randomly distributed –A.k.a. “You are what you buy”
  8. 8. Rating Matrix
  9. 9. Users
  10. 10. Items
  11. 11. Rating
  12. 12. User Profile
  13. 13. Item Profile
  14. 14. Unknown Rating
  15. 15. Collaborative Filtering If user Boris watched Love Actually, how would he rate it?
  16. 16. Collaborative Filtering • Standard item-based formulation (Adomavicius & Tuzhilin 2005) ( ) ( ) ( ) ( ) sim , rat , rat , sim ,u u j I j I i j u i u j i j∈ ∈ = ∑ ∑
  17. 17. Collaborative Filtering • Benefits over content-based approach – Overcomes problems with finding suitable features to represent e.g. art, music – Serendipity – Implicit mechanism for qualitative aspects like style • Problems: large groups, broad domains
  18. 18. Prediction vs. Ranking • Original formulations focused on modelling the users’ item ratings: rating prediction – Evaluation of algorithms (e.g., Netflix prize) by Mean Absolute Error (MAE) or Root Mean Square Error (RMSE) between predicted and actual ratings
  19. 19. Recency-based
  20. 20. Prediction vs. Ranking • Original formulations focused on modelling the users’ item ratings: rating prediction – Evaluation of algorithms (e.g., Netflix prize) by Mean Absolute Error (MAE) or Root Mean Square Error (RMSE) between predicted and actual ratings • For the end user, the ranking of recommended items is the essential problem: relevance ranking – Evaluation by precision at fixed rank (P@N)
  21. 21. Relevance Ranking • Core problem of Information Retrieval!
  22. 22. Generative Model • A statistical model for generating data – Probability distribution over samples in a given ‘language’ M P ( | M ) = P ( | M ) P ( | M, ) P ( | M, ) P ( | M, ) © Victor Lavrenko, Aug. 2002
  23. 23. Unigram models etc. • Unigram Models • N-gram Models (here, N=2) = P ( ) P ( | ) P ( | ) P ( | ) P ( ) P ( ) P ( ) P ( ) P ( ) P ( ) P ( | ) P ( | ) P ( | ) © Victor Lavrenko, Aug. 2002
  24. 24. Fundamental Problem • Usually we don’t know the model M – But have a sample representative of that model • First estimate a model from a sample • Then compute the observation probability P ( | M ( ) ) M © Victor Lavrenko, Aug. 2002
  25. 25. • Unigram Language Models (LM) – Urn metaphor Language Models… • P( ) ~ P ( ) P ( ) P ( ) P ( ) = 4/9 * 2/9 * 4/9 * 3/9 © Victor Lavrenko, Aug. 2002
  26. 26. … for Information Retrieval • Rank models (documents) by probability of generating the query: Q: P( | ) = 4/9 * 2/9 * 4/9 * 3/9 = 96/9 P( | ) = 3/9 * 3/9 * 3/9 * 3/9 = 81/9 P( | ) = 2/9 * 3/9 * 2/9 * 4/9 = 48/9 P( | ) = 2/9 * 5/9 * 2/9 * 2/9 = 40/9
  27. 27. Zero-frequency Problem • Suppose some event not in our example – Model will assign zero probability to that event – And to any set of events involving the unseen event • Happens frequently in natural language text, and it is incorrect to infer zero probabilities – Especially when dealing with incomplete samples ?
  28. 28. Smoothing • Idea: Shift part of probability mass to unseen events • Interpolate document-based model with a background model (of “general English”) – Reflects expected frequency of events – Plays role of IDF λ + (1-λ)
  29. 29. Relevance Ranking • Core problem of Information Retrieval! – Question arising naturally: Are CF and IR, from a modelling perspective, really two different problems then? Jun Wang, Arjen P. de Vries, Marcel JT Reinders, A User-Item Relevance Model for Log-Based Collaborative Filtering, ECIR 2006
  30. 30. User-Item Relevance Models • Idea: CF by a probabilistic retrieval model
  31. 31. User-Item Relevance Models • Idea: CF by a probabilistic retrieval model • Treat user profile as query and answer the following question: – “ What is the probability that this item is relevant to this user, given his or her profile” • Hereto, apply the language modelling approach to IR as a formal model to compute the user-item relevance
  32. 32. Implicit or explicit relevance?• Rating-based CF: – Users explicitly rate “items” We use “items” to represent contents (movie, music, etc.) • Log-based CF: – User profiles are gathered by logging the interactions. Music play-list, web surf log, etc.
  33. 33. User-Item Relevance Models • Existing User-based/Item-based approaches – Heuristic implementations of “word-of-mouth” – Unclear how to best deal with the sparse data! • User-Item Relevance Models – Give probabilistic justification – Integrate smoothing to tackle the problem of sparsity
  34. 34. User-Item Relevance Models Other Items that the target user liked Other users who liked the target itemTarget Item Target User Relevance ? Item Representation User Representation
  35. 35. User-Item Relevance Models • Introduce the following random variables • Rank items by their log odds of relevance ( | , ) RSV ( ) log ( | , ) U P R r U I I P R r U I = = = 1 1Users: { ,..., } Items: I { ,..., }K MU u u i i∈ ∈ Relevance : { , } , "relevant", "not relevant"R r r r r∈
  36. 36. Item Representation Query Items: other Items that the target user liked Target Item Target User Relevance Item Representation {ib} im? uk
  37. 37. User-Item Relevance Models • Item representation – Use items that I liked to represent target user – Assume the item “ratings” are independent – Linear interpolation smoothing to address sparsity : ( , ) 0 ( | , ) ( | , ) ( | ) RSV ( ) log log ( | , ) ( | , ) ( | ) ... (1 ) ( | , ) log(1 ) log ( | ) ( | ) k b b u b mk m k k m m u m m k k m m ml b m m i i L c i i b P r i u P u r i P r i i P r i u P u r i P r i P i i r P i r P i r λ λ∀ ∈ ∩ > = = − = + +∑ ( | , ) (1 ) ( | , ) ( | )b m ml b m ml bP i i r P i i r P i rλ λ= − + [0,1] is a parameter to adjust the strength of smoothingλ ∈
  38. 38. Co-occurrence popularity User-Item Relevance Models • Probabilistic justification of Item-based CF – The RSV of a target item is the combination of its popularity and its co-occurrence with items (query items) that the target user liked. : ( , ) 0 (1 ) ( | , ) RSV ( ) log(1 ) log ( | ) ( | )k b b u b mk ml b m u m m i i L c i i b P i i r i P i r P i r λ λ∀ ∈ ∩ > − = + +∑
  39. 39. Co-occurrence between target item and query item Popularity of query item User-Item Relevance Models • Probabilistic justification of Item-based CF – The RSV of a target item is the combination of its popularity and its co-occurrence with items (query items) that the target user liked • Item co-occurrence should be emphasized if more users express interest in both target & query item • Item co-occurrence should be suppressed when the popularity of the query item is high : ( , ) 0 (1 ) ( | , ) RSV ( ) log(1 ) log ( | ) ( | )k b b u b mk ml b m u m m i i L c i i b P i i r i P i r P i r λ λ∀ ∈ ∩ > − = + +∑
  40. 40. User Representation Other users who liked the target itemTarget Item Target User Relevance {ub}im? uk
  41. 41. User-Item Relevance Models • User representation – Represent target item by users who like it – Assume the user profiles are independent – Linear interpolation smoothing to address sparsity : ( | , ) ( | , ) ( | ) RSV ( ) log log ( | , ) ( | , ) ( | ) ... (1 ) ( | , ) log(1 ) ( | ) k b b im m k m k k u m m k m k k ml b k u u L b P r i u P i r u P r u i P r i u P i r u P r u P u u r P u r λ λ∀ ∈ = = − = +∑ ( | , ) (1 ) ( | , ) ( | )ml b k ml b k ml bP u u r P u u r P u rλ λ= − + [0,1] is a parameter to adjust the strength of smoothingλ ∈
  42. 42. Co-occurrence between the target user and the other users Popularity of the other users User-Item Relevance Models • Probabilistic justification of User-based CF – The RSV of a target item towards a target user is calculated by the target user’s co-occurrence with other users who liked the target item • User co-occurrence is emphasized if more items liked by target user are also liked by the other user • User co-occurrence should be suppressed when this user liked many items : (1 ) ( | , ) RSV ( ) log(1 ) ( | )k b b im ml b k u m u u L b P u u r i P u r λ λ∀ ∈ − = +∑
  43. 43. Empirical Results • Data Set: – Music play-lists from audioscrobbler.com – 428 users and 516 items – 80% users as training set and 20% users as test set. – Half of items in test set as ground truth, others as user profiles • Measurement – Recommendation Precision: (num of corrected items)/(num. of recommended) – Averaged over 5 runs – Compared with the suggestion lib developed in GroupLens
  44. 44. P@N vs. lambda
  45. 45. Effectiveness (P@N)
  46. 46. So far… • User-Item relevance models – Give a probabilistic justification for CF – Deal with the problem of sparsity – Provide state-of-art performance
  47. 47. Rating Prediction? • Previous log-based CF method predicts nor uses rating information – Ranks items solely by usage frequency – Appropriate for, e.g., music recommendation in a service like Spotify or personalised TV
  48. 48. …… bi Bi1i ,1ax 1,mx ,A bx ,a Bx, ?a bxau Au 1u ……
  49. 49. bi … …Sorted Item Similarity 1,bx ,A bx au , ?a bx Rating Prediction SIR Unknown Rating
  50. 50. bi au …… ,1ax ,a Bx, ?a bx SortedUserSimilarity RatingPrediction Unknown Rating SUR
  51. 51. Sparseness • Whether you choose SIR or SUR, in many cases, the neighborhood extends to include “not-so-similar” users and/or items • Idea: Take into considerations the similar item ratings made by similar users as extra source for prediction Jun Wang, Arjen P. de Vries, Marcel JT Reinders, Unifying user-based and item-based collaborative filtering approaches by similarity fusion, SIGIR 2006
  52. 52. bi … , ?a bxau ……SortedUserSimilarity … Sorted Item Similarity SIR Unknown Rating SUIR SUR Rating Prediction Rating Prediction
  53. 53. ,a bx ,k mx 2I1I 1 1I =1 0I = 2 1I = ,a bx SIR∈ ,a bx SUR∈ ,a bx SUIR∈ ,a bx SUIR∈ 2 0I = Similarity Fusion
  54. 54. Sketch of Derivation 2 , , 2 2 , 2 2 , 2 2 , , , , , ( | , , ) ( , | , , ) ( ) ( , 1| , , ) ( 1) ( , 0 | , , )(1 ( 1)) ( | ) ( | , )(1 ) ( | ) ( ( ) ( )(1 ) k m k m I k m k m k m k m k m k m k m P x SUR SIR SUIR P x I SUR SIR SUIR P I P x I SUR SIR SUIR P I P x I SUR SIR SUIR P I P x SUIR P x SUR SIR P x SUIR P x SUR P x SIR δ δ δ λ λ = = = = + = − = = + − = + + − ∑ )(1 )δ− See SIGIR 2006 paper for more details
  55. 55. User-Item Relevance Models Theoretical Level Information Retrieval Field Machine Learning Field User RepresentationItem Representation Combination rules Similarity Fusion Individual Predictor Latent Predictor Space, Latent semantic analysis, manifold learning etc. Relevance Feedback. Query expansion etc
  56. 56. Remarks • SIGIR 2006 paper estimates probabilities directly from the similarity distance given between users and items • TOIS 2008 paper below applies Parzen window kernel density estimation to the rating data itself, to give a full probabilistic derivation – Shows how the “kernel trick” let’s us generalize the distance measure; such that a cosine (projection) kernel (length-normalized dot product) can be chosen, while keeping Gaussian kernel Parzen windows Jun Wang, Arjen P. de Vries, and Marcel J. T. Reinders. Unified relevance models for rating prediction in collaborative filtering. ACM TOIS 26 (3), June 2008
  57. 57. Relevance Feedback • Relevance Models for query expansion in IR – Language model estimated from known relevant or from top-k documents (Pseudo-RFB) – Expand query with terms generated by the LM • Application to recommendation – User profile used to identify neighbourhood; a Relevance Model estimated from that neighbourhood used to expand the profile – Deploy probabilistic clustering method PPC to construct the neighbourhood – Very good empirical results on P@N Javier Parapar, Alejandro Bellogín, Pablo Castells, Álvaro Barreiro. Relevance-Based Language Modelling for Recommender Systems.Information Processing & Management 49 (4), pp. 966-980
  58. 58. CF =~ IR? Follow-up question: Can we go beyond “model level” equivalences observed so far, and actually cast the CF problem such that we can use the full IR machinery? Alejandro Bellogín, Jun Wang, and Pablo Castells.Text Retrieval Methods for Item Ranking in Collaborative Filtering. ECIR 2011
  59. 59. IR System Query Process Text Retrieval Engine OutputInverted Index Term occurrences (term-doc matrix) Query
  60. 60. CF RecSys?! User Profile Process Item Similarity Text Retrieval Engine OutputInverted Index User Profiles (User-item matrix) User profile (as query)
  61. 61. Collaborative Filtering • Standard item-based formulation • More general ( ) ( ) ( ) ( ) ( ) ( ) 1 2rat , , , , , j g u j g u u i f u i j f u j f i j ∈ ∈ = =∑ ∑ ( ) ( ) ( ) ( ) sim , rat , rat , sim ,u u j I j I i j u i u j i j∈ ∈ = ∑ ∑
  62. 62. Text Retrieval • In (Metzler & Zaragoza, 2009) – In particular: factored form ( ) ( ) ( ) , , , t g q s q d s q d t ∈ = ∑ ( ) ( ) ( )1 2, , , ,s q d t w q t w d t=
  63. 63. Text Retrieval • Examples – TF: – TF-IDF: – BM25: ( ) ( ) ( ) ( ) 1 2 , qf , tf , w q t t w d t t d = = ( ) ( ) ( ) ( ) ( ) 1 2 , qf , tf , log df w q t t N w d t t d t =   =  ÷ ÷   ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) 3 1 3 1 2 1 1 qf , qf df 0.5 1 tf , , log df 0.5 1 dl / dl tf , k t w q t k t N t k t d w d t t k b b d t d + = +  − + + =  ÷ ÷+ − + × + 
  64. 64. IR =~ CF? • In item-based Collaborative Filtering • Apply different models – With different normalizations and norms: sqd, L1 and L2 ( ) ( ) ( ) ( ) tf , sim , qf rat , t d i j t u j = = sqd Document No norm Norm ( /|D|) Query No norm s00 s01 Norm ( /|Q|) s10 s11 t j d i q u ≈ ≈ ≈
  65. 65. IR =~ CF! • TF L1 s01 is equivalent to item-based CF ( ) ( ) ( ) ( ) sim , rat , rat , sim ,u u j I j I i j u i u j i j∈ ∈ = ∑ ∑ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 2 tf , , , , qf tf ,t g q t g q t g q t d s q d w q t w d t t t d∈ ∈ ∈ = =∑ ∑ ∑ ( ) ( ) ( ) ( ) tf , sim , qf rat , t d i j t u j = =
  66. 66. Empirical Results • Movielens 1M – Movielens100k: comparable results 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 TF L1 s01 TF-IDF L1 s01 TF-IDF L2 s11 BM25 L2 s11 TF L1 s10 BM25 L1 s01 TF-IDF L1 s10 BM25 L1 s00 TF-IDF L2 s10 TF-IDF L1 s00 TF L2 s10 BM25 L2 s10 TF L1 s00 BM25 L1 s11 BM25 L1 s10 BM25 L2 s01 TF L2 s11 TF-IDF L2 s01 TF L2 s01 nDCG
  67. 67. Vector Space Model • Challenge: – No shared “words” to relate documents to queries • Solution: – First project users and items in a common space • Two extreme settings: – Project users into a space with dimensionality of the number of items – Project items into a space with dimensionality of the number of users A. Bellogín, J. Wang, P. Castells. Bridging Memory-Based Collaborative Filtering and Text Retrieval. Information Retrieval Journal
  68. 68. Item Space • User • Item • Rank • Predict rating:
  69. 69. User space • User • Item • Rank • Predict rating:
  70. 70. Linear Algebra • Users and items in shared orthonormal space: • Consider covariance matrix • Spectral theorem now states that an orthonormal basis of eigenvectors exists
  71. 71. Linear Algebra • Use this basis to represent items and users: • The dot product then has a remarkable form (of the IR models discussed):
  72. 72. Subspaces… • Number of items (n) vs. number of users (m): – If n < m, a linear dependency must exist between users in terms of the item space components – In this case, it has been known empirically that item-based algorithms tend to perform better • Dimension of sub-space key for the performance of the algorithm? • ~ better estimation (more data per item) in the probabilistic versions
  73. 73. Subspaces… • Matrix Factorization methods are captured by assuming a lower-dimensionality space to project items and users into (usually considered “model-based” rather than “memory-based”) ~ Latent Semantic Indexing (a VSM method replicated as pLSA and variants)
  74. 74. Ratings into Inverted File • Note: distribution of item occurrences not Zipfian like text, so existing implementations (including choice of compression etc.) may be sub-optimal for CF runtime performance
  75. 75. Weighting schemes
  76. 76. Empirical results 1M
  77. 77. Empirical results 10M
  78. 78. Rating prediction
  79. 79. Concluding Remarks • The probabilistic models are elegant (often deploying impressive maths), but what do they really add in understanding IR & CF – i.e., beyond the (often claimed to be “ad- hoc”) approaches of the VSM?
  80. 80. Concluding Remarks • Clearly, the models in CF & IR are closely related • Should these then really be studied in two different (albeit overlapping) communities, RecSys vs. SIGIR?
  81. 81. Meanwhile at TREC…
  82. 82. Contextual Suggestions • Given a user profile and a context, make suggestions – AKA Context-aware Recommendation, zero- query Information Retrieval, …
  83. 83. “Entertain me” • Recommend “things to do”, where – User profile consists of opinions about attractions – Context consists of a specific geo-location
  84. 84. TREC-CS (1/3) • Given a user profile – 70 – 100 POIs represented by a title, description and URL (situated in Chicago / Santa Fe) – Rated on a scale 0 – 4 125, Adler Planetarium & Astronomy Museum, ''Interactive exhibits & high-tech sky shows entertain stargazers -- lakefront views are a bonus.'', http://www.adlerplanetarium.org/ 131,Lincoln Park Zoo,"Lincoln Park Zoo is a free 35-acre zoo located in Lincoln Park in Chicago, Illinois. The zoo was founded in 1868, making it one of the oldest zoos in the U.S. It is also one of a few free admission zoos in the United States.", http://www.lpzoo.org/ 700, 125, 4, 4 700, 131, 0, 1
  85. 85. TREC CS (2/3) • … and a context – Corresponding to a metropolitan area in the USA, e.g., 109, Kalamazoo, MI
  86. 86. TREC CS (3/3) • Suggest Web pages / snippets – From the Open Web, or from ClueWeb 700, 109 ,1,"About KIA History Kalamazoo Institute of Arts KIA History","The Kalamazoo Institute of Arts is a nonprofit art museum and school. Since , the institute has offered art classes and free admission programming, including exhibitions, lectures, events, activities and a permanent collection. The KIAs mission is to cultivate the creation and appreciation of the visual arts for the communities",clueweb12-1811wb-14-09165
  87. 87. Common approach
  88. 88. References • Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE TKDE 17(6), 734-749 (2005) • Alejandro Bellogín, Jun Wang, and Pablo Castells.Text Retrieval Methods for Item Ranking in Collaborative Filtering. ECIR 2011. • Metzler, D., Zaragoza, H.: Semi-parametric and non-parametric term weighting for information retrieval. ECIR 2009. • Javier Parapar, Alejandro Bellogín, Pablo Castells, Álvaro Barreiro. Relevance- Based Language Modelling for Recommender Systems.Information Processing & Management 49 (4), pp. 966-980 • A. Bellogín, J. Wang, P. Castells. Bridging Memory-Based Collaborative Filtering and Text Retrieval.Information Retrieval (to appear) • Jun Wang, Arjen P. de Vries, Marcel JT Reinders, Unifying user-based and item- based collaborative filtering approaches by similarity fusion, SIGIR 2006 • Jun Wang, Arjen P. de Vries, Marcel JT Reinders, A User-Item Relevance Model for Log-Based Collaborative Filtering, ECIR 2006 • Jun Wang, Arjen P. de Vries, and Marcel J. T. Reinders. Unified relevance models for rating prediction in collaborative filtering. ACM TOIS 26 (3), June 2008. • Jun Wang, Stephen Robertson, Arjen P. de Vries, and Marcel J.T. Reinders. Probabilistic relevance ranking for collaborative filtering. Information Retrieval 11 (6):477-497, 2008
  89. 89. Thanks • Alejandro Bellogín • Jun Wang • Thijs Westerveld • Victor Lavrenko

×