This document provides an overview of recommender systems for e-commerce. It discusses various recommender approaches including collaborative filtering algorithms like nearest neighbor methods, item-based collaborative filtering, and matrix factorization. It also covers content-based recommendation, classification techniques, addressing challenges like data sparsity and scalability, and hybrid recommendation approaches.
2. Content
• Recommendation Problem
• RecommenderApproaches
• RecommenderAlgorithms
• Collaborative Filtering – CF
• Nearest Neighbor Methods – kNN
• Item Based CF
• Clustering
• Association Rule Based CF
• Classification
• Data Sparsity Challenges
• Scalability Challenges
• Performance Implications
• CF - Matrix Factorization
• Content Based Recommender – CBR
• Machine Learning and Statistical Techniques
• Learning to Rank
• Context Aware Recommendation Systems
• Social andTrust based Recommendations
• HybridApproaches
• Summary
• References
3. Recommender Systems
• Estimate a utility function to predict how a user will like an item
• Systems to recommend items and services of likely interest to the user
based on their preferences
• Compare user’s profile to some reference characteristics to predict
whether the user would be interested in an unseen item
• Helps users deal with the information overload
• Books, movies, CDs, web pages, news
• On-line stores likeAmazon
• Substantial sales improvements
4. Information Sources Used
• Browsing and searching data
• Purchase data
• Feedback explicitly provided by the users
• Textual comments
• Expert recommendations
• Demographic data
5. Recommendation Problem
• S is a set of users U and a set of items S to be recommended
• Let p be an utility function that measures the usefulness of item s
( S) to user u ( U)
– p:U×S R, where R is a totally ordered set (non-negative integers or real
numbers in a range)
• Objective
– Learn p based on the past data
– Use p to predict the utility value of each item s ( S) to each user u ( U)
6. RecommenderTasks
• Rating prediction - predict a rating that a user is likely to give
to an item not seen before
• Item prediction -Top N - predict a ranked list of items that a
user is likely to buy
7. Good Recommender
• Does not recommend items the user already knows or would
have found anyway
• Expands the user's taste into neighbouring areas - Serendipity
= Unsought finding
• Diverse - represents all the possible interests of one user
• Relevant to the user - personalized
8. User Profiles
• Most recommender systems use a profile of interests of a user
• History of user is used as training data for a machine learning algorithm
• History is used to create a user model
• Contents
– A model of user’s preferences - a function for any item that predicts the
likelihood that the user is interested in that item
– User’s interaction history - items viewed by a user, items purchased by a user,
search queries
9. User Profiles
• Manual recommendation - user customization
– Provide check box interface to allow users construct their own profiles of
interests
– A simple database matching process to find items that meet the specified
criteria and recommend these to users
• Limitations
– Require efforts from users
– Cannot cope with changes in interests of user
– Does not provide a way to determine order among recommending items
10. User Model
• Creating a model of the user’s preference from his history is a form of
classification learning
• The training data (user history) could be captured through
– explicit feedback (user rates items)
– implicit observing of user’s interactions (user bought an item and later
returned it is a sign of user doesn’t like the item)
11. User Model
• Implicit method can collect large amount of data but could contains noise
while data collected through explicit method is perfect but the amount
collected could be limited
• A number of classification learning algorithms can be considered
– Main goal - learn a function that model the user’s interests
– Applying function on a new item can give the probability that a user will like
this item or a numeric value indicating the degree of interest in this item
12. RecommenderApproaches
• recommends items that people with similar tastes and
preferences liked in the pastCollaborative Filtering
• recommends items similar to ones that user preferred in the
pastContent Based
• ranking problemPersonalized learning to rank
• trust basedSocial recommendations
• Combination of aboveHybrid
13. Evolution of Recommender Systems
Item Hierarchy
–You bought
Camera,You
will also need
film
Attribute Based
–You like action
movies starring
Client
Eastwood, you
will also like
movie Good,
Bad and Ugly
Collaborative
Filtering, User-
use similarity –
People like you
who bought
milk, also
brought bread
Collaborative
Filtering, Item-
item similarity –
You like
Spiderman so
you will like
Superman
Social + Interest
Graph based –
Your friend likes
Ford car, you
will also like
Ford car
Model based –
training SVM,
SVD, LDA for
implicit
features
16. CF
• Recommends item by computing similarity between preferences of user and
other like minded people
• Assumption - personal tastes are correlated
• These systems ignore content of the items being recommended
• They are able to suggest new items to user whose preferences are similar to
others
• Maintains a database of ratings by many users of items
• Predicts utility of items for a user based on the items previously rated by other
like-minded users
17. CF Methods
• Memory Based
– Memory-based algorithms operate on the entire user-item rating matrix and generates
recommendations by identifying the neighbourhood of the target user to whom the
recommendations will be made, based on the agreement of user’s past ratings
– easy to understand, easy to implement and work well in many real-world situations
– The most serious problem is the sparsity of user-item rating matrix
– Another problem of memory-basedCF is efficiency. It is not computationally feasible for the
ad-hoc recommender systems with millions of users and items
– Clustering, matrix factorization, machine learning on the graph
• Model Based
– Model-based techniques use the rating data to train a model and then the model will be used
to derive the recommendations
18. CF - Approaches
• K-nearest neighbor (kNN)
– User based methods
• Neighborhood formation phase
• Recommendation phase
– Item based methods
• Recommendation phase
• Association rules based prediction
• Matrix factorization
19. CF - Process
• Weight all users with respect to similarity with the active user
• Select a subset of the users (neighbors) to use as predictors
• Normalize ratings and compute a prediction from a weighted
combination of the selected neighbors’ ratings
• Present items with highest predicted ratings as recommendations
20. Nearest Neighbor Methods - kNN
• Memory based - stores all the training data in memory
• To classify a new item, compare it to all stored items using a similarity
function and determine the nearest neighbour or k nearest neighbours
• Class or numeric score of the previously unseen item can then be derived
from the class of the nearest neighbour
• Utilizes entire user-item database to generate predictions directly
21. Nearest Neighbor Methods - kNN
• No model building
• Similarity function depends on type of data
– Structured data: Euclidean distance metric
– Unstructured data (free text)
Cosine similarity function
22. Neighbor Selection
• For a given active user a, select correlated users to serve as source of
predictions
• Standard approach is to use the most similar n users, u, based on
similarity weights wa, u
• Alternate approach is to include all users whose similarity weight is above
a given threshold
23. • Let the record (or profile) of the target user be u (represented
as a vector) and the record of another user be v (v T)
• Calculate similarity between the target user u and a neighbor
v using Pearson’s correlation coefficient
,
)()(
))((
),(
2
,
2
,
,,
Ci iCi i
Ci ii
rrrr
rrrr
sim
vvuu
vvuu
vu
User Based CF - Neighborhood Formation
24. User Based CF - Recommendation Phase
• Compute the rating prediction of item i for target user u
• V is the set of k similar users and rv,i is the rating of user v
given to item i
V
V i
sim
rrsim
rip
v
v vv
u
vu
vu
u
),(
)(),(
),(
,
25. Personalized vs. Non-Personalized
• CF recommendations are personalized since the prediction is
based on the ratings expressed by similar users
– Those neighbours are different for each target user
• A non-personalized collaborative-based recommendation
can be generated by averaging the recommendations ofALL
the users
26. User Based CF - Issues
• Lack of scalability - requires the real-time comparison of the target user to all user records in
order to generate predictions
• Difficult to make predictions based on nearest neighbour algorithms -Accuracy of
recommendation may be poor
• Sparsity – evaluation of large item sets, users purchases are under 5%
• Poor relationship among like minded but sparse-rating users
• Solution
– usage of latent models to capture similarity between users & items in a reduced dimensional spaceA
– variation of this approach that remedies this problem is called item-basedCF
28. Item Based CF
• Rather than matching the user to similar customers, build a similar-items
table by finding that customers tend to purchase together
• Amazon.com used this method
• Scales independently of the catalog size or the total number of customers
• Acceptable performance by creating the expensive similar-item table
offline
29. Item Based CF
• The item-based approach works by comparing items based
on their pattern of ratings across users.The similarity of
items i and j is computed as follows
U jU i
U ji
rrrr
rrrr
jisim
u uuu uu
u uuuu
2
,
2
,
,,
)()(
))((
),(
30. • After computing the similarity between items we select a set
of k most similar items to the target item and generate a
predicted value of user u’s rating
• where J is the set of k similar items
Jj
Jj j
jisim
jisimr
ip
),(
),(
)(
,u
u,
Item Based CF - Recommendation phase
31. Model Based CF Algorithms
• Models are learned from the underlying data rather than heuristics
• Models of user ratings (or purchases)
– Clustering (classification)
– Association rules
– Matrix Factorization
– Restricted Boltzmann Machines
– Other models
• Bayesian network (probabilistic)
• Probabilistic Latent Semantic Analysis ...
32. Clustering
• One more technique to recommend based on past purchases is to cluster
customers for collaborative filtering
• Each cluster assigned typical preferences, based on preferences of customers
belonging to that cluster
• Customers within each cluster receive recommendations computed at the cluster
level
33. Clustering
• Pros
– Clustering techniques can be used to work on aggregated data
– Can also be applied as a first step for shrinking the selection of relevant neighbours in a
collaborative filtering algorithm and improve performance
– Can be used to capture latent similarities between users or items
• Cons
– Recommendations (per cluster) may be less relevant than collaborative filtering (per
individual)
34. Item Based CF - Issues
• Bottleneck is similarity computation
– Time complexity, highly time consuming with millions of users and items in
the database
• Isolate the neighbourhood generation and predication steps
• off-line model – similarity computation done earlier & stored in memory
• On-line component – prediction generation process
36. Association Rule Based CF
• Association rules used for
recommendation
• Each transaction for association rule
mining is the set of items bought by
a particular user
• Find item association rules: buy_X,
buy_Y -> buy_Z
• Rank items based on measures such
as confidence
37. Association Rules
• Pros
– Very less storage space required
– Quick to implement
– Fast to execute
– Not individualized
– Very successful in broad applications for large populations, such as shelf layout in retail stores
• Cons
– Not suitable if knowledge of preferences change rapidly
– It is tempting to do not apply restrictive confidence rules
– May lead to literally stupid recommendation
38. Classification
• Classifiers are computational models trained using positive and negative
examples
• They may take in inputs
– Vector of item features (action / adventure, Tom Hanks)
– Preferences of customers (like comedy / action)
– Relations among item
• Logistic Regression, Bayesian Networks, SupportVector Machines, DecisionTrees...
• Used in CF and CB Recommenders
39. Classification
• Pros
– Can be combined with other methods to improve accuracy of
recommendations
– Versatile
• Cons
– May over fit (Regularization)
– Need a relevant training data
40. CF - Benefits and Challenges
• Benefits
– Very powerful and efficient
– Highly relevant recommandations
• The bigger the database, the more the past behaviors, the better the recommandations
• Challenges
– Cold Start: Needs to have enough other users in the system to find a match
– Sparsity: Despite many users, user/ratings matrix can be sparse, making it hard to find users that have rated
items
– First Rater: Cannot recommend an item that has not been previously rated
• New items
• Esoteric items
– Popularity Bias: Cannot recommend items to someone with unique tastes
• Tends to recommend popular items
41. Data Sparsity Challenges
• Netflix Prize rating data in a User/Movie matrix
– 500,000 x 17,000 = 8,500 M positions
– Out of which only 100M have data
• Typically - large product sets, user ratings for a small percentage of them
– Example Amazon: millions of books and a user may have bought hundreds of
books – the probability that two users that have bought 100 books have a
common book (in a catalogue of 1 million books) is 0.01 (with 50 and 10
millions is 0.0002)
42. Data Sparsity Challenges
• Standard CF must have a number of users comparable to one
tenth of the size of the product catalogue
• Methods of dimensionality reduction
– Matrix Factorization
– Projection (PCA ...)
– Clustering
43. Scalability Challenges
• Nearest neighbour algorithms require computations that grows with both the
number of customers and products
• With millions of customers and products a web-based recommender can suffer
serious scalability problems
• The worst case complexity is O(mn) (m customers and n products)
• But in practice the complexity is O(m + n) since for each customer only a small
number of products are considered
• Some clustering techniques like K-means can help
44. Performance Implications
• Item-based similarity is static - Enables pre-computing of
item-item similarity
– prediction process involves only a table lookup for the similarity
values & computation of the weighted sum
• User-based CF – similarity between users is dynamic, pre-
computing user neighbourhood can lead to poor predictions
45. Matrix Factorization
• Provides superior performance in recommendation quality and scalability
• Approximates ratings matrix to a product of low rank matrices
• Decompose a matrix M into the product of several factor matrices where
n can be any number - usually 2 or 3
nFFFM ...21
46. CF - Matrix Factorization
• Matrix factorization is a latent factor model
• Latent variables (also called features, aspects, or factors) are introduced to account for the
underlying reasons of a user purchasing or using a product
– Connections between the latent variables and observed variables (user, product, rating, etc.) are
estimated during the training
– Recommendations made by computing possible interactions with each product through the latent
variables
• Netflix Prize contest for movie recommendation used a Singular Value Decomposition
(SVD) based algorithm
• The prize winning method employed an adapted version of SVD
48. Extended Matrix Factorization (EMF)
• According to purpose it can be categorized into following contexts
• Adding biases to matrix factorization
– User bias – if user is strict when rating, then always ratings are lower
– Item bias – a popular movie always has higher ratings
49. EMF
• Adding other influential factors
– Temporal effects – very old ratings have less impact on rating predictions
• Algorithm –Time SVD++
– Content Profile
• Users or items share same or similar content like gender, user group, movie genre that can
contribute to rating predictions
– Contexts – User preference can change from context to context
– Social ties from Facebook, twitter etc...
• Tensor Factorization
50. EMF -Tensor Factorization
• There could be more than 2 dimensions in ratings space – multi-dimensional
rating space
• TF can be considered as a multi-dimensional matrix factorization
• Allows as many context variables as needed to be integrated
51. Algorithms Evaluation
• ItemKNN – item based collaborative filtering
• RegSVD – SVD with regularization
• BiasedMF – Biased matrix factorization
• SVD++ - A complicated extension over matrix factorization
52. Content Based Recommender - CBR
• Recommend an item by predicting its utility to a user
• Recommendation based on similarity to items liked by user in the past
• Recommendation based on information on content of items rather than
opinions of other users
• Each user is assumed to act independently
53. Content Based Recommender - CBR
• Recommender does not depend on having other users in the system
• User needs to provide information on her personal interests to use the
system
• The top-k best matched or most similar items are recommended to the
user
• The simplest approach is to compute the similarity of the user profile with
each item
55. CBR
• What is the « content » of an item?
– It can be explicit « attributes » or « characteristics » of the item. For example for a film:
• Action / adventure
• Feature BruceWillis
• Year 1995
• It can also be « textual content » (title, description, table of content, etc.)
– Several techniques exist to compute the distance between two textual documents
• Can be extracted from the signal itself – Audio,Video
56. CBR
• CBR systems are common for text based data
• Text documents recommended based on a comparison between their
content (words appearing) and user model (a set of preferred words)
• The user model can also be a classifier based on whatever technique
(Neural Networks, Naïve Bayes...)
57. CBR Process –TF.IDF
• A textual document is scanned and parsed
• Word occurrences are counted (may be stemmed)
• Several words or « tokens » are not taken into account.That includes
« stop words » (the, a, for), and words that do not appear enough in
documents
• Each document is transformed into a normedTF-IDF vector, size N (Term
Frequency / Inverted Document Frequency).
• The distance between any pair of vector is computed
59. Machine Learning and StatisticalTechniques
• Bayesian classifiers and machine learning techniques like clustering,
decision trees and artificial neural networks
– These methods use models learned from the underlying data rather than
heuristics.
• For example, based on a set of web pages that were rated as relevant or
irrelevant by the user, the naive Bayesian classifier can be used to classify
unratedWeb pages
60. CBR - Advantages
• No need for data on other users
– No cold-start or sparsity problems
• Able to recommend to users with unique tastes
• Able to recommend new and unpopular items
– No first-rater problem
• Can provide explanations of recommended items by listing content-
features that caused an item to be recommended
61. CBR - Disadvantages
• Requires content that can be encoded as meaningful features
• Difficult to implement serendipity
• Easy to over fit (for a user with few data points we may “pigeon hole” her)
• User tastes must be represented as a learnable function of these content features
• Even for texts, IR techniques cannot consider multimedia information, aesthetic
qualities, download time…
– A positively rated page may be not related to the presence of certain keywords
• Unable to exploit quality judgments of other users
– Unless these are somehow included in the content features
62. Learning to Rank
• Recommendation is a ranking problem
• Users pay attention to few items at the top of the list
• Machine learning task - Rank the most relevant items as high as possible
in the recommendation list
• Does not try to predict a rating, but the order of preference
• Can be treated as a standard supervised classification problem
63. Learning to Rank
• Optimize ranking algorithms to give the highest scores to titles that a
member is most likely to play and enjoy
• Many other features can be added
• Goal - Find a personalized ranking function that is better than item
popularity to better satisfy users with varying tastes
• Machine learning problem goal is to construct a ranking model from
training data
64. Learning to Rank
• Training data can be a partial order or binary judgments (relevant/not
relevant)
• Resulting order of the items typically induced from a numerical score
• Learning to rank is a key element for personalization -Treat the problem
as a standard supervised classification problem
65. Popularity and Predicted Rating
• Linear Model
• Score(u, v) = w1 p(v) + w2 r(u, v) + b
– u=user, v=video item, p=popularity and r=predicted rating
• Select positive and negative examples from historical data and let a
machine learning algorithm learn the weights that optimizes goal
68. Learning to Rank Approaches
• Pair wise approach to ranking
– Loss function is defined on pair-wise preferences
– The goal is to minimize the number of inversions in the resulting ranking
– Ranking problem is then transformed into the binary classification problem
– RankSVM, RankBoost, RankNet, FRank…
• List-wise approach
– Directly optimize the ranking of the whole list by using a list wise approach
– uses similarity between the ranking list and the ground truth as a loss function
69. Learning to Rank Approaches
• Point wise approach
– Ranking function minimizes loss function defined on individual relevance judgment
– Ranking score based on regression or classification
– Ordinal regression, Logistic regression, SVM, GBDT, …
• Need to use rank-specific information retrieval metrics to measure the
performance of the model
– Mean average precision
– Mean reciprocal rank
71. ContextTypes
• Physical context - time, position, user activity, weather, light, temperature
• Social context - presence and role of other people around the user
• Interaction media context - device used to access the system and the type of
media that are browsed and personalized - text, music, images, movies
• Modal context - state of mind of the user, the user’s goals, mood, experience, and
cognitive capabilities
• Traditional RS - Users × Items
• Ratings Contextual RS - Users × Items × Contexts Ratings
72. Context Aware Recommendation Systems
• Known as CARSs
• Pattern - user preferences change from contexts to contexts
• Necessary to adapt users’ preferences to dynamic situations
• Context is an important factor in recommendations – factor like weather
– Pre-filtering techniques
• Context information is used to select relevant portion of data
– Post filtering techniques
• Context information is used to re-rank or filter final rankings
– Contextual modelling
• Context information is used directly as part of learning preferences model
• Variations an combinations of above methods
73. Pre-Filtering Challenges
• Context over-specification - using an exact context may be too narrow
– Watching a movie with a friend in a movie theatre on Saturday
• Certain aspects of the overly specific context may not be significant
(Saturday vs. weekend)
• Sparsity problem - overly specified context may not have enough training
examples for accurate prediction
74. Pre-Filtering Challenges
• Pre-Filter generalization different approaches
– Roll up to higher level concepts in context hierarchies – Saturday -> weekend
or movie theatre any location
• Use latent factors models or dimensionality reduction approaches
– Matrix factorization, LDA
75. Post Filtering
• Contextual Post-Filtering - heuristic in nature
– Basic Idea - treat the context as an additional constraint
– Many different approaches
• Filtering based on social/collaborative context representation
– mine social features - annotations, tags, tweets, reviews associated with the
item and users in a given context C
– Promote items with frequently occurring social features from C
76. Post Filtering
• Filtering based on context similarity
– Can be represented as a set of features commonly associated with the
specified context
– Adjust the recommendation list by favouring those items that have
more of the relevant features
– Similarity-based approach (but the space of features may be different
than the one describing the items)
77. Social andTrust based Recommendations
• A social recommender system recommends items that are popular in the
social proximity of the user
• A person being close in social network does not mean their judgement
can be trusted
– This idea of trust is central in social-based systems
– It can be a general per-user value that takes into account social proximity but
can also be topic-specific
78. Trust Definition
• Trust is very complex
– Involves personal background, history of interaction, context, similarity, reputation
• Sociological definitions
– Trust requires a belief and a commitment
– Tom believes Rob will provide reliable information thusTom is willing to act on that
information
– Similar to a bet
• In the context of recommender systems, trust is generally used to describe
similarity in opinion
– Ignores authority, correctness on facts
79. Methods
• Rating prediction from a user to an item
– Using user’s web of trust
– People in web of trust are seen as trustable
– Average of all the rating scores given by trustable people, weighted by their trust
value
• Use trust as a way to give more weight to some users
• Trust for collaborative filtering
– Use trust in place of (or combined with) similarity
• Trust for sorting and filtering
– Prioritize information from trusted sources
80. Using Social Data
• Social connections can be used in combination with other approaches
• Friendships can be fed into collaborative filtering methods in different
ways
– replace or modify user-user similarity by using social network information
• Algorithms
– Advogato (Levien)
– Appleseed (Ziegler and Lausen)
– MoleTrust (Massa and Avesani)
– TidalTrust (Golbeck)
81. Demographic Methods
• Aim to categorize the user based on personal attributes and make
recommendation based on demographic classes
• Demographic groups can come from marketing research – hence experts
decided how to model the users
• Demographic techniques form people-to people correlations
• Attributes can be induced by classifying a user using other user
descriptions (the home page) – you need some user for which you know
the class (male/female)
• Prediction can use whatever learning mechanism we like (nearest
neighbour, naïve classifier...)
82. Hybrid Approaches
• Content-based and collaborative methods have complementary
strengths and weaknesses
• Combine methods to obtain the best of both
– Apply both methods and combine recommendations
– Use collaborative data as content
– Use content-based predictor as another collaborator
– Use content-based predictor to complete collaborative data
84. Hybrid Approaches
• Netflix is a good example of a hybrid system
• Recommends
– by comparing the watching and searching habits of similar users -
collaborative filtering
– offering movies that share characteristics with films that a user has
rated highly - content-based filtering
85. Hybrid Approaches -Weighted Method
• Scores of several recommendations are combined together
to produce the single recommendation
• Equal weight can be assigned to both content and
collaborative recommenders but gradually adjust the weights
as the prediction of ratings are confirmed
86. Hybrid Approaches -Weighted Method
• Assumption: relative performance of the different techniques is uniform.
Not true in general
– Example - CF performs worse for items with few ratings
• Example
– a CB and a CF recommender equally weighted at first.Weights are adjusted as
predictions are confirmed or not.
– RS with consensus scheme - each recommendation of a specific item counts
as a vote for the item
87. Hybrid Approaches
• Switching Method - use some criterion to switch between
recommendation techniques
– The main problem is to identify a good switching criterion
– A system using CB-CF, when CB cannot predict with sufficient confidence
switch to CF
• Mixed Method – used when large recommendations exist
– Recommendations from several techniques are presented at the same time
88. Hybrid Approaches - Cascade
• Involves the stage process
• One recommendation technique is employed to produce a ranking of
candidates and a second technique refines the recommendation from the
candidate set
• At each iteration, a first recommendation technique produces a coarse
ranking & a second technique refines the recommendation
89. Hybrid Approaches - Cascade
• Avoids employing the second, lower-priority technique on items already
well-differentiated by the first
• Requires a meaningful ordering of the techniques
90. Hybrid Approaches - Feature Combination
• Features from different recommendation data sources are used together
into a single recommendation algorithm.
– Allows system consider collaborative data without relying on it exclusively, so
it reduces the sensitivity of the system to the number of users who have rated
an item.
– Conversely, it lets the system have information about the inherent similarity
of items that are otherwise opaque to a collaborative system.
91. Hybrid Approaches - FeatureAugmentation
• Used to improve the performance of a core system.
• One technique is used to produce a rating of an item and that information is then
incorporated into the processing of the next recommendation technique
• Difference between the Cascade and augmentation
– in feature augmentation the feature used by second recommendation is the one which
is the output of the first one
– in cascading second recommender does not use the output of first one but the results
of the two recommenders are combined in a prioritized manner
92. Hybrid Approaches - Meta-level
• Two recommendation techniques can be merged by using the model
generated by one as the input for another
• Difference between the meta-level and augmentation
– in augmentation output of first recommender is used as input for second one
– in meta-level the entire model will be consider as a input for the second one
93. Summary
• For many applications such as Recommender Systems (but also Search,
Advertising, and even Networks) understanding data and users is vital
• Algorithms can only be as good as the data they use as input
– But the inverse is also true: you need a good algorithm to leverage your data
• Importance of User/Data Mining is going to be a growing trend in many areas in
the coming years
• Recommender Systems (RS) is an important application of User Mining
94. Summary
• RS have the potential to become as important as Search is now
• RS are fairly new but already grounded on well proven technology
– Collaborative Filtering
– Machine Learning
– ContentAnalysis
– Social Network Analysis
– …
• However, there are still many open questions and a lot of interesting research to do!
95. References
1. Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, N. Oliver, and A. Hanjalic. CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In
Proc. of the sixth Recsys, 2012.
2. E-Commerce Recommendation Applications
http://citeseer.ist.psu.edu/cache/papers/cs/14532/http:zSzzSzwww.cs.umn.eduzSzResearchzSzGroupLenszSzECRA.pdf/schafer01ecommerce.pdf
3. “Item-based Collaborative Filtering Recommendation Algorithms”, B. Sarwar et al. 2001. Proceedings of World Wide Web Conference
4. ”Lessons from the Netflix Prize Challenge.”. R. M. Bell and Y. Koren. SIGKDD Explor. Newsl., 9(2):75–79, December 2007.
5. “Beyond algorithms: An HCI perspective on recommender systems”. K. Swearingen and R. Sinha. In ACM SIGIR 2001 Workshop on Recommender Systems
6. “Fast context-aware recommendations with factorization machines”. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt- Thieme. In Proc. of the 34th ACM
SIGIR, 2011.
7. “Restricted Boltzmann machines for collaborative filtering”. R. Salakhutdinov, A. Mnih, and G. E. Hinton.In Proc of ICML ’07, 2007
8. “Learning to rank: From pairwise approach to listwise approach”.Z. Cao and T. Liu. In In Proceedings of the 24th ICML, 2007.
9. “Recommender Systems in E-Commerce”. J. Ben Schafer et al. ACM Conference on Electronic Commerce. 1999-
10. “Introduction to Data Mining”, P. Tan et al. Addison Wesley. 2005
11. Amazon.com Recommendations: Item-to-Item Collaborative Filtering http://www.win.tue.nl/~laroyo/2L340/resources/Amazon-Recommendations.pdf
12. Item-based Collaborative Filtering Recommendation Algorithms http://www.grouplens.org/papers/pdf/www10_sarwar.pdf
13. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme. Fast context-aware recommendations with factorization machines. In Proc. Of the 34th ACM
SIGIR, 2011