SlideShare une entreprise Scribd logo
1  sur  96
Recommender Systems
for
E-Commerce
Girish Khanzode
Content
• Recommendation Problem
• RecommenderApproaches
• RecommenderAlgorithms
• Collaborative Filtering – CF
• Nearest Neighbor Methods – kNN
• Item Based CF
• Clustering
• Association Rule Based CF
• Classification
• Data Sparsity Challenges
• Scalability Challenges
• Performance Implications
• CF - Matrix Factorization
• Content Based Recommender – CBR
• Machine Learning and Statistical Techniques
• Learning to Rank
• Context Aware Recommendation Systems
• Social andTrust based Recommendations
• HybridApproaches
• Summary
• References
Recommender Systems
• Estimate a utility function to predict how a user will like an item
• Systems to recommend items and services of likely interest to the user
based on their preferences
• Compare user’s profile to some reference characteristics to predict
whether the user would be interested in an unseen item
• Helps users deal with the information overload
• Books, movies, CDs, web pages, news
• On-line stores likeAmazon
• Substantial sales improvements
Information Sources Used
• Browsing and searching data
• Purchase data
• Feedback explicitly provided by the users
• Textual comments
• Expert recommendations
• Demographic data
Recommendation Problem
• S is a set of users U and a set of items S to be recommended
• Let p be an utility function that measures the usefulness of item s
( S) to user u ( U)
– p:U×S  R, where R is a totally ordered set (non-negative integers or real
numbers in a range)
• Objective
– Learn p based on the past data
– Use p to predict the utility value of each item s ( S) to each user u ( U)
RecommenderTasks
• Rating prediction - predict a rating that a user is likely to give
to an item not seen before
• Item prediction -Top N - predict a ranked list of items that a
user is likely to buy
Good Recommender
• Does not recommend items the user already knows or would
have found anyway
• Expands the user's taste into neighbouring areas - Serendipity
= Unsought finding
• Diverse - represents all the possible interests of one user
• Relevant to the user - personalized
User Profiles
• Most recommender systems use a profile of interests of a user
• History of user is used as training data for a machine learning algorithm
• History is used to create a user model
• Contents
– A model of user’s preferences - a function for any item that predicts the
likelihood that the user is interested in that item
– User’s interaction history - items viewed by a user, items purchased by a user,
search queries
User Profiles
• Manual recommendation - user customization
– Provide check box interface to allow users construct their own profiles of
interests
– A simple database matching process to find items that meet the specified
criteria and recommend these to users
• Limitations
– Require efforts from users
– Cannot cope with changes in interests of user
– Does not provide a way to determine order among recommending items
User Model
• Creating a model of the user’s preference from his history is a form of
classification learning
• The training data (user history) could be captured through
– explicit feedback (user rates items)
– implicit observing of user’s interactions (user bought an item and later
returned it is a sign of user doesn’t like the item)
User Model
• Implicit method can collect large amount of data but could contains noise
while data collected through explicit method is perfect but the amount
collected could be limited
• A number of classification learning algorithms can be considered
– Main goal - learn a function that model the user’s interests
– Applying function on a new item can give the probability that a user will like
this item or a numeric value indicating the degree of interest in this item
RecommenderApproaches
• recommends items that people with similar tastes and
preferences liked in the pastCollaborative Filtering
• recommends items similar to ones that user preferred in the
pastContent Based
• ranking problemPersonalized learning to rank
• trust basedSocial recommendations
• Combination of aboveHybrid
Evolution of Recommender Systems
Item Hierarchy
–You bought
Camera,You
will also need
film
Attribute Based
–You like action
movies starring
Client
Eastwood, you
will also like
movie Good,
Bad and Ugly
Collaborative
Filtering, User-
use similarity –
People like you
who bought
milk, also
brought bread
Collaborative
Filtering, Item-
item similarity –
You like
Spiderman so
you will like
Superman
Social + Interest
Graph based –
Your friend likes
Ford car, you
will also like
Ford car
Model based –
training SVM,
SVD, LDA for
implicit
features
RecommenderAlgorithms
Collaborative Filtering - CF
CF
• Recommends item by computing similarity between preferences of user and
other like minded people
• Assumption - personal tastes are correlated
• These systems ignore content of the items being recommended
• They are able to suggest new items to user whose preferences are similar to
others
• Maintains a database of ratings by many users of items
• Predicts utility of items for a user based on the items previously rated by other
like-minded users
CF Methods
• Memory Based
– Memory-based algorithms operate on the entire user-item rating matrix and generates
recommendations by identifying the neighbourhood of the target user to whom the
recommendations will be made, based on the agreement of user’s past ratings
– easy to understand, easy to implement and work well in many real-world situations
– The most serious problem is the sparsity of user-item rating matrix
– Another problem of memory-basedCF is efficiency. It is not computationally feasible for the
ad-hoc recommender systems with millions of users and items
– Clustering, matrix factorization, machine learning on the graph
• Model Based
– Model-based techniques use the rating data to train a model and then the model will be used
to derive the recommendations
CF - Approaches
• K-nearest neighbor (kNN)
– User based methods
• Neighborhood formation phase
• Recommendation phase
– Item based methods
• Recommendation phase
• Association rules based prediction
• Matrix factorization
CF - Process
• Weight all users with respect to similarity with the active user
• Select a subset of the users (neighbors) to use as predictors
• Normalize ratings and compute a prediction from a weighted
combination of the selected neighbors’ ratings
• Present items with highest predicted ratings as recommendations
Nearest Neighbor Methods - kNN
• Memory based - stores all the training data in memory
• To classify a new item, compare it to all stored items using a similarity
function and determine the nearest neighbour or k nearest neighbours
• Class or numeric score of the previously unseen item can then be derived
from the class of the nearest neighbour
• Utilizes entire user-item database to generate predictions directly
Nearest Neighbor Methods - kNN
• No model building
• Similarity function depends on type of data
– Structured data: Euclidean distance metric
– Unstructured data (free text)
Cosine similarity function
Neighbor Selection
• For a given active user a, select correlated users to serve as source of
predictions
• Standard approach is to use the most similar n users, u, based on
similarity weights wa, u
• Alternate approach is to include all users whose similarity weight is above
a given threshold
• Let the record (or profile) of the target user be u (represented
as a vector) and the record of another user be v (v  T)
• Calculate similarity between the target user u and a neighbor
v using Pearson’s correlation coefficient
,
)()(
))((
),(
2
,
2
,
,,







Ci iCi i
Ci ii
rrrr
rrrr
sim
vvuu
vvuu
vu
User Based CF - Neighborhood Formation
User Based CF - Recommendation Phase
• Compute the rating prediction of item i for target user u
• V is the set of k similar users and rv,i is the rating of user v
given to item i






V
V i
sim
rrsim
rip
v
v vv
u
vu
vu
u
),(
)(),(
),(
,
Personalized vs. Non-Personalized
• CF recommendations are personalized since the prediction is
based on the ratings expressed by similar users
– Those neighbours are different for each target user
• A non-personalized collaborative-based recommendation
can be generated by averaging the recommendations ofALL
the users
User Based CF - Issues
• Lack of scalability - requires the real-time comparison of the target user to all user records in
order to generate predictions
• Difficult to make predictions based on nearest neighbour algorithms -Accuracy of
recommendation may be poor
• Sparsity – evaluation of large item sets, users purchases are under 5%
• Poor relationship among like minded but sparse-rating users
• Solution
– usage of latent models to capture similarity between users & items in a reduced dimensional spaceA
– variation of this approach that remedies this problem is called item-basedCF
Item Based CF
Item Based CF
• Rather than matching the user to similar customers, build a similar-items
table by finding that customers tend to purchase together
• Amazon.com used this method
• Scales independently of the catalog size or the total number of customers
• Acceptable performance by creating the expensive similar-item table
offline
Item Based CF
• The item-based approach works by comparing items based
on their pattern of ratings across users.The similarity of
items i and j is computed as follows







U jU i
U ji
rrrr
rrrr
jisim
u uuu uu
u uuuu
2
,
2
,
,,
)()(
))((
),(
• After computing the similarity between items we select a set
of k most similar items to the target item and generate a
predicted value of user u’s rating
• where J is the set of k similar items






Jj
Jj j
jisim
jisimr
ip
),(
),(
)(
,u
u,
Item Based CF - Recommendation phase
Model Based CF Algorithms
• Models are learned from the underlying data rather than heuristics
• Models of user ratings (or purchases)
– Clustering (classification)
– Association rules
– Matrix Factorization
– Restricted Boltzmann Machines
– Other models
• Bayesian network (probabilistic)
• Probabilistic Latent Semantic Analysis ...
Clustering
• One more technique to recommend based on past purchases is to cluster
customers for collaborative filtering
• Each cluster assigned typical preferences, based on preferences of customers
belonging to that cluster
• Customers within each cluster receive recommendations computed at the cluster
level
Clustering
• Pros
– Clustering techniques can be used to work on aggregated data
– Can also be applied as a first step for shrinking the selection of relevant neighbours in a
collaborative filtering algorithm and improve performance
– Can be used to capture latent similarities between users or items
• Cons
– Recommendations (per cluster) may be less relevant than collaborative filtering (per
individual)
Item Based CF - Issues
• Bottleneck is similarity computation
– Time complexity, highly time consuming with millions of users and items in
the database
• Isolate the neighbourhood generation and predication steps
• off-line model – similarity computation done earlier & stored in memory
• On-line component – prediction generation process
Two Step Process
Association Rule Based CF
• Association rules used for
recommendation
• Each transaction for association rule
mining is the set of items bought by
a particular user
• Find item association rules: buy_X,
buy_Y -> buy_Z
• Rank items based on measures such
as confidence
Association Rules
• Pros
– Very less storage space required
– Quick to implement
– Fast to execute
– Not individualized
– Very successful in broad applications for large populations, such as shelf layout in retail stores
• Cons
– Not suitable if knowledge of preferences change rapidly
– It is tempting to do not apply restrictive confidence rules
– May lead to literally stupid recommendation
Classification
• Classifiers are computational models trained using positive and negative
examples
• They may take in inputs
– Vector of item features (action / adventure, Tom Hanks)
– Preferences of customers (like comedy / action)
– Relations among item
• Logistic Regression, Bayesian Networks, SupportVector Machines, DecisionTrees...
• Used in CF and CB Recommenders
Classification
• Pros
– Can be combined with other methods to improve accuracy of
recommendations
– Versatile
• Cons
– May over fit (Regularization)
– Need a relevant training data
CF - Benefits and Challenges
• Benefits
– Very powerful and efficient
– Highly relevant recommandations
• The bigger the database, the more the past behaviors, the better the recommandations
• Challenges
– Cold Start: Needs to have enough other users in the system to find a match
– Sparsity: Despite many users, user/ratings matrix can be sparse, making it hard to find users that have rated
items
– First Rater: Cannot recommend an item that has not been previously rated
• New items
• Esoteric items
– Popularity Bias: Cannot recommend items to someone with unique tastes
• Tends to recommend popular items
Data Sparsity Challenges
• Netflix Prize rating data in a User/Movie matrix
– 500,000 x 17,000 = 8,500 M positions
– Out of which only 100M have data
• Typically - large product sets, user ratings for a small percentage of them
– Example Amazon: millions of books and a user may have bought hundreds of
books – the probability that two users that have bought 100 books have a
common book (in a catalogue of 1 million books) is 0.01 (with 50 and 10
millions is 0.0002)
Data Sparsity Challenges
• Standard CF must have a number of users comparable to one
tenth of the size of the product catalogue
• Methods of dimensionality reduction
– Matrix Factorization
– Projection (PCA ...)
– Clustering
Scalability Challenges
• Nearest neighbour algorithms require computations that grows with both the
number of customers and products
• With millions of customers and products a web-based recommender can suffer
serious scalability problems
• The worst case complexity is O(mn) (m customers and n products)
• But in practice the complexity is O(m + n) since for each customer only a small
number of products are considered
• Some clustering techniques like K-means can help
Performance Implications
• Item-based similarity is static - Enables pre-computing of
item-item similarity
– prediction process involves only a table lookup for the similarity
values & computation of the weighted sum
• User-based CF – similarity between users is dynamic, pre-
computing user neighbourhood can lead to poor predictions
Matrix Factorization
• Provides superior performance in recommendation quality and scalability
• Approximates ratings matrix to a product of low rank matrices
• Decompose a matrix M into the product of several factor matrices where
n can be any number - usually 2 or 3
nFFFM ...21
CF - Matrix Factorization
• Matrix factorization is a latent factor model
• Latent variables (also called features, aspects, or factors) are introduced to account for the
underlying reasons of a user purchasing or using a product
– Connections between the latent variables and observed variables (user, product, rating, etc.) are
estimated during the training
– Recommendations made by computing possible interactions with each product through the latent
variables
• Netflix Prize contest for movie recommendation used a Singular Value Decomposition
(SVD) based algorithm
• The prize winning method employed an adapted version of SVD
Matrix FactorizationTrends
Dimension
Reduction
Techniques
PCA and SVD
SVD Based Matrix
Factorization
Basic Matrix
Factorization
Extended Matrix
Factorization
Extended Matrix Factorization (EMF)
• According to purpose it can be categorized into following contexts
• Adding biases to matrix factorization
– User bias – if user is strict when rating, then always ratings are lower
– Item bias – a popular movie always has higher ratings
EMF
• Adding other influential factors
– Temporal effects – very old ratings have less impact on rating predictions
• Algorithm –Time SVD++
– Content Profile
• Users or items share same or similar content like gender, user group, movie genre that can
contribute to rating predictions
– Contexts – User preference can change from context to context
– Social ties from Facebook, twitter etc...
• Tensor Factorization
EMF -Tensor Factorization
• There could be more than 2 dimensions in ratings space – multi-dimensional
rating space
• TF can be considered as a multi-dimensional matrix factorization
• Allows as many context variables as needed to be integrated
Algorithms Evaluation
• ItemKNN – item based collaborative filtering
• RegSVD – SVD with regularization
• BiasedMF – Biased matrix factorization
• SVD++ - A complicated extension over matrix factorization
Content Based Recommender - CBR
• Recommend an item by predicting its utility to a user
• Recommendation based on similarity to items liked by user in the past
• Recommendation based on information on content of items rather than
opinions of other users
• Each user is assumed to act independently
Content Based Recommender - CBR
• Recommender does not depend on having other users in the system
• User needs to provide information on her personal interests to use the
system
• The top-k best matched or most similar items are recommended to the
user
• The simplest approach is to compute the similarity of the user profile with
each item
CBR
CBR
• What is the « content » of an item?
– It can be explicit « attributes » or « characteristics » of the item. For example for a film:
• Action / adventure
• Feature BruceWillis
• Year 1995
• It can also be « textual content » (title, description, table of content, etc.)
– Several techniques exist to compute the distance between two textual documents
• Can be extracted from the signal itself – Audio,Video
CBR
• CBR systems are common for text based data
• Text documents recommended based on a comparison between their
content (words appearing) and user model (a set of preferred words)
• The user model can also be a classifier based on whatever technique
(Neural Networks, Naïve Bayes...)
CBR Process –TF.IDF
• A textual document is scanned and parsed
• Word occurrences are counted (may be stemmed)
• Several words or « tokens » are not taken into account.That includes
« stop words » (the, a, for), and words that do not appear enough in
documents
• Each document is transformed into a normedTF-IDF vector, size N (Term
Frequency / Inverted Document Frequency).
• The distance between any pair of vector is computed
CBR Process –TF.IDF
Machine Learning and StatisticalTechniques
• Bayesian classifiers and machine learning techniques like clustering,
decision trees and artificial neural networks
– These methods use models learned from the underlying data rather than
heuristics.
• For example, based on a set of web pages that were rated as relevant or
irrelevant by the user, the naive Bayesian classifier can be used to classify
unratedWeb pages
CBR - Advantages
• No need for data on other users
– No cold-start or sparsity problems
• Able to recommend to users with unique tastes
• Able to recommend new and unpopular items
– No first-rater problem
• Can provide explanations of recommended items by listing content-
features that caused an item to be recommended
CBR - Disadvantages
• Requires content that can be encoded as meaningful features
• Difficult to implement serendipity
• Easy to over fit (for a user with few data points we may “pigeon hole” her)
• User tastes must be represented as a learnable function of these content features
• Even for texts, IR techniques cannot consider multimedia information, aesthetic
qualities, download time…
– A positively rated page may be not related to the presence of certain keywords
• Unable to exploit quality judgments of other users
– Unless these are somehow included in the content features
Learning to Rank
• Recommendation is a ranking problem
• Users pay attention to few items at the top of the list
• Machine learning task - Rank the most relevant items as high as possible
in the recommendation list
• Does not try to predict a rating, but the order of preference
• Can be treated as a standard supervised classification problem
Learning to Rank
• Optimize ranking algorithms to give the highest scores to titles that a
member is most likely to play and enjoy
• Many other features can be added
• Goal - Find a personalized ranking function that is better than item
popularity to better satisfy users with varying tastes
• Machine learning problem goal is to construct a ranking model from
training data
Learning to Rank
• Training data can be a partial order or binary judgments (relevant/not
relevant)
• Resulting order of the items typically induced from a numerical score
• Learning to rank is a key element for personalization -Treat the problem
as a standard supervised classification problem
Popularity and Predicted Rating
• Linear Model
• Score(u, v) = w1 p(v) + w2 r(u, v) + b
– u=user, v=video item, p=popularity and r=predicted rating
• Select positive and negative examples from historical data and let a
machine learning algorithm learn the weights that optimizes goal
Popularity and Predicted Rating
Netflix – Ranking Improvements
Learning to Rank Approaches
• Pair wise approach to ranking
– Loss function is defined on pair-wise preferences
– The goal is to minimize the number of inversions in the resulting ranking
– Ranking problem is then transformed into the binary classification problem
– RankSVM, RankBoost, RankNet, FRank…
• List-wise approach
– Directly optimize the ranking of the whole list by using a list wise approach
– uses similarity between the ranking list and the ground truth as a loss function
Learning to Rank Approaches
• Point wise approach
– Ranking function minimizes loss function defined on individual relevance judgment
– Ranking score based on regression or classification
– Ordinal regression, Logistic regression, SVM, GBDT, …
• Need to use rank-specific information retrieval metrics to measure the
performance of the model
– Mean average precision
– Mean reciprocal rank
Diversity in Recommendation – Re-Ranking
Recommender Re-rank results
ContextTypes
• Physical context - time, position, user activity, weather, light, temperature
• Social context - presence and role of other people around the user
• Interaction media context - device used to access the system and the type of
media that are browsed and personalized - text, music, images, movies
• Modal context - state of mind of the user, the user’s goals, mood, experience, and
cognitive capabilities
• Traditional RS - Users × Items
• Ratings Contextual RS - Users × Items × Contexts Ratings
Context Aware Recommendation Systems
• Known as CARSs
• Pattern - user preferences change from contexts to contexts
• Necessary to adapt users’ preferences to dynamic situations
• Context is an important factor in recommendations – factor like weather
– Pre-filtering techniques
• Context information is used to select relevant portion of data
– Post filtering techniques
• Context information is used to re-rank or filter final rankings
– Contextual modelling
• Context information is used directly as part of learning preferences model
• Variations an combinations of above methods
Pre-Filtering Challenges
• Context over-specification - using an exact context may be too narrow
– Watching a movie with a friend in a movie theatre on Saturday
• Certain aspects of the overly specific context may not be significant
(Saturday vs. weekend)
• Sparsity problem - overly specified context may not have enough training
examples for accurate prediction
Pre-Filtering Challenges
• Pre-Filter generalization different approaches
– Roll up to higher level concepts in context hierarchies – Saturday -> weekend
or movie theatre any location
• Use latent factors models or dimensionality reduction approaches
– Matrix factorization, LDA
Post Filtering
• Contextual Post-Filtering - heuristic in nature
– Basic Idea - treat the context as an additional constraint
– Many different approaches
• Filtering based on social/collaborative context representation
– mine social features - annotations, tags, tweets, reviews associated with the
item and users in a given context C
– Promote items with frequently occurring social features from C
Post Filtering
• Filtering based on context similarity
– Can be represented as a set of features commonly associated with the
specified context
– Adjust the recommendation list by favouring those items that have
more of the relevant features
– Similarity-based approach (but the space of features may be different
than the one describing the items)
Social andTrust based Recommendations
• A social recommender system recommends items that are popular in the
social proximity of the user
• A person being close in social network does not mean their judgement
can be trusted
– This idea of trust is central in social-based systems
– It can be a general per-user value that takes into account social proximity but
can also be topic-specific
Trust Definition
• Trust is very complex
– Involves personal background, history of interaction, context, similarity, reputation
• Sociological definitions
– Trust requires a belief and a commitment
– Tom believes Rob will provide reliable information thusTom is willing to act on that
information
– Similar to a bet
• In the context of recommender systems, trust is generally used to describe
similarity in opinion
– Ignores authority, correctness on facts
Methods
• Rating prediction from a user to an item
– Using user’s web of trust
– People in web of trust are seen as trustable
– Average of all the rating scores given by trustable people, weighted by their trust
value
• Use trust as a way to give more weight to some users
• Trust for collaborative filtering
– Use trust in place of (or combined with) similarity
• Trust for sorting and filtering
– Prioritize information from trusted sources
Using Social Data
• Social connections can be used in combination with other approaches
• Friendships can be fed into collaborative filtering methods in different
ways
– replace or modify user-user similarity by using social network information
• Algorithms
– Advogato (Levien)
– Appleseed (Ziegler and Lausen)
– MoleTrust (Massa and Avesani)
– TidalTrust (Golbeck)
Demographic Methods
• Aim to categorize the user based on personal attributes and make
recommendation based on demographic classes
• Demographic groups can come from marketing research – hence experts
decided how to model the users
• Demographic techniques form people-to people correlations
• Attributes can be induced by classifying a user using other user
descriptions (the home page) – you need some user for which you know
the class (male/female)
• Prediction can use whatever learning mechanism we like (nearest
neighbour, naïve classifier...)
Hybrid Approaches
• Content-based and collaborative methods have complementary
strengths and weaknesses
• Combine methods to obtain the best of both
– Apply both methods and combine recommendations
– Use collaborative data as content
– Use content-based predictor as another collaborator
– Use content-based predictor to complete collaborative data
Hybrid Approaches
Hybrid Approaches
• Netflix is a good example of a hybrid system
• Recommends
– by comparing the watching and searching habits of similar users -
collaborative filtering
– offering movies that share characteristics with films that a user has
rated highly - content-based filtering
Hybrid Approaches -Weighted Method
• Scores of several recommendations are combined together
to produce the single recommendation
• Equal weight can be assigned to both content and
collaborative recommenders but gradually adjust the weights
as the prediction of ratings are confirmed
Hybrid Approaches -Weighted Method
• Assumption: relative performance of the different techniques is uniform.
Not true in general
– Example - CF performs worse for items with few ratings
• Example
– a CB and a CF recommender equally weighted at first.Weights are adjusted as
predictions are confirmed or not.
– RS with consensus scheme - each recommendation of a specific item counts
as a vote for the item
Hybrid Approaches
• Switching Method - use some criterion to switch between
recommendation techniques
– The main problem is to identify a good switching criterion
– A system using CB-CF, when CB cannot predict with sufficient confidence
switch to CF
• Mixed Method – used when large recommendations exist
– Recommendations from several techniques are presented at the same time
Hybrid Approaches - Cascade
• Involves the stage process
• One recommendation technique is employed to produce a ranking of
candidates and a second technique refines the recommendation from the
candidate set
• At each iteration, a first recommendation technique produces a coarse
ranking & a second technique refines the recommendation
Hybrid Approaches - Cascade
• Avoids employing the second, lower-priority technique on items already
well-differentiated by the first
• Requires a meaningful ordering of the techniques
Hybrid Approaches - Feature Combination
• Features from different recommendation data sources are used together
into a single recommendation algorithm.
– Allows system consider collaborative data without relying on it exclusively, so
it reduces the sensitivity of the system to the number of users who have rated
an item.
– Conversely, it lets the system have information about the inherent similarity
of items that are otherwise opaque to a collaborative system.
Hybrid Approaches - FeatureAugmentation
• Used to improve the performance of a core system.
• One technique is used to produce a rating of an item and that information is then
incorporated into the processing of the next recommendation technique
• Difference between the Cascade and augmentation
– in feature augmentation the feature used by second recommendation is the one which
is the output of the first one
– in cascading second recommender does not use the output of first one but the results
of the two recommenders are combined in a prioritized manner
Hybrid Approaches - Meta-level
• Two recommendation techniques can be merged by using the model
generated by one as the input for another
• Difference between the meta-level and augmentation
– in augmentation output of first recommender is used as input for second one
– in meta-level the entire model will be consider as a input for the second one
Summary
• For many applications such as Recommender Systems (but also Search,
Advertising, and even Networks) understanding data and users is vital
• Algorithms can only be as good as the data they use as input
– But the inverse is also true: you need a good algorithm to leverage your data
• Importance of User/Data Mining is going to be a growing trend in many areas in
the coming years
• Recommender Systems (RS) is an important application of User Mining
Summary
• RS have the potential to become as important as Search is now
• RS are fairly new but already grounded on well proven technology
– Collaborative Filtering
– Machine Learning
– ContentAnalysis
– Social Network Analysis
– …
• However, there are still many open questions and a lot of interesting research to do!
References
1. Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, N. Oliver, and A. Hanjalic. CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In
Proc. of the sixth Recsys, 2012.
2. E-Commerce Recommendation Applications
http://citeseer.ist.psu.edu/cache/papers/cs/14532/http:zSzzSzwww.cs.umn.eduzSzResearchzSzGroupLenszSzECRA.pdf/schafer01ecommerce.pdf
3. “Item-based Collaborative Filtering Recommendation Algorithms”, B. Sarwar et al. 2001. Proceedings of World Wide Web Conference
4. ”Lessons from the Netflix Prize Challenge.”. R. M. Bell and Y. Koren. SIGKDD Explor. Newsl., 9(2):75–79, December 2007.
5. “Beyond algorithms: An HCI perspective on recommender systems”. K. Swearingen and R. Sinha. In ACM SIGIR 2001 Workshop on Recommender Systems
6. “Fast context-aware recommendations with factorization machines”. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt- Thieme. In Proc. of the 34th ACM
SIGIR, 2011.
7. “Restricted Boltzmann machines for collaborative filtering”. R. Salakhutdinov, A. Mnih, and G. E. Hinton.In Proc of ICML ’07, 2007
8. “Learning to rank: From pairwise approach to listwise approach”.Z. Cao and T. Liu. In In Proceedings of the 24th ICML, 2007.
9. “Recommender Systems in E-Commerce”. J. Ben Schafer et al. ACM Conference on Electronic Commerce. 1999-
10. “Introduction to Data Mining”, P. Tan et al. Addison Wesley. 2005
11. Amazon.com Recommendations: Item-to-Item Collaborative Filtering http://www.win.tue.nl/~laroyo/2L340/resources/Amazon-Recommendations.pdf
12. Item-based Collaborative Filtering Recommendation Algorithms http://www.grouplens.org/papers/pdf/www10_sarwar.pdf
13. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme. Fast context-aware recommendations with factorization machines. In Proc. Of the 34th ACM
SIGIR, 2011
ThankYou
Check Out My LinkedIn Profile at
https://in.linkedin.com/in/girishkhanzode

Contenu connexe

Tendances

Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender SystemsDavid Zibriczky
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemRishabh Mehta
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemAkshat Thakar
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation enginesGeorgian Micsa
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFYusuke Yamamoto
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems BasicsJarin Tasnim Khan
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithmsnextlib
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 

Tendances (20)

Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems Basics
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 

En vedette

Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYCJeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYCMLconf
 
A Data Scientist in the Music Industry
A Data Scientist in the Music IndustryA Data Scientist in the Music Industry
A Data Scientist in the Music IndustryData Science London
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix FactorizationTatsuya Yokota
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2njit-ronbrown
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFMLiangjie Hong
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorizationrubyyc
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومfaradars
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with SparkChris Johnson
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Domonkos Tikk
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization MachinesPavel Kalaidin
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringDKALab
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...MLconf
 

En vedette (15)

Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYCJeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
 
A Data Scientist in the Music Industry
A Data Scientist in the Music IndustryA Data Scientist in the Music Industry
A Data Scientist in the Music Industry
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دوم
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization Machines
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
 

Similaire à Recommender Systems

Demystifying Recommendation Systems
Demystifying Recommendation SystemsDemystifying Recommendation Systems
Demystifying Recommendation SystemsRumman Chowdhury
 
Big data certification training mumbai
Big data certification training mumbaiBig data certification training mumbai
Big data certification training mumbaiTejaspathiLV
 
Best data science courses in pune
Best data science courses in puneBest data science courses in pune
Best data science courses in puneprathyusha1234
 
Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabadprathyusha1234
 
best online data science courses
best online data science coursesbest online data science courses
best online data science coursesprathyusha1234
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionPerumalPitchandi
 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation systemAkashPatil334
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
Recommendation Systems : Selection vs Fulfillment
Recommendation Systems : Selection vs FulfillmentRecommendation Systems : Selection vs Fulfillment
Recommendation Systems : Selection vs FulfillmentAkansha Kumar, Ph.D.
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender systemKaren Li
 
recommendation system techunique and issue
recommendation system techunique and issuerecommendation system techunique and issue
recommendation system techunique and issueNutanBhor
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface晓愚 孟
 
case based recommendation approach for market basket data
case based recommendation approach for market basket datacase based recommendation approach for market basket data
case based recommendation approach for market basket datamniranjanmurthy
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratchDr. Amit Sachan
 
Recommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptxRecommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptxSatyam Sharma
 
Agent technology for e commerce-recommendation systems
Agent technology for e commerce-recommendation systemsAgent technology for e commerce-recommendation systems
Agent technology for e commerce-recommendation systemsAravindharamanan S
 

Similaire à Recommender Systems (20)

Demystifying Recommendation Systems
Demystifying Recommendation SystemsDemystifying Recommendation Systems
Demystifying Recommendation Systems
 
Big data certification training mumbai
Big data certification training mumbaiBig data certification training mumbai
Big data certification training mumbai
 
Best data science courses in pune
Best data science courses in puneBest data science courses in pune
Best data science courses in pune
 
Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabad
 
best online data science courses
best online data science coursesbest online data science courses
best online data science courses
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System Introduction
 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation system
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
Recommendation Systems : Selection vs Fulfillment
Recommendation Systems : Selection vs FulfillmentRecommendation Systems : Selection vs Fulfillment
Recommendation Systems : Selection vs Fulfillment
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
recommendation system techunique and issue
recommendation system techunique and issuerecommendation system techunique and issue
recommendation system techunique and issue
 
Lec7 collaborative filtering
Lec7 collaborative filteringLec7 collaborative filtering
Lec7 collaborative filtering
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
case based recommendation approach for market basket data
case based recommendation approach for market basket datacase based recommendation approach for market basket data
case based recommendation approach for market basket data
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratch
 
Recommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptxRecommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptx
 
Cs548 s15 showcase_web_mining
Cs548 s15 showcase_web_miningCs548 s15 showcase_web_mining
Cs548 s15 showcase_web_mining
 
Agent technology for e commerce-recommendation systems
Agent technology for e commerce-recommendation systemsAgent technology for e commerce-recommendation systems
Agent technology for e commerce-recommendation systems
 

Plus de Girish Khanzode (13)

Apache Spark Components
Apache Spark ComponentsApache Spark Components
Apache Spark Components
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Data Visulalization
Data VisulalizationData Visulalization
Data Visulalization
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
IR
IRIR
IR
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
NLP
NLPNLP
NLP
 
NLTK
NLTKNLTK
NLTK
 
NoSql
NoSqlNoSql
NoSql
 
Hadoop
HadoopHadoop
Hadoop
 
Language R
Language RLanguage R
Language R
 
Python Scipy Numpy
Python Scipy NumpyPython Scipy Numpy
Python Scipy Numpy
 
Funtional Programming
Funtional ProgrammingFuntional Programming
Funtional Programming
 

Dernier

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Dernier (20)

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Recommender Systems

  • 2. Content • Recommendation Problem • RecommenderApproaches • RecommenderAlgorithms • Collaborative Filtering – CF • Nearest Neighbor Methods – kNN • Item Based CF • Clustering • Association Rule Based CF • Classification • Data Sparsity Challenges • Scalability Challenges • Performance Implications • CF - Matrix Factorization • Content Based Recommender – CBR • Machine Learning and Statistical Techniques • Learning to Rank • Context Aware Recommendation Systems • Social andTrust based Recommendations • HybridApproaches • Summary • References
  • 3. Recommender Systems • Estimate a utility function to predict how a user will like an item • Systems to recommend items and services of likely interest to the user based on their preferences • Compare user’s profile to some reference characteristics to predict whether the user would be interested in an unseen item • Helps users deal with the information overload • Books, movies, CDs, web pages, news • On-line stores likeAmazon • Substantial sales improvements
  • 4. Information Sources Used • Browsing and searching data • Purchase data • Feedback explicitly provided by the users • Textual comments • Expert recommendations • Demographic data
  • 5. Recommendation Problem • S is a set of users U and a set of items S to be recommended • Let p be an utility function that measures the usefulness of item s ( S) to user u ( U) – p:U×S  R, where R is a totally ordered set (non-negative integers or real numbers in a range) • Objective – Learn p based on the past data – Use p to predict the utility value of each item s ( S) to each user u ( U)
  • 6. RecommenderTasks • Rating prediction - predict a rating that a user is likely to give to an item not seen before • Item prediction -Top N - predict a ranked list of items that a user is likely to buy
  • 7. Good Recommender • Does not recommend items the user already knows or would have found anyway • Expands the user's taste into neighbouring areas - Serendipity = Unsought finding • Diverse - represents all the possible interests of one user • Relevant to the user - personalized
  • 8. User Profiles • Most recommender systems use a profile of interests of a user • History of user is used as training data for a machine learning algorithm • History is used to create a user model • Contents – A model of user’s preferences - a function for any item that predicts the likelihood that the user is interested in that item – User’s interaction history - items viewed by a user, items purchased by a user, search queries
  • 9. User Profiles • Manual recommendation - user customization – Provide check box interface to allow users construct their own profiles of interests – A simple database matching process to find items that meet the specified criteria and recommend these to users • Limitations – Require efforts from users – Cannot cope with changes in interests of user – Does not provide a way to determine order among recommending items
  • 10. User Model • Creating a model of the user’s preference from his history is a form of classification learning • The training data (user history) could be captured through – explicit feedback (user rates items) – implicit observing of user’s interactions (user bought an item and later returned it is a sign of user doesn’t like the item)
  • 11. User Model • Implicit method can collect large amount of data but could contains noise while data collected through explicit method is perfect but the amount collected could be limited • A number of classification learning algorithms can be considered – Main goal - learn a function that model the user’s interests – Applying function on a new item can give the probability that a user will like this item or a numeric value indicating the degree of interest in this item
  • 12. RecommenderApproaches • recommends items that people with similar tastes and preferences liked in the pastCollaborative Filtering • recommends items similar to ones that user preferred in the pastContent Based • ranking problemPersonalized learning to rank • trust basedSocial recommendations • Combination of aboveHybrid
  • 13. Evolution of Recommender Systems Item Hierarchy –You bought Camera,You will also need film Attribute Based –You like action movies starring Client Eastwood, you will also like movie Good, Bad and Ugly Collaborative Filtering, User- use similarity – People like you who bought milk, also brought bread Collaborative Filtering, Item- item similarity – You like Spiderman so you will like Superman Social + Interest Graph based – Your friend likes Ford car, you will also like Ford car Model based – training SVM, SVD, LDA for implicit features
  • 16. CF • Recommends item by computing similarity between preferences of user and other like minded people • Assumption - personal tastes are correlated • These systems ignore content of the items being recommended • They are able to suggest new items to user whose preferences are similar to others • Maintains a database of ratings by many users of items • Predicts utility of items for a user based on the items previously rated by other like-minded users
  • 17. CF Methods • Memory Based – Memory-based algorithms operate on the entire user-item rating matrix and generates recommendations by identifying the neighbourhood of the target user to whom the recommendations will be made, based on the agreement of user’s past ratings – easy to understand, easy to implement and work well in many real-world situations – The most serious problem is the sparsity of user-item rating matrix – Another problem of memory-basedCF is efficiency. It is not computationally feasible for the ad-hoc recommender systems with millions of users and items – Clustering, matrix factorization, machine learning on the graph • Model Based – Model-based techniques use the rating data to train a model and then the model will be used to derive the recommendations
  • 18. CF - Approaches • K-nearest neighbor (kNN) – User based methods • Neighborhood formation phase • Recommendation phase – Item based methods • Recommendation phase • Association rules based prediction • Matrix factorization
  • 19. CF - Process • Weight all users with respect to similarity with the active user • Select a subset of the users (neighbors) to use as predictors • Normalize ratings and compute a prediction from a weighted combination of the selected neighbors’ ratings • Present items with highest predicted ratings as recommendations
  • 20. Nearest Neighbor Methods - kNN • Memory based - stores all the training data in memory • To classify a new item, compare it to all stored items using a similarity function and determine the nearest neighbour or k nearest neighbours • Class or numeric score of the previously unseen item can then be derived from the class of the nearest neighbour • Utilizes entire user-item database to generate predictions directly
  • 21. Nearest Neighbor Methods - kNN • No model building • Similarity function depends on type of data – Structured data: Euclidean distance metric – Unstructured data (free text) Cosine similarity function
  • 22. Neighbor Selection • For a given active user a, select correlated users to serve as source of predictions • Standard approach is to use the most similar n users, u, based on similarity weights wa, u • Alternate approach is to include all users whose similarity weight is above a given threshold
  • 23. • Let the record (or profile) of the target user be u (represented as a vector) and the record of another user be v (v  T) • Calculate similarity between the target user u and a neighbor v using Pearson’s correlation coefficient , )()( ))(( ),( 2 , 2 , ,,        Ci iCi i Ci ii rrrr rrrr sim vvuu vvuu vu User Based CF - Neighborhood Formation
  • 24. User Based CF - Recommendation Phase • Compute the rating prediction of item i for target user u • V is the set of k similar users and rv,i is the rating of user v given to item i       V V i sim rrsim rip v v vv u vu vu u ),( )(),( ),( ,
  • 25. Personalized vs. Non-Personalized • CF recommendations are personalized since the prediction is based on the ratings expressed by similar users – Those neighbours are different for each target user • A non-personalized collaborative-based recommendation can be generated by averaging the recommendations ofALL the users
  • 26. User Based CF - Issues • Lack of scalability - requires the real-time comparison of the target user to all user records in order to generate predictions • Difficult to make predictions based on nearest neighbour algorithms -Accuracy of recommendation may be poor • Sparsity – evaluation of large item sets, users purchases are under 5% • Poor relationship among like minded but sparse-rating users • Solution – usage of latent models to capture similarity between users & items in a reduced dimensional spaceA – variation of this approach that remedies this problem is called item-basedCF
  • 28. Item Based CF • Rather than matching the user to similar customers, build a similar-items table by finding that customers tend to purchase together • Amazon.com used this method • Scales independently of the catalog size or the total number of customers • Acceptable performance by creating the expensive similar-item table offline
  • 29. Item Based CF • The item-based approach works by comparing items based on their pattern of ratings across users.The similarity of items i and j is computed as follows        U jU i U ji rrrr rrrr jisim u uuu uu u uuuu 2 , 2 , ,, )()( ))(( ),(
  • 30. • After computing the similarity between items we select a set of k most similar items to the target item and generate a predicted value of user u’s rating • where J is the set of k similar items       Jj Jj j jisim jisimr ip ),( ),( )( ,u u, Item Based CF - Recommendation phase
  • 31. Model Based CF Algorithms • Models are learned from the underlying data rather than heuristics • Models of user ratings (or purchases) – Clustering (classification) – Association rules – Matrix Factorization – Restricted Boltzmann Machines – Other models • Bayesian network (probabilistic) • Probabilistic Latent Semantic Analysis ...
  • 32. Clustering • One more technique to recommend based on past purchases is to cluster customers for collaborative filtering • Each cluster assigned typical preferences, based on preferences of customers belonging to that cluster • Customers within each cluster receive recommendations computed at the cluster level
  • 33. Clustering • Pros – Clustering techniques can be used to work on aggregated data – Can also be applied as a first step for shrinking the selection of relevant neighbours in a collaborative filtering algorithm and improve performance – Can be used to capture latent similarities between users or items • Cons – Recommendations (per cluster) may be less relevant than collaborative filtering (per individual)
  • 34. Item Based CF - Issues • Bottleneck is similarity computation – Time complexity, highly time consuming with millions of users and items in the database • Isolate the neighbourhood generation and predication steps • off-line model – similarity computation done earlier & stored in memory • On-line component – prediction generation process
  • 36. Association Rule Based CF • Association rules used for recommendation • Each transaction for association rule mining is the set of items bought by a particular user • Find item association rules: buy_X, buy_Y -> buy_Z • Rank items based on measures such as confidence
  • 37. Association Rules • Pros – Very less storage space required – Quick to implement – Fast to execute – Not individualized – Very successful in broad applications for large populations, such as shelf layout in retail stores • Cons – Not suitable if knowledge of preferences change rapidly – It is tempting to do not apply restrictive confidence rules – May lead to literally stupid recommendation
  • 38. Classification • Classifiers are computational models trained using positive and negative examples • They may take in inputs – Vector of item features (action / adventure, Tom Hanks) – Preferences of customers (like comedy / action) – Relations among item • Logistic Regression, Bayesian Networks, SupportVector Machines, DecisionTrees... • Used in CF and CB Recommenders
  • 39. Classification • Pros – Can be combined with other methods to improve accuracy of recommendations – Versatile • Cons – May over fit (Regularization) – Need a relevant training data
  • 40. CF - Benefits and Challenges • Benefits – Very powerful and efficient – Highly relevant recommandations • The bigger the database, the more the past behaviors, the better the recommandations • Challenges – Cold Start: Needs to have enough other users in the system to find a match – Sparsity: Despite many users, user/ratings matrix can be sparse, making it hard to find users that have rated items – First Rater: Cannot recommend an item that has not been previously rated • New items • Esoteric items – Popularity Bias: Cannot recommend items to someone with unique tastes • Tends to recommend popular items
  • 41. Data Sparsity Challenges • Netflix Prize rating data in a User/Movie matrix – 500,000 x 17,000 = 8,500 M positions – Out of which only 100M have data • Typically - large product sets, user ratings for a small percentage of them – Example Amazon: millions of books and a user may have bought hundreds of books – the probability that two users that have bought 100 books have a common book (in a catalogue of 1 million books) is 0.01 (with 50 and 10 millions is 0.0002)
  • 42. Data Sparsity Challenges • Standard CF must have a number of users comparable to one tenth of the size of the product catalogue • Methods of dimensionality reduction – Matrix Factorization – Projection (PCA ...) – Clustering
  • 43. Scalability Challenges • Nearest neighbour algorithms require computations that grows with both the number of customers and products • With millions of customers and products a web-based recommender can suffer serious scalability problems • The worst case complexity is O(mn) (m customers and n products) • But in practice the complexity is O(m + n) since for each customer only a small number of products are considered • Some clustering techniques like K-means can help
  • 44. Performance Implications • Item-based similarity is static - Enables pre-computing of item-item similarity – prediction process involves only a table lookup for the similarity values & computation of the weighted sum • User-based CF – similarity between users is dynamic, pre- computing user neighbourhood can lead to poor predictions
  • 45. Matrix Factorization • Provides superior performance in recommendation quality and scalability • Approximates ratings matrix to a product of low rank matrices • Decompose a matrix M into the product of several factor matrices where n can be any number - usually 2 or 3 nFFFM ...21
  • 46. CF - Matrix Factorization • Matrix factorization is a latent factor model • Latent variables (also called features, aspects, or factors) are introduced to account for the underlying reasons of a user purchasing or using a product – Connections between the latent variables and observed variables (user, product, rating, etc.) are estimated during the training – Recommendations made by computing possible interactions with each product through the latent variables • Netflix Prize contest for movie recommendation used a Singular Value Decomposition (SVD) based algorithm • The prize winning method employed an adapted version of SVD
  • 47. Matrix FactorizationTrends Dimension Reduction Techniques PCA and SVD SVD Based Matrix Factorization Basic Matrix Factorization Extended Matrix Factorization
  • 48. Extended Matrix Factorization (EMF) • According to purpose it can be categorized into following contexts • Adding biases to matrix factorization – User bias – if user is strict when rating, then always ratings are lower – Item bias – a popular movie always has higher ratings
  • 49. EMF • Adding other influential factors – Temporal effects – very old ratings have less impact on rating predictions • Algorithm –Time SVD++ – Content Profile • Users or items share same or similar content like gender, user group, movie genre that can contribute to rating predictions – Contexts – User preference can change from context to context – Social ties from Facebook, twitter etc... • Tensor Factorization
  • 50. EMF -Tensor Factorization • There could be more than 2 dimensions in ratings space – multi-dimensional rating space • TF can be considered as a multi-dimensional matrix factorization • Allows as many context variables as needed to be integrated
  • 51. Algorithms Evaluation • ItemKNN – item based collaborative filtering • RegSVD – SVD with regularization • BiasedMF – Biased matrix factorization • SVD++ - A complicated extension over matrix factorization
  • 52. Content Based Recommender - CBR • Recommend an item by predicting its utility to a user • Recommendation based on similarity to items liked by user in the past • Recommendation based on information on content of items rather than opinions of other users • Each user is assumed to act independently
  • 53. Content Based Recommender - CBR • Recommender does not depend on having other users in the system • User needs to provide information on her personal interests to use the system • The top-k best matched or most similar items are recommended to the user • The simplest approach is to compute the similarity of the user profile with each item
  • 54. CBR
  • 55. CBR • What is the « content » of an item? – It can be explicit « attributes » or « characteristics » of the item. For example for a film: • Action / adventure • Feature BruceWillis • Year 1995 • It can also be « textual content » (title, description, table of content, etc.) – Several techniques exist to compute the distance between two textual documents • Can be extracted from the signal itself – Audio,Video
  • 56. CBR • CBR systems are common for text based data • Text documents recommended based on a comparison between their content (words appearing) and user model (a set of preferred words) • The user model can also be a classifier based on whatever technique (Neural Networks, Naïve Bayes...)
  • 57. CBR Process –TF.IDF • A textual document is scanned and parsed • Word occurrences are counted (may be stemmed) • Several words or « tokens » are not taken into account.That includes « stop words » (the, a, for), and words that do not appear enough in documents • Each document is transformed into a normedTF-IDF vector, size N (Term Frequency / Inverted Document Frequency). • The distance between any pair of vector is computed
  • 59. Machine Learning and StatisticalTechniques • Bayesian classifiers and machine learning techniques like clustering, decision trees and artificial neural networks – These methods use models learned from the underlying data rather than heuristics. • For example, based on a set of web pages that were rated as relevant or irrelevant by the user, the naive Bayesian classifier can be used to classify unratedWeb pages
  • 60. CBR - Advantages • No need for data on other users – No cold-start or sparsity problems • Able to recommend to users with unique tastes • Able to recommend new and unpopular items – No first-rater problem • Can provide explanations of recommended items by listing content- features that caused an item to be recommended
  • 61. CBR - Disadvantages • Requires content that can be encoded as meaningful features • Difficult to implement serendipity • Easy to over fit (for a user with few data points we may “pigeon hole” her) • User tastes must be represented as a learnable function of these content features • Even for texts, IR techniques cannot consider multimedia information, aesthetic qualities, download time… – A positively rated page may be not related to the presence of certain keywords • Unable to exploit quality judgments of other users – Unless these are somehow included in the content features
  • 62. Learning to Rank • Recommendation is a ranking problem • Users pay attention to few items at the top of the list • Machine learning task - Rank the most relevant items as high as possible in the recommendation list • Does not try to predict a rating, but the order of preference • Can be treated as a standard supervised classification problem
  • 63. Learning to Rank • Optimize ranking algorithms to give the highest scores to titles that a member is most likely to play and enjoy • Many other features can be added • Goal - Find a personalized ranking function that is better than item popularity to better satisfy users with varying tastes • Machine learning problem goal is to construct a ranking model from training data
  • 64. Learning to Rank • Training data can be a partial order or binary judgments (relevant/not relevant) • Resulting order of the items typically induced from a numerical score • Learning to rank is a key element for personalization -Treat the problem as a standard supervised classification problem
  • 65. Popularity and Predicted Rating • Linear Model • Score(u, v) = w1 p(v) + w2 r(u, v) + b – u=user, v=video item, p=popularity and r=predicted rating • Select positive and negative examples from historical data and let a machine learning algorithm learn the weights that optimizes goal
  • 67. Netflix – Ranking Improvements
  • 68. Learning to Rank Approaches • Pair wise approach to ranking – Loss function is defined on pair-wise preferences – The goal is to minimize the number of inversions in the resulting ranking – Ranking problem is then transformed into the binary classification problem – RankSVM, RankBoost, RankNet, FRank… • List-wise approach – Directly optimize the ranking of the whole list by using a list wise approach – uses similarity between the ranking list and the ground truth as a loss function
  • 69. Learning to Rank Approaches • Point wise approach – Ranking function minimizes loss function defined on individual relevance judgment – Ranking score based on regression or classification – Ordinal regression, Logistic regression, SVM, GBDT, … • Need to use rank-specific information retrieval metrics to measure the performance of the model – Mean average precision – Mean reciprocal rank
  • 70. Diversity in Recommendation – Re-Ranking Recommender Re-rank results
  • 71. ContextTypes • Physical context - time, position, user activity, weather, light, temperature • Social context - presence and role of other people around the user • Interaction media context - device used to access the system and the type of media that are browsed and personalized - text, music, images, movies • Modal context - state of mind of the user, the user’s goals, mood, experience, and cognitive capabilities • Traditional RS - Users × Items • Ratings Contextual RS - Users × Items × Contexts Ratings
  • 72. Context Aware Recommendation Systems • Known as CARSs • Pattern - user preferences change from contexts to contexts • Necessary to adapt users’ preferences to dynamic situations • Context is an important factor in recommendations – factor like weather – Pre-filtering techniques • Context information is used to select relevant portion of data – Post filtering techniques • Context information is used to re-rank or filter final rankings – Contextual modelling • Context information is used directly as part of learning preferences model • Variations an combinations of above methods
  • 73. Pre-Filtering Challenges • Context over-specification - using an exact context may be too narrow – Watching a movie with a friend in a movie theatre on Saturday • Certain aspects of the overly specific context may not be significant (Saturday vs. weekend) • Sparsity problem - overly specified context may not have enough training examples for accurate prediction
  • 74. Pre-Filtering Challenges • Pre-Filter generalization different approaches – Roll up to higher level concepts in context hierarchies – Saturday -> weekend or movie theatre any location • Use latent factors models or dimensionality reduction approaches – Matrix factorization, LDA
  • 75. Post Filtering • Contextual Post-Filtering - heuristic in nature – Basic Idea - treat the context as an additional constraint – Many different approaches • Filtering based on social/collaborative context representation – mine social features - annotations, tags, tweets, reviews associated with the item and users in a given context C – Promote items with frequently occurring social features from C
  • 76. Post Filtering • Filtering based on context similarity – Can be represented as a set of features commonly associated with the specified context – Adjust the recommendation list by favouring those items that have more of the relevant features – Similarity-based approach (but the space of features may be different than the one describing the items)
  • 77. Social andTrust based Recommendations • A social recommender system recommends items that are popular in the social proximity of the user • A person being close in social network does not mean their judgement can be trusted – This idea of trust is central in social-based systems – It can be a general per-user value that takes into account social proximity but can also be topic-specific
  • 78. Trust Definition • Trust is very complex – Involves personal background, history of interaction, context, similarity, reputation • Sociological definitions – Trust requires a belief and a commitment – Tom believes Rob will provide reliable information thusTom is willing to act on that information – Similar to a bet • In the context of recommender systems, trust is generally used to describe similarity in opinion – Ignores authority, correctness on facts
  • 79. Methods • Rating prediction from a user to an item – Using user’s web of trust – People in web of trust are seen as trustable – Average of all the rating scores given by trustable people, weighted by their trust value • Use trust as a way to give more weight to some users • Trust for collaborative filtering – Use trust in place of (or combined with) similarity • Trust for sorting and filtering – Prioritize information from trusted sources
  • 80. Using Social Data • Social connections can be used in combination with other approaches • Friendships can be fed into collaborative filtering methods in different ways – replace or modify user-user similarity by using social network information • Algorithms – Advogato (Levien) – Appleseed (Ziegler and Lausen) – MoleTrust (Massa and Avesani) – TidalTrust (Golbeck)
  • 81. Demographic Methods • Aim to categorize the user based on personal attributes and make recommendation based on demographic classes • Demographic groups can come from marketing research – hence experts decided how to model the users • Demographic techniques form people-to people correlations • Attributes can be induced by classifying a user using other user descriptions (the home page) – you need some user for which you know the class (male/female) • Prediction can use whatever learning mechanism we like (nearest neighbour, naïve classifier...)
  • 82. Hybrid Approaches • Content-based and collaborative methods have complementary strengths and weaknesses • Combine methods to obtain the best of both – Apply both methods and combine recommendations – Use collaborative data as content – Use content-based predictor as another collaborator – Use content-based predictor to complete collaborative data
  • 84. Hybrid Approaches • Netflix is a good example of a hybrid system • Recommends – by comparing the watching and searching habits of similar users - collaborative filtering – offering movies that share characteristics with films that a user has rated highly - content-based filtering
  • 85. Hybrid Approaches -Weighted Method • Scores of several recommendations are combined together to produce the single recommendation • Equal weight can be assigned to both content and collaborative recommenders but gradually adjust the weights as the prediction of ratings are confirmed
  • 86. Hybrid Approaches -Weighted Method • Assumption: relative performance of the different techniques is uniform. Not true in general – Example - CF performs worse for items with few ratings • Example – a CB and a CF recommender equally weighted at first.Weights are adjusted as predictions are confirmed or not. – RS with consensus scheme - each recommendation of a specific item counts as a vote for the item
  • 87. Hybrid Approaches • Switching Method - use some criterion to switch between recommendation techniques – The main problem is to identify a good switching criterion – A system using CB-CF, when CB cannot predict with sufficient confidence switch to CF • Mixed Method – used when large recommendations exist – Recommendations from several techniques are presented at the same time
  • 88. Hybrid Approaches - Cascade • Involves the stage process • One recommendation technique is employed to produce a ranking of candidates and a second technique refines the recommendation from the candidate set • At each iteration, a first recommendation technique produces a coarse ranking & a second technique refines the recommendation
  • 89. Hybrid Approaches - Cascade • Avoids employing the second, lower-priority technique on items already well-differentiated by the first • Requires a meaningful ordering of the techniques
  • 90. Hybrid Approaches - Feature Combination • Features from different recommendation data sources are used together into a single recommendation algorithm. – Allows system consider collaborative data without relying on it exclusively, so it reduces the sensitivity of the system to the number of users who have rated an item. – Conversely, it lets the system have information about the inherent similarity of items that are otherwise opaque to a collaborative system.
  • 91. Hybrid Approaches - FeatureAugmentation • Used to improve the performance of a core system. • One technique is used to produce a rating of an item and that information is then incorporated into the processing of the next recommendation technique • Difference between the Cascade and augmentation – in feature augmentation the feature used by second recommendation is the one which is the output of the first one – in cascading second recommender does not use the output of first one but the results of the two recommenders are combined in a prioritized manner
  • 92. Hybrid Approaches - Meta-level • Two recommendation techniques can be merged by using the model generated by one as the input for another • Difference between the meta-level and augmentation – in augmentation output of first recommender is used as input for second one – in meta-level the entire model will be consider as a input for the second one
  • 93. Summary • For many applications such as Recommender Systems (but also Search, Advertising, and even Networks) understanding data and users is vital • Algorithms can only be as good as the data they use as input – But the inverse is also true: you need a good algorithm to leverage your data • Importance of User/Data Mining is going to be a growing trend in many areas in the coming years • Recommender Systems (RS) is an important application of User Mining
  • 94. Summary • RS have the potential to become as important as Search is now • RS are fairly new but already grounded on well proven technology – Collaborative Filtering – Machine Learning – ContentAnalysis – Social Network Analysis – … • However, there are still many open questions and a lot of interesting research to do!
  • 95. References 1. Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, N. Oliver, and A. Hanjalic. CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In Proc. of the sixth Recsys, 2012. 2. E-Commerce Recommendation Applications http://citeseer.ist.psu.edu/cache/papers/cs/14532/http:zSzzSzwww.cs.umn.eduzSzResearchzSzGroupLenszSzECRA.pdf/schafer01ecommerce.pdf 3. “Item-based Collaborative Filtering Recommendation Algorithms”, B. Sarwar et al. 2001. Proceedings of World Wide Web Conference 4. ”Lessons from the Netflix Prize Challenge.”. R. M. Bell and Y. Koren. SIGKDD Explor. Newsl., 9(2):75–79, December 2007. 5. “Beyond algorithms: An HCI perspective on recommender systems”. K. Swearingen and R. Sinha. In ACM SIGIR 2001 Workshop on Recommender Systems 6. “Fast context-aware recommendations with factorization machines”. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt- Thieme. In Proc. of the 34th ACM SIGIR, 2011. 7. “Restricted Boltzmann machines for collaborative filtering”. R. Salakhutdinov, A. Mnih, and G. E. Hinton.In Proc of ICML ’07, 2007 8. “Learning to rank: From pairwise approach to listwise approach”.Z. Cao and T. Liu. In In Proceedings of the 24th ICML, 2007. 9. “Recommender Systems in E-Commerce”. J. Ben Schafer et al. ACM Conference on Electronic Commerce. 1999- 10. “Introduction to Data Mining”, P. Tan et al. Addison Wesley. 2005 11. Amazon.com Recommendations: Item-to-Item Collaborative Filtering http://www.win.tue.nl/~laroyo/2L340/resources/Amazon-Recommendations.pdf 12. Item-based Collaborative Filtering Recommendation Algorithms http://www.grouplens.org/papers/pdf/www10_sarwar.pdf 13. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme. Fast context-aware recommendations with factorization machines. In Proc. Of the 34th ACM SIGIR, 2011
  • 96. ThankYou Check Out My LinkedIn Profile at https://in.linkedin.com/in/girishkhanzode