2. Recommender Systems
• Software tools and techniques providing suggestions for items
to be of use to a user
• Recommender systems analyze patterns of user interest in
items or products to provide personalized recommendations
of items that will suit a user’s taste
Item - What the system recommends to the user
(CD, news, books, movies...)
User preferences - ratings for products
User actions - user browsing history
3. RS Techniques
• Collaborative-Filtering system
– recommends to the active user the items that
other users with similar tastes liked in the past
• Content-based system
– recommend items that are similar to the ones that
the user liked in the past
• Hybrid-Collaborative Filtering
• Tagging: recommends items using tags
assigned by different users
4. Collaborative Filtering
• trying to predict the opinion the user will have on the
different items and be able to recommend the “best”
items to each user based on the user’s previous
likings and the opinions of other like minded users.
5. Collaborative Filtering
• The task of a CF algorithm is to find item likeliness of two
forms :
Prediction – a numerical value, expressing the predicted
likeliness value about an item of the active user
Recommendation – a list of N items that the active user will
like the most
6. K Nearest Neighbour Algorithm
• A distance measure is needed to determine the
“closeness” of instances
• Classify an instance by finding its nearest neighbors
and picking the most popular class among the
neighbors
7. Mega
Mind
Toy Story Despicabl
e Me
Lion King Kung Fu
Panda
Zeynep 4 5 3 2 4
Funda 3 3 2 3 5
Pınar 3 3 4 2 3
Gülten 4 4 5 4 5
Yağız 4 5 ? 4 5
Rating Prediction
8. Application
• MovieLens Database (1M)
3883 movies
6040 users
1000209 ratings
• Technologies
ASP.Net 4.0
MS SQL Server 2008
9. RATING PREDICTION DATABASE DIAGRAM
Movies
MovieID
Title
Genre
Ratings
ID
UserID
MovieID
Rating
Timestamp
Users
UserID
Gender
Age
Occupation
ZipCode
Age
Id
Description
Occupation
Id
Description
Predictions
ID
UserID
MostSimilarUserID
Difference
TimeElapsed
MovieID
PredictedRating
ActualRating
12. Pro
Con
• Cold-start Problem
• Storage: all training
examples are saved in
memory
• Time: to classify x, you
need to loop over all
training examples (x’,y’) to
compute distance between
x and x’.
Simple to implement and
use
Comprehensible – easy to
explain prediction
Robust to noisy data by
averaging k-nearest
neighbors
KNN Algorithm
13. Conclusion
Recommending and personalization are important
approaches to combating information over-load.
Machine Learning is an important part of systems for
these tasks.
Collaborative Filtering has its own problems
Better results would be achieved by use of
content, tags and more optimized similarity
functions.