SlideShare une entreprise Scribd logo
1  sur  31
Usman Sharif

RECOMMENDATION SYSTEMS
Why recommendation systems?

 Provide a better experience to your users.
 Understand the behavior and patterns of
  users.
 Enables an opportunity to re-engage inactive
  users.
 Boost sales
 Better than a search feature
How some companies are using
Recommendation Systems - Amazon
How some companies are using
Recommendation Systems - Gmail
A simple recommendation system

 Consider the following scenario
   A library has books and has members
   Members can have books issued
   The library wants to build a recommender system
    to recommend books to their members
Scoring Matrices
         Book 1   Book 2   Book 3   Book 4
User 1   X                 X
User 2   X
User 3            X                 X
User 4   X                 X        X
User 5   X        X

         Book 1   Book 2   Book 3   Book 4
Book 1   4        1        2        1
Book 2   1        2        0        1
Book 3   2        0        2        1
Book 4   1        1        1        2
Using the scoring matrices

 If a user has read Book 1 recommend Book 3, 2, 4.
 If a user has read Book 2 recommend Book 1, 4, 3.
 If a user has read Book 3 recommend Book 1, 4, 2.
 If a user has read Book 4 recommend Book 1, 2, 3.
Advantages

 Very simple to understand and implement.
 Works really well if you’re interested in
  looking at user’s one activity to recommend
  further.
Disadvantages

 Cannot work for a new user with no history.
 In a real world scenario where there are
  thousands of books and thousands of
  members, there are bound to be too many
  zeroes (a sparse matrix).
 Does not consider more than 1 item.
Another Try
 Our Books records might look like this:
BookId Title                     Genre         Writer               Language
1       The Great Gatsby         Classic       F Scott Fitzgerald   English
2       Nine Stories             Short Stories J D Salinger         English
3       The Sun Also Rises       Classic       Ernest Hemingway English
4       The Hunger Games         Action        Suzanne Collins      English
5       The Ambler Warning       Thriller      Robert Ludlum        English
6       The Catcher in the Rye   Classic       J D Salinger         English
7       To Kill a Mockingbird    Classic       Harper Lee           English
Create an Item Similarity
   Matrix
            Book 1     Book 2      Book 3     Book 4      Book 5     Book 6      Book 7
Book 1      3          1           2          1           1          2           2
Book 2      1          3           1          1           1          2           1
Book 3      2          1           3          1           1          2           2
Book 4      1          1           1          3           1          1           1
Book 5      1          1           1          1           3          1           1
Book 6      2          2           2          1           1          3           2
Book 7      2          1           2          1           1          2           3
• This would always be a square (n x n) matrix.
• Each cell has the count of similar attributes (excluding unique attributes).
• In general any measure for similarity can be used here.
To Recommend

 Look at what a user has previously read.
 Use the values from the similarity matrix and
  recommend books based on how similar it is
  to the book the user has already read.
Advantages

 Recommendations can be pre-computed for
  a very large Item base.
 Fast lookups can be built to perform
  recommendations.
 For example, if a user is seeing the page of
  Book 3, you may want to recommend them
  Books 1, 6 and 7.
 Would work for new/non-registered users.
Disadvantage

 Does not consider the user’s history.
 Instead looks at a collective trend.
Another Approach - The Users

 Our Users records might look like this:
 UserId     Gender    Age        Location
 1          Male      34         Pakistan
 2          Female    28         Pakistan
 3          Male      38         India
 4          Male      32         India
 5          Female    21         Pakistan
 6          Female    24         Pakistan
The User Borrowing
  UserId   BookId
  1        3
  1        7
  2        2
  3        1
  3        5
  3        7
  4        6
  4        7
  5        2
  6        4
  6        6
  6        7
Transforming User Borrowing
             User 1     User 2       User 3   User 4   User 5   User 6
   Book 1                            X
   Book 2               X                              X
   Book 3    X
   Book 4                                                       X
   Book 5                            X
   Book 6                                     X                 X
   Book 7    X                       X        X                 X


• Issue with too many zero values.
• Any solutions?
Transform the Users Records

 Consider Age as a discrete column with
  ranges like {0-10, 11-20, 21-30, 31-40, …} so
  that we can create some partitions like this:
  PartitionId   Gender   AgeGroup   Location
  1             Male     31-40      Pakistan
  2             Female   21-30      Pakistan
  3             Male     31-40      India
Recreate User Borrowing using
  Partition Information
 Lesser zero valued records (11/21 compared to
  30/42 previously)
 Much less columns than we previously had!
 The notation has been changed from ‘X’ to
  count.                  Partition 1 Partition 2 Partition 3
                         Book 1                      1
                         Book 2            2
                         Book 3   1
                         Book 4            1
                         Book 5                      1
                         Book 6            1         1
                         Book 7   1        1         2
To Recommend

 See what partition a user belongs to.
 Look at the column of that partition and sort
  the books in descending order based on their
  frequency count.
Advantages

 Continues to improve over time.
 More partitions can be added over time.
 Instead of using a collective scoring, the
  technique partitions the user base into
  ‘similar’ users.
 The technique can easily be extended on the
  item side and rather than having books as
  rows, we can have book clusters.
Disadvantages

 Needs some seed data to start.
 Requires some transformations.
 Can become very complex as the number of
  users/items grow.
Evaluating Performance
(Metrics)
 Almost any Information Retrieval metric can
  be used.
 Three interesting ones:
   Accuracy
   Coverage
   Normalized Distance Based Performance Measure
    (NDPM)
Accuracy
• Takes into account the order in which recommendations are
  shown to users and how they responded to them.
• For rank position = 1:
   • Acc(1) = # of Positive responses with rank less than or
      equal to 1 / total recommendations with rank less than or
      equal to 1
   • Therefore, Acc(1) = 1 / 3 = 33.33%
• Similarly, Acc(2) = 2 / 6 = 33.33%
                        UserId     BookId    Rank       Response
                        1          3         1          Yes
                        1          2         2          No
                        2          7         1          No
                        2          5         2          Yes
                        3          3         1          No
                        3          7         2          No
Coverage
 Shows the coverage of items that appear in the
  recommendations for all users.
 For rank position = 1:
   Cov(1) = Unique items in recommendations with rank less
    than or equal to 1 / total items.
   Therefore, Cov(1) = 2 / 7 = 28.57%
 Similarly, Cov(2) = 4 / 7 = 57.14%
                      UserId     BookId   Rank      Response
                      1          3        1         Yes
                      1          2        2         No
                      2          7        1         No
                      2          5        2         Yes
                      3          3        1         No
                      3          7        2         No
Normalized Distance Based Performance
    Measure (NDPM)
   Assesses the quality of the measure of recommendation system taking into account the
    ordering in which items are shown.
   NDPM = (C- + 0.5 x C+) / Cu
   C- - is the number of recommended item pairs where user responded as (No, Yes).
   C+ - is the number of recommended item pairs where user responded as (Yes, No).
   Cu - is the number of all item pairs where the user’s response was not same.
   In our example,
       C-(1) = 2, C+(1) = 2 and Cu(1) = 4 => NDPM(1) = (2 + 0.5 x 2) / 4 = 75%
       C-(2) = 0, C+(2) = 1 and Cu(2) = 1 => NDPM(2) = (0 + 0.5 x 1) / 1 = 50%
       NDPM = (0.75 + 0.5) / 2 = 62.5%
                                              UserId                 BookId       Rank   Response
                                              1                      3            1      Yes
                                              1                      2            2      No
                                              1                      7            3      No
                                              1                      5            4      Yes
                                              2                      3            1      Yes
                                              2                      7            2      No
How to improve results

 Ensure that you maintain a list of already
  seen recommendations for users and don’t
  recommend them back for some time.
 Provide some sort of mechanism to user to
  provide information about what they’re
  looking for.
 Infer the above from user searches.
Some standard algorithms
 Item Hierarchy
      You bought a printer, you will also need ink.
 Attribute-based recommendations
      You like reading classics, written by Salinger, you might like “Catcher in
       the Rye”.
 Collaborative Filtering – User-User Similarity
      People like you who read “The Hunger Games” also read “The Ambler
       Warning”.
 Collaborative Filtering – Item-Item Similarity
      You like “Catcher in the Rye” so you will like “Nine Stories”.
 Social + Interest Graph Based
      Your friends like “The Great Gatsby” so you will like “The Great Gatsby”
       too.
 Model Based
      Training SVM, LDA, SVD for implicit features.
Some Tools

 Apache Mahout (Java)


 Crab (Python)


 Easyrec (RESTful API)
Questions??
Thankyou!

            www.usman-sharif.com
                  @sharif_usman

Contenu connexe

Similaire à Recommender Systems

7.1 ratios and rates 1
7.1 ratios and rates 17.1 ratios and rates 1
7.1 ratios and rates 1
bweldon
 
Stronger Research Reporting Using Visuals
Stronger Research Reporting Using VisualsStronger Research Reporting Using Visuals
Stronger Research Reporting Using Visuals
vcuniversity
 

Similaire à Recommender Systems (20)

NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
Indic threads pune12-recommenders-apache-mahout
Indic threads pune12-recommenders-apache-mahoutIndic threads pune12-recommenders-apache-mahout
Indic threads pune12-recommenders-apache-mahout
 
Memo Raft
Memo RaftMemo Raft
Memo Raft
 
Lecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdfLecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdf
 
Tinderbook
Tinderbook  Tinderbook
Tinderbook
 
Segmentation for Targeting
Segmentation for TargetingSegmentation for Targeting
Segmentation for Targeting
 
7.1 ratios and rates 1
7.1 ratios and rates 17.1 ratios and rates 1
7.1 ratios and rates 1
 
Consulting Template Slides - Mckinsey, BCG & Bain Style Communication
Consulting Template Slides - Mckinsey, BCG & Bain Style CommunicationConsulting Template Slides - Mckinsey, BCG & Bain Style Communication
Consulting Template Slides - Mckinsey, BCG & Bain Style Communication
 
Probabilistic Group Recommendation via Information Matching
Probabilistic Group Recommendation via Information MatchingProbabilistic Group Recommendation via Information Matching
Probabilistic Group Recommendation via Information Matching
 
New Revised GRE Test Format
New Revised GRE Test FormatNew Revised GRE Test Format
New Revised GRE Test Format
 
Stronger Research Reporting Using Visuals
Stronger Research Reporting Using VisualsStronger Research Reporting Using Visuals
Stronger Research Reporting Using Visuals
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
Rubric sample
Rubric sampleRubric sample
Rubric sample
 
Collaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CFCollaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CF
 
L3. Decision Trees
L3. Decision TreesL3. Decision Trees
L3. Decision Trees
 
The Data Analysis Workflow
The Data Analysis WorkflowThe Data Analysis Workflow
The Data Analysis Workflow
 
Empowering Students Unit
Empowering Students UnitEmpowering Students Unit
Empowering Students Unit
 
Effective Use of Surveys in UX | Triangle UXPA Workshop
Effective Use of Surveys in UX | Triangle UXPA WorkshopEffective Use of Surveys in UX | Triangle UXPA Workshop
Effective Use of Surveys in UX | Triangle UXPA Workshop
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

Recommender Systems

  • 2. Why recommendation systems?  Provide a better experience to your users.  Understand the behavior and patterns of users.  Enables an opportunity to re-engage inactive users.  Boost sales  Better than a search feature
  • 3. How some companies are using Recommendation Systems - Amazon
  • 4. How some companies are using Recommendation Systems - Gmail
  • 5. A simple recommendation system  Consider the following scenario  A library has books and has members  Members can have books issued  The library wants to build a recommender system to recommend books to their members
  • 6. Scoring Matrices Book 1 Book 2 Book 3 Book 4 User 1 X X User 2 X User 3 X X User 4 X X X User 5 X X Book 1 Book 2 Book 3 Book 4 Book 1 4 1 2 1 Book 2 1 2 0 1 Book 3 2 0 2 1 Book 4 1 1 1 2
  • 7. Using the scoring matrices  If a user has read Book 1 recommend Book 3, 2, 4.  If a user has read Book 2 recommend Book 1, 4, 3.  If a user has read Book 3 recommend Book 1, 4, 2.  If a user has read Book 4 recommend Book 1, 2, 3.
  • 8. Advantages  Very simple to understand and implement.  Works really well if you’re interested in looking at user’s one activity to recommend further.
  • 9. Disadvantages  Cannot work for a new user with no history.  In a real world scenario where there are thousands of books and thousands of members, there are bound to be too many zeroes (a sparse matrix).  Does not consider more than 1 item.
  • 10. Another Try  Our Books records might look like this: BookId Title Genre Writer Language 1 The Great Gatsby Classic F Scott Fitzgerald English 2 Nine Stories Short Stories J D Salinger English 3 The Sun Also Rises Classic Ernest Hemingway English 4 The Hunger Games Action Suzanne Collins English 5 The Ambler Warning Thriller Robert Ludlum English 6 The Catcher in the Rye Classic J D Salinger English 7 To Kill a Mockingbird Classic Harper Lee English
  • 11. Create an Item Similarity Matrix Book 1 Book 2 Book 3 Book 4 Book 5 Book 6 Book 7 Book 1 3 1 2 1 1 2 2 Book 2 1 3 1 1 1 2 1 Book 3 2 1 3 1 1 2 2 Book 4 1 1 1 3 1 1 1 Book 5 1 1 1 1 3 1 1 Book 6 2 2 2 1 1 3 2 Book 7 2 1 2 1 1 2 3 • This would always be a square (n x n) matrix. • Each cell has the count of similar attributes (excluding unique attributes). • In general any measure for similarity can be used here.
  • 12. To Recommend  Look at what a user has previously read.  Use the values from the similarity matrix and recommend books based on how similar it is to the book the user has already read.
  • 13. Advantages  Recommendations can be pre-computed for a very large Item base.  Fast lookups can be built to perform recommendations.  For example, if a user is seeing the page of Book 3, you may want to recommend them Books 1, 6 and 7.  Would work for new/non-registered users.
  • 14. Disadvantage  Does not consider the user’s history.  Instead looks at a collective trend.
  • 15. Another Approach - The Users  Our Users records might look like this: UserId Gender Age Location 1 Male 34 Pakistan 2 Female 28 Pakistan 3 Male 38 India 4 Male 32 India 5 Female 21 Pakistan 6 Female 24 Pakistan
  • 16. The User Borrowing UserId BookId 1 3 1 7 2 2 3 1 3 5 3 7 4 6 4 7 5 2 6 4 6 6 6 7
  • 17. Transforming User Borrowing User 1 User 2 User 3 User 4 User 5 User 6 Book 1 X Book 2 X X Book 3 X Book 4 X Book 5 X Book 6 X X Book 7 X X X X • Issue with too many zero values. • Any solutions?
  • 18. Transform the Users Records  Consider Age as a discrete column with ranges like {0-10, 11-20, 21-30, 31-40, …} so that we can create some partitions like this: PartitionId Gender AgeGroup Location 1 Male 31-40 Pakistan 2 Female 21-30 Pakistan 3 Male 31-40 India
  • 19. Recreate User Borrowing using Partition Information  Lesser zero valued records (11/21 compared to 30/42 previously)  Much less columns than we previously had!  The notation has been changed from ‘X’ to count. Partition 1 Partition 2 Partition 3 Book 1 1 Book 2 2 Book 3 1 Book 4 1 Book 5 1 Book 6 1 1 Book 7 1 1 2
  • 20. To Recommend  See what partition a user belongs to.  Look at the column of that partition and sort the books in descending order based on their frequency count.
  • 21. Advantages  Continues to improve over time.  More partitions can be added over time.  Instead of using a collective scoring, the technique partitions the user base into ‘similar’ users.  The technique can easily be extended on the item side and rather than having books as rows, we can have book clusters.
  • 22. Disadvantages  Needs some seed data to start.  Requires some transformations.  Can become very complex as the number of users/items grow.
  • 23. Evaluating Performance (Metrics)  Almost any Information Retrieval metric can be used.  Three interesting ones:  Accuracy  Coverage  Normalized Distance Based Performance Measure (NDPM)
  • 24. Accuracy • Takes into account the order in which recommendations are shown to users and how they responded to them. • For rank position = 1: • Acc(1) = # of Positive responses with rank less than or equal to 1 / total recommendations with rank less than or equal to 1 • Therefore, Acc(1) = 1 / 3 = 33.33% • Similarly, Acc(2) = 2 / 6 = 33.33% UserId BookId Rank Response 1 3 1 Yes 1 2 2 No 2 7 1 No 2 5 2 Yes 3 3 1 No 3 7 2 No
  • 25. Coverage  Shows the coverage of items that appear in the recommendations for all users.  For rank position = 1:  Cov(1) = Unique items in recommendations with rank less than or equal to 1 / total items.  Therefore, Cov(1) = 2 / 7 = 28.57%  Similarly, Cov(2) = 4 / 7 = 57.14% UserId BookId Rank Response 1 3 1 Yes 1 2 2 No 2 7 1 No 2 5 2 Yes 3 3 1 No 3 7 2 No
  • 26. Normalized Distance Based Performance Measure (NDPM)  Assesses the quality of the measure of recommendation system taking into account the ordering in which items are shown.  NDPM = (C- + 0.5 x C+) / Cu  C- - is the number of recommended item pairs where user responded as (No, Yes).  C+ - is the number of recommended item pairs where user responded as (Yes, No).  Cu - is the number of all item pairs where the user’s response was not same.  In our example,  C-(1) = 2, C+(1) = 2 and Cu(1) = 4 => NDPM(1) = (2 + 0.5 x 2) / 4 = 75%  C-(2) = 0, C+(2) = 1 and Cu(2) = 1 => NDPM(2) = (0 + 0.5 x 1) / 1 = 50%  NDPM = (0.75 + 0.5) / 2 = 62.5% UserId BookId Rank Response 1 3 1 Yes 1 2 2 No 1 7 3 No 1 5 4 Yes 2 3 1 Yes 2 7 2 No
  • 27. How to improve results  Ensure that you maintain a list of already seen recommendations for users and don’t recommend them back for some time.  Provide some sort of mechanism to user to provide information about what they’re looking for.  Infer the above from user searches.
  • 28. Some standard algorithms  Item Hierarchy  You bought a printer, you will also need ink.  Attribute-based recommendations  You like reading classics, written by Salinger, you might like “Catcher in the Rye”.  Collaborative Filtering – User-User Similarity  People like you who read “The Hunger Games” also read “The Ambler Warning”.  Collaborative Filtering – Item-Item Similarity  You like “Catcher in the Rye” so you will like “Nine Stories”.  Social + Interest Graph Based  Your friends like “The Great Gatsby” so you will like “The Great Gatsby” too.  Model Based  Training SVM, LDA, SVD for implicit features.
  • 29. Some Tools  Apache Mahout (Java)  Crab (Python)  Easyrec (RESTful API)
  • 31. Thankyou! www.usman-sharif.com @sharif_usman