SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
CC 2.0 by Horia Varlan | http://flic.kr/p/7vjmof
Septem
                                        ber 1,
                                        2012



•  What are Product Recommenders         2


   •  Introducing Recommenders
   •  A Simple Example
   •  Recommender Evaluation
•  How do they work?
   •  Machine learning tool – Apache
      Mahout



Namics Conference 2012

Agenda
Septem
                                            ber 1,
                                            2012



•  Spin-off of MeMo News AG, the              3

   leading provider for Social Media
   Monitoring & Analytics in Switzerland
•  Big Data expert, focused on Hadoop,
   HBase and Solr
•  Objective: Transforming data into
   insights



Intro

About Sentric
CC 2.0 by Dennis Wong | http://flic.kr/p/6C3RuV	
  
Septem
                                                      ber 1,
                                                      2012



•  Each day we form opinions about                     5

   things we like, don’t like, and don’t
   even care about.
•  People tend to like things …
     •  that similar people like
     •  that are similar to other things they like
•  These patterns can be used to predict
   such likes and dislikes.


Introducing Recommenders

The Patterns
Septem
                                         ber 1,
                                         2012



user-based – Look to what people with     6

similar tastes seem to like

Example:




Introducing Recommenders

Strategies for Discovering New Things
Septem
                                                       ber 1,
                                                       2012



item-based – Figure out what items are                  7

like the ones you already like (again by looking to
others’ apparent preferences)



Example:




Introducing Recommenders

Strategies for Discovering New Things
Septem
                                                              ber 1,
                                                              2012



content-based – Suggest items based on                         8
                                                             Septem


particular attribute (again by looking to others’ apparent
                                                              ber 1,
                                                              2012


preferences)



Example:




Introducing Recommenders

Strategies for Discovering New Things
Septem
                                                                ber 1,
                                                                2012


                                                                 9
Collaborative Filtering –
                                     Item-based
Producing recommendations
based on, and only based
on, knowledge of users’   User-based           Content-based
relationships to items.

                                   Recommenders



Recommendation is all about predicting
patterns of taste, and using them to
discover new and desirable things you
didn’t already know about.

Introducing Recommenders

The Definition of Recommendation
CC 2.0 by Will Scullin | http://flic.kr/p/6K9jb8	
  
Septem
                                                                         ber 1,
                                                                         2012



•  Let’s start with a simple example                                     11




       Create	
  Input	
         Create	
  a	
     Analyse	
  the	
  
          Data	
              Recommender	
          Output	
  




A Simple user-based Example

The Workflow
Septem
                                                              ber 1,
                                                              2012


•  Recommendations will                           1,101,5.0 
 12
                                                  1,102,3.0 

   base on input-data            User 1 has a
                                 preference 3.0   1,103,2.5 

                                 for item 102     2,101,2.0 

•  Data takes the form of                         2,102,2.5 

   preferences –associations                      2,103,5.0 

                                                  2,104,2.0 

   from users to items                            3,101,2.5 

                                                  3,104,4.0 

                                                  3,105,4.5 

                                                  3,107,5.0 

Example:                                          4,101,5.0 

                                                  4,103,3.0"
                                                  4,104,4.5"
These values might be ratings                     4,106,4.0"
on a scale of 1 to 5, where 1                     5,101,4.0"
                                                  5,102,3.0"
indicates items the user can’t                    5,103,2.0"
                                                  5,104,4.0"
stand, and 5 indicates                            5,105,3.5"
favorites.                                        5,106,4.0 "
                                                  	
  
                                                  	
  


A Simple user-based Example

Input Data
Septem
                                                                      ber 1,
                                                                      2012

•     Trend visualization for positive users              1,101,5.0 
 13
                                                          1,102,3.0 

      preferences (in petrol)                             1,103,2.5 

                                                          2,101,2.0 

                                                          2,102,2.5 

       1                            5          3          2,103,5.0 

                                                          2,104,2.0 

                                                          3,101,2.5 

                                                          3,104,4.0 

                                                          3,105,4.5 

     101      102      103    104       105   106   107   3,107,5.0 

                                                          4,101,5.0 

                                                          4,103,3.0"
                                                          4,104,4.5"
                                                          4,106,4.0"
                                                          5,101,4.0"
                        2     4                           5,102,3.0"
                                                          5,103,2.0"
                                                          5,104,4.0"
•     All other preferences are recognized as             5,105,3.5"
      negative – the user doesn’t seem to like the        5,106,4.0 "
      item that much (red, dotted)                        	
  
                                                          	
  


A Simple user-based Example

Trend Visualization
Septem
                                                                             ber 1,
                                                                             2012


Users 1 and 5 seem to have similar tastes.                                   14
Both like 101, like 102 a little less, and like 103 less still

       1                            5




     101      102      103    104       105   106   107


                                                Users 1 and 4 seem to
                                                have similar tastes. Both
                        2     4
                                                seem to like 101 and 103
                                                identically
Users 1 and 2 have tastes that seem
to run counter to each other


A Simple user-based Example

Trend Visualization
Septem
                                                                     ber 1,
                                                                     2012



So what product might be recommended to                              15

user 1?
       1                            5          3




     101      102      103    104       105   106   107




                        2     4


 Obviously not 101, 102 or 103. User 1 already knows about these.


A Simple user-based Example

Analyzing the Output
Septem
                                                      ber 1,
                                                      2012


The output could be: [item:104,   value:4.257081]"    16



The recommender engine did so because it
estimated user 1’s preference for 104 to be
about 4.3, and that was the highest among all
the items eligible for recommendation.

Questions:
•  Is this the best recommendation for user 1?
•  What exactly is a good recommendation?


A Simple user-based Example

Analyzing the Output
CC 2.0 by larsaaboe | http://flic.kr/p/7nJpV8	
  
Septem
                                                                                                  ber 1,
                                                                                                  2012



 Goal:                                                                                            18

          Evaluate how closely the estimated
          preferences match the actual preferences.


 How?
                                                             Produce                   Compare
                                                             estimate                  estimates with
           Reasonable              30% for test
Prepare                    Split                       Run   preferences     Analyse   test data à
           data set	
              70 % for training
                                                             with training             Calculate a
                                                             data                      score


                          Experiment with other recommenders



A Simple user-based Example

Evaluating a Recommender
Septem
                                                                        ber 1,
                                                                        2012



Example evaluation output for a                                         19

particular recommender engine
                              Item 1         Item 2           Item 3
  Actual                      3.0            5.0              4.0
  Estimate                    3.5            2.0              5.0
  Difference                   0.5            3.0              1.0
  Average distance            = (0.5+3.0+1.0)/3=1.5
  Root-mean-square            =√((0.52+3.02+1.02)/3)=1.8484

Note: A score of 0.0 would mean perfect estimation



A Simple user-based Example

Evaluating a Recommender
CC 2.0 by amtrak_russ | http://flic.kr/p/6fAPej	
  
Septem
                                                                 ber 1,
                                                                 2012



•  Mahout …                                                      21

      •         Open-source machine learning library from
                Apache (Java)
      •         Can be used for large data collections – it’s
                scalable, build upon Apache Hadoop
      •         Implements algorithms such as
                Classification, Recommenders, Clustering
      •         Incubates a number of techniques and
                algorithms
•  ML it’s a hype! But …

In a Nutshell

Apache Mahout
Septem
                                                                                ber 1,
                                                                                2012



A Simple Recommender                                                           22


class RecommenderExample {"
     … main(String[] args) throws … {"
       DataModel model = new FileDataModel(new File(“examle.csv")); "
       UserSimilarity similarity = "
          new PearsonCorrelationSimilarity(model);"
       UserNeighborhood neighborhood = "
          new NearestNUserNeighborhood(2, similarity, model);"
       Recommender recommender = "
          new GenericUserBasedRecommender(model, neighborhood, similarity);"
       List<RecommendedItem> recommendations = recommender.recommend(1, 1);"
" for (RecommendedItem recommendation : recommendations) {"
           System.out.println(recommendation);"
       }"
}}"
	
  




A Simple user-based Example

Create a Recommender
Septem
                                                                                           ber 1,
                                                                                           2012


                                                                                          23




                                                 <<interface>>	
  
                                                 UserSimilarity	
  

                           <<interface>>	
                            <<interface>>	
  
    ApplicaAon	
  
                           Recommender	
                               DataModel	
  

                                                 <<interface>>	
  
                                               UserNeighborhood	
  




A user-based Recommender

Component Interaction
Septem
                                                                                       ber 1,
                                                                                       2012


NearestNUserNeighborhood                   ThresholdUserNeighborhood                  24


   2	
                                        2	
  


                   1	
                                        1	
  
           5	
                                        5	
  
                                   3	
                                        3	
  
                           4	
                                        4	
  

A neighborhood around user 1
is chosen to consist of the                Defining a neighborhood of
three most similar users: 5, 4,            most-similar users with a
and 2                                      similarity threshold


Algorithms

UserNeighborhood
Septem
                                                                        ber 1,
                                                                        2012



Implementations of this interface define a                              25

notion of similarity between two users.
Implementations should return values in the
range -1.0 to 1.0, with 1.0 representing perfect
similarity.
                                <<interface>>

                               UserSimilarity"



   EuclideanDistance         PearsonCorrelation     UncenteredCosine
      Similarity"                Similarity"           Similarity"


             LogLikelihood    TanimotoCoefficient
                                                    ..."
              Similarity"         Similarity"



Algorithms

User Similarity
Septem
                                                               ber 1,
                                                               2012


Similarity between data objects can be represented in         26
a variety of ways:

•     Distance between data objects is sum of the
      distances of each attribute of the data objects (i.e.
      Euclidean Distance)
•     Measuring how the attributes of both data objects
      change with respect to the variation of the mean
      value for the attributes (Pearson Correlation
      coefficient)
•     Using the word frequencies for each document, the
      normalized dot product of the frequencies can be
      used as a measure of similarity (cosine similarity)
•     An a few more ..


Algorithms

User Similarity
Septem
                                                                          ber 1,
                                                                          2012



Similarity between                                                        27

two data objects:       5




                        4


                                                       User 5   User 1
                        3




                        102
                                          User 2

                        2




                         1


                                          User 3                User 4
                        0
                              0   1   2            3     4        5
                                           101




Mathematically & Plot

Euclidean Distance
Septem
                                                                                           ber 1,
                                                                                           2012



Similarity between                                                                        28

two data objects:
                                  5

                                 4.5

                                  4        104                                  101

                                 3.5

                                  3                                   102




                        User 5
                                 2.5

                                  2                           103

                                 1.5

                                   1

                                 0.5

                                  0
                                       0         1   2            3         4         5
                                                         User 1




Mathematically & Plot

Pearson Correlation
Septem
                                                         ber 1,
                                                         2012


                                                        29




                         Questions?
     Jean-Pierre König, jean-pierre.koenig@sentric.ch




Namics Conference 2012

Thank you!
Septem
                                                         ber 1,
                                                         2012


•  References                                           30

     The content of this presentation is based on:
     •  Chapter 1, 2 and 4 of the following book:
        Owen, Anil, Dunning, Friedman. Mahout in
        Action. Shelter Island, NY: Manning
        Publications Co., 2012.
     •  Chapter “Discussion of Similarity Metrics” of
        the following publication: Shanley Philip.
        Data Mining Portfolio.
•  Links
   http://bitly.com/bundles/jpkoenig/1

A Simple user-based Example

Literatur & Links

Contenu connexe

Similaire à What are product recommendations, and how do they work?

Chapter7 simulation handbook_nohanagi
Chapter7 simulation handbook_nohanagiChapter7 simulation handbook_nohanagi
Chapter7 simulation handbook_nohanagiNoha Nagi
 
It is one of the most cliché of clichés, but it nevertheless rings
It is one of the most cliché of clichés, but it nevertheless ringsIt is one of the most cliché of clichés, but it nevertheless rings
It is one of the most cliché of clichés, but it nevertheless ringsTatianaMajor22
 
APM PMO SIG - project review simulation
APM PMO SIG - project review simulationAPM PMO SIG - project review simulation
APM PMO SIG - project review simulationUpside Energy Ltd
 
Alabfi em-20120624
Alabfi em-20120624Alabfi em-20120624
Alabfi em-20120624zepheiraorg
 
User Research & Strategy for UX
User Research & Strategy for UXUser Research & Strategy for UX
User Research & Strategy for UXZack Naylor
 
Embedded DA vs Consultative DA: Audience Workshop
Embedded DA vs Consultative DA: Audience WorkshopEmbedded DA vs Consultative DA: Audience Workshop
Embedded DA vs Consultative DA: Audience WorkshopSmartOrg
 
Unlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedgeUnlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedgecarolinestallings
 
Instructional Design for Competence-based Learning
Instructional Design for Competence-based LearningInstructional Design for Competence-based Learning
Instructional Design for Competence-based LearningTang Buay Choo
 
Research Framework Proposal Methodology Experience Consumer Marketing Communi...
Research Framework Proposal Methodology Experience Consumer Marketing Communi...Research Framework Proposal Methodology Experience Consumer Marketing Communi...
Research Framework Proposal Methodology Experience Consumer Marketing Communi...SlideTeam
 
Habits not Hype: Startup Thinking 101
Habits not Hype: Startup Thinking 101Habits not Hype: Startup Thinking 101
Habits not Hype: Startup Thinking 101M.J. D'Elia
 
A choice of Research methods
A choice of Research methodsA choice of Research methods
A choice of Research methodsyehyaeloueini
 
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...SPTechCon
 
Tech summitassessment2011final
Tech summitassessment2011finalTech summitassessment2011final
Tech summitassessment2011finalKyle Kauffman
 
How to run your Library like a Startup
How to run your Library like a StartupHow to run your Library like a Startup
How to run your Library like a StartupM.J. D'Elia
 
BAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement Scale
BAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement ScaleBAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement Scale
BAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement ScaleDr Nur Suhaili Ramli
 
Pushing the awareness envelope
Pushing the awareness envelopePushing the awareness envelope
Pushing the awareness envelopeIsrael Gutiérrez
 
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)Chiradeep Vittal
 

Similaire à What are product recommendations, and how do they work? (20)

Chapter7 simulation handbook_nohanagi
Chapter7 simulation handbook_nohanagiChapter7 simulation handbook_nohanagi
Chapter7 simulation handbook_nohanagi
 
It is one of the most cliché of clichés, but it nevertheless rings
It is one of the most cliché of clichés, but it nevertheless ringsIt is one of the most cliché of clichés, but it nevertheless rings
It is one of the most cliché of clichés, but it nevertheless rings
 
APM PMO SIG - project review simulation
APM PMO SIG - project review simulationAPM PMO SIG - project review simulation
APM PMO SIG - project review simulation
 
Alabfi em-20120624
Alabfi em-20120624Alabfi em-20120624
Alabfi em-20120624
 
User Research & Strategy for UX
User Research & Strategy for UXUser Research & Strategy for UX
User Research & Strategy for UX
 
Embedded DA vs Consultative DA: Audience Workshop
Embedded DA vs Consultative DA: Audience WorkshopEmbedded DA vs Consultative DA: Audience Workshop
Embedded DA vs Consultative DA: Audience Workshop
 
Unlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedgeUnlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedge
 
5 survey mistakes and how to avoid them
5 survey mistakes and how to avoid them5 survey mistakes and how to avoid them
5 survey mistakes and how to avoid them
 
Fraction lesson 1
Fraction lesson 1Fraction lesson 1
Fraction lesson 1
 
Instructional Design for Competence-based Learning
Instructional Design for Competence-based LearningInstructional Design for Competence-based Learning
Instructional Design for Competence-based Learning
 
Research Framework Proposal Methodology Experience Consumer Marketing Communi...
Research Framework Proposal Methodology Experience Consumer Marketing Communi...Research Framework Proposal Methodology Experience Consumer Marketing Communi...
Research Framework Proposal Methodology Experience Consumer Marketing Communi...
 
Habits not Hype: Startup Thinking 101
Habits not Hype: Startup Thinking 101Habits not Hype: Startup Thinking 101
Habits not Hype: Startup Thinking 101
 
A choice of Research methods
A choice of Research methodsA choice of Research methods
A choice of Research methods
 
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
SharePoint MoneyBall: The Art of Winning the SharePoint Metrics Game by Susan...
 
Tech summitassessment2011final
Tech summitassessment2011finalTech summitassessment2011final
Tech summitassessment2011final
 
How to run your Library like a Startup
How to run your Library like a StartupHow to run your Library like a Startup
How to run your Library like a Startup
 
Ecology Online Class
Ecology Online ClassEcology Online Class
Ecology Online Class
 
BAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement Scale
BAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement ScaleBAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement Scale
BAEB601 Chapter 5: Research Design, Theoretical Framework and Measurement Scale
 
Pushing the awareness envelope
Pushing the awareness envelopePushing the awareness envelope
Pushing the awareness envelope
 
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
The Future of Apache CloudStack (Not So Cloudy) (Collab 2012)
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

What are product recommendations, and how do they work?

  • 1. CC 2.0 by Horia Varlan | http://flic.kr/p/7vjmof
  • 2. Septem ber 1, 2012 •  What are Product Recommenders 2 •  Introducing Recommenders •  A Simple Example •  Recommender Evaluation •  How do they work? •  Machine learning tool – Apache Mahout Namics Conference 2012 Agenda
  • 3. Septem ber 1, 2012 •  Spin-off of MeMo News AG, the 3 leading provider for Social Media Monitoring & Analytics in Switzerland •  Big Data expert, focused on Hadoop, HBase and Solr •  Objective: Transforming data into insights Intro About Sentric
  • 4. CC 2.0 by Dennis Wong | http://flic.kr/p/6C3RuV  
  • 5. Septem ber 1, 2012 •  Each day we form opinions about 5 things we like, don’t like, and don’t even care about. •  People tend to like things … •  that similar people like •  that are similar to other things they like •  These patterns can be used to predict such likes and dislikes. Introducing Recommenders The Patterns
  • 6. Septem ber 1, 2012 user-based – Look to what people with 6 similar tastes seem to like Example: Introducing Recommenders Strategies for Discovering New Things
  • 7. Septem ber 1, 2012 item-based – Figure out what items are 7 like the ones you already like (again by looking to others’ apparent preferences) Example: Introducing Recommenders Strategies for Discovering New Things
  • 8. Septem ber 1, 2012 content-based – Suggest items based on 8 Septem particular attribute (again by looking to others’ apparent ber 1, 2012 preferences) Example: Introducing Recommenders Strategies for Discovering New Things
  • 9. Septem ber 1, 2012 9 Collaborative Filtering – Item-based Producing recommendations based on, and only based on, knowledge of users’ User-based Content-based relationships to items. Recommenders Recommendation is all about predicting patterns of taste, and using them to discover new and desirable things you didn’t already know about. Introducing Recommenders The Definition of Recommendation
  • 10. CC 2.0 by Will Scullin | http://flic.kr/p/6K9jb8  
  • 11. Septem ber 1, 2012 •  Let’s start with a simple example 11 Create  Input   Create  a   Analyse  the   Data   Recommender   Output   A Simple user-based Example The Workflow
  • 12. Septem ber 1, 2012 •  Recommendations will 1,101,5.0 
 12 1,102,3.0 
 base on input-data User 1 has a preference 3.0 1,103,2.5 
 for item 102 2,101,2.0 
 •  Data takes the form of 2,102,2.5 
 preferences –associations 2,103,5.0 
 2,104,2.0 
 from users to items 3,101,2.5 
 3,104,4.0 
 3,105,4.5 
 3,107,5.0 
 Example: 4,101,5.0 
 4,103,3.0" 4,104,4.5" These values might be ratings 4,106,4.0" on a scale of 1 to 5, where 1 5,101,4.0" 5,102,3.0" indicates items the user can’t 5,103,2.0" 5,104,4.0" stand, and 5 indicates 5,105,3.5" favorites. 5,106,4.0 "     A Simple user-based Example Input Data
  • 13. Septem ber 1, 2012 •  Trend visualization for positive users 1,101,5.0 
 13 1,102,3.0 
 preferences (in petrol) 1,103,2.5 
 2,101,2.0 
 2,102,2.5 
 1 5 3 2,103,5.0 
 2,104,2.0 
 3,101,2.5 
 3,104,4.0 
 3,105,4.5 
 101 102 103 104 105 106 107 3,107,5.0 
 4,101,5.0 
 4,103,3.0" 4,104,4.5" 4,106,4.0" 5,101,4.0" 2 4 5,102,3.0" 5,103,2.0" 5,104,4.0" •  All other preferences are recognized as 5,105,3.5" negative – the user doesn’t seem to like the 5,106,4.0 " item that much (red, dotted)     A Simple user-based Example Trend Visualization
  • 14. Septem ber 1, 2012 Users 1 and 5 seem to have similar tastes. 14 Both like 101, like 102 a little less, and like 103 less still 1 5 101 102 103 104 105 106 107 Users 1 and 4 seem to have similar tastes. Both 2 4 seem to like 101 and 103 identically Users 1 and 2 have tastes that seem to run counter to each other A Simple user-based Example Trend Visualization
  • 15. Septem ber 1, 2012 So what product might be recommended to 15 user 1? 1 5 3 101 102 103 104 105 106 107 2 4 Obviously not 101, 102 or 103. User 1 already knows about these. A Simple user-based Example Analyzing the Output
  • 16. Septem ber 1, 2012 The output could be: [item:104, value:4.257081]" 16 The recommender engine did so because it estimated user 1’s preference for 104 to be about 4.3, and that was the highest among all the items eligible for recommendation. Questions: •  Is this the best recommendation for user 1? •  What exactly is a good recommendation? A Simple user-based Example Analyzing the Output
  • 17. CC 2.0 by larsaaboe | http://flic.kr/p/7nJpV8  
  • 18. Septem ber 1, 2012 Goal: 18 Evaluate how closely the estimated preferences match the actual preferences. How? Produce Compare estimate estimates with Reasonable 30% for test Prepare Split Run preferences Analyse test data à data set   70 % for training with training Calculate a data score Experiment with other recommenders A Simple user-based Example Evaluating a Recommender
  • 19. Septem ber 1, 2012 Example evaluation output for a 19 particular recommender engine Item 1 Item 2 Item 3 Actual 3.0 5.0 4.0 Estimate 3.5 2.0 5.0 Difference 0.5 3.0 1.0 Average distance = (0.5+3.0+1.0)/3=1.5 Root-mean-square =√((0.52+3.02+1.02)/3)=1.8484 Note: A score of 0.0 would mean perfect estimation A Simple user-based Example Evaluating a Recommender
  • 20. CC 2.0 by amtrak_russ | http://flic.kr/p/6fAPej  
  • 21. Septem ber 1, 2012 •  Mahout … 21 •  Open-source machine learning library from Apache (Java) •  Can be used for large data collections – it’s scalable, build upon Apache Hadoop •  Implements algorithms such as Classification, Recommenders, Clustering •  Incubates a number of techniques and algorithms •  ML it’s a hype! But … In a Nutshell Apache Mahout
  • 22. Septem ber 1, 2012 A Simple Recommender 22 class RecommenderExample {" … main(String[] args) throws … {" DataModel model = new FileDataModel(new File(“examle.csv")); " UserSimilarity similarity = " new PearsonCorrelationSimilarity(model);" UserNeighborhood neighborhood = " new NearestNUserNeighborhood(2, similarity, model);" Recommender recommender = " new GenericUserBasedRecommender(model, neighborhood, similarity);" List<RecommendedItem> recommendations = recommender.recommend(1, 1);" " for (RecommendedItem recommendation : recommendations) {" System.out.println(recommendation);" }" }}"   A Simple user-based Example Create a Recommender
  • 23. Septem ber 1, 2012 23 <<interface>>   UserSimilarity   <<interface>>   <<interface>>   ApplicaAon   Recommender   DataModel   <<interface>>   UserNeighborhood   A user-based Recommender Component Interaction
  • 24. Septem ber 1, 2012 NearestNUserNeighborhood ThresholdUserNeighborhood 24 2   2   1   1   5   5   3   3   4   4   A neighborhood around user 1 is chosen to consist of the Defining a neighborhood of three most similar users: 5, 4, most-similar users with a and 2 similarity threshold Algorithms UserNeighborhood
  • 25. Septem ber 1, 2012 Implementations of this interface define a 25 notion of similarity between two users. Implementations should return values in the range -1.0 to 1.0, with 1.0 representing perfect similarity. <<interface>>
 UserSimilarity" EuclideanDistance PearsonCorrelation UncenteredCosine Similarity" Similarity" Similarity" LogLikelihood TanimotoCoefficient ..." Similarity" Similarity" Algorithms User Similarity
  • 26. Septem ber 1, 2012 Similarity between data objects can be represented in 26 a variety of ways: •  Distance between data objects is sum of the distances of each attribute of the data objects (i.e. Euclidean Distance) •  Measuring how the attributes of both data objects change with respect to the variation of the mean value for the attributes (Pearson Correlation coefficient) •  Using the word frequencies for each document, the normalized dot product of the frequencies can be used as a measure of similarity (cosine similarity) •  An a few more .. Algorithms User Similarity
  • 27. Septem ber 1, 2012 Similarity between 27 two data objects: 5 4 User 5 User 1 3 102 User 2 2 1 User 3 User 4 0 0 1 2 3 4 5 101 Mathematically & Plot Euclidean Distance
  • 28. Septem ber 1, 2012 Similarity between 28 two data objects: 5 4.5 4 104 101 3.5 3 102 User 5 2.5 2 103 1.5 1 0.5 0 0 1 2 3 4 5 User 1 Mathematically & Plot Pearson Correlation
  • 29. Septem ber 1, 2012 29 Questions? Jean-Pierre König, jean-pierre.koenig@sentric.ch Namics Conference 2012 Thank you!
  • 30. Septem ber 1, 2012 •  References 30 The content of this presentation is based on: •  Chapter 1, 2 and 4 of the following book: Owen, Anil, Dunning, Friedman. Mahout in Action. Shelter Island, NY: Manning Publications Co., 2012. •  Chapter “Discussion of Similarity Metrics” of the following publication: Shanley Philip. Data Mining Portfolio. •  Links http://bitly.com/bundles/jpkoenig/1 A Simple user-based Example Literatur & Links