SlideShare une entreprise Scribd logo
1  sur  28
Mahout in Action
          Part 1


    Yasmine M. Gaber
      28 February 2013
Agenda

    Meet Apache Mahout

    Part 1: Recommendation

    Part 2: Clustering

    Part 3: Classification
Meet Apache Mahout

  It is an open source machine learning library
from Apache

    It is scalable

    It is a Java library

 It can be used with Hadoop to deal with large
scale data.
Famous Engines

  Recommender engines:

 Amazon.com

 Netflix

 Dating sites like Líbímseti

 Social networking sites like Facebook

  Clustering engines:

 Google News

 Search engines like Clusty

  Classification engines:

 Spam emails

 Google’s Picasa

 Optical character recognition software

 Apple’s Genius feature in iTunes
Recommendations
Recommender Input

    A preference consists of a user ID and an item
    ID, user’s preference for the item

    It is .csv file
Create Recommender
Recommender Evaluation

    Average difference vs Root-mean-square
Mahout RecommenderEvaluator
Precision and Recall
RecommenderIRStatsEvaluator
Representing Recommender Data

    Preference object
    −   new GenericPreference(123, 456, 3.0f)

    Preference Array
Representing Recommender Data

    Preference Array





    FastByIDMap and FastIDSet
In-memory DataModels

    GenericDataModel


    File-based data


    Refreshable components


    Database-based data
Coping without preference values
Coping without preference values
User-based Recommender

    The algorithm

for every item i that u has no preference for yet
 for every other user v that has a preference for i
    compute a similarity s between u and v
    incorporate v's preference for i, weighted by s, into a running
    average
return the top items, ranked by weighted average
Recommender Components

    Data model, implemented via DataModel


    User-user similarity metric, implemented via
    UserSimilarity


    User neighborhood definition, implemented via
    UserNeighborhood


    Recommender engine, implemented via a
    Recommender (here,
GenericUserBasedRecommender
User Neighborhoods

    Fixed-size neighborhoods





    Threshold-based neighborhood
similarity metrics

    Pearson correlation–based similarity
    −   It is a number between –1 and 1 that measures
        the tendency of two series of numbers, paired up
        one-to-one, to move together
    −   Problems:
        
            It doesn’t take into account the number of items in
            which two users’ preferences overlap, which is probably
            a weakness in the context of recommender engines.
        
            If two users overlap on only one item, no correlation can
            be computed because of how the computation is
            defined
similarity metrics

    Euclidean distance similarity
    −   1 / (1+euclidean distance)

    Cosine measure similarity
    −   between –1 and 1

    Tanimoto coefficient similarity
    −   The ratio of the size of the
    intersection to the size of
    the union of their preferred items
Item-based recommendation

    The algorithm

for every item i that u has no preference for yet
 for every item j that u has a preference for
    compute a similarity s between i and j
    add u's preference for j, weighted by s, to a running average
return the top items, ranked by weighted average
GenericItemBasedRecommender
Slope-one recommender

    The algorithm

for every item i the user u expresses no preference for
 for every item j that user u expresses a preference for
    find the average preference difference between j and i
    add this diff to u's preference value for j
    add this to a running average
return the top items, ranked by these averages
Taking Recommender to Production
User-based recommenders
Thank You



               Contact at:
Email: Yasmine.Gaber@espace.com.eg
Twitter: Twitter.com/yasmine_mohamed

Contenu connexe

Tendances

Collaborative Filtering Recommendation Algorithm based on Hadoop
Collaborative Filtering Recommendation Algorithm based on HadoopCollaborative Filtering Recommendation Algorithm based on Hadoop
Collaborative Filtering Recommendation Algorithm based on HadoopTien-Yang (Aiden) Wu
 
Collaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CFCollaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CFYusuke Yamamoto
 
Improving Social Recommendations by applying a Personalized Item Clustering P...
Improving Social Recommendations by applying a Personalized Item Clustering P...Improving Social Recommendations by applying a Personalized Item Clustering P...
Improving Social Recommendations by applying a Personalized Item Clustering P...Γιώργος Αλεξανδρίδης
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxnishanth kurush
 
(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project report(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project reportGaurav Sawant
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...Geetika Gautam
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina
 
intership summary
intership summaryintership summary
intership summaryJunting Ma
 
Movies Recommendation System
Movies Recommendation SystemMovies Recommendation System
Movies Recommendation SystemShubham Patil
 
Recommender Engines
Recommender EnginesRecommender Engines
Recommender EnginesThomas Hess
 
Towards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata QualityTowards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata QualityXavier Ochoa
 
Analyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining ApproachAnalyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining ApproachRupal7
 
Recommender system
Recommender systemRecommender system
Recommender systemSaiguru P.v
 
IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37manish jindal
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsLei Guo
 

Tendances (20)

Collaborative Filtering Recommendation Algorithm based on Hadoop
Collaborative Filtering Recommendation Algorithm based on HadoopCollaborative Filtering Recommendation Algorithm based on Hadoop
Collaborative Filtering Recommendation Algorithm based on Hadoop
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Project presentation
Project presentationProject presentation
Project presentation
 
Collaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CFCollaborative Filtering 2: Item-based CF
Collaborative Filtering 2: Item-based CF
 
Improving Social Recommendations by applying a Personalized Item Clustering P...
Improving Social Recommendations by applying a Personalized Item Clustering P...Improving Social Recommendations by applying a Personalized Item Clustering P...
Improving Social Recommendations by applying a Personalized Item Clustering P...
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
 
(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project report(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project report
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Dm
DmDm
Dm
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
intership summary
intership summaryintership summary
intership summary
 
Movies Recommendation System
Movies Recommendation SystemMovies Recommendation System
Movies Recommendation System
 
Recommender Engines
Recommender EnginesRecommender Engines
Recommender Engines
 
Towards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata QualityTowards Automatic Evaluation of Learning Object Metadata Quality
Towards Automatic Evaluation of Learning Object Metadata Quality
 
Analyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining ApproachAnalyzing Adverse Drug Events Using Data Mining Approach
Analyzing Adverse Drug Events Using Data Mining Approach
 
Recommender system
Recommender systemRecommender system
Recommender system
 
IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37IRE Project IIIT Hyderabad Tweet classification Group 37
IRE Project IIIT Hyderabad Tweet classification Group 37
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 

Similaire à Mahout part1

Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsAravindharamanan S
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...IRJET Journal
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionPerumalPitchandi
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011idoguy
 
Download
DownloadDownload
Downloadbutest
 
Download
DownloadDownload
Downloadbutest
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
 
movierecommendationproject-171223181147.pptx
movierecommendationproject-171223181147.pptxmovierecommendationproject-171223181147.pptx
movierecommendationproject-171223181147.pptxAryanVyawahare
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Cataldo Musto
 
Recommenders Systems
Recommenders SystemsRecommenders Systems
Recommenders SystemsTariq Hassan
 
Zaffar+Ahmed+ +Collaborative+Filtering
Zaffar+Ahmed+ +Collaborative+FilteringZaffar+Ahmed+ +Collaborative+Filtering
Zaffar+Ahmed+ +Collaborative+FilteringZaffar Ahmed Shaikh
 
Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...Eswar Publications
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems RoadtripThe Real Dyl
 
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria RatingsA Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria RatingsNat Rice
 

Similaire à Mahout part1 (20)

Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
B1802021823
B1802021823B1802021823
B1802021823
 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithms
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System Introduction
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
movierecommendationproject-171223181147.pptx
movierecommendationproject-171223181147.pptxmovierecommendationproject-171223181147.pptx
movierecommendationproject-171223181147.pptx
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Recommenders Systems
Recommenders SystemsRecommenders Systems
Recommenders Systems
 
LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING
LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING
LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING
 
Zaffar+Ahmed+ +Collaborative+Filtering
Zaffar+Ahmed+ +Collaborative+FilteringZaffar+Ahmed+ +Collaborative+Filtering
Zaffar+Ahmed+ +Collaborative+Filtering
 
Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems Roadtrip
 
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria RatingsA Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
 

Plus de Yasmine Gaber (8)

Capistrano
CapistranoCapistrano
Capistrano
 
Ionic
IonicIonic
Ionic
 
Dyna trace
Dyna traceDyna trace
Dyna trace
 
Mahout part2
Mahout part2Mahout part2
Mahout part2
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Home Bowling
Home BowlingHome Bowling
Home Bowling
 
Oauth2.0
Oauth2.0Oauth2.0
Oauth2.0
 
Why_do i_hate_shopping
Why_do i_hate_shoppingWhy_do i_hate_shopping
Why_do i_hate_shopping
 

Dernier

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 

Dernier (20)

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 

Mahout part1

  • 1. Mahout in Action Part 1 Yasmine M. Gaber 28 February 2013
  • 2. Agenda  Meet Apache Mahout  Part 1: Recommendation  Part 2: Clustering  Part 3: Classification
  • 3. Meet Apache Mahout  It is an open source machine learning library from Apache  It is scalable  It is a Java library  It can be used with Hadoop to deal with large scale data.
  • 4. Famous Engines  Recommender engines:  Amazon.com  Netflix  Dating sites like Líbímseti  Social networking sites like Facebook  Clustering engines:  Google News  Search engines like Clusty  Classification engines:  Spam emails  Google’s Picasa  Optical character recognition software  Apple’s Genius feature in iTunes
  • 6. Recommender Input  A preference consists of a user ID and an item ID, user’s preference for the item  It is .csv file
  • 8. Recommender Evaluation  Average difference vs Root-mean-square
  • 12. Representing Recommender Data  Preference object − new GenericPreference(123, 456, 3.0f)  Preference Array
  • 13. Representing Recommender Data  Preference Array  FastByIDMap and FastIDSet
  • 14. In-memory DataModels  GenericDataModel  File-based data  Refreshable components  Database-based data
  • 17. User-based Recommender  The algorithm for every item i that u has no preference for yet for every other user v that has a preference for i compute a similarity s between u and v incorporate v's preference for i, weighted by s, into a running average return the top items, ranked by weighted average
  • 18. Recommender Components  Data model, implemented via DataModel  User-user similarity metric, implemented via UserSimilarity  User neighborhood definition, implemented via UserNeighborhood  Recommender engine, implemented via a Recommender (here,
  • 20. User Neighborhoods  Fixed-size neighborhoods  Threshold-based neighborhood
  • 21. similarity metrics  Pearson correlation–based similarity − It is a number between –1 and 1 that measures the tendency of two series of numbers, paired up one-to-one, to move together − Problems:  It doesn’t take into account the number of items in which two users’ preferences overlap, which is probably a weakness in the context of recommender engines.  If two users overlap on only one item, no correlation can be computed because of how the computation is defined
  • 22. similarity metrics  Euclidean distance similarity − 1 / (1+euclidean distance)  Cosine measure similarity − between –1 and 1  Tanimoto coefficient similarity − The ratio of the size of the intersection to the size of the union of their preferred items
  • 23. Item-based recommendation  The algorithm for every item i that u has no preference for yet for every item j that u has a preference for compute a similarity s between i and j add u's preference for j, weighted by s, to a running average return the top items, ranked by weighted average
  • 25. Slope-one recommender  The algorithm for every item i the user u expresses no preference for for every item j that user u expresses a preference for find the average preference difference between j and i add this diff to u's preference value for j add this to a running average return the top items, ranked by these averages
  • 26. Taking Recommender to Production
  • 28. Thank You Contact at: Email: Yasmine.Gaber@espace.com.eg Twitter: Twitter.com/yasmine_mohamed