Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Developing a Movie recommendation Engine with Spark

1 829 vues

Publié le

Developing a Movie recommendation Engine with Spark

Publié dans : Technologie
  • Soyez le premier à commenter

Developing a Movie recommendation Engine with Spark

  1. 1. www.edureka.co/apache-spark-scala-training Developing a Movie recommendation engine with Spark
  2. 2. Slide 2 www.edureka.co/apache-spark-scala-training At the end of the session, you will be able to know :  What is a recommendation engine  Major companies using recommendation engines  Different approaches to build recommendation engine  How to build a recommendation engine using Spark and Machine learning library (MLlib) What are we going to learn today ?
  3. 3. Slide 3 www.edureka.co/apache-spark-scala-training Transition – Search to Recommendation We are leaving the era of search and entering one of discovery. What’s the difference? Search is what you do when you are looking for something. Discovery is when something wonderful that you didn’t know existed, finds you CNN Money The race to create a smart Google
  4. 4. Slide 4 www.edureka.co/apache-spark-scala-training Recommendations make life easier Recommendations help user find information, products and services that user might not have thought of
  5. 5. Slide 5 www.edureka.co/apache-spark-scala-training Recommendation Approaches Collaborative filtering The user will be recommended items that people with similar tastes and preferences liked in the past Content based The user will be recommended items similar to the ones that user preferred in that past Hybrid methods Users are recommended by combining both collaborative filter and content based approaches
  6. 6. Slide 6 www.edureka.co/apache-spark-scala-training Lets take a small quiz
  7. 7. Slide 7 www.edureka.co/apache-spark-scala-training Recommendation Engine at LastFm Recommended tracks by last.fm Which approach last.fm uses to recommend Music?
  8. 8. Slide 8 www.edureka.co/apache-spark-scala-training Recommendation Engine at IMDB Movie recommendations by IMDB Which approach IMDB uses to recommend movies ?
  9. 9. Slide 9 www.edureka.co/apache-spark-scala-training Recommendation Engine at Amazon Recommended books by Amazon Which approach Amazon uses to recommend items ?
  10. 10. Slide 10 www.edureka.co/apache-spark-scala-training Recommendation Engine at Youtube Recommended Videos by Youtube Which approach Youtube uses to recommend videos ?
  11. 11. Slide 11 www.edureka.co/apache-spark-scala-training Recommendation Engine at LinkedIn Job recommendations by LinkedIn Which approach LinkedIn uses to recommend jobs?
  12. 12. Slide 12 www.edureka.co/apache-spark-scala-training Implementing Recommendation Engine To implement a recommendation engine we will require following : • Data source – to store historical data e.g. MySQL, MongoDB, HBase etc. • Spark - low latency computing • MLlib – library of machine learning algorithms
  13. 13. Slide 13 www.edureka.co/apache-spark-scala-training High Level Architecture - Recommendation Engine Data Source Hadoop Spark Application MLlib Recommendation Engine Architecture
  14. 14. Slide 14 www.edureka.co/apache-spark-scala-training Step 1 - Data Source
  15. 15. Slide 15 www.edureka.co/apache-spark-scala-training Step 2 – Hadoop to the rescue One of the problem with different types of data sources is that raw data is not well structured and we need something which can store data from different data sources at a single place Hadoop is the best fit which solves this problem
  16. 16. Slide 16 www.edureka.co/apache-spark-scala-training Step 3 - Spark Once we have all the data in place we can use Spark to do in-memory computation on the data Apache Spark is an in-memory cluster computing system which provides real time data processing capability. Note that its possible to build a recommendation engine without using Spark. We can build a recommendation engine by only using Hadoop but since Hadoop reads and writes to disk not in-memory, which takes extra time. So a recommendation engine build using only Hadoop will not be a real time.
  17. 17. Slide 17 www.edureka.co/apache-spark-scala-training Step 4 - MLlib Spark MLlibSparkSQL Spark Streaming Rather than writing the entire recommendation engine from scratch, we can use very popular MLlib library which provides machine learning algorithms to build a recommendation engine
  18. 18. Slide 18 www.edureka.co/apache-spark-scala-training High Level Architecture - Recommendation Engine Data Source Hadoop Spark Application MLlib Recommendation Engine Architecture
  19. 19. Slide 19 www.edureka.co/apache-spark-scala-training Lets See a Code Example Code to build a recommendation engine
  20. 20. Questions Slide 20 www.edureka.co/apache-spark-scala-training
  21. 21. Slide 21 www.edureka.co/apache-spark-scala-training References http://recommender-systems.org/content-based-filtering/ http://archive.fortune.com/magazines/fortune/fortune_archive/2006/11/27/8394347/index.htm http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html
  22. 22. Slide 22 Course Url

×