SlideShare une entreprise Scribd logo
1  sur  11
By : Puneet Gupta
M.Tech (Future Studies and Planning)
 A mahout is one who drives an elephant as its
master
 Its close association with Apache Hadoop
which uses an elephant as its logo.
 Apache Mahout started as a sub-project of
Apache’s Lucene in 2008. In 2010, Mahout
became a top level project of Apache.
Apache Mahout?
• Apache Mahout is an open source project
• Mahout is a Java library
- Implementing Machine Learning techniques
• Recommendation
• Clustering
• Classification
What can we do?
•Currently Mahout supports mainly three use cases:
–Recommendation - takes users' behavior and from
that tries to find items users might like.
–Clustering - takes e.g. text documents and groups
them into groups of topically related documents.
–Classification - learns from existing categorized
documents what documents of a specific category
look like and is able to assign unlabeled documents
to the (hopefully) correct category.
Why Mahout?
• Mahout is not the only Machine Learning
framework
– Weka
– R
• Why do we prefer Mahout?
– Apache License
– Good Community
– Good Documentation
– Scalable
•Based on Hadoop (not mandatory!)
Why do need a scalable framework?
Algorithms
•Recommendation
– User-based Collaborative Filtering
– Item-based Collaborative Filtering
– Slope One Recommenders
– Singular Value Decomposition
Algorithms
•Clustering
- Canopy
- K-Means
- Fuzzy K-Means
- Latent Dirichlet Allocation (LDA)
- MinHash Clustering
- Hierarchical Clustering
Algorithms
•Classification
- Logistic Regression
- Bayes
- Random Forests
- Hidden Markov Models
- Support Vector Machines
- Neural Networks
- Restricted Boltzmann Machines
Mahout Vs R
• Mahout is a java library But R is an
expression language with a very simple
syntax.
• Mahout use for Big data and R use
prototype data
Apache mahout

Contenu connexe

Tendances

Jvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationJvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationQuentin Ambard
 
Git and Github slides.pdf
Git and Github slides.pdfGit and Github slides.pdf
Git and Github slides.pdfTilton2
 
Introduction to git and github
Introduction to git and githubIntroduction to git and github
Introduction to git and githubAderemi Dadepo
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsDataWorks Summit
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Simplilearn
 
Intro to git and git hub
Intro to git and git hubIntro to git and git hub
Intro to git and git hubJasleenSondhi
 
Introduction to Git and Github
Introduction to Git and Github Introduction to Git and Github
Introduction to Git and Github Max Claus Nunes
 
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...ScyllaDB
 
Big Data Analytics using Mahout
Big Data Analytics using MahoutBig Data Analytics using Mahout
Big Data Analytics using MahoutIMC Institute
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialDaniel Abadi
 
/path/to/content - the Apache Jackrabbit content repository
/path/to/content - the Apache Jackrabbit content repository/path/to/content - the Apache Jackrabbit content repository
/path/to/content - the Apache Jackrabbit content repositoryJukka Zitting
 
Linux, Apache, Mysql, PHP
Linux, Apache, Mysql, PHPLinux, Apache, Mysql, PHP
Linux, Apache, Mysql, PHPwebhostingguy
 

Tendances (20)

Jvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationJvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies application
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Git and github 101
Git and github 101Git and github 101
Git and github 101
 
Git and Github slides.pdf
Git and Github slides.pdfGit and Github slides.pdf
Git and Github slides.pdf
 
Git & GitHub WorkShop
Git & GitHub WorkShopGit & GitHub WorkShop
Git & GitHub WorkShop
 
Introduction to git and github
Introduction to git and githubIntroduction to git and github
Introduction to git and github
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
Intro to git and git hub
Intro to git and git hubIntro to git and git hub
Intro to git and git hub
 
Introduction to Git and Github
Introduction to Git and Github Introduction to Git and Github
Introduction to Git and Github
 
Introduction git
Introduction gitIntroduction git
Introduction git
 
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
 
Big Data Analytics using Mahout
Big Data Analytics using MahoutBig Data Analytics using Mahout
Big Data Analytics using Mahout
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop Tutorial
 
/path/to/content - the Apache Jackrabbit content repository
/path/to/content - the Apache Jackrabbit content repository/path/to/content - the Apache Jackrabbit content repository
/path/to/content - the Apache Jackrabbit content repository
 
Linux, Apache, Mysql, PHP
Linux, Apache, Mysql, PHPLinux, Apache, Mysql, PHP
Linux, Apache, Mysql, PHP
 

En vedette

Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindingsDmitriy Lyubimov
 
Distributed Machine Learning with Apache Mahout
Distributed Machine Learning with Apache MahoutDistributed Machine Learning with Apache Mahout
Distributed Machine Learning with Apache MahoutSuneel Marthi
 
Scala Programming Introduction
Scala Programming IntroductionScala Programming Introduction
Scala Programming IntroductionairisData
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutTed Dunning
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldDean Wampler
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward
 
Apache Spark & Scala
Apache Spark & ScalaApache Spark & Scala
Apache Spark & ScalaEdureka!
 
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)Jee Vang, Ph.D.
 
Introduction to Functional Programming with Scala
Introduction to Functional Programming with ScalaIntroduction to Functional Programming with Scala
Introduction to Functional Programming with Scalapramode_ce
 

En vedette (11)

Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindings
 
Distributed Machine Learning with Apache Mahout
Distributed Machine Learning with Apache MahoutDistributed Machine Learning with Apache Mahout
Distributed Machine Learning with Apache Mahout
 
Scala Programming Introduction
Scala Programming IntroductionScala Programming Introduction
Scala Programming Introduction
 
Mahout
MahoutMahout
Mahout
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
Apache Spark & Scala
Apache Spark & ScalaApache Spark & Scala
Apache Spark & Scala
 
Why Scala?
Why Scala?Why Scala?
Why Scala?
 
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
 
Introduction to Functional Programming with Scala
Introduction to Functional Programming with ScalaIntroduction to Functional Programming with Scala
Introduction to Functional Programming with Scala
 

Similaire à Apache mahout

Apache Mahout
Apache MahoutApache Mahout
Apache MahoutAjit Koti
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutKorea Sdec
 
Download Materials
Download MaterialsDownload Materials
Download Materialsbutest
 
I-Arxiv (Intelligent-Arxiv).pptx
I-Arxiv (Intelligent-Arxiv).pptxI-Arxiv (Intelligent-Arxiv).pptx
I-Arxiv (Intelligent-Arxiv).pptxPeterParker936547
 
Apache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobjectApache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobjectsakthibalabalamuruga
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Cataldo Musto
 
A view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academiaA view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academiaMichael Mior
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - RecommendationCataldo Musto
 
OpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemOpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemGrant Ingersoll
 
Bdm hadoop ecosystem
Bdm hadoop ecosystemBdm hadoop ecosystem
Bdm hadoop ecosystemAmit Bhardwaj
 
IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.ASHISH JAGTAP
 

Similaire à Apache mahout (20)

Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
MahoutNew
MahoutNewMahoutNew
MahoutNew
 
Mahout tutorial
Mahout tutorialMahout tutorial
Mahout tutorial
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of Mahout
 
Download Materials
Download MaterialsDownload Materials
Download Materials
 
I-Arxiv (Intelligent-Arxiv).pptx
I-Arxiv (Intelligent-Arxiv).pptxI-Arxiv (Intelligent-Arxiv).pptx
I-Arxiv (Intelligent-Arxiv).pptx
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
Mahout in action
Mahout in actionMahout in action
Mahout in action
 
Apache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobjectApache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobject
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)
 
A view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academiaA view from the ivory tower: Participating in Apache as a member of academia
A view from the ivory tower: Participating in Apache as a member of academia
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
 
OpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemOpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene Ecosystem
 
Machine Learning & Apache Mahout
Machine Learning & Apache MahoutMachine Learning & Apache Mahout
Machine Learning & Apache Mahout
 
Bdm hadoop ecosystem
Bdm hadoop ecosystemBdm hadoop ecosystem
Bdm hadoop ecosystem
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.
 
Apache mahout - introduction
Apache mahout - introductionApache mahout - introduction
Apache mahout - introduction
 
Test Presentation
Test PresentationTest Presentation
Test Presentation
 
mahout introduction
mahout  introductionmahout  introduction
mahout introduction
 

Dernier

Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Dernier (20)

Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

Apache mahout

  • 1. By : Puneet Gupta M.Tech (Future Studies and Planning)
  • 2.  A mahout is one who drives an elephant as its master  Its close association with Apache Hadoop which uses an elephant as its logo.  Apache Mahout started as a sub-project of Apache’s Lucene in 2008. In 2010, Mahout became a top level project of Apache.
  • 3. Apache Mahout? • Apache Mahout is an open source project • Mahout is a Java library - Implementing Machine Learning techniques • Recommendation • Clustering • Classification
  • 4. What can we do? •Currently Mahout supports mainly three use cases: –Recommendation - takes users' behavior and from that tries to find items users might like. –Clustering - takes e.g. text documents and groups them into groups of topically related documents. –Classification - learns from existing categorized documents what documents of a specific category look like and is able to assign unlabeled documents to the (hopefully) correct category.
  • 5. Why Mahout? • Mahout is not the only Machine Learning framework – Weka – R • Why do we prefer Mahout? – Apache License – Good Community – Good Documentation – Scalable •Based on Hadoop (not mandatory!)
  • 6. Why do need a scalable framework?
  • 7. Algorithms •Recommendation – User-based Collaborative Filtering – Item-based Collaborative Filtering – Slope One Recommenders – Singular Value Decomposition
  • 8. Algorithms •Clustering - Canopy - K-Means - Fuzzy K-Means - Latent Dirichlet Allocation (LDA) - MinHash Clustering - Hierarchical Clustering
  • 9. Algorithms •Classification - Logistic Regression - Bayes - Random Forests - Hidden Markov Models - Support Vector Machines - Neural Networks - Restricted Boltzmann Machines
  • 10. Mahout Vs R • Mahout is a java library But R is an expression language with a very simple syntax. • Mahout use for Big data and R use prototype data