SlideShare a Scribd company logo
1 of 23
Download to read offline
Non-Negative Matrix
Factorization
A quick tutorial
Matrices (also Matrixes)
In mathematics, a matrix (plural matrices) is a
rectangular array of numbers arranged in rows
and columns. The individual items in a matrix
are called its elements or entries.
An example of a matrix with 2 rows and 3
columns is:
Source: Wikipedia
Size of Matrices
The size of a matrix is defined by the number of
rows and columns that it contains. A matrix with
m rows and n columns is called an m × n matrix
or m-by-n matrix, while m and n are called its
dimensions.
Size?
Basic Matrix Operations
We’ll discuss some basic matrix operations next, and we’ll
practice by generating random matrices* using a simple
python script that can be found here:
http://bit.ly/mgen-py
*Please applaud instructor level of difficulty and excuse whiteboard arithmetic mistakes.
Adding (and Subtracting) Matrices
For matrices of the same dimension, add together the
corresponding elements to result in another matrix of the
same dimension.
Scalar Multiplication
To multiply a scalar denoted as i, or any real
number against a matrix, A - simply multiply i to
each element of the matrix.
Transposing Matrices
Reflect a matrix along its diagonal by swapping
the matrix’s rows and columns. Such that a m x
n matrix becomes an n x m matrix.
Matrix Multiplication
A matrix product between a matrix A with size n x m and B
with size m x p will produce an n x p matrix in which the m
columns of A are multiplied against the m rows of B as
follows:
The Dot Product of the
row in A and column in B
The Dot Product
The dot product of two vectors (e.g. 1 x m or n
x 1 matrices) of the same size is the sum of the
products of each element at every direction.
Matrix Sparsity (or Density)
The percentage of non-zero elements to the
number of elements in total (and 1 - the ratio of
non-zero elements to the number of elements).
[ 11 22 0 0 0 0 0 ]
[ 0 33 44 0 0 0 0 ]
[ 0 0 55 66 77 0 0 ]
[ 0 0 0 0 0 88 0 ]
[ 0 0 0 0 0 0 99 ]
Matrix Factorization (or Decomposition)
Given a n x m matrix, R - find two smaller
matrices, P and Q with k-dimensional features -
e.g. that P is size n x k and Q is size m x k -
such that their product approximates R.
(Adjusting k allows for more accuracy when
reconstructing the original matrix.)
Note: Sparse Matrix Factorization is used when the matrix is populated primarily by zeros
Recommendations
In order to compute recommendations, we will
construct a users x movies matrix such that
every element of the matrix is the user’s rating,
if any:
Star Wars Bridget Jones The Hobbit ...
Bob 5 2 0
Joe 3 4 2
Jane 0 0 3
... ... ... ...
As you can see - this is a pretty sparse matrix!
Factored Matrices
● P is the features matrix. It has a row for each
feature and a column for each column in the
original matrix (movie).
● Q is the weights matrix. It has a column for
each feature, and a row for each row in the
original matrix (user).
● Therefore when you multiply the dot product,
you’re finding the sum of the latent features
by the weights for each element in the
original matrix.
Latent Features
● The features and weights described before
measure “latent” features - e.g. hidden, non-
human describable features and their
weights, but could relate to things like genre.
● You need less features than items,
otherwise the best answer is simply every
item (no similarity).
● The magic is where there are zero values -
the product will fill them in, predicting their
value.
Non-negative
Called non-negative matrix factorization
because it returns features and weights with no
negative values. Therefore all features must be
positive or zero values.
Clustering
Non-Negative Matrix Factorization is closely
related to both supervised and unsupervised
methodologies (supervised because R can be
seen as a training set) - but in particular NNMF
is closely related to other clustering
(unsupervised) algorithms.
Gradient Descent
● The technique we will use to
factor is called gradient
descent, which attempts to
minimize error.
● We can calculate error from
our product using the
squared error (actual -
predicted)2
● Once we know the error, we
can calculate the gradient in
order to figure out what
direction to go to minimize
the error. We keep going
until we have no more error.
The Algorithm
→ Initialize P and Q with random small numbers
→ for step until max_steps:
for row, col in R:
if R[row][col] > 0:
compute error of element
compute gradient from error
update P and Q with new entry
compute total error
if error < some threshold:
break
return P, Q.T
Needed Computations
Compute* predicted element for each user-
movie pair (dot product of row and column in P
and Q):
Compute the squared error for each user-movie
pair (in order to compute the gradient):
*computations for non-zero values only
Needed Computations
Find gradient (slope of error curve) by taking
the differential of the error for each element:
Update Rule
Update each element in P and Q by using a
learning rate (called α) - this determines how
far to travel along the gradient. α is usually
small, because if we choose a step size that is
too large, we could miss the minimum.
Convergence and Regularization
We converge once the sum of the errors has
reached some threshold, usually very small.
Extending this algorithm will introduce
regularization to avoid overfitting by adding a
beta parameter. This forces the algorithm to
control the magnitudes of the feature vectors.
Predicted Recommendations
Our predicted matrix for our movies rating, will
end up looking something like what follows:
Where there were zeros before, we now have
predictions!
Star Wars Bridget Jones The Hobbit ...
Bob 4.98148768 2.02748447 3.29852779
Joe 3.0157327 3.968359 2.01139212
Jane 4.50410968 2.93580899 2.98826278
... ... ... ...
As you can see - the predictions are very close to the actual values, but we now
suspect that Jane will really like Star Wars!

More Related Content

What's hot

Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Edureka!
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysisBangalore
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Kyunghoon Kim
 
Image Caption Generation using Convolutional Neural Network and LSTM
Image Caption Generation using Convolutional Neural Network and LSTMImage Caption Generation using Convolutional Neural Network and LSTM
Image Caption Generation using Convolutional Neural Network and LSTMOmkar Reddy
 
Singular Value Decompostion (SVD)
Singular Value Decompostion (SVD)Singular Value Decompostion (SVD)
Singular Value Decompostion (SVD)Isaac Yowetu
 
Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...宏毅 李
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tearsAnkit Sharma
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection methodAmir Razmjou
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERKnoldus Inc.
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniquesVenkata Reddy Konasani
 
support vector regression
support vector regressionsupport vector regression
support vector regressionAkhilesh Joshi
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 

What's hot (20)

Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용
 
Image Caption Generation using Convolutional Neural Network and LSTM
Image Caption Generation using Convolutional Neural Network and LSTMImage Caption Generation using Convolutional Neural Network and LSTM
Image Caption Generation using Convolutional Neural Network and LSTM
 
Singular Value Decompostion (SVD)
Singular Value Decompostion (SVD)Singular Value Decompostion (SVD)
Singular Value Decompostion (SVD)
 
Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tears
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection method
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 
support vector regression
support vector regressionsupport vector regression
support vector regression
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 

Viewers also liked

Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutData Science London
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsLei Guo
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsYONG ZHENG
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filteringsscdotopen
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with SparkChris Johnson
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومfaradars
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorizationrubyyc
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Domonkos Tikk
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFMLiangjie Hong
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization MachinesPavel Kalaidin
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2njit-ronbrown
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringDKALab
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsT212
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 

Viewers also liked (20)

Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in Mahout
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filtering
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دوم
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization Machines
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 

Similar to Beginners Guide to Non-Negative Matrix Factorization

Kulum alin-11 jan2014
Kulum alin-11 jan2014Kulum alin-11 jan2014
Kulum alin-11 jan2014rolly purnomo
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepSanjanaSaxena17
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchEshanAgarwal4
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical MethodsESUG
 
Feature selection using PCA.pptx
Feature selection using PCA.pptxFeature selection using PCA.pptx
Feature selection using PCA.pptxbeherasushree212
 
MNIST and machine learning - presentation
MNIST and machine learning - presentationMNIST and machine learning - presentation
MNIST and machine learning - presentationSteve Dias da Cruz
 
Linear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear AlgebraLinear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear AlgebraMUHAMMADUSMAN93058
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 
Markov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world applicationMarkov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world applicationAndjela Todorovic
 
Arrays with Numpy, Computer Graphics
Arrays with Numpy, Computer GraphicsArrays with Numpy, Computer Graphics
Arrays with Numpy, Computer GraphicsPrabu U
 
Fundamentals of Machine Learning.pptx
Fundamentals of Machine Learning.pptxFundamentals of Machine Learning.pptx
Fundamentals of Machine Learning.pptxWiamFADEL
 
Matrix and its applications by mohammad imran
Matrix and its applications by mohammad imranMatrix and its applications by mohammad imran
Matrix and its applications by mohammad imranMohammad Imran
 
Project management
Project managementProject management
Project managementAvay Minni
 
5.2 Least Squares Linear Regression.pptx
5.2  Least Squares Linear Regression.pptx5.2  Least Squares Linear Regression.pptx
5.2 Least Squares Linear Regression.pptxMaiEllahham1
 

Similar to Beginners Guide to Non-Negative Matrix Factorization (20)

Kulum alin-11 jan2014
Kulum alin-11 jan2014Kulum alin-11 jan2014
Kulum alin-11 jan2014
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical Methods
 
Feature selection using PCA.pptx
Feature selection using PCA.pptxFeature selection using PCA.pptx
Feature selection using PCA.pptx
 
MNIST and machine learning - presentation
MNIST and machine learning - presentationMNIST and machine learning - presentation
MNIST and machine learning - presentation
 
Linear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear AlgebraLinear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear Algebra
 
Mat lab.pptx
Mat lab.pptxMat lab.pptx
Mat lab.pptx
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Matlab for marketing people
Matlab for marketing peopleMatlab for marketing people
Matlab for marketing people
 
Markov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world applicationMarkov Cluster Algorithm & real world application
Markov Cluster Algorithm & real world application
 
working with python
working with pythonworking with python
working with python
 
Arrays with Numpy, Computer Graphics
Arrays with Numpy, Computer GraphicsArrays with Numpy, Computer Graphics
Arrays with Numpy, Computer Graphics
 
Fundamentals of Machine Learning.pptx
Fundamentals of Machine Learning.pptxFundamentals of Machine Learning.pptx
Fundamentals of Machine Learning.pptx
 
Matrix and its applications by mohammad imran
Matrix and its applications by mohammad imranMatrix and its applications by mohammad imran
Matrix and its applications by mohammad imran
 
Ada notes
Ada notesAda notes
Ada notes
 
Brute force method
Brute force methodBrute force method
Brute force method
 
Project management
Project managementProject management
Project management
 
5.2 Least Squares Linear Regression.pptx
5.2  Least Squares Linear Regression.pptx5.2  Least Squares Linear Regression.pptx
5.2 Least Squares Linear Regression.pptx
 

More from Benjamin Bengfort

Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Benjamin Bengfort
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity ResolutionBenjamin Bengfort
 
An Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportAn Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportBenjamin Bengfort
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataBenjamin Bengfort
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
 
Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)Benjamin Bengfort
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Benjamin Bengfort
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with PythonBenjamin Bengfort
 

More from Benjamin Bengfort (18)

Getting Started with TRISA
Getting Started with TRISAGetting Started with TRISA
Getting Started with TRISA
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
Visualizing Model Selection with Scikit-Yellowbrick: An Introduction to Devel...
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Data Product Architectures
Data Product ArchitecturesData Product Architectures
Data Product Architectures
 
A Primer on Entity Resolution
A Primer on Entity ResolutionA Primer on Entity Resolution
A Primer on Entity Resolution
 
An Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation ReportAn Interactive Visual Analytics Dashboard for the Employment Situation Report
An Interactive Visual Analytics Dashboard for the Employment Situation Report
 
Graph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational DataGraph Based Machine Learning on Relational Data
Graph Based Machine Learning on Relational Data
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)Evolutionary Design of Swarms (SSCI 2014)
Evolutionary Design of Swarms (SSCI 2014)
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)
 
Annotation with Redfox
Annotation with RedfoxAnnotation with Redfox
Annotation with Redfox
 
Rasta processing of speech
Rasta processing of speechRasta processing of speech
Rasta processing of speech
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
 

Recently uploaded

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 

Recently uploaded (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Beginners Guide to Non-Negative Matrix Factorization

  • 2. Matrices (also Matrixes) In mathematics, a matrix (plural matrices) is a rectangular array of numbers arranged in rows and columns. The individual items in a matrix are called its elements or entries. An example of a matrix with 2 rows and 3 columns is: Source: Wikipedia
  • 3. Size of Matrices The size of a matrix is defined by the number of rows and columns that it contains. A matrix with m rows and n columns is called an m × n matrix or m-by-n matrix, while m and n are called its dimensions. Size?
  • 4. Basic Matrix Operations We’ll discuss some basic matrix operations next, and we’ll practice by generating random matrices* using a simple python script that can be found here: http://bit.ly/mgen-py *Please applaud instructor level of difficulty and excuse whiteboard arithmetic mistakes.
  • 5. Adding (and Subtracting) Matrices For matrices of the same dimension, add together the corresponding elements to result in another matrix of the same dimension.
  • 6. Scalar Multiplication To multiply a scalar denoted as i, or any real number against a matrix, A - simply multiply i to each element of the matrix.
  • 7. Transposing Matrices Reflect a matrix along its diagonal by swapping the matrix’s rows and columns. Such that a m x n matrix becomes an n x m matrix.
  • 8. Matrix Multiplication A matrix product between a matrix A with size n x m and B with size m x p will produce an n x p matrix in which the m columns of A are multiplied against the m rows of B as follows: The Dot Product of the row in A and column in B
  • 9. The Dot Product The dot product of two vectors (e.g. 1 x m or n x 1 matrices) of the same size is the sum of the products of each element at every direction.
  • 10. Matrix Sparsity (or Density) The percentage of non-zero elements to the number of elements in total (and 1 - the ratio of non-zero elements to the number of elements). [ 11 22 0 0 0 0 0 ] [ 0 33 44 0 0 0 0 ] [ 0 0 55 66 77 0 0 ] [ 0 0 0 0 0 88 0 ] [ 0 0 0 0 0 0 99 ]
  • 11. Matrix Factorization (or Decomposition) Given a n x m matrix, R - find two smaller matrices, P and Q with k-dimensional features - e.g. that P is size n x k and Q is size m x k - such that their product approximates R. (Adjusting k allows for more accuracy when reconstructing the original matrix.) Note: Sparse Matrix Factorization is used when the matrix is populated primarily by zeros
  • 12. Recommendations In order to compute recommendations, we will construct a users x movies matrix such that every element of the matrix is the user’s rating, if any: Star Wars Bridget Jones The Hobbit ... Bob 5 2 0 Joe 3 4 2 Jane 0 0 3 ... ... ... ... As you can see - this is a pretty sparse matrix!
  • 13. Factored Matrices ● P is the features matrix. It has a row for each feature and a column for each column in the original matrix (movie). ● Q is the weights matrix. It has a column for each feature, and a row for each row in the original matrix (user). ● Therefore when you multiply the dot product, you’re finding the sum of the latent features by the weights for each element in the original matrix.
  • 14. Latent Features ● The features and weights described before measure “latent” features - e.g. hidden, non- human describable features and their weights, but could relate to things like genre. ● You need less features than items, otherwise the best answer is simply every item (no similarity). ● The magic is where there are zero values - the product will fill them in, predicting their value.
  • 15. Non-negative Called non-negative matrix factorization because it returns features and weights with no negative values. Therefore all features must be positive or zero values.
  • 16. Clustering Non-Negative Matrix Factorization is closely related to both supervised and unsupervised methodologies (supervised because R can be seen as a training set) - but in particular NNMF is closely related to other clustering (unsupervised) algorithms.
  • 17. Gradient Descent ● The technique we will use to factor is called gradient descent, which attempts to minimize error. ● We can calculate error from our product using the squared error (actual - predicted)2 ● Once we know the error, we can calculate the gradient in order to figure out what direction to go to minimize the error. We keep going until we have no more error.
  • 18. The Algorithm → Initialize P and Q with random small numbers → for step until max_steps: for row, col in R: if R[row][col] > 0: compute error of element compute gradient from error update P and Q with new entry compute total error if error < some threshold: break return P, Q.T
  • 19. Needed Computations Compute* predicted element for each user- movie pair (dot product of row and column in P and Q): Compute the squared error for each user-movie pair (in order to compute the gradient): *computations for non-zero values only
  • 20. Needed Computations Find gradient (slope of error curve) by taking the differential of the error for each element:
  • 21. Update Rule Update each element in P and Q by using a learning rate (called α) - this determines how far to travel along the gradient. α is usually small, because if we choose a step size that is too large, we could miss the minimum.
  • 22. Convergence and Regularization We converge once the sum of the errors has reached some threshold, usually very small. Extending this algorithm will introduce regularization to avoid overfitting by adding a beta parameter. This forces the algorithm to control the magnitudes of the feature vectors.
  • 23. Predicted Recommendations Our predicted matrix for our movies rating, will end up looking something like what follows: Where there were zeros before, we now have predictions! Star Wars Bridget Jones The Hobbit ... Bob 4.98148768 2.02748447 3.29852779 Joe 3.0157327 3.968359 2.01139212 Jane 4.50410968 2.93580899 2.98826278 ... ... ... ... As you can see - the predictions are very close to the actual values, but we now suspect that Jane will really like Star Wars!