SlideShare une entreprise Scribd logo
1  sur  20
COLLABORATIVE
FILTERING USING
ORTHOGONAL
NONNEGATIVE MATRIX
Presenter: Meng-Lun Wu
Authors: Gang Chen, Fei Wang and Changshui Zhang
Source: Information Processing and Management
(2009), pp. 368-379
1
OUTLINE
 Introduction
 Related Work
 Orthogonal nonnegative matrix tri-factorization
 Framework
 Experiments
 Conclusion
2
INTRODUCTION
 Collaborative filtering can predict a test user’s
rating for new items based on similar users.
 Collaborative filtering can be categorized into…
 Memory-based (similarity)
 user-based and item-based
 Model-based
 Establish a model using training examples.
3
INTRODUCTION (CONT.)
 This paper apply orthogonal nonnegative matrix
tri-factorization(ONMTF) to circumvent the two
kinds of collaborative filtering.
 ONMTF is applied to simultaneously co-cluster
the rows and columns, and attain individual
predictions for an unknown test rating.
 This paper possesses the following superiorities:
 Sparsity problem
 Scalability problem
 Fusing prediction results
4
RELATED WORK
 Researchers have proposed some hybrid
approaches in order to combine the memory
based and model based approaches.
 Xue et. al. (2005) resolve the sparsity and scalability
by using clusters to smooth ratings and clustering.
 However, Xue only consider user-based approach,
this paper extend the idea to integrate model
based, user-based and item-based approaches.
5
RELATED WORK (CONT.)
 Matrix decomposition can used to solve the co-
clustering problem.
 Ding et al. (2005) proposed co-clustering based on
nonnegative matrix factorization. (NMF)
 In 2006, they proposed ONMTF.
 Long et al. (2005) provided co-clustering by block
value decomposition.
6
ORTHOGONAL NONNEGATIVE
MATRIX TRI-FACTORIZATION
 The NMF is first brought into machine learning
and data mining fields by Lee et al. (2001).
 Ding et al. (2006) proved the equivalence
between NMF and K-means, and extended NMF
to ONMTF.
 The idea is to approximate the original matrix X
to the combination matrix, and the optimization
problem is
7
lnlkkpnp
TTT
VSU
andwhere
IVVIUUtsUSVX
×
+
×
+
×
+
×
+
≥≥≥
ℜ∈ℜ∈ℜ∈ℜ∈
==−
VS,U,X
,..,min
2
0,0,0
ORTHOGONAL NONNEGATIVE
MATRIX TRI-FACTORIZATION
(CONT.)
 The optimization problem can be solved using the
following update rules.
 After co-clustering, we could get the user centroid
SVT
and item centroid US. 8
( )
( )
( )ik
TT
ik
T
ikik
ik
TT
ik
T
ikik
jk
TT
jk
T
jkjk
VUSVU
XVU
SS
XVSUU
XVS
UU
USXVV
USX
VV
)(
)(
)(
←
←
←
FRAMEWORK
 Notations
 X = [u1,…,up]T
, uj=(xj1,…,xjn)T
, j∈{1,…,p}
 X = [i1,…,in], im=(x1m,…,xpm)T
, m∈{1,…,n}
9
MEMORY-BASED APPROACHES
 User’s neighbor selection
 Compute the similarities between a user and all the
user-cluster centroids SVT
.
 Select the top K user cluster as the user set uh.
 The item’s neighbor selection is similar.
 The cosine similarity between the j1th user and
the j2th user.
 Given an user-item pair <uj, im>, where uh∈{the most
similar K-user of uj}. 10
∑∑
∑
==
=
=
n
m
mj
n
m
mj
n
m
mjmj
xx
xx
sim
1
2
1
2
1
)()(
))((
),(
21
21
21 jj uu
∑
∑ −
+=
h
h
u jh
u jh
uu
uu
),(
))(,(
sim
uusim
ux
hhm
jjm
MEMORY-BASED APPROACHES
 The adjusted-cosine similarity between the m1th
and m2th items. (T is the set of users who both
rated m1 and m2)
 Given an user-item pair <uj, im>, where ih∈{the most
similar K-items of im}
 The final prediction result could be linearly
combined the three different types of predictions. 11
∑∑
∑
∈∈
∈
−−
−−
=
Tt ttmTt ttm
Tt ttmttm
uxux
uxux
sim
22
)()(
))((
),(
21
21
21 mm ii
∑
∑=
h
h
i mh
i mh
ii
ii
),(
))(,(
sim
xsim
x
jh
jm
jmjmjmjm xixuxnx ~~
)1)(1(~~)1(~~~ λδλδλ −−+−+=
ALGORITHM
1. The user-item matrix X is factorized as USVT
by using
ONMTF.
2. Calculate the similarities between the test user/item and
user/item-cluster centroids.
3. Sort the similarities and select the most similar C
user/item clusters as the test user/item neighbor
candidate set.
4. Identify the most K neighbors of the test user/item by
searching for the user/item candidate set.
5. Predict the unknown ratings by using user based and
item based approaches.
6. Linearly combine three different predictions.
12
EXPERIMENTS
 Dataset
 MovieLens: 500 users and 1000 items (1-5 scales)
 Training set: the first 100, 200 and 300 users, called
ML_100, ML_200 and ML_300.
 Testing set: the last 200 users
 We randomly selected 5, 10 and 20 items rated by
test users, called Given5, Given10 and Given20.
 Evaluation metric
 Mean absolute error (MAE) as evaluation metric.
 Where N is the number of tested ratings.
13
N
xx
MAE
mj jmjm∑ −
=
,
~
DIFFERENT CLUSTER
14
 The ML_300 dataset is used for training, and try
10 different values of k or l (2,5,10,20,…,80)
PERCENTAGE OF NEIGHBORS
 The percentage of pre-selected neighbors reaches
around 30%.
15
SIZE OF NEIGHBORS
16
COMBINATION COEFFICIENTS
 Fix λ=0, the optimal value of δ is approximately
between 0.5 and 0.7.
17
COMBINATION COEFFICIENTS
(CONT.)
 Fix δ=0.6, the optimal value of λ is approximately
between 0.2 and 0.4.
18
PERFORMANCE COMPARISON
19
 Wang et al., 2006, similarity fusion (SF2)
 Xue et al., 2005, cluster-based Pearson correlation coefficient (SCBPCC)
 Rennie and Srebro, 2005, maximum margin matrix factorization (MMMF)
 Ungar and Foster, 1999, cluster-based collaborative filtering (CBCF)
 Hofmann and Puzicha, 1999, aspect model (AM)
 Pennock et al., 2000, personality diagnosis (PD)
 Breese et al., 1998, user-based Pearson correlation coefficient (PCC)
CONCLUSIONS
 This paper presented a novel fusion framework
for collaborative filtering.
 The model-based and memory-based and
naturally assembled via ONMTF.
 Empirical studies verified our framework
effectively improves the prediction accuracy.
 Future work is investigate new co-clustering
techniques and develop better fusion models. 20

Contenu connexe

Tendances

Classification of handwritten characters by their symmetry features
Classification of handwritten characters by their symmetry featuresClassification of handwritten characters by their symmetry features
Classification of handwritten characters by their symmetry featuresAYUSH RAJ
 
Application of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningApplication of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningVARUN KUMAR
 
CHN and Swap Heuristic to Solve the Maximum Independent Set Problem
CHN and Swap Heuristic to Solve the Maximum Independent Set ProblemCHN and Swap Heuristic to Solve the Maximum Independent Set Problem
CHN and Swap Heuristic to Solve the Maximum Independent Set ProblemIJECEIAES
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...
DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...
DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...mathsjournal
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Umberto Picchini
 
بررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارش
بررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارشبررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارش
بررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارشپروژه مارکت
 
Lec 2 discrete random variable
Lec 2 discrete random variableLec 2 discrete random variable
Lec 2 discrete random variablecairo university
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Free vibration analysis of composite plates with uncertain properties
Free vibration analysis of composite plates  with uncertain propertiesFree vibration analysis of composite plates  with uncertain properties
Free vibration analysis of composite plates with uncertain propertiesUniversity of Glasgow
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Tsuyoshi Sakama
 
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...IJCNCJournal
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning VMax Kleiner
 

Tendances (14)

Classification of handwritten characters by their symmetry features
Classification of handwritten characters by their symmetry featuresClassification of handwritten characters by their symmetry features
Classification of handwritten characters by their symmetry features
 
Section6 stochastic
Section6 stochasticSection6 stochastic
Section6 stochastic
 
Application of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningApplication of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine Learning
 
CHN and Swap Heuristic to Solve the Maximum Independent Set Problem
CHN and Swap Heuristic to Solve the Maximum Independent Set ProblemCHN and Swap Heuristic to Solve the Maximum Independent Set Problem
CHN and Swap Heuristic to Solve the Maximum Independent Set Problem
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...
DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...
DISCRETIZATION OF A MATHEMATICAL MODEL FOR TUMOR-IMMUNE SYSTEM INTERACTION WI...
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
بررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارش
بررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارشبررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارش
بررسی دو روش شناسایی سیستم های متغیر با زمان به همراه شبیه سازی و گزارش
 
Lec 2 discrete random variable
Lec 2 discrete random variableLec 2 discrete random variable
Lec 2 discrete random variable
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Free vibration analysis of composite plates with uncertain properties
Free vibration analysis of composite plates  with uncertain propertiesFree vibration analysis of composite plates  with uncertain properties
Free vibration analysis of composite plates with uncertain properties
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章
 
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning V
 

En vedette

Co clustering by-block_value_decomposition
Co clustering by-block_value_decompositionCo clustering by-block_value_decomposition
Co clustering by-block_value_decompositionAllenWu
 
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data StreamsDSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data StreamsAllenWu
 
地震知識
地震知識地震知識
地震知識AllenWu
 
Collaborative filtering with CCAM
Collaborative filtering with CCAMCollaborative filtering with CCAM
Collaborative filtering with CCAMAllenWu
 
Co-clustering with augmented data
Co-clustering with augmented dataCo-clustering with augmented data
Co-clustering with augmented dataAllenWu
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
 
Ch4.mapreduce algorithm design
Ch4.mapreduce algorithm designCh4.mapreduce algorithm design
Ch4.mapreduce algorithm designAllenWu
 
2013 11 01(fast_grbf-nmf)_for_share
2013 11 01(fast_grbf-nmf)_for_share2013 11 01(fast_grbf-nmf)_for_share
2013 11 01(fast_grbf-nmf)_for_shareTatsuya Yokota
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
 

En vedette (9)

Co clustering by-block_value_decomposition
Co clustering by-block_value_decompositionCo clustering by-block_value_decomposition
Co clustering by-block_value_decomposition
 
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data StreamsDSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
 
地震知識
地震知識地震知識
地震知識
 
Collaborative filtering with CCAM
Collaborative filtering with CCAMCollaborative filtering with CCAM
Collaborative filtering with CCAM
 
Co-clustering with augmented data
Co-clustering with augmented dataCo-clustering with augmented data
Co-clustering with augmented data
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
Ch4.mapreduce algorithm design
Ch4.mapreduce algorithm designCh4.mapreduce algorithm design
Ch4.mapreduce algorithm design
 
2013 11 01(fast_grbf-nmf)_for_share
2013 11 01(fast_grbf-nmf)_for_share2013 11 01(fast_grbf-nmf)_for_share
2013 11 01(fast_grbf-nmf)_for_share
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 

Similaire à Collaborative filtering using orthogonal nonnegative matrix

Multi objective predictive control a solution using metaheuristics
Multi objective predictive control  a solution using metaheuristicsMulti objective predictive control  a solution using metaheuristics
Multi objective predictive control a solution using metaheuristicsijcsit
 
A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
A Multi-Objective Genetic Algorithm for Pruning Support Vector MachinesA Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
A Multi-Objective Genetic Algorithm for Pruning Support Vector MachinesMohamed Farouk
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
 
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...idescitation
 
A New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition PerformanceA New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition PerformanceCSCJournals
 
Continuous Architecting of Stream-Based Systems
Continuous Architecting of Stream-Based SystemsContinuous Architecting of Stream-Based Systems
Continuous Architecting of Stream-Based SystemsCHOOSE
 
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Xin-She Yang
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...Pooyan Jamshidi
 
A comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leachingA comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leachingtuxette
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
 
Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...AmirParnianifard1
 
A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...IJECEIAES
 
An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmIOSR Journals
 
Batch gradient method for training of
Batch gradient method for training ofBatch gradient method for training of
Batch gradient method for training ofijaia
 

Similaire à Collaborative filtering using orthogonal nonnegative matrix (20)

Multi objective predictive control a solution using metaheuristics
Multi objective predictive control  a solution using metaheuristicsMulti objective predictive control  a solution using metaheuristics
Multi objective predictive control a solution using metaheuristics
 
A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
A Multi-Objective Genetic Algorithm for Pruning Support Vector MachinesA Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
 
I046850
I046850I046850
I046850
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
 
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
 
A New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition PerformanceA New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition Performance
 
CoopLoc Technical Presentation
CoopLoc Technical PresentationCoopLoc Technical Presentation
CoopLoc Technical Presentation
 
Continuous Architecting of Stream-Based Systems
Continuous Architecting of Stream-Based SystemsContinuous Architecting of Stream-Based Systems
Continuous Architecting of Stream-Based Systems
 
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
 
A comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leachingA comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leaching
 
9.pdf
9.pdf9.pdf
9.pdf
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
 
A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...
 
Dycops2019
Dycops2019 Dycops2019
Dycops2019
 
An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
 
Batch gradient method for training of
Batch gradient method for training ofBatch gradient method for training of
Batch gradient method for training of
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Collaborative filtering using orthogonal nonnegative matrix

  • 1. COLLABORATIVE FILTERING USING ORTHOGONAL NONNEGATIVE MATRIX Presenter: Meng-Lun Wu Authors: Gang Chen, Fei Wang and Changshui Zhang Source: Information Processing and Management (2009), pp. 368-379 1
  • 2. OUTLINE  Introduction  Related Work  Orthogonal nonnegative matrix tri-factorization  Framework  Experiments  Conclusion 2
  • 3. INTRODUCTION  Collaborative filtering can predict a test user’s rating for new items based on similar users.  Collaborative filtering can be categorized into…  Memory-based (similarity)  user-based and item-based  Model-based  Establish a model using training examples. 3
  • 4. INTRODUCTION (CONT.)  This paper apply orthogonal nonnegative matrix tri-factorization(ONMTF) to circumvent the two kinds of collaborative filtering.  ONMTF is applied to simultaneously co-cluster the rows and columns, and attain individual predictions for an unknown test rating.  This paper possesses the following superiorities:  Sparsity problem  Scalability problem  Fusing prediction results 4
  • 5. RELATED WORK  Researchers have proposed some hybrid approaches in order to combine the memory based and model based approaches.  Xue et. al. (2005) resolve the sparsity and scalability by using clusters to smooth ratings and clustering.  However, Xue only consider user-based approach, this paper extend the idea to integrate model based, user-based and item-based approaches. 5
  • 6. RELATED WORK (CONT.)  Matrix decomposition can used to solve the co- clustering problem.  Ding et al. (2005) proposed co-clustering based on nonnegative matrix factorization. (NMF)  In 2006, they proposed ONMTF.  Long et al. (2005) provided co-clustering by block value decomposition. 6
  • 7. ORTHOGONAL NONNEGATIVE MATRIX TRI-FACTORIZATION  The NMF is first brought into machine learning and data mining fields by Lee et al. (2001).  Ding et al. (2006) proved the equivalence between NMF and K-means, and extended NMF to ONMTF.  The idea is to approximate the original matrix X to the combination matrix, and the optimization problem is 7 lnlkkpnp TTT VSU andwhere IVVIUUtsUSVX × + × + × + × + ≥≥≥ ℜ∈ℜ∈ℜ∈ℜ∈ ==− VS,U,X ,..,min 2 0,0,0
  • 8. ORTHOGONAL NONNEGATIVE MATRIX TRI-FACTORIZATION (CONT.)  The optimization problem can be solved using the following update rules.  After co-clustering, we could get the user centroid SVT and item centroid US. 8 ( ) ( ) ( )ik TT ik T ikik ik TT ik T ikik jk TT jk T jkjk VUSVU XVU SS XVSUU XVS UU USXVV USX VV )( )( )( ← ← ←
  • 9. FRAMEWORK  Notations  X = [u1,…,up]T , uj=(xj1,…,xjn)T , j∈{1,…,p}  X = [i1,…,in], im=(x1m,…,xpm)T , m∈{1,…,n} 9
  • 10. MEMORY-BASED APPROACHES  User’s neighbor selection  Compute the similarities between a user and all the user-cluster centroids SVT .  Select the top K user cluster as the user set uh.  The item’s neighbor selection is similar.  The cosine similarity between the j1th user and the j2th user.  Given an user-item pair <uj, im>, where uh∈{the most similar K-user of uj}. 10 ∑∑ ∑ == = = n m mj n m mj n m mjmj xx xx sim 1 2 1 2 1 )()( ))(( ),( 21 21 21 jj uu ∑ ∑ − += h h u jh u jh uu uu ),( ))(,( sim uusim ux hhm jjm
  • 11. MEMORY-BASED APPROACHES  The adjusted-cosine similarity between the m1th and m2th items. (T is the set of users who both rated m1 and m2)  Given an user-item pair <uj, im>, where ih∈{the most similar K-items of im}  The final prediction result could be linearly combined the three different types of predictions. 11 ∑∑ ∑ ∈∈ ∈ −− −− = Tt ttmTt ttm Tt ttmttm uxux uxux sim 22 )()( ))(( ),( 21 21 21 mm ii ∑ ∑= h h i mh i mh ii ii ),( ))(,( sim xsim x jh jm jmjmjmjm xixuxnx ~~ )1)(1(~~)1(~~~ λδλδλ −−+−+=
  • 12. ALGORITHM 1. The user-item matrix X is factorized as USVT by using ONMTF. 2. Calculate the similarities between the test user/item and user/item-cluster centroids. 3. Sort the similarities and select the most similar C user/item clusters as the test user/item neighbor candidate set. 4. Identify the most K neighbors of the test user/item by searching for the user/item candidate set. 5. Predict the unknown ratings by using user based and item based approaches. 6. Linearly combine three different predictions. 12
  • 13. EXPERIMENTS  Dataset  MovieLens: 500 users and 1000 items (1-5 scales)  Training set: the first 100, 200 and 300 users, called ML_100, ML_200 and ML_300.  Testing set: the last 200 users  We randomly selected 5, 10 and 20 items rated by test users, called Given5, Given10 and Given20.  Evaluation metric  Mean absolute error (MAE) as evaluation metric.  Where N is the number of tested ratings. 13 N xx MAE mj jmjm∑ − = , ~
  • 14. DIFFERENT CLUSTER 14  The ML_300 dataset is used for training, and try 10 different values of k or l (2,5,10,20,…,80)
  • 15. PERCENTAGE OF NEIGHBORS  The percentage of pre-selected neighbors reaches around 30%. 15
  • 17. COMBINATION COEFFICIENTS  Fix λ=0, the optimal value of δ is approximately between 0.5 and 0.7. 17
  • 18. COMBINATION COEFFICIENTS (CONT.)  Fix δ=0.6, the optimal value of λ is approximately between 0.2 and 0.4. 18
  • 19. PERFORMANCE COMPARISON 19  Wang et al., 2006, similarity fusion (SF2)  Xue et al., 2005, cluster-based Pearson correlation coefficient (SCBPCC)  Rennie and Srebro, 2005, maximum margin matrix factorization (MMMF)  Ungar and Foster, 1999, cluster-based collaborative filtering (CBCF)  Hofmann and Puzicha, 1999, aspect model (AM)  Pennock et al., 2000, personality diagnosis (PD)  Breese et al., 1998, user-based Pearson correlation coefficient (PCC)
  • 20. CONCLUSIONS  This paper presented a novel fusion framework for collaborative filtering.  The model-based and memory-based and naturally assembled via ONMTF.  Empirical studies verified our framework effectively improves the prediction accuracy.  Future work is investigate new co-clustering techniques and develop better fusion models. 20