The document proposes a Riemannian gossip algorithm for decentralized matrix completion. Each agent has its own data matrix and aims to complete the matrix while reaching consensus on the common factor matrix U with other agents. The optimization problem is formulated on a Grassmann manifold by minimizing a weighted combination of completion and consensus terms. A parallel variant of the gossip algorithm is also developed, which converges at the same rate as the original algorithm. Numerical tests on synthetic and Netflix data show the algorithm achieves good performance compared to benchmark methods.
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Riemannian gossip algorithms for decentralized matrix completion
1. Riemannian gossip algorithms for
decentralized matrix completion
Hiroyuki Kasai†
, Bamdev Mishra‡
, and Atul Saroop‡
†The University of Electro-Communications, Japan
‡Amazon Development Centre India, India
IEICE meeting 2016
2. Motivation
The matrix completion problem
? ? * ?
* * ? *
? * * ?
* ? * ?
≈
Low-rank prior
m Movies
n Users
m
nr
(n + m − r)r, r ≪ (m, n)
WT
U
X?
U and W factor matrices.
[Netflix Challenge, 2006]
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 2 / 25
3. Motivation
Our interest is to look at the decentralized scenario
? ? * ?
* * ? *
? * * ?
* ? * ?
m Movies
n1 Users n2 Users
X?
1 X?
2
U[WT
1 WT
2 ]≈
An agent i has access to its own data matrix Xi .
The matrix U is common across all the agents.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 3 / 25
4. Motivation
Contributions
We develop a nonlinear gossip algorithm with minimal
communication between agents.
The optimization formulation is based on a weighted combination of
matrix completion and consensus terms.
We develop a parallel variant of the proposed gossip algorithm.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 4 / 25
5. Motivation
Paper and codes available online
at
www.bamdevmishra.com.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 5 / 25
6. Motivation
Outline
Problem formulation on the Riemannian Grassmann manifold.
Proposed gossip algorithms.
Numerical comparisons on synthetic and Netflix data.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 6 / 25
7. Problem formulation
Outline
Problem formulation on the Riemannian Grassmann manifold.
Proposed gossip algorithms.
Numerical comparisons on synthetic and Netflix data.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 7 / 25
8. Problem formulation
Batch problem formulation
min
U∈St(r,m)
min
W∈Rn×r
PΩ(UWT
) − PΩ(X ) 2
F .
W ∈ Rn×r
and
U ∈ St(r, m), the set of m × r matrices with orthonormal columns.
PΩ is the sampling operator, a convenient way to denote known
entries.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 8 / 25
9. Problem formulation
Eliminate W
minU∈St(r,m) minW∈Rn×r PΩ(UWT
) − PΩ(X ) 2
F
≡
minU∈St(r,m) f (U, WU), a Grassmann optimization problem.
Solve blue problem in closed form to obtain WU.
Final optimization problem is on Grassmann manifold, i.e.,
variable is ‘column space’ of U.
[Boumal and Absil, LAA, 2015]
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 9 / 25
10. Problem formulation
Decentralized problem formulation
X = [X1, X2, . . . , XN].
i
min
U∈St(r,m),Wi ∈Rni ×r
1
2
PΩi
(UWi
T
) − PΩi
(Xi ) 2
F
= min
U∈St(r,m)
1
2 i
PΩi
(UWT
iU) − PΩi
(Xi ) 2
F ,
where WiU is computed by agent i independently.
Although the problem is distributed, we still need to learn a common
U.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 10 / 25
11. Problem formulation
We add a consensus term to our optimization
formulation
Key idea: introduce multiple copies of U among N agents, but allow
them to reach consensus.
min
U1,...,UN ∈St(r,m)
1
2 i
PΩi
(Ui WT
iUi
) − PΩi
(Xi ) 2
F
completion task handled by agent i
+
ρ
2
(d(U1, U2)2
+ d(U2, U3)2
+ . . . + d(UN−1, UN)2
)
consensus among agents
.
d is the Riemannian distance on the Grassmann manifold.
A large ρ trades-off completion with consensus.
Minimizing only consensus ⇒ U1 = U2 = . . . = UN−1 = UN.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 11 / 25
12. Proposed Riemannian gossip algorithms
Outline
Problem formulation on the Riemannian Grassmann manifold.
Proposed gossip algorithms.
Numerical comparisons on synthetic and Netflix data.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 12 / 25
13. Proposed Riemannian gossip algorithms
Riemannian online gossip on Grassmann
1 Agents i and i + 1 are neighbors for all i N − 1. (ordering of
agents)
2 At each time slot, say t, we pick an agent i N − 1 randomly
with uniform probability. (SGD updates)
Equivalently, we also pick agent i + 1 (the neighbor of agent i).
Agents i and i + 1 update Ui and Ui+1, respectively, by taking
a gradient descent step with stepsize γt on Grassmann manifold.
γ2
t < ∞ and γt = +∞.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 13 / 25
14. Proposed Riemannian gossip algorithms
A graphical illustration
Agent 1 Agent 2 Agent 3 Agent N-1 Agent N
Universal clock
Each pair is chosen
. . .
with probability 1=(N − 1)
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 14 / 25
15. Proposed Riemannian gossip algorithms
Convergence of Riemannian online gossip
Asymptotic convergence follows standard SGD analysis on manifold.
The proposed algorithm is readily implementable, e.g., with the
toolbox Manopt.
[Bonnabel, IEEE TAC, 2013; Absil, Mahoney, and Sepulchre,
Princeton Press, 2008; Boumal et al., JMLR, 2014]
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 15 / 25
16. Proposed Riemannian gossip algorithms
Parallelizing Riemannian gossip with particular
sampling
Agent 1 Agent 2 Agent 3 Agent 4 Agent 5
Universal clock
Solid" pairs chosen at same time.
. . .
Dotted' pairs chosen at same time.
“Dotted’ and “solid” groups are chosen with probability 1/2.
Convergence guarantees remain the same.
(N − 1)/2 times faster.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 16 / 25
17. Numerical comparisons
Outline
Problem formulation on the Riemannian Grassmann manifold.
Proposed gossip algorithms.
Numerical comparisons on synthetic and Netflix data.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 17 / 25
18. Numerical comparisons
Effect of ρ
10 000×100 000 matrix with N = 6.
0 10 20 30 40
Every 10th update
10-10
10-8
10
-6
10-4
10
-2
100
Meansquareerrorontrainingset
Agent 1 rho=103
Agent 2, rho=103
Distance 1-2, rho=103
Agent 1, rho=10
10
Agent 2, rho=1010
Distance 1-2, rho=1010
A very large ρ minimizes only achieves consensus among agents.
A tuned ρ achieves both completion and consensus.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 18 / 25
19. Numerical comparisons
Performance of online and parallel variants
10 000×100 000 matrix with N = 6 and ρ = 103
.
0 10 20 30 40
Every 10th update
10-10
10-8
10
-6
10-4
10
-2
100
Meansquareerrorontrainingset
Onl. agent 1
Onl. agent 2
Onl. distance 1 - 2
Para. agent 1
Para. agent 2
Para. distance 1 - 2
There is no loss of performance in parallelizing the updates.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 19 / 25
20. Numerical comparisons
Comparison with D-LMaFit
500 × 12000 and N = 6.
0 10 20 30 40
Every 10th update
10-12
10-10
10-8
10-6
10
-4
10-2
100
102Meansquareerrorontrainingset
Online agent 1
Online agent 2
Online distance 1 - 2
D-LMaFit agent 1
D-LMaFit agent 2
D-LMaFit distance 1 - 2
D-LMaFit code is not scalable to large data.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 20 / 25
21. Numerical comparisons
Netflix data: different number of agents
Rank = 10
N = {2, 5, 10, 15, 20} agents
10 random 80/20 - train/test - 80/20 million split
Online gossip:
N = 2 N = 5 N = 10 N = 15 N = 20
Test
RMSE
0.877 0.885 0.891 0.894 0.900
Batch gradient descent algorithm RTRMC benchmark: 0.873.
[Boumal and Absil, LAA, 2015]
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 21 / 25
23. Numerical comparisons
Netflix data: test RMSE with updates
0 20 40 60 80
0.85
0.9
0.95
1
1.05
1.1
TestRMSE
Agent 1
Agent 2
Agent 3
RTRMC−1
Every 10th update
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 23 / 25
24. Summary and future work
Summary and future work
We proposed a Riemannian gossip approach to the decentralized
matrix completion problem.
We minimize weighted sum of completion and consensus terms
on Grassmann manifold.
Numerical comparisons show good performance of the proposed
algorithms, e.g., on Netflix dataset.
Currently, we intend to explore asynchronous updating of agents
on Grassmann manifold.
Kasai, Mishra, and Saroop Riemannian gossip 20 September 2016 24 / 25
25. Riemannian gossip algorithms for
decentralized matrix completion
Hiroyuki Kasai†
, Bamdev Mishra‡
, and Atul Saroop‡
†The University of Electro-Communications, Japan
‡Amazon Development Centre India, India
IEICE meeting 2016