SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Algorithmic !
Anti-Differentiation!
A case study with !
min-cuts, spectral, and flow
!
!
David F. Gleich · Purdue University!
Michael W. Mahoney · Berkeley ICSI!
Code "www.cs.purdue.edu/homes/dgleich/codes/l1pagerank!
1
Algorithmic Anti-differentiation!
Understanding how and why heuristic procedures
•  Early stopping
•  Truncating small entries 
•  etc
are actually algorithms for implicit objectives.
2
ICML
David Gleich · Purdue
The ideal world
Given Problem P
Derive solution
characterization C
Show algorithm A "
finds a solution where C
holds
Profit?!
Given “min-cut”
Derive “max-flow is
equivalent to min-cut”
Show push-relabel
solves max-flow "

Profit!!
ICML
David Gleich · Purdue
3
(The ideal world)’
Given Problem P
Derive solution approx.
characterization C
Show algorithm A’ "
finds a solution where C’
holds
Profit?!
Given “sparsest-cut”
Derive Rayleigh-
quotient approximation
Show power method
finds good Rayleigh
quotient
Profit? !
ICML
David Gleich · Purdue
4
(In academia!)!
The real world
Given Task P
Hack around until you
find something useful
Write paper presenting
“novel heuristic” H for P
and
Profit!!
Given “find-communities”
Hack around !
… hidden ..!
Write paper on “three
steps of power method
finds communities”
Profit!!
ICML
David Gleich · Purdue
5
(The ideal world)’’
Understand why H works!
Show heuristic H solves P’
Guess and check!
until you find something H
solves
Derive characterization of
heuristic H
Given “find-communities”
Hack around !
!
Write paper on “three
steps of power method
finds communities”
Profit!!
ICML
David Gleich · Purdue
6
If your algorithm is related
to optimization, this is: 
Given a procedure X, "
what objective does it
optimize?
The real world
Algorithmic Anti-differentiation!
Given heuristic H, is there a problem P’
such that H is an algorithm for P’ ? 
In the smooth,
unconstrained case,
this is just “anti-
differentiation!”
ICML
David Gleich · Purdue
7
Algorithmic Anti-differentiation
in the literature
Mahoney & Orecchia (2011) 

Three steps of the power method and p-norm reg.
Dhillon et al. (2007) "
Spectral clustering, trace minimization & kernel k-means
Saunders (1995) LSQR & Craig iterative methods for Ax = b!
… many more …


ICML
David Gleich · Purdue
8
Outline
1.  A new derivation of the PageRank vector for an
undirected graph based on Laplacians, cuts, or flows.
2.  An understanding of the implicit regularization of
PageRank “push” method.
3.  The impact of this on a few applications.
ICML
David Gleich · Purdue
9
The PageRank problem
The PageRank random surfer
1.  With probability beta, follow a
random-walk step
2.  With probability (1-beta), jump
randomly ~ dist. v.
Goal find the stationary dist. x!
!




Sym. adjacency matrix
Diagonal degree matrix
Solution
Jump-vector
(I AD 1
)x = (1 )v
ICML
David Gleich · Purdue
10
[↵D + L]z = ↵v
where
= 1/(1 + ↵)
and x = Dz
Equivalent to
Combinatorial "
Laplacian
The Push Algorithm for PageRank!

Proposed (in closest form) in Andersen, Chung, Lang "
(also by McSherry, Jeh & Widom) for personalized PageRank
Strongly related to Gauss-Seidel, coordinate descent
Derived to quickly approximate PageRank with sparsity
1. x(1)
= 0, r(1)
= (1 )ei , k = 1
2. while any rj > ⌧dj (dj is the degree of node j)
3. x(k+1)
= x(k)
+ (rj ⌧dj ⇢)ej
4. r(k+1)
i =
8
><
>:
⌧dj ⇢ i = j
r(k)
i + (rj ⌧dj ⇢)/dj i ⇠ j
r(k)
i otherwise
5. k k + 1
The
Push
Method!
⌧, ⇢
ICML
David Gleich · Purdue
11
The push method stays local
ICML
David Gleich · Purdue
12
Why do we care
about push?

1.  Used for empirical studies of
“communities” and an
ingredient in an empirically
successful community finder
(Whang et al. CIKM 2013).
2.  Used for “fast PageRank”
approximation
3.  It produces sparse
approximations to PageRank!

Newman’s netscience!
379 vertices, 1828 nnz
“zero” on most of the nodes
v has a single "
one here
13
ICML
minimize kBxkC,1 =
P
ij2E Ci,j |xi xj |
subject to xs = 1, xt = 0, x 0.
The s-t min-cut problem
Unweighted incidence matrix
Diagonal cost matrix
14
ICML
David Gleich · Purdue
The localized cut graph





Related to a construction
used in “FlowImprove” "
Andersen & Lang (2007); and
Orecchia & Zhu (2014)
AS =
2
4
0 ↵dT
S 0
↵dS A ↵d¯S
0 ↵dT
¯S 0
3
5
Connect s to vertices
in S with weight ↵ · degree
Connect t to vertices
in ¯S with weight ↵ · degree
ICML
David Gleich · Purdue
15
The localized cut graph & PageRank
ICML
David Gleich · Purdue
16
minimize kBSxkC(↵),1
subject to xs = 1, xt = 0
x 0.
Solve the s-t min-cut
The localized cut graph & PageRank
ICML
David Gleich · Purdue
17
Solve “spectral” s-t min-cut
minimize kBSxkC(↵),2
subject to xs = 1, xt = 0
x 0.
The PageRank vector z that solves
(↵D + L)z = ↵v
with v = dS/vol(S) is a renormalized
solution of the electrical cut computation:
minimize kBSxkC(↵),2
subject to xs = 1, xt = 0.
Specifically, if x is the solution, then
x =
2
4
1
vol(S)z
0
3
5
Back to the push method
Let x be the output from the push method
with 0 < < 1, v = dS/vol(S),
⇢ = 1, and ⌧ > 0.
Set ↵ = 1
,  = ⌧vol(S)/ , and let zG solve:
minimize 1
2 kBSzk
2
C(↵),2 + kDzk1
subject to zs = 1, zt = 0, z 0
,
where z =
h 1
zG
0
i
.
Then x = DzG/vol(S).
Proof Write out KKT conditions
Show that the push method
solves them. Slackness was “tricky”
Regularization
for sparsity
ICML
David Gleich · Purdue
18
Need for
normalization
A simple example
The vector xpr, z, and x(↵, S ) are the PageRank vectors from Theo-
rem 1, where x(↵, S ) solves Prob. (4) and the others are from the
problems at the end of Section 2. The vector xcut solves the cut
Prob. (2), and zG solves Prob. (6).
Deg. xpr z x(↵, S ) xcut zG
2 0.0788 0.0394 0.8276 1 0.2758
4 0.1475 0.0369 0.7742 1 0.2437
7 0.2362 0.0337 0.7086 1 0.2138
4 0.1435 0.0359 0.7533 1 0.2325
4 0.1297 0.0324 0.6812 1 0.1977
7 0.1186 0.0169 0.3557 0 0
3 0.0385 0.0128 0.2693 0 0
2 0.0167 0.0083 0.1749 0 0
4 0.0487 0.0122 0.2554 0 0
3 0.0419 0.0140 0.2933 0 0
Prob. (6) solves an `1-regularized `2 regression problem)
has 24 non-zeros. The true “min-cut” set is large in both
the 2-norm PageRank problem and the regularized problem.
Thus, we identify the underlying graph feature correctly;
but the implicitly regularized ACL procedure does so with
many fewer non-zeros than the vanilla PageRank procedure.
ICML
David Gleich · Purdue
19
David Gleich · Purdue
20
Anti-di↵erentiating Approximat
16 nonzeros 15 nonzeros
Figure 2. Examples of the di↵erent cut vectors on a portion of the netscience
with its vertices enlarged. In the other subfigures, we show the solution vectors
(4), and (6), solved with min-cut, PageRank, and ACL) for this set S . Each v
values are large and dark. White vertices with outlines are numerically non-zer
outlined, in contrast to the third figure). The true min-cut set is large in all ve
with many fewer non-zeros than the vanilla PageRank problem.
References
Andersen, Reid and Lang, Kevin. An algorithm for improving
graph partitions. In Proceedings of the 19th annual ACM-SIAM
Symposium on Discrete Algorithms, pp. 651–660, 2008.
Andersen, Reid, Chung, Fan, and Lang, Kevin. Local graph par-
titioning using PageRank vectors. In Proceedings of the 47th
Annual IEEE Symposium on Foundations of Computer Science,
Leskov
Mic
clus
Inte
Mahon
regu
of th
143
Anti-di↵erentiating Approximation Algorithms
eros 15 nonzeros 284 nonzeros 24 nonzeros
of the di↵erent cut vectors on a portion of the netscience graph. In the left subfigure, we show the set S highlighted
arged. In the other subfigures, we show the solution vectors from the various cut problems (from left to right, Probs. (2),
with min-cut, PageRank, and ACL) for this set S . Each vector determines the color and size of a vertex, where high
dark. White vertices with outlines are numerically non-zero (which is why most of the vertices in the fourth figure are
t to the third figure). The true min-cut set is large in all vectors, but the implicitly regularized problem achieves this
Push’s sparsity
helps it identify
the “right” graph
feature with fewer
non-zeros
The set S
 The mincut solution
The push solution
The PageRank solution
ICML
It’s easy to make this apply broadly 


Easy to cook up interesting diffusion-like problems and adapt them to this
framework. In particular, Zhou et al. (2004) gave a semi-supervised learning
diffusion we are currently studying …
2
4
0 eT
S 0
eS ✓A e¯S
0 e¯S 0
3
5 .
ICML
David Gleich · Purdue
21
minimize 1
2 kBS ˆxk
2
2 + kˆxk1
subject to ˆxs = 1, ˆxt = 0, ˆx 0
minimize 1
2 xT
(I + ✓L)x xT
eS + kxk1
subject to x 0
Anti-di↵erentiating Approximation Algorithms
16 nonzeros 15 nonzeros 284 nonzeros 24 nonzeros
Figure 1. Examples of the di↵erent cut vectors on a portion of the net-science graph. At left we show the set S highlighted
Recap & Conclusions
ICML
David Gleich · Purdue
22
Open issues!
Better treatment of directed graphs?

Algorithm for rho < 1?!
rho set to ½ in most “uses”
Need new analysis 

(Coming soon)"
Improvements to semi-supervised
learning on graphs!


Key point
We don’t solve the 1-norm
regularized problem with
a 1-norm solver, but with
the efficient push method. 

Run push, and you get a
1-norm reg. with early
stopping
David Gleich · Purdue
Supported by NSF CAREER 1149756-CCF 
 www.cs.purdue.edu/homes/dgleich
1.  “Defined” alg.
anti-diff to
understand why
heuristics work.
2.  Found equiv. w/
PageRank and
cut / flow.
3.  Push & 1-norm
regularization.
PageRank à s-t min-cut
That equivalence works if s is degree-weighted.
What if s is the uniform vector? 
A(s) =
2
4
0 ↵sT
0
↵s A ↵(d s)
0 ↵(d s)T
0
3
5 .
David Gleich · Purdue
23
MMDS 2014

Contenu connexe

Tendances

Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)
Matthew Leingang
 

Tendances (20)

Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 
Tensor Train decomposition in machine learning
Tensor Train decomposition in machine learningTensor Train decomposition in machine learning
Tensor Train decomposition in machine learning
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
NUMERICAL METHODS -Iterative methods(indirect method)
NUMERICAL METHODS -Iterative methods(indirect method)NUMERICAL METHODS -Iterative methods(indirect method)
NUMERICAL METHODS -Iterative methods(indirect method)
 
Reduction of the small gain condition
Reduction of the small gain conditionReduction of the small gain condition
Reduction of the small gain condition
 
system of algebraic equation by Iteration method
system of algebraic equation by Iteration methodsystem of algebraic equation by Iteration method
system of algebraic equation by Iteration method
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural Network
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)
 
Hierarchical matrix techniques for maximum likelihood covariance estimation
Hierarchical matrix techniques for maximum likelihood covariance estimationHierarchical matrix techniques for maximum likelihood covariance estimation
Hierarchical matrix techniques for maximum likelihood covariance estimation
 

En vedette

Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
David Gleich
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
David Gleich
 
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
David Gleich
 

En vedette (19)

A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulants
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architectures
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
 
How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 
MapReduce for scientific simulation analysis
MapReduce for scientific simulation analysisMapReduce for scientific simulation analysis
MapReduce for scientific simulation analysis
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph AlgorithmsMassive MapReduce Matrix Computations & Multicore Graph Algorithms
Massive MapReduce Matrix Computations & Multicore Graph Algorithms
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computation
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 

Similaire à Anti-differentiating approximation algorithms: A case study with min-cuts, spectral, and flow

slides_nuclear_norm_regularization_david_mateos
slides_nuclear_norm_regularization_david_mateosslides_nuclear_norm_regularization_david_mateos
slides_nuclear_norm_regularization_david_mateos
David Mateos
 
Algorithm review
Algorithm reviewAlgorithm review
Algorithm review
chidabdu
 
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
Alex (Oleksiy) Varfolomiyev
 

Similaire à Anti-differentiating approximation algorithms: A case study with min-cuts, spectral, and flow (20)

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Automatic bayesian cubature
Automatic bayesian cubatureAutomatic bayesian cubature
Automatic bayesian cubature
 
OI.ppt
OI.pptOI.ppt
OI.ppt
 
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
slides_nuclear_norm_regularization_david_mateos
slides_nuclear_norm_regularization_david_mateosslides_nuclear_norm_regularization_david_mateos
slides_nuclear_norm_regularization_david_mateos
 
DAA - UNIT 4 - Engineering.pptx
DAA - UNIT 4 - Engineering.pptxDAA - UNIT 4 - Engineering.pptx
DAA - UNIT 4 - Engineering.pptx
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Algorithm review
Algorithm reviewAlgorithm review
Algorithm review
 
A Regularized Simplex Method
A Regularized Simplex MethodA Regularized Simplex Method
A Regularized Simplex Method
 
On Continuous Approximate Solution of Ordinary Differential Equations
On Continuous Approximate Solution of Ordinary Differential EquationsOn Continuous Approximate Solution of Ordinary Differential Equations
On Continuous Approximate Solution of Ordinary Differential Equations
 
Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...
 
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemA Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
 
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
 
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares Method
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Anti-differentiating approximation algorithms: A case study with min-cuts, spectral, and flow

  • 1. Algorithmic ! Anti-Differentiation! A case study with ! min-cuts, spectral, and flow ! ! David F. Gleich · Purdue University! Michael W. Mahoney · Berkeley ICSI! Code "www.cs.purdue.edu/homes/dgleich/codes/l1pagerank! 1
  • 2. Algorithmic Anti-differentiation! Understanding how and why heuristic procedures •  Early stopping •  Truncating small entries •  etc are actually algorithms for implicit objectives. 2 ICML David Gleich · Purdue
  • 3. The ideal world Given Problem P Derive solution characterization C Show algorithm A " finds a solution where C holds Profit?! Given “min-cut” Derive “max-flow is equivalent to min-cut” Show push-relabel solves max-flow " Profit!! ICML David Gleich · Purdue 3
  • 4. (The ideal world)’ Given Problem P Derive solution approx. characterization C Show algorithm A’ " finds a solution where C’ holds Profit?! Given “sparsest-cut” Derive Rayleigh- quotient approximation Show power method finds good Rayleigh quotient Profit? ! ICML David Gleich · Purdue 4 (In academia!)!
  • 5. The real world Given Task P Hack around until you find something useful Write paper presenting “novel heuristic” H for P and Profit!! Given “find-communities” Hack around ! … hidden ..! Write paper on “three steps of power method finds communities” Profit!! ICML David Gleich · Purdue 5
  • 6. (The ideal world)’’ Understand why H works! Show heuristic H solves P’ Guess and check! until you find something H solves Derive characterization of heuristic H Given “find-communities” Hack around ! ! Write paper on “three steps of power method finds communities” Profit!! ICML David Gleich · Purdue 6
  • 7. If your algorithm is related to optimization, this is: Given a procedure X, " what objective does it optimize? The real world Algorithmic Anti-differentiation! Given heuristic H, is there a problem P’ such that H is an algorithm for P’ ? In the smooth, unconstrained case, this is just “anti- differentiation!” ICML David Gleich · Purdue 7
  • 8. Algorithmic Anti-differentiation in the literature Mahoney & Orecchia (2011) 
 Three steps of the power method and p-norm reg. Dhillon et al. (2007) " Spectral clustering, trace minimization & kernel k-means Saunders (1995) LSQR & Craig iterative methods for Ax = b! … many more … ICML David Gleich · Purdue 8
  • 9. Outline 1.  A new derivation of the PageRank vector for an undirected graph based on Laplacians, cuts, or flows. 2.  An understanding of the implicit regularization of PageRank “push” method. 3.  The impact of this on a few applications. ICML David Gleich · Purdue 9
  • 10. The PageRank problem The PageRank random surfer 1.  With probability beta, follow a random-walk step 2.  With probability (1-beta), jump randomly ~ dist. v. Goal find the stationary dist. x! ! Sym. adjacency matrix Diagonal degree matrix Solution Jump-vector (I AD 1 )x = (1 )v ICML David Gleich · Purdue 10 [↵D + L]z = ↵v where = 1/(1 + ↵) and x = Dz Equivalent to Combinatorial " Laplacian
  • 11. The Push Algorithm for PageRank! Proposed (in closest form) in Andersen, Chung, Lang " (also by McSherry, Jeh & Widom) for personalized PageRank Strongly related to Gauss-Seidel, coordinate descent Derived to quickly approximate PageRank with sparsity 1. x(1) = 0, r(1) = (1 )ei , k = 1 2. while any rj > ⌧dj (dj is the degree of node j) 3. x(k+1) = x(k) + (rj ⌧dj ⇢)ej 4. r(k+1) i = 8 >< >: ⌧dj ⇢ i = j r(k) i + (rj ⌧dj ⇢)/dj i ⇠ j r(k) i otherwise 5. k k + 1 The Push Method! ⌧, ⇢ ICML David Gleich · Purdue 11
  • 12. The push method stays local ICML David Gleich · Purdue 12
  • 13. Why do we care about push? 1.  Used for empirical studies of “communities” and an ingredient in an empirically successful community finder (Whang et al. CIKM 2013). 2.  Used for “fast PageRank” approximation 3.  It produces sparse approximations to PageRank! Newman’s netscience! 379 vertices, 1828 nnz “zero” on most of the nodes v has a single " one here 13 ICML
  • 14. minimize kBxkC,1 = P ij2E Ci,j |xi xj | subject to xs = 1, xt = 0, x 0. The s-t min-cut problem Unweighted incidence matrix Diagonal cost matrix 14 ICML David Gleich · Purdue
  • 15. The localized cut graph Related to a construction used in “FlowImprove” " Andersen & Lang (2007); and Orecchia & Zhu (2014) AS = 2 4 0 ↵dT S 0 ↵dS A ↵d¯S 0 ↵dT ¯S 0 3 5 Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree ICML David Gleich · Purdue 15
  • 16. The localized cut graph & PageRank ICML David Gleich · Purdue 16 minimize kBSxkC(↵),1 subject to xs = 1, xt = 0 x 0. Solve the s-t min-cut
  • 17. The localized cut graph & PageRank ICML David Gleich · Purdue 17 Solve “spectral” s-t min-cut minimize kBSxkC(↵),2 subject to xs = 1, xt = 0 x 0. The PageRank vector z that solves (↵D + L)z = ↵v with v = dS/vol(S) is a renormalized solution of the electrical cut computation: minimize kBSxkC(↵),2 subject to xs = 1, xt = 0. Specifically, if x is the solution, then x = 2 4 1 vol(S)z 0 3 5
  • 18. Back to the push method Let x be the output from the push method with 0 < < 1, v = dS/vol(S), ⇢ = 1, and ⌧ > 0. Set ↵ = 1 ,  = ⌧vol(S)/ , and let zG solve: minimize 1 2 kBSzk 2 C(↵),2 + kDzk1 subject to zs = 1, zt = 0, z 0 , where z = h 1 zG 0 i . Then x = DzG/vol(S). Proof Write out KKT conditions Show that the push method solves them. Slackness was “tricky” Regularization for sparsity ICML David Gleich · Purdue 18 Need for normalization
  • 19. A simple example The vector xpr, z, and x(↵, S ) are the PageRank vectors from Theo- rem 1, where x(↵, S ) solves Prob. (4) and the others are from the problems at the end of Section 2. The vector xcut solves the cut Prob. (2), and zG solves Prob. (6). Deg. xpr z x(↵, S ) xcut zG 2 0.0788 0.0394 0.8276 1 0.2758 4 0.1475 0.0369 0.7742 1 0.2437 7 0.2362 0.0337 0.7086 1 0.2138 4 0.1435 0.0359 0.7533 1 0.2325 4 0.1297 0.0324 0.6812 1 0.1977 7 0.1186 0.0169 0.3557 0 0 3 0.0385 0.0128 0.2693 0 0 2 0.0167 0.0083 0.1749 0 0 4 0.0487 0.0122 0.2554 0 0 3 0.0419 0.0140 0.2933 0 0 Prob. (6) solves an `1-regularized `2 regression problem) has 24 non-zeros. The true “min-cut” set is large in both the 2-norm PageRank problem and the regularized problem. Thus, we identify the underlying graph feature correctly; but the implicitly regularized ACL procedure does so with many fewer non-zeros than the vanilla PageRank procedure. ICML David Gleich · Purdue 19
  • 20. David Gleich · Purdue 20 Anti-di↵erentiating Approximat 16 nonzeros 15 nonzeros Figure 2. Examples of the di↵erent cut vectors on a portion of the netscience with its vertices enlarged. In the other subfigures, we show the solution vectors (4), and (6), solved with min-cut, PageRank, and ACL) for this set S . Each v values are large and dark. White vertices with outlines are numerically non-zer outlined, in contrast to the third figure). The true min-cut set is large in all ve with many fewer non-zeros than the vanilla PageRank problem. References Andersen, Reid and Lang, Kevin. An algorithm for improving graph partitions. In Proceedings of the 19th annual ACM-SIAM Symposium on Discrete Algorithms, pp. 651–660, 2008. Andersen, Reid, Chung, Fan, and Lang, Kevin. Local graph par- titioning using PageRank vectors. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, Leskov Mic clus Inte Mahon regu of th 143 Anti-di↵erentiating Approximation Algorithms eros 15 nonzeros 284 nonzeros 24 nonzeros of the di↵erent cut vectors on a portion of the netscience graph. In the left subfigure, we show the set S highlighted arged. In the other subfigures, we show the solution vectors from the various cut problems (from left to right, Probs. (2), with min-cut, PageRank, and ACL) for this set S . Each vector determines the color and size of a vertex, where high dark. White vertices with outlines are numerically non-zero (which is why most of the vertices in the fourth figure are t to the third figure). The true min-cut set is large in all vectors, but the implicitly regularized problem achieves this Push’s sparsity helps it identify the “right” graph feature with fewer non-zeros The set S The mincut solution The push solution The PageRank solution ICML
  • 21. It’s easy to make this apply broadly Easy to cook up interesting diffusion-like problems and adapt them to this framework. In particular, Zhou et al. (2004) gave a semi-supervised learning diffusion we are currently studying … 2 4 0 eT S 0 eS ✓A e¯S 0 e¯S 0 3 5 . ICML David Gleich · Purdue 21 minimize 1 2 kBS ˆxk 2 2 + kˆxk1 subject to ˆxs = 1, ˆxt = 0, ˆx 0 minimize 1 2 xT (I + ✓L)x xT eS + kxk1 subject to x 0
  • 22. Anti-di↵erentiating Approximation Algorithms 16 nonzeros 15 nonzeros 284 nonzeros 24 nonzeros Figure 1. Examples of the di↵erent cut vectors on a portion of the net-science graph. At left we show the set S highlighted Recap & Conclusions ICML David Gleich · Purdue 22 Open issues! Better treatment of directed graphs? Algorithm for rho < 1?! rho set to ½ in most “uses” Need new analysis (Coming soon)" Improvements to semi-supervised learning on graphs! Key point We don’t solve the 1-norm regularized problem with a 1-norm solver, but with the efficient push method. Run push, and you get a 1-norm reg. with early stopping David Gleich · Purdue Supported by NSF CAREER 1149756-CCF www.cs.purdue.edu/homes/dgleich 1.  “Defined” alg. anti-diff to understand why heuristics work. 2.  Found equiv. w/ PageRank and cut / flow. 3.  Push & 1-norm regularization.
  • 23. PageRank à s-t min-cut That equivalence works if s is degree-weighted. What if s is the uniform vector? A(s) = 2 4 0 ↵sT 0 ↵s A ↵(d s) 0 ↵(d s)T 0 3 5 . David Gleich · Purdue 23 MMDS 2014