SlideShare a Scribd company logo
1 of 22
Download to read offline
March 2018 1 of 22
Projective Splitting with Forward Steps
and Greedy Activation
Jonathan Eckstein
Rutgers University, New Jersey, USA
Joint work with
Patrick Johnstone (my postdoc)
Rutgers University, New Jersey, USA
Based on earlier work with
Patrick Combettes, Benar F. Svaiter
Funded in part by US National Science
Foundation Grant CCF-1617617
March 2018 2 of 22
Convex/Monotone Problem Setting
• 0 1, , , n   are real Hilbert spaces
• 0:i iG →  is a bounded/continuous linear operator 1..i n∀ ∈
• :i iT   i is a maximal monotone operator 1..i n∀ ∈
*
0
1
: 0 ( )
n
i i i
i
x G T G x
=
∈ ∈∑Find
Generalization of
0
1
min ( )
n
i i
x
i
f G x
∈
=
 
 
 
∑
Where : { }i if → ∪ +∞  are closed proper convex 1..i n∀ ∈
March 2018 3 of 22
For this Short Talk, a Simplification
• 0 1 n= = = =   
• IdiG = 1..i n∀ ∈
1
: 0 ( )
n
i
i
x T x
=
∈ ∈∑Find
which generalizes
1
min ( )
n
i
x
i
f x
∈
=
 
 
 
∑
March 2018 4 of 22
The Kuhn-Tucker Set and Fejér Projection Algorithms
1
1 1
1
( , , , ) ( ) 1.. 1, ( )
n
n
n i i i n
i
z w w w T z i n w T z
−
−
=
 
∈ ∈ ∀ ∈ − − ∈ 
 
∑ 
• z solves the inclusion 1 1 1 1, , :( , , , )n nw w z w w− −⇔ ∃ ∈ ∈  
•  is a closed convex set
• We will use a separating hyperplane projection algorithm to
try to (weakly) converge to a point in 
• Fejér monotone: non-increasing distance to all points in 

{ }
is affine
( ) 0
( ) 0
( ) 0
k
k k
k
k k
H p p
p p
p
ϕ
ϕ
ϕ
ϕ
= =
≤ ∀ ∈
>

1kp +
kp
March 2018 5 of 22
A Family of Separating Hyperplanes
Given ( , ) graph 1..i i ix y T i n∈ ∀ ∈ define
1 1
1 1
1 1
( , , , ) , ,
n n
n i i i n n i
i i
z w w z x y w z x y wϕ
− −
−
= =
= − − + − +∑ ∑
• ϕ is an affine function on n
 (the quadratic terms cancel)
• 1, , nT T monotone 1 1 1 1( , , , ) 0 ( , , , )n nz w w z w wϕ − −⇒ ≤ ∀ ∈ 
March 2018 6 of 22
Constructing a Separating Hyperplane
Given 1 1( , , )k k k k
np z w w − ∉ , can we find ( , ) graphk k
i i ix y T∈ such
that
1 1
1 1
1 1
( , , ) , , 0
n n
k k k k k k k k k k k
k n i i i n n i
i i
z w w z x y w z x y wϕ
− −
−
=
= − − + − + >∑ ∑ ?
Sufficient to solve the following for each iT :
• Given maximal monotone :T   , ( )( , )  graphz w T∈ ×  ,
find ( , ) graphx y T∈ such that
, 0z x y w− − > or equivalently , 0x z y w− − <
March 2018 7 of 22
Using a Proximal (Backward) Step ( 1.. 1i n∈ − )
• Take any 0ρ > . Then the proximal step finds the unique
( , ) graphk k
i i ix y T∈ such that k k k k
i i ix y z wρ ρ+ = +
• So
2
1
( ) , 0k k k k k k k k k k
i i i i i i iy w z x z x y w z xρρ − = − ⇒ − − = − ≥
k k
ix y z wρ ρ+ = +
( , )k k
i ix y
( , )k k
iz w
iT
March 2018 8 of 22
Using a Proximal (Backward) Step, Continued
• Defining
1
1
nk k
n ii
w w
−
=
= −∑ , the same thing works for i n=
• Adding up, 1 1 1
( , , ) , 0
nk k k k k k k
k n i i ii
z w w z x y wϕ − =
= − − ≥∑
• And if 1 1( , , ) 0k k k
k nz w wϕ − = , then 1
k k k
nz x x= = = , k k
i iw y i= ∀ ,
meaning that 1 1( , , )k k k
nz w w − ∈ since ( , ) graphk k
i i ix y T i∈ ∀
• So we strictly separate any 1 1( , , ) k k k k n
np z w w −= ∈  from 

{ }( ) 0
( ) 0
( ) 0
k k
k
k
H p p
p p
p
ϕ
ϕ
ϕ
= =
≤ ∀ ∈
>

k
p
March 2018 9 of 22
Algorithm Close to a Special Case of E and Svaiter 2009
Starting with an arbitrary 0 0 0
1( , , , )nz w w ∈ :
For 0,1,2,k = 
1. For 1, ,i n=  , compute
( )( , ) Prox ( )
k
i
i
k k k k k
i i T i ix y z w
ρ
ρ= +
(Decomposition Step) (parameters k
iρ can vary with i and k)
2. Define
1 1
1 1
1 1
( , , , ) , ,
n n
k k k k
k n i i i n n i
i i
z w w z x y w z x y wϕ
− −
−
= =
= − − + − +∑ ∑
3. Compute 1 1 1 1
1 1( , , , )k k k k
np z w w+ + + +
−=  by projecting
1
1 1( , , , )k k k
nz w w+
− onto the halfspace 1 1( , , , ) 0k nz w wϕ − ≤
(possibly with some overrelaxation) (Coordination Step)
E and Svaiter 2009 showed that the cuts 1 1( , , , ) 0k nz w wϕ − ≤
obtained this way (and generalizations) are sufficiently deep for
{ }k
z to converge (weakly) to a solution. For fixed min max0 ρ ρ< ≤ ,
any choices of [ ]min max,k
iρ ρ ρ∈ are permitted.
March 2018 10 of 22
More on This Class of Algorithm
• (Overrelaxed) projection:
{ }1
2
max 0, ( )k
kk k
k k
k
p
p p
ϕ
β ϕ
ϕ
+
=− ∇
∇
• Helpful to use a scaled norm to adjust primal/dual weighting
Further developments:
• Alotaibi, Combettes & Shahzad 2013: including a linear
mapping G and solve *
1 20 ( ) ( )T z G T Gz∈ + with proximal steps on
1T and 2T (not *
2G T G  )
• Combettes and Eckstein 2016: block iterative and
asynchronous versions with 2n ≥ operators
oBlock iterative: at each iteration, process only a subset of
blocks i, keep remaining ( , ) graphk k
i i ix y T∈ unchanged
oAsynchronous: proximal operations can use (boundedly)
outdated information, allowing asynchronous parallel
operation
March 2018 11 of 22
A Recently Solved Challenge
Within the context of this kind of projective splitting algorithm:
• Suppose iT is Lipschitz continuous with constant iL
• Do we really have to perform a proximal step on such an
operator? Can’t we use forward steps instead?
oThere are a variety of splitting algorithms that use forward
( )ix T xρ− steps on Lipschiptz operators...
o ...with the stepsize ρ typically bounded by something
proportional to 1/ iL
Answer (from Patrick Johnstone):
• For a Lipschitz operator, you can substitute two forward steps
for a proximal step
March 2018 12 of 22
Using Two Forward Steps
( )
1
2 2 2
1 1
, , ,
, ( ) ,
k k k k k k k k k k k k
i i i i i i i i i
k k k k k k k k
i i i i i i
k k k k k k
i i i i i
z x y w z x T z w z x T z y
z x z x z x T z T x
z x L z x L z x
ρ
ρ ρ
− − = − − − − −
= − − − − −
≥ − − − = − −
( , )k k
iz w
iT
( , )k k
iz T z
( , )k k
i ix w
( , ) ( , )k k k k
i i i i ix y x T x=
1/ρ
( )k k k k
i i ix z T z wρ=− − , then k k
i i iy T x=
March 2018 13 of 22
Using Two Forward Steps, Continued
• So if 1
1/i iL Lρ ρ> ⇔ < , we get a valid step
• And it turns out that all the convergence theory continues to
go through, including block iterations and asynchonicity
Variations
• If iL is unknown, instead possible to pick some 0∆ > and
backtrack on ρ until
2
,k k k k k k
i i i iz x y w z x− − ≥ ∆ −
Will eventually occur for small enough ρ if iT is Lipschitz:
1t + operator evaluations, where t is # of backtrack steps
• If iT is affine, can just solve for ρ in 2 total evaluations, or…
• ...similarly, solve for ρ maximizing ,k k k k
i i iz x y w− −
• The convergence theory still holds with all these techniques
March 2018 14 of 22
“Greedy” Activation Heuristic
• If we don’t overrelax, iteration 1k − typically leaves us with
1 1
1 1 1 1
( , , ) , 0
nk k k k k k k
k n i i ii
z w w z x y wϕ − −
− − =
= − − =∑
• If we find an i for which 1 1
,k k k k
i i iz x y w− −
− − is negative, we can
increase it to at least 0 and immediately cut off the current
iterate
oWorks with either a proximal step or our two-forward-step
technique
• Heuristic: give priority to processing i for which
1 1
,k k k k
i i iz x y w− −
− − is the most negative
• This does not really maximize the distance to the separator,
but seems to be a useful proxy
March 2018 15 of 22
Some Very Preliminary Computational Tests: LASSO
LASSO problems:
{ }21
2 1
mind
x
Qx b xλ
∈
− +

Partition Q into r blocks of rows, set 1n r= +
21
2 1
1
mind
r
i i
x
i
Q x b xλ
∈
=
 
− + 
 
∑
So we can set
1
( ) ( ), 1.. 1i i i i nT x Q Q x b i n T λ= − ∀ ∈ − = ∂ ⋅T
• At each iteration, process blocks { , }i n , where 1.. 1i n∈ − is
selected randomly or greedily; forward steps use ∆ technique
• Did some primal-dual scaling (simple norm change)
• Also simulate random asychronicity delays
• Measure the number of “Q-equivalent” matrix multiplies
March 2018 16 of 22
Preliminary Test Results: Blog Data
Legend: ( , )r D , where D is max delay, G = greedy & no delay
March 2018 17 of 22
Preliminary Test Results: Crime Data
March 2018 18 of 22
Preliminary Test Results: Randomly Generated Data
March 2018 19 of 22
Observations
• Projective splitting seems to have some promise as a way to
build efficient parallel algorithms for large-scale problems
• Breaking up the loss function term into multiple blocks seems
to speed up the projective splitting methods – an unusual
property for decomposition algorithms
• Greedy block activation looks useful
• It seems helpful to use forward steps for affine operators
More Coming Soon
• Convergence rate analyses (see also Machado 2017)
March 2018 20 of 22
Big Open Question
• What is a “killer app” for projective splitting?
March 2018 21 of 22
Some More Open Questions
Adaptive stepsizes: how might we fully exploit all the allowed
parameter variability?
• Projective splitting has existed for nearly a decade, but we
still don’t know how to use all the extra parameter variability
it allows
• From iteration to iteration
Related question:
• Given maximal monotone :T   , ( )( , )  graphz w T∈ ×  ,
find ( , ) graphx y T∈ such that
,x z y w− − is minimized (or at least a “large” negative number)
• Better yet, minimize 2 2
,x z y w
x yγ
− −
+
March 2018 22 of 22
References
• Patrick R. Johnstone and Jonathan Eckstein. “Projective
Splitting with Forward Steps: Asynchronous and Block-Iterative
Operator Splitting”. Optimization Online and ArXiv, released
March 2018.

More Related Content

What's hot

FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...
FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...
FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...ieijjournal
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsDavid Gleich
 
2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussian2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussiannozomuhamada
 
Datastructure tree
Datastructure treeDatastructure tree
Datastructure treerantd
 
Implicit two step adam moulton hybrid block method with two off step points f...
Implicit two step adam moulton hybrid block method with two off step points f...Implicit two step adam moulton hybrid block method with two off step points f...
Implicit two step adam moulton hybrid block method with two off step points f...Alexander Decker
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
 
Skiena algorithm 2007 lecture15 backtracing
Skiena algorithm 2007 lecture15 backtracingSkiena algorithm 2007 lecture15 backtracing
Skiena algorithm 2007 lecture15 backtracingzukun
 
Universal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousUniversal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousJoe Suzuki
 
2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machinenozomuhamada
 
Principal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationPrincipal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationTatsuya Yokota
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmmnozomuhamada
 
Numerical solution of linear volterra fredholm integro-
Numerical solution of linear volterra fredholm integro-Numerical solution of linear volterra fredholm integro-
Numerical solution of linear volterra fredholm integro-Alexander Decker
 
Maneuvering target track prediction model
Maneuvering target track prediction modelManeuvering target track prediction model
Maneuvering target track prediction modelIJCI JOURNAL
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural NetworkRuochun Tzeng
 
Newton two Equation method
Newton two Equation  method Newton two Equation  method
Newton two Equation method shanto017
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...
ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...
ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...ijitcs
 

What's hot (19)

FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...
FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...
FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
 
2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussian2012 mdsp pr12 k means mixture of gaussian
2012 mdsp pr12 k means mixture of gaussian
 
Datastructure tree
Datastructure treeDatastructure tree
Datastructure tree
 
Implicit two step adam moulton hybrid block method with two off step points f...
Implicit two step adam moulton hybrid block method with two off step points f...Implicit two step adam moulton hybrid block method with two off step points f...
Implicit two step adam moulton hybrid block method with two off step points f...
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
Skiena algorithm 2007 lecture15 backtracing
Skiena algorithm 2007 lecture15 backtracingSkiena algorithm 2007 lecture15 backtracing
Skiena algorithm 2007 lecture15 backtracing
 
Universal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or ContinuousUniversal Prediction without assuming either Discrete or Continuous
Universal Prediction without assuming either Discrete or Continuous
 
2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine
 
Principal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationPrincipal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classification
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmm
 
Numerical solution of linear volterra fredholm integro-
Numerical solution of linear volterra fredholm integro-Numerical solution of linear volterra fredholm integro-
Numerical solution of linear volterra fredholm integro-
 
Maneuvering target track prediction model
Maneuvering target track prediction modelManeuvering target track prediction model
Maneuvering target track prediction model
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural Network
 
Newton two Equation method
Newton two Equation  method Newton two Equation  method
Newton two Equation method
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...
ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...
ADAPTIVESYNCHRONIZER DESIGN FOR THE HYBRID SYNCHRONIZATION OF HYPERCHAOTIC ZH...
 
D024025032
D024025032D024025032
D024025032
 
L12 complexity
L12 complexityL12 complexity
L12 complexity
 

Similar to QMC: Operator Splitting Workshop, Projective Splitting with Forward Steps and Greedy Activation - Jonathan Eckstein, Mar 22, 2018

Shape drawing algs
Shape drawing algsShape drawing algs
Shape drawing algsMusawarNice
 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentRaphael Reitzig
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorchJun Young Park
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksStratio
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsJagadeeswaran Rathinavel
 
Lecture 1
Lecture 1Lecture 1
Lecture 1butest
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksChenYiHuang5
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
Convex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTConvex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTandrewmart11
 
Fixed point and common fixed point theorems in complete metric spaces
Fixed point and common fixed point theorems in complete metric spacesFixed point and common fixed point theorems in complete metric spaces
Fixed point and common fixed point theorems in complete metric spacesAlexander Decker
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2San Kim
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdfanandsimple
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptxpallavidhade2
 

Similar to QMC: Operator Splitting Workshop, Projective Splitting with Forward Steps and Greedy Activation - Jonathan Eckstein, Mar 22, 2018 (20)

CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Shape drawing algs
Shape drawing algsShape drawing algs
Shape drawing algs
 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient Apportionment
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
2.circle
2.circle2.circle
2.circle
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
 
SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithms
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
Convex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTConvex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPT
 
Cdc18 dg lee
Cdc18 dg leeCdc18 dg lee
Cdc18 dg lee
 
Fixed point and common fixed point theorems in complete metric spaces
Fixed point and common fixed point theorems in complete metric spacesFixed point and common fixed point theorems in complete metric spaces
Fixed point and common fixed point theorems in complete metric spaces
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
 
Cis435 week02
Cis435 week02Cis435 week02
Cis435 week02
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

QMC: Operator Splitting Workshop, Projective Splitting with Forward Steps and Greedy Activation - Jonathan Eckstein, Mar 22, 2018

  • 1. March 2018 1 of 22 Projective Splitting with Forward Steps and Greedy Activation Jonathan Eckstein Rutgers University, New Jersey, USA Joint work with Patrick Johnstone (my postdoc) Rutgers University, New Jersey, USA Based on earlier work with Patrick Combettes, Benar F. Svaiter Funded in part by US National Science Foundation Grant CCF-1617617
  • 2. March 2018 2 of 22 Convex/Monotone Problem Setting • 0 1, , , n   are real Hilbert spaces • 0:i iG →  is a bounded/continuous linear operator 1..i n∀ ∈ • :i iT   i is a maximal monotone operator 1..i n∀ ∈ * 0 1 : 0 ( ) n i i i i x G T G x = ∈ ∈∑Find Generalization of 0 1 min ( ) n i i x i f G x ∈ =       ∑ Where : { }i if → ∪ +∞  are closed proper convex 1..i n∀ ∈
  • 3. March 2018 3 of 22 For this Short Talk, a Simplification • 0 1 n= = = =    • IdiG = 1..i n∀ ∈ 1 : 0 ( ) n i i x T x = ∈ ∈∑Find which generalizes 1 min ( ) n i x i f x ∈ =       ∑
  • 4. March 2018 4 of 22 The Kuhn-Tucker Set and Fejér Projection Algorithms 1 1 1 1 ( , , , ) ( ) 1.. 1, ( ) n n n i i i n i z w w w T z i n w T z − − =   ∈ ∈ ∀ ∈ − − ∈    ∑  • z solves the inclusion 1 1 1 1, , :( , , , )n nw w z w w− −⇔ ∃ ∈ ∈   •  is a closed convex set • We will use a separating hyperplane projection algorithm to try to (weakly) converge to a point in  • Fejér monotone: non-increasing distance to all points in   { } is affine ( ) 0 ( ) 0 ( ) 0 k k k k k k H p p p p p ϕ ϕ ϕ ϕ = = ≤ ∀ ∈ >  1kp + kp
  • 5. March 2018 5 of 22 A Family of Separating Hyperplanes Given ( , ) graph 1..i i ix y T i n∈ ∀ ∈ define 1 1 1 1 1 1 ( , , , ) , , n n n i i i n n i i i z w w z x y w z x y wϕ − − − = = = − − + − +∑ ∑ • ϕ is an affine function on n  (the quadratic terms cancel) • 1, , nT T monotone 1 1 1 1( , , , ) 0 ( , , , )n nz w w z w wϕ − −⇒ ≤ ∀ ∈ 
  • 6. March 2018 6 of 22 Constructing a Separating Hyperplane Given 1 1( , , )k k k k np z w w − ∉ , can we find ( , ) graphk k i i ix y T∈ such that 1 1 1 1 1 1 ( , , ) , , 0 n n k k k k k k k k k k k k n i i i n n i i i z w w z x y w z x y wϕ − − − = = − − + − + >∑ ∑ ? Sufficient to solve the following for each iT : • Given maximal monotone :T   , ( )( , ) graphz w T∈ ×  , find ( , ) graphx y T∈ such that , 0z x y w− − > or equivalently , 0x z y w− − <
  • 7. March 2018 7 of 22 Using a Proximal (Backward) Step ( 1.. 1i n∈ − ) • Take any 0ρ > . Then the proximal step finds the unique ( , ) graphk k i i ix y T∈ such that k k k k i i ix y z wρ ρ+ = + • So 2 1 ( ) , 0k k k k k k k k k k i i i i i i iy w z x z x y w z xρρ − = − ⇒ − − = − ≥ k k ix y z wρ ρ+ = + ( , )k k i ix y ( , )k k iz w iT
  • 8. March 2018 8 of 22 Using a Proximal (Backward) Step, Continued • Defining 1 1 nk k n ii w w − = = −∑ , the same thing works for i n= • Adding up, 1 1 1 ( , , ) , 0 nk k k k k k k k n i i ii z w w z x y wϕ − = = − − ≥∑ • And if 1 1( , , ) 0k k k k nz w wϕ − = , then 1 k k k nz x x= = = , k k i iw y i= ∀ , meaning that 1 1( , , )k k k nz w w − ∈ since ( , ) graphk k i i ix y T i∈ ∀ • So we strictly separate any 1 1( , , ) k k k k n np z w w −= ∈  from   { }( ) 0 ( ) 0 ( ) 0 k k k k H p p p p p ϕ ϕ ϕ = = ≤ ∀ ∈ >  k p
  • 9. March 2018 9 of 22 Algorithm Close to a Special Case of E and Svaiter 2009 Starting with an arbitrary 0 0 0 1( , , , )nz w w ∈ : For 0,1,2,k =  1. For 1, ,i n=  , compute ( )( , ) Prox ( ) k i i k k k k k i i T i ix y z w ρ ρ= + (Decomposition Step) (parameters k iρ can vary with i and k) 2. Define 1 1 1 1 1 1 ( , , , ) , , n n k k k k k n i i i n n i i i z w w z x y w z x y wϕ − − − = = = − − + − +∑ ∑ 3. Compute 1 1 1 1 1 1( , , , )k k k k np z w w+ + + + −=  by projecting 1 1 1( , , , )k k k nz w w+ − onto the halfspace 1 1( , , , ) 0k nz w wϕ − ≤ (possibly with some overrelaxation) (Coordination Step) E and Svaiter 2009 showed that the cuts 1 1( , , , ) 0k nz w wϕ − ≤ obtained this way (and generalizations) are sufficiently deep for { }k z to converge (weakly) to a solution. For fixed min max0 ρ ρ< ≤ , any choices of [ ]min max,k iρ ρ ρ∈ are permitted.
  • 10. March 2018 10 of 22 More on This Class of Algorithm • (Overrelaxed) projection: { }1 2 max 0, ( )k kk k k k k p p p ϕ β ϕ ϕ + =− ∇ ∇ • Helpful to use a scaled norm to adjust primal/dual weighting Further developments: • Alotaibi, Combettes & Shahzad 2013: including a linear mapping G and solve * 1 20 ( ) ( )T z G T Gz∈ + with proximal steps on 1T and 2T (not * 2G T G  ) • Combettes and Eckstein 2016: block iterative and asynchronous versions with 2n ≥ operators oBlock iterative: at each iteration, process only a subset of blocks i, keep remaining ( , ) graphk k i i ix y T∈ unchanged oAsynchronous: proximal operations can use (boundedly) outdated information, allowing asynchronous parallel operation
  • 11. March 2018 11 of 22 A Recently Solved Challenge Within the context of this kind of projective splitting algorithm: • Suppose iT is Lipschitz continuous with constant iL • Do we really have to perform a proximal step on such an operator? Can’t we use forward steps instead? oThere are a variety of splitting algorithms that use forward ( )ix T xρ− steps on Lipschiptz operators... o ...with the stepsize ρ typically bounded by something proportional to 1/ iL Answer (from Patrick Johnstone): • For a Lipschitz operator, you can substitute two forward steps for a proximal step
  • 12. March 2018 12 of 22 Using Two Forward Steps ( ) 1 2 2 2 1 1 , , , , ( ) , k k k k k k k k k k k k i i i i i i i i i k k k k k k k k i i i i i i k k k k k k i i i i i z x y w z x T z w z x T z y z x z x z x T z T x z x L z x L z x ρ ρ ρ − − = − − − − − = − − − − − ≥ − − − = − − ( , )k k iz w iT ( , )k k iz T z ( , )k k i ix w ( , ) ( , )k k k k i i i i ix y x T x= 1/ρ ( )k k k k i i ix z T z wρ=− − , then k k i i iy T x=
  • 13. March 2018 13 of 22 Using Two Forward Steps, Continued • So if 1 1/i iL Lρ ρ> ⇔ < , we get a valid step • And it turns out that all the convergence theory continues to go through, including block iterations and asynchonicity Variations • If iL is unknown, instead possible to pick some 0∆ > and backtrack on ρ until 2 ,k k k k k k i i i iz x y w z x− − ≥ ∆ − Will eventually occur for small enough ρ if iT is Lipschitz: 1t + operator evaluations, where t is # of backtrack steps • If iT is affine, can just solve for ρ in 2 total evaluations, or… • ...similarly, solve for ρ maximizing ,k k k k i i iz x y w− − • The convergence theory still holds with all these techniques
  • 14. March 2018 14 of 22 “Greedy” Activation Heuristic • If we don’t overrelax, iteration 1k − typically leaves us with 1 1 1 1 1 1 ( , , ) , 0 nk k k k k k k k n i i ii z w w z x y wϕ − − − − = = − − =∑ • If we find an i for which 1 1 ,k k k k i i iz x y w− − − − is negative, we can increase it to at least 0 and immediately cut off the current iterate oWorks with either a proximal step or our two-forward-step technique • Heuristic: give priority to processing i for which 1 1 ,k k k k i i iz x y w− − − − is the most negative • This does not really maximize the distance to the separator, but seems to be a useful proxy
  • 15. March 2018 15 of 22 Some Very Preliminary Computational Tests: LASSO LASSO problems: { }21 2 1 mind x Qx b xλ ∈ − +  Partition Q into r blocks of rows, set 1n r= + 21 2 1 1 mind r i i x i Q x b xλ ∈ =   − +    ∑ So we can set 1 ( ) ( ), 1.. 1i i i i nT x Q Q x b i n T λ= − ∀ ∈ − = ∂ ⋅T • At each iteration, process blocks { , }i n , where 1.. 1i n∈ − is selected randomly or greedily; forward steps use ∆ technique • Did some primal-dual scaling (simple norm change) • Also simulate random asychronicity delays • Measure the number of “Q-equivalent” matrix multiplies
  • 16. March 2018 16 of 22 Preliminary Test Results: Blog Data Legend: ( , )r D , where D is max delay, G = greedy & no delay
  • 17. March 2018 17 of 22 Preliminary Test Results: Crime Data
  • 18. March 2018 18 of 22 Preliminary Test Results: Randomly Generated Data
  • 19. March 2018 19 of 22 Observations • Projective splitting seems to have some promise as a way to build efficient parallel algorithms for large-scale problems • Breaking up the loss function term into multiple blocks seems to speed up the projective splitting methods – an unusual property for decomposition algorithms • Greedy block activation looks useful • It seems helpful to use forward steps for affine operators More Coming Soon • Convergence rate analyses (see also Machado 2017)
  • 20. March 2018 20 of 22 Big Open Question • What is a “killer app” for projective splitting?
  • 21. March 2018 21 of 22 Some More Open Questions Adaptive stepsizes: how might we fully exploit all the allowed parameter variability? • Projective splitting has existed for nearly a decade, but we still don’t know how to use all the extra parameter variability it allows • From iteration to iteration Related question: • Given maximal monotone :T   , ( )( , ) graphz w T∈ ×  , find ( , ) graphx y T∈ such that ,x z y w− − is minimized (or at least a “large” negative number) • Better yet, minimize 2 2 ,x z y w x yγ − − +
  • 22. March 2018 22 of 22 References • Patrick R. Johnstone and Jonathan Eckstein. “Projective Splitting with Forward Steps: Asynchronous and Block-Iterative Operator Splitting”. Optimization Online and ArXiv, released March 2018.