A Note On the Forward-Douglas‒Rachford Splitting Algorithm and its Application to Convex Optimization

For solving the monotone inclusion problem

    find x ∈ zer{A + B} ,

where B has sufficient regularity, the forward-backward splitting algorithm is essentially the repeated application of the operator

    T_FB := J_A(Id − B) ,

where J_A := (Id + A)⁻¹ is the resolvent of A.
In T_FB, the resolvent of A and the application B are done separately, hence the term of splitting. This is useful in many applications, in particular for finding a minimum of a sum of convex functions, that is A plays the role of the subdifferential of a simple nonsmooth functional and B plays the role of the differential of a smooth functional.
Now, for solving the monotone inclusion problem

    find x ∈ zer{A + C} ,

when C does not have the required regularity of B above (in convex optimization, when it is also a subdifferential of a nonsmooth functional), one can call on the Douglas‒Rachford splitting algorithm, which is essentially the repeated application of the operator

    T_DR := 1/2(R_AR_C + Id) = J_A(2J_C − Id) + (Id − J_C) .

where we conveniently noted R_A := 2J_A − Id. These algorithms has been known at least since the work of Lions and Mercier (1979).

The extension of the Douglas‒Rachford algorithms to a splitting of an arbitrary number of operators has been pretty straightforward. However, tackling

    find x ∈ zer{ ∑_i A_i + B} ,

with both full splitting of the operators and enjoying the regularity of B was not possible until our generalized forward-backward which we published in 2013. It consists in duplicating the research space as many times as there are nonsmooth operators in the splitting, and in this augmented space, repeating applications of the following operator

    T_GFB := 1/2(R_AR_S + Id) (Id −BP_S) = J_A(2P_S − Id − BP_S) + (Id − P_S) ,

where J_A denotes here the parallel application of the resolvent of the A_i's, P_S is the orthogonal projector over the first diagonal of the augmented space (set all auxiliary variables equal to their average), and R_S := 2P_S − Id.
This proved rather useful, since modern signal processing or machine learning tools are formulated as minimizations of structured sum of smooth functionals with several additional nonsmooth ones.

Not long after, Briceño-Arias (2015) realizes one can replace the set S above by any closed vector space V without changing anything of the above operator T_GFB (in the first form given, the second becomes J_A(2P_V − Id − P_VBP_V) + (Id − P_V) in all generality, note the additional projection on V). Although he does not even attempt at showing instances where this might be useful, this apparently is enough to get a publication in Optimization. He does find a name much fancier than ours though: forward-Douglas−Rachford. Indeed, the operator T_GFB really looks like the fusion of T_FB and T_DR.
However, the analogy is not perfectly suitable yet, because one cannot tackle an additional arbitrary monotone operator C just as the Douglas−Rachford operator T_DR does. But come along Davis and Yin (2015), realizing that an orthogonal projector P_S is nothing but the resolvent of a normal cone J_{N_S} . Replacing the normal cone N_S by the arbitrary C in T_GFB yields the following operator

    T_FDR := J_A(2J_C − Id − BJ_C) + (Id − J_C) .

Although this generalization is straightforward, they must be given the credit that the convergence analysis of this iteration is more delicate. They call it a “three operator splitting scheme”, which is regrettable because it should exactly be called forward-Douglas−Rachford. We would also have liked the “generalized generalized forward-backward”, but unfortunately the authors never heard of us. In fact, by reading their paper which makes no mention of our work, it seems that their operator comes out of nowhere.

They could also have tried to illustrate numerically, or even discuss the possibility, of instances where such extension is practically useful. We may have found applications where this is the case. Well, sort of. Not really, actually, but it does remain true that the forward-Douglas−Rachford is more elegant than plain generalized forward-backward on these instances.
Everything is detailed in our note, where we notably specify the case with an arbitrary number of operators A_i and the use of preconditioners (published in Optimization Letters; reference as BibTeX format).
C++ implementation, interfaced with GNU Octave or Matlab, of the resulting method for typical signal processing and learning tasks can be found at the dedicated GitLab repository.

Back To Home