# A Note On the Forward-Douglas‒Rachford Splitting Algorithm and its Application to Convex Optimization

For solving the monotone inclusion problem

find *x* ∈ zer{A + B} ,

where *B* has sufficient regularity, the **forward-backward** splitting algorithm is essentially the repeated application of the operator

*T*_{FB} := *J*_{A}(Id − *B*) ,

where *J*_{A} := (Id + *A*)^{−1} is the resolvent of *A*.

In *T*_{FB}, the resolvent of *A* and the application *B* are done separately, hence the term of *splitting*.
This is useful in many applications, in particular for finding a minimum of a sum of convex functions, that is *A* plays the role of the subdifferential of a simple nonsmooth functional and *B* plays the role of the differential of a smooth functional.

Now, for solving the monotone inclusion problem

find *x* ∈ zer{A + C} ,

when *C* does not have the required regularity of *B* above (in convex optimization, when it is also a subdifferential of a nonsmooth functional), one can call on the **Douglas‒Rachford** splitting algorithm, which is essentially the repeated application of the operator

*T*_{DR} :=
1/2(*R*_{A }*R*_{C} + Id) =
*J*_{A}(2*J*_{C} − Id) +
(Id − *J*_{C}) .

where we conveniently noted *R*_{A} := 2*J*_{A} − Id.
These algorithms has been known at least since the work of Lions and Mercier (1979).

The extension of the Douglas‒Rachford algorithms to a splitting of an arbitrary number of operators has been pretty straightforward. However, tackling

find *x* ∈ zer{ ∑_{i} A_{i} + B} ,

with both full splitting of the operators *and* enjoying the regularity of *B* was not possible until our generalized forward-backward which we published in 2013. It consists in duplicating the research space as many times as there are nonsmooth operators in the splitting, and in this augmented space, repeating applications of the following operator

*T*_{GFB} :=
1/2(*R*_{A }*R*_{S} + Id)
(Id −*B**P*_{S})
= *J*_{A}(2*P*_{S} − Id −
*B**P*_{S}) +
(Id − *P*_{S}) ,

where *J*_{A} denotes here the parallel application of the resolvent of the *A*_{i}'s, *P*_{S} is the orthogonal projector over the first diagonal of the augmented space (set all auxiliary variables equal to their average), and *R*_{S} := 2*P*_{S} − Id.

This proved rather useful, since modern signal processing or machine learning tools are formulated as minimizations of structured sum of smooth functionals with several additional nonsmooth ones.

Not long after, Briceño-Arias (2015) realizes one can replace the set *S* above by any closed vector space *V* without changing anything of the above operator *T*_{GFB} (in the first form given, the second becomes
*J*_{A}(2*P*_{V} − Id −
*P*_{V}*B**P*_{V}) +
(Id − *P*_{V})
in all generality, note the additional projection on *V*).
Although he does not even attempt at showing instances where this might be useful, this apparently is enough to get a publication in *Optimization*. He does find a name much fancier than ours though: **forward-Douglas−Rachford**.
Indeed, the operator *T*_{GFB} really looks like the fusion of *T*_{FB} and *T*_{DR}.

However, the analogy is not perfectly suitable yet, because one cannot tackle an additional arbitrary monotone operator *C* just as the Douglas−Rachford operator *T*_{DR} does. But come along Davis and Yin (2015), realizing that *an orthogonal projector
**P*_{S}
is nothing but the resolvent of a normal cone
*J*_{NS}
. Replacing the normal cone *N*_{S} by the arbitrary *C* in *T*_{GFB} yields the following operator

*T*_{FDR} :=
*J*_{A}(2*J*_{C} − Id −
*B**J*_{C}) +
(Id − *J*_{C}) .

Although this generalization is straightforward, they must be given the credit that the convergence analysis of this iteration is more delicate.
They call it a “three operator splitting scheme”, which is regrettable because it should exactly be called forward-Douglas−Rachford. We would also have liked the “generalized generalized forward-backward”, but unfortunately the authors never heard of us. In fact, by reading their paper which makes no mention of our work, it seems that their operator comes out of nowhere.

They could also have tried to illustrate numerically, or even discuss the possibility, of instances where such extension is practically useful. We may have found applications where this is the case. Well, sort of. Not really, actually, but it does remain true that the forward-Douglas−Rachford is more elegant than plain generalized forward-backward on these instances.

Everything is detailed in our note, where we notably specify the case with an arbitrary number of operators *A*_{i} and the use of preconditioners
(published in Optimization Letters;
reference as BibTeX format).

C++ implementation, interfaced with GNU Octave or Matlab, of the resulting method for typical signal processing and learning tasks can be found at the dedicated GitHub repository.