conjugate gradient algorithm

The conjugate gradient algorithm is used to solve the quadratic minimization problem:

\min\left(\frac{1}{2}x^{T}Qx-b^{T}x\right)

or equivalently to solve the linear system $Qx-b=0$ , where $Q$ is a given $n$ by $n$ symmetric positive definite matrix and $b$ is a given $n$ vector.

The algorithm requires n iterations, starting from an arbitrary initial guess $x_{0}$ (often $x_{0}=0$ is used). We will use the following notation:

•

$i$ — iteration number;
•

$x_{i}$ — solution approximation;
•

$d_{i}$ — search direction;
•

$r_{i}$ — residual, which is defined as $b-Qx_{i}$ .

Algorithm

1.

Initialization. Let $x_{0}=0$ (or other starting point). Let $r_{0}=d_{0}=-Qx_{0}+b$ . (The initial search direction is set to minus the gradient of the quadratic function being minimized, evaluated at the starting point).
2.

For $i=0$ to $n-1$ compute

$\alpha=\frac{r_{i}^{T}d_{i}}{d_{i}^{T}Qd_{i}}$

$x_{i+1}=x_{i}+\alpha d_{i}$

$r_{i+1}=r_{i}-\alpha Qd_{i}$

If $\|r_{i}\|<\varepsilon$ or $i=n-1$ Stop, the solution has been found. Otherwise, continue:

$\beta=\frac{r_{i+1}^{T}Qd_{i}}{d_{i}^{T}Qd_{i}}$

$d_{i+1}=-r_{i+1}+\beta d_{i}$

(In each iteration the solution estimate is set to the previous estimate plus a multiple of the previous search direction. The next search direction is then set to the gradient plus a multiple of the previous search direction).

Discussion

The conjugate gradient method was developed in 1952 by Hestenes and Stiefel as an improvement to the steepest descent method. Whereas steepest descent approaches the solution asymptotically, the conjugate gradient method will find the solution in n iterations (assuming no roundoff error).

Why the name? The search directions $d_{i}$ are conjugate in the sense that $d_{j}^{T}Qd_{i}=0$ for $j<i$ . In addition these directions are computed from (but are not equal to) the gradient.

The conjugate gradient method has been generalized to the case where the function being minimized is only approximately quadratic. In that case the explicit formula for $\alpha$ given above is replaced by a line-search procedure; this is a trial and error method in which various values of $\alpha$ are tried, and the value that leads to the smallest value of the objective is chosen. Well known generalized c.g. methods include the Fletcher-Reeves method and the Polak-Ribiere method.

Example

Solve

\left(\begin{array}[]{rr}1&2\\ 2&1\end{array}\right)x=\left(\begin{array}[]{rr}-3\\ 0\end{array}\right)

We have

x_{0}=\left(\begin{array}[]{rr}0\\ 0\end{array}\right)

then

x_{1}=\left(\begin{array}[]{rr}-3\\ 0\end{array}\right)

and finally

x_{2}=\left(\begin{array}[]{rr}1\\ -2\end{array}\right)

which is the solution.

References

Luenberger: Introduction to Linear and Nonlinear Programming, Addison-Wesley, 1973

Jonathan Richard Shewchuk: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, August 1994. \htmladdnormallinkhttp://www-2.cs.cmu.edu/ jrs/jrspapers.html http://www-2.cs.cmu.edu/ jrs/jrspapers.html [A detailed derivation of the method from first principles].

Press, et al.: Numerical Recipes in C, Cambridge University Press, 1995 [Chapter 10.6 contains an implementation of the generalized conjugate gradient method of Polak and Ribiere].

Title	conjugate gradient algorithm
Canonical name	ConjugateGradientAlgorithm
Date of creation	2013-03-22 14:58:54
Last modified on	2013-03-22 14:58:54
Owner	aplant (12431)
Last modified by	aplant (12431)
Numerical id	16
Author	aplant (12431)
Entry type	Algorithm
Classification	msc 15A06
Classification	msc 90C20
Synonym	method of conjugate gradients