|
The conjugate gradient algorithm is used to solve the quadratic minimization problem: $$ \min\left( \frac{1}{2} x^T Q x - b^T x\right) $$ or equivalently to solve the linear system $Q x - b = 0$ , where $Q$ is a given $n$ by $n$ symmetric positive definite matrix and $b$ is a given $n$ vector.
The algorithm requires n iterations, starting from an arbitrary initial guess $x_0$ (often $x_0 = 0$ is used). We will use the following notation:
Algorithm
- Initialization. Let $x_0=0$ (or other starting point). Let $r_0 = d_0 = -Q x_0 + b$ . (The initial search direction is set to minus the gradient of the quadratic function being minimized, evaluated at the starting point).
- For $i = 0$ to $n-1$ compute $$ \alpha = \frac{r_i^T d_i}{d_i^T Q d_i} $$ $$ x_{i+1} = x_i + \alpha d_i $$ $$ r_{i+1} = r_i - \alpha Q d_i $$ If $\|r_i\| < \varepsilon$ or $i = n-1$ Stop, the solution has been found. Otherwise, continue: $$ \beta = \frac{r_{i+1}^T Q d_i}{d_i^T Q d_i} $$ $$ d_{i+1} = -r_{i+1}+\beta d_i $$ (In each iteration the solution estimate is set to the previous estimate plus a multiple of the previous search direction. The next search direction
is then set to the gradient plus a multiple of the previous search direction).
Discussion
The conjugate gradient method was developed in 1952 by Hestenes and Stiefel as an improvement to the steepest descent method. Whereas steepest descent approaches the solution asymptotically, the conjugate gradient method will find the solution in n iterations (assuming no roundoff error).
Why the name? The search directions $d_i$ are conjugate in the sense that $d_j^T Q d_i = 0$ for $j<i$ . In addition these directions are computed from (but are not equal to) the gradient.
The conjugate gradient method has been generalized to the case where the function being minimized is only approximately quadratic. In that case the explicit formula for $\alpha$ given above is replaced by a line-search procedure; this is a trial and error method in which various values of $\alpha$ are tried, and the value that leads to the smallest value of the objective is chosen. Well known generalized c.g. methods include the Fletcher-Reeves method and the Polak-Ribiere method.
Example
Solve
We have
then
and finally
which is the solution.
References
Luenberger: Introduction to Linear and Nonlinear Programming, Addison-Wesley, 1973
Jonathan Richard Shewchuk: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, August 1994. .html">http://www-2.cs.cmu.edu/ jrs/jrspapers.html [A detailed derivation of the method from first principles].
Press, et al.: Numerical Recipes in C, Cambridge University Press, 1995 [Chapter 10.6 contains an implementation of the generalized conjugate gradient method of Polak and Ribiere].
|