# normal equations

We consider the problem $Ax\approx b$ , where $A$ is an $m\times n$ matrix with $m\geq n$ rank $(A)=n$, $b$ is an $m\times 1$ vector, and $x$ is the $n\times 1$ vector to be determined.

The sign $\approx$ stands for the least squares approximation, i.e. a minimization of the norm of the residual $r=Ax-b$.

 $||Ax-b||_{2}=||r||_{2}=\left[\sum_{i=1}^{m}r_{i}^{2}\right]^{1/2}$

or the square

 $\displaystyle F(x)$ $\displaystyle=$ $\displaystyle\frac{1}{2}||Ax-b||_{2}^{2}=\frac{1}{2}(Ax-b)^{T}(Ax-b)$ $\displaystyle=$ $\displaystyle\frac{1}{2}(x^{T}A^{T}Ax-2x^{T}A^{T}b+b^{T}b)$

i.e. a differentiable function of $x$. The necessary condition for a minimum is:

 $\nabla F(x)=0\text{or}\frac{\partial F}{\partial x_{i}}=0(i=1,\ldots,n)$

These equations are called the normal equations , which become in our case:

 $A^{T}Ax=A^{T}b$

The solution $x=(A^{T}A)^{-1}A^{T}b$ is usually computed with the following algorithm: First (the lower triangular portion of) the symmetric matrix $A^{T}A$ is computed, then its Cholesky decomposition $LL^{T}$. Thereafter one solves $Ly=A^{T}b$ for $y$ and finally $x$ is computed from $L^{T}x=y$.

Unfortunately $A^{T}A$ is often ill-conditioned and strongly influenced by roundoff errors (see [Golub89]). Other methods which do not compute $A^{T}A$ and solve $Ax\approx b$ directly are QR decomposition and singular value decomposition.

References

• Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.htmlhttp://rkb.home.cern.ch/rkb/titleA.html)

Title normal equations NormalEquations 2013-03-22 12:04:28 2013-03-22 12:04:28 akrowne (2) akrowne (2) 7 akrowne (2) Definition msc 65-00 msc 15-00