normal equations
We consider the problem $Ax\approx b$ , where $A$ is an $m\times n$ matrix with $m\ge n$ rank $(A)=n$, $b$ is an $m\times 1$ vector, and $x$ is the $n\times 1$ vector to be determined.
The sign $\approx $ stands for the least squares approximation, i.e. a minimization of the norm of the residual $r=Axb$.
$${Axb}_{2}={r}_{2}={\left[\sum _{i=1}^{m}{r}_{i}^{2}\right]}^{1/2}$$ 
or the square
$F(x)$  $=$  $\frac{1}{2}}{Axb}_{2}^{2}={\displaystyle \frac{1}{2}}{(Axb)}^{T}(Axb)$  
$=$  $\frac{1}{2}}({x}^{T}{A}^{T}Ax2{x}^{T}{A}^{T}b+{b}^{T}b)$ 
i.e. a differentiable function of $x$. The necessary condition for a minimum is:
$$\nabla F(x)=0\text{or}\frac{\partial F}{\partial {x}_{i}}=0(i=1,\mathrm{\dots},n)$$ 
These equations are called the normal equations , which become in our case:
$${A}^{T}Ax={A}^{T}b$$ 
The solution $x={({A}^{T}A)}^{1}{A}^{T}b$ is usually computed with the following algorithm: First (the lower triangular portion of) the symmetric matrix^{} ${A}^{T}A$ is computed, then its Cholesky decomposition^{} $L{L}^{T}$. Thereafter one solves $Ly={A}^{T}b$ for $y$ and finally $x$ is computed from ${L}^{T}x=y$.
Unfortunately ${A}^{T}A$ is often illconditioned and strongly influenced by roundoff errors (see [Golub89]). Other methods which do not compute ${A}^{T}A$ and solve $Ax\approx b$ directly are QR decomposition^{} and singular value decomposition^{}.
References

•
Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.htmlhttp://rkb.home.cern.ch/rkb/titleA.html)
Title  normal equations 

Canonical name  NormalEquations 
Date of creation  20130322 12:04:28 
Last modified on  20130322 12:04:28 
Owner  akrowne (2) 
Last modified by  akrowne (2) 
Numerical id  7 
Author  akrowne (2) 
Entry type  Definition 
Classification  msc 6500 
Classification  msc 1500 