## You are here

Homeleast squares

## Primary tabs

# least squares

The general problem to be solved by the least squares method is this: given some direct measurements $y$ of random variables, and knowing a set of equations $f$ which have to be satisfied by these measurements (possibly involving unknown parameters $x$), find the set of $x$ which comes closest to satisfying

$f(x,y)=0$ |

where “closest” is defined by a $\Delta y$ such that

$f(x,y+\Delta y)=0\text{ and }\Delta y^{2}\text{ is minimized }$ |

$\Delta y^{2}=\Delta y^{T}\Delta y=||\Delta y||_{2}=\sum_{i}\Delta y_{i}^{2}$ |

The assumption has been made here that the elements of $y$ are statistically uncorrelated and have equal variance. For this case, the above solution results in the most efficent estimators for $x$, $\Delta y$. If the $y$ are correlated, correlations and variances are defined by a covariance matrix $C$, and the above minimum condition becomes

$\Delta y^{T}C^{{-1}}\Delta y\text{ is minimized }$ |

Least squares solutions can be more or less simple, depending on the constraint equations $f$. If there is exactly one equation for each measurement, and the functions $f$ are linear in the elements of $y$ and $x$, the solution is discussed under linear regression. For other linear models, see Linear Least Squares. Least squares methods applied to few parameters can lend themselves to very efficient algorithms (e.g. in real-time image processing), as they reduce to simple matrix operations.

If the constraint equations are non-linear, one typically solves by linearization and in iterations, using approximate values of $x$, $\Delta y$ in every step, and linearizing by forming the matrix of derivatives , $df/dx$ (the Jacobian matrix) and possibly also $df/dy$ at the last point of approximation.

Note that as the iterative improvements $\delta x,\delta y$ tend towards zero (if the process converges), $\Delta y$ converges towards a final value which enters the minimum equation above.

Algorithms avoiding the explicit calculation of $df/dx$ and $df/dy$ have also been investigated, e.g. [1]; for a discussion, see [2]. Where convergence (or control over convergence) is problematic, use of a general package for minimization may be indicated.

# References

- 1 M.L. Ralston and R.I. Jennrich, Dud, a Derivative-free Algorithm for Non-linear Least Squares, Technometrics 20-1 (1978) 7.
- 2 W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, Second edition, Cambridge University Press, 1995.

Note: This entry is based on content from the The Data Analysis Briefbook

## Mathematics Subject Classification

15-00*no label found*

- Forums
- Planetary Bugs
- HS/Secondary
- University/Tertiary
- Graduate/Advanced
- Industry/Practice
- Research Topics
- LaTeX help
- Math Comptetitions
- Math History
- Math Humor
- PlanetMath Comments
- PlanetMath System Updates and News
- PlanetMath help
- PlanetMath.ORG
- Strategic Communications Development
- The Math Pub
- Testing messages (ignore)

- Other useful stuff

## Recent Activity

new question: Prove that for any sets A, B, and C, An(BUC)=(AnB)U(AnC) by St_Louis

Apr 20

new image: information-theoretic-distributed-measurement-dds.png by rspuzio

new image: information-theoretic-distributed-measurement-4.2 by rspuzio

new image: information-theoretic-distributed-measurement-4.1 by rspuzio

new image: information-theoretic-distributed-measurement-3.2 by rspuzio

new image: information-theoretic-distributed-measurement-3.1 by rspuzio

new image: information-theoretic-distributed-measurement-2.1 by rspuzio

Apr 19

new collection: On the Information-Theoretic Structure of Distributed Measurements by rspuzio

Apr 15

new question: Prove a formula is part of the Gentzen System by LadyAnne

Mar 30

new question: A problem about Euler's totient function by mbhatia