|
The study of best approximations in inner product spaces has a very elegant treatment with profound consequences. Most of the theory of Hilbert spaces depends on this study and several approximation problems are better understood using this techniques and results.
For example: least square fitting, linear regression, approximation of functions by polynomials, among many other problems, can be seen as particular cases of the general study of best approximation in inner product spaces.
Some of the above problems are going to be discussed later in this entry.
Our fundamental result on the existence and uniqueness of best approximations is the following (we postpone its proof to this attached entry):
Theorem - Let $X$ be an inner product space and $A \subseteq X$ a complete, convex and non-empty subset. Then for every $x \in X$ there exists a unique best approximation of $x$ in $A$ , i.e. there exists a unique element $a_0 \in A$ such that
The following result gives a very geometric interpretation of the best approximation when $A$ is a subspace of $X$ . We also postpone its proof to an attached entry.
Theorem - Let $X$ be an inner product space, $A \subseteq X$ a subspace and $x \in X$ . The following statements are equivalent:
- $a_0 \in A$ is the best approximation of $x$ in $A$ .
- $a_0 \in A$ and $x-a_0 \perp A$ .
Thus, the best approximation of $x$ in a subspace $A$ is just the orthogonal projection of $x$ in $A$ .
When the $A$ is a complete subspace of $X$ , the best approximation can be "calculated" explicitly. Recall that, in this case, $A$ becomes an Hilbert space (since it is complete) and therefore it has an orthonormal basis.
Again, we postpone the proof of the next result to an attached entry.
Theorem - Let $X$ be an inner product space and $A \subseteq X$ a complete subspace. Let $(e_i)_{i \in J}$ be an orthonormal basis for $A$ . Then for every $x \in X$ the best approximation $a_0 \in A$ of $x$ in $A$ is given by
One can also write the best approximation in terms of any other basis (not necessarily an orthonormal one). For simplicity we present here how that can be done when $A$ is a finite dimensional subspace of $X$ .
Theorem - Let $X$ be an inner product space and $A \subseteq X$ a finite dimensional subspace. Let $v_1, \dots, v_n$ be a basis for $A$ . Then for every $x \in X$ the best approximation $a_0 \in A$ of $x$ in $A$ is given by
where the coefficients $a_0^i$ are the solutions of the system of equations
$Remark - $ The above matrix is a symmetric positive definite matrix, which implies that the system has a unique solution as expected.
There are several applications of the above results. We explore two of them in the following.
Suppose we want to find a polynomial of degree $\leq n$ that approximates in the best possible way a given function $f$ . We are in fact trying to find a point in the subspace of polynomials of degree $\leq n$ that is closest to $f$ , i.e. we are trying to find the best approximation of $f$ in that subspace.
For example, let $f \in L^2([0,1])$ . Consider the basis $v_k(t)= t^k ,\quad 0\leq k \leq n, \;$ of the subspace of polynomials of degree $\leq n$ .
The best approximation of $f$ by these polynomials is the function $a_0(t) = a_0^1 +a_0^1 t + \dots + a_0^n t^n$ , where the coefficients $a_0^1, \dots, a_0^n$ are the solutions of the system
$Remark -$ Instead of polynomials we could approximate $f$ by any other type of functions using the same procedure.
Suppose we want to find the line that best fits some given points $(t_1, y_1), \dots, (t_n, y_n)$ , i.e. the affine function $a_0(t) = \alpha t + \beta$ that minimizes $\displaystyle \sum_{k = 1}^n |a_0(t_k) - y_k|^2$ .
We are then led to consider the inner product
in the space of functions $h:\{t_1, \dots, t_k\} \longrightarrow \mathbb{R}$ .
With this setting we are then looking for the best approximation of the function $f(t_k)=y_k$ on the subspace of affine functions.
A base for the subspace of affine functions is given by the functions $v_1(t) = 1$ and $v_2(t) = t$ .
The best approximation of $f$ on this space is the function $a_0(t) = \beta + \alpha t$ , where the coefficients $\beta, \alpha$ are the solutions of the system
Thus, the function $a_0(t) = \beta + \alpha t$ obtained by the above procedure provides the line that best fits the data $(t_1, y_1), \dots, (t_n, y_n)$ .
|