|
|
|
|
tests for local extrema in Lagrange multiplier method
|
(Result)
|
|
|
Let $U$ be open in $\real^n$ , and $f\colon U \to \real$ , $g\colon U \to \real^m$ be twice continuously differentiable functions. Assume that $p \in U$ is a stationary point for $f$ on $M = g^{-1}(\{ 0 \})$ , and $\D g$ has full rank everywhere 1 on $M$ . Then we know that $p$ is the solution to the Lagrange multiplier system \begin{equation}\label{lagrange} \D f(p) = \lambda \cdot \D g(p)\,, \end{equation}for a Lagrange multiplier vector
.
Our aim is to develop an analogue of the second derivative test for the stationary point $p$ .
The most straightforward way to proceed is to consider a coordinate chart $\alpha\colon V \to M$ for the manifold $M$ , and consider the Hessian of the function $f \circ \alpha\colon V \to \real^n$ at $0 = \alpha^{-1}(p)$ . This Hessian is in fact just the Hessian form of $f \colon M \to \real$ expressed in the coordinates of the chart $\alpha$
. But the whole point of using Lagrange multipliers is to avoid calculating coordinate charts directly, so we find an equivalent expression for $\D^2 (f\circ \alpha)(0)$ in terms of $\D^2 f(p)$ without mentioning derivatives of $\alpha$ .
To do this, we differentiate $f \circ \alpha$ twice using the chain rule and product rule 2. To reduce clutter, from now on we use the prime notation for derivatives rather than $\D$ . \begin{equation}\label{second-derivative} \begin{split} (f \circ \alpha)''(0) &= ((f' \circ \alpha) \cdot \alpha')'(0) \\ &= \bigl ( (f'' \circ \alpha) \cdot \alpha' \cdot \alpha' + (f' \circ \alpha) \cdot \alpha'' \bigr) (0) \\ &= f''(\alpha(0))
\cdot \alpha'(0) \cdot \alpha'(0) + f'(\alpha(0)) \cdot \alpha''(0) \\ &= f''(p) \cdot \alpha'(0) \cdot \alpha'(0) + f'(p) \cdot \alpha''(0)\,. \end{split} \end{equation} If we interpret $(f \circ \alpha)''(0)$ as a bilinear mapping of vectors $u,v \in \real^{n-m}$ , then formula ( ) really means \begin{equation}\label{bilinear} (f \circ \alpha)''(0) \cdot (u, v) = f''(p) \cdot \bigl(\alpha'(0)\cdot u, \alpha'(0) \cdot v\bigr) + f'(p) \cdot \bigl(\alpha''(0) \cdot (u, v)\bigr)\,. \end{equation}To obtain the quadratic form, we set $v = u$ ; also we abbreviate the
vector $\alpha'(0) \cdot u$ by $h$ , which belongs to the tangent space $\mathrm{T}_p M$ of $M$ at $p$ . So, \begin{equation}\label{quadratic} (f \circ \alpha)''(0) \cdot u^2 = f''(p) \cdot h^2 + f'(p) \cdot \bigl(\alpha''(0) \cdot u^2 \bigr)\,. \end{equation}Naïvely, we might think that $(f \circ \alpha)''(0)$ is simply $f''(p)$ restricted to the tangent space $\mathrm{T}_p M$ . This happens to be the first term in ( ), but there is also an additional
contribution by the second term involving $\alpha''(0)$ ; intuitively, $\alpha''(0)$ is the curvature of the surface (manifold) $M$ , ``changing the geometry'' of the graph of $f$ .
But the second term of ( ) still involves $\alpha$ . To eliminate it, we differentiate the equation $g \circ \alpha = 0$ twice. \begin{equation}\label{g-second-derivative} 0 = (g \circ \alpha)''(0) = g''(p) \cdot \alpha'(0) \cdot \alpha'(0) + g'(p) \cdot \alpha''(0)\,. \end{equation}(It is derived the same way as ( ) but with $f$ replaced by $g$ .) Now we can substitute ( ) and ( ) in ( ) to eliminate the term $f'(p) \cdot \alpha''(0)$ : \begin{equation}\label{ff-second-derivative} \begin{split} (f \circ \alpha)''(0) &= f''(p) \cdot \alpha'(0) \cdot \alpha'(0) + \lambda \cdot g'(p) \cdot \alpha''(0) \\ &= f''(p) \cdot \alpha'(0) \cdot \alpha'(0) - \lambda \cdot g''(p) \cdot \alpha'(0) \cdot \alpha'(0)\,, \end{split} \end{equation}or expressed as a quadratic form, \begin{equation}\label{ff-quadratic} (f \circ \alpha)''(0) \cdot u^2 = f''(p) \cdot h^2 - \lambda \cdot g''(p) \cdot h^2\,. \end{equation} Thus, to understand the nature of the stationary point $p$ , we can study the modified Hessian: \begin{equation}\label{lagrange-hessian} f''(p) -
\lambda \cdot g''(p)\,, \quad \text{ restricted to $\mathrm{T}_p M$. } \end{equation}For example, if this bilinear form is positive definite, then $p$ is a local minimum, and if it is negative definite, then $p$ is a local maximum, and so on. All the tests that apply to the usual Hessian in $\real^n$ apply to the modified Hessian ( ).
In coordinates of $\real^n$ , the modified Hessian ( ) takes the form \begin{equation}\label{lagrange-hessian-coord} \sum_{i=1}^n \sum_{j=1}^n \left( \left.\frac{\partial^2 f}{\partial x^i \partial x^j}\right|_p - \sum_{k=1}^m \lambda_k \, \left.\frac{\partial^2 g^k}{\partial x^i \partial x^j}\right|_p \right) h^i h^j\,. \end{equation}We emphasize that the vector $h$ can be restricted to lie in the tangent space $\mathrm{T}_p M$ , when studying the stationary point $p$ of $f$ restricted to $M$ .
In matrix form ( ) can be written \begin{equation}\label{lagrange-hessian-matrix} B= \begin{bmatrix} \dfrac{\partial^2 f}{\partial x^i \partial x^j} - \sum\limits_{k=1}^m \lambda_k \, \dfrac{\partial^2 g^k}{\partial x^i \partial x^j} \end{bmatrix}_{ij}\,. \end{equation}But again, the test vector $h$ need only lie on $\mathrm{T}_p M$ , so if we want to apply positive/negative definiteness tests for matrices, they should instead be applied to the projected or reduced Hessian: \begin{equation} Z^\mathrm{t} B Z \end{equation}where the columns of the $n \times (n-m)$ matrix $Z$ form a basis for $\mathrm{T}_p M = \ker g'(p) \subset \real^n$ .
Footnotes
- 1
- Actually, only $\D g(p)$ needs to have full rank, and the arguments presented here continue to hold in that case, although $M$ would not necessarily be a manifold then.
- 2
- Note that the ``product'' operation involved (second equality of (
)) is the operation of composition of two linear mappings. Think hard about this if you are not sure; it took me several tries to get this formula right, since multi-variable iterated derivatives
have a complicated structure.
|
"tests for local extrema in Lagrange multiplier method" is owned by stevecheng.
|
|
(view preamble | get metadata)
Cross-references: basis, columns, reduced, positive, lie on, matrix, local maximum, negative definite, local minimum, positive definite, bilinear form, equation, graph, surface, curvature, tangent space, belongs, quadratic form, bilinear mapping, prime, structure, right, formula, linear mappings, composition, equality, operation, product rule, chain rule, differentiate, derivatives, terms, expression, equivalent, point, chart, coordinates, Hessian form, Hessian, coordinate chart, second derivative test, vector, solution, manifold, arguments, rank, stationary point, functions, continuously differentiable, open
There is 1 reference to this entry.
This is version 3 of tests for local extrema in Lagrange multiplier method, born on 2005-08-20, modified 2005-08-20.
Object id is 7336, canonical name is TestsForLocalExtremaForLagrangeMultiplierMethod.
Accessed 5873 times total.
Classification:
| AMS MSC: | 49K35 (Calculus of variations and optimal control; optimization :: Necessary conditions and sufficient conditions for optimality :: Minimax problems) | | | 49-00 (Calculus of variations and optimal control; optimization :: General reference works ) | | | 26B12 (Real functions :: Functions of several variables :: Calculus of vector functions) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|