# tests for local extrema in Lagrange multiplier method

Let $U$ be open in ${\mathbb{R}}^{n}$,
and $f:U\to \mathbb{R}$, $g:U\to {\mathbb{R}}^{m}$ be twice continuously differentiable functions.
Assume that $p\in U$ is a stationary point
for $f$ on $M={g}^{-1}(\{0\})$,
and $\mathrm{D}g$ has full rank everywhere^{1}^{1}Actually, only $\mathrm{D}g(p)$ needs to have full rank, and the arguments^{} presented here continue to hold in that case, although $M$ would not necessarily be a manifold then. on $M$.
Then we know that $p$ is the solution
to the Lagrange multiplier^{} system

$$\mathrm{D}f(p)=\lambda \cdot \mathrm{D}g(p),$$ | (1) |

for a Lagrange multiplier vector $\lambda =({\lambda}_{1},\mathrm{\dots},{\lambda}_{m})$.

Our aim is to develop an analogue of the second derivative test^{} for the stationary point $p$.

The most straightforward way to proceed is to consider a coordinate chart $\alpha :V\to M$
for the manifold $M$, and consider the Hessian^{} of the function $f\circ \alpha :V\to {\mathbb{R}}^{n}$
at $0={\alpha}^{-1}(p)$. This Hessian is in fact just the Hessian form of $f:M\to \mathbb{R}$
expressed in the coordinates^{} of the chart $\alpha $. But the whole point of using Lagrange multipliers
is to avoid calculating coordinate charts directly, so we find an equivalent^{} expression
for ${\mathrm{D}}^{2}(f\circ \alpha )(0)$ in terms of ${\mathrm{D}}^{2}f(p)$ without mentioning derivatives^{} of $\alpha $.

To do this, we differentiate $f\circ \alpha $ twice using the chain rule^{} and product
rule^{}^{2}^{2}Note that the “product^{}” operation^{} involved (second equality of (2))
is the operation of *composition of two linear mappings*.
Think hard about this if you are not sure; it took me several tries to get this formula^{} right,
since multi-variable iterated derivatives have a complicated structure^{}..
To reduce clutter, from now on we use the prime notation for derivatives rather than $\mathrm{D}$.

$$\begin{array}{cc}\hfill {(f\circ \alpha )}^{\prime \prime}(0)& ={(({f}^{\prime}\circ \alpha )\cdot {\alpha}^{\prime})}^{\prime}(0)\hfill \\ & =\left(({f}^{\prime \prime}\circ \alpha )\cdot {\alpha}^{\prime}\cdot {\alpha}^{\prime}+({f}^{\prime}\circ \alpha )\cdot {\alpha}^{\prime \prime}\right)(0)\hfill \\ & ={f}^{\prime \prime}(\alpha (0))\cdot {\alpha}^{\prime}(0)\cdot {\alpha}^{\prime}(0)+{f}^{\prime}(\alpha (0))\cdot {\alpha}^{\prime \prime}(0)\hfill \\ & ={f}^{\prime \prime}(p)\cdot {\alpha}^{\prime}(0)\cdot {\alpha}^{\prime}(0)+{f}^{\prime}(p)\cdot {\alpha}^{\prime \prime}(0).\hfill \end{array}$$ | (2) |

If we interpret ${(f\circ \alpha )}^{\prime \prime}(0)$ as a bilinear mapping of vectors $u,v\in {\mathbb{R}}^{n-m}$, then formula (2) really means

$${(f\circ \alpha )}^{\prime \prime}(0)\cdot (u,v)={f}^{\prime \prime}(p)\cdot ({\alpha}^{\prime}(0)\cdot u,{\alpha}^{\prime}(0)\cdot v)+{f}^{\prime}(p)\cdot \left({\alpha}^{\prime \prime}(0)\cdot (u,v)\right).$$ | (3) |

To obtain the quadratic form^{}, we set $v=u$; also we abbreviate the vector ${\alpha}^{\prime}(0)\cdot u$ by $h$,
which belongs to the tangent space ${\mathrm{T}}_{p}M$ of $M$ at $p$. So,

$${(f\circ \alpha )}^{\prime \prime}(0)\cdot {u}^{2}={f}^{\prime \prime}(p)\cdot {h}^{2}+{f}^{\prime}(p)\cdot \left({\alpha}^{\prime \prime}(0)\cdot {u}^{2}\right).$$ | (4) |

Naïvely, we might think that ${(f\circ \alpha )}^{\prime \prime}(0)$ is simply ${f}^{\prime \prime}(p)$ restricted to the tangent space ${\mathrm{T}}_{p}M$. This happens to be the first term in (4), but there is also an additional contribution by the second term involving ${\alpha}^{\prime \prime}(0)$; intuitively, ${\alpha}^{\prime \prime}(0)$ is the curvature of the surface (manifold) $M$, “changing the geometry” of the graph of $f$.

But the second term of (4) still involves $\alpha $. To eliminate it, we differentiate the equation $g\circ \alpha =0$ twice.

$$0={(g\circ \alpha )}^{\prime \prime}(0)={g}^{\prime \prime}(p)\cdot {\alpha}^{\prime}(0)\cdot {\alpha}^{\prime}(0)+{g}^{\prime}(p)\cdot {\alpha}^{\prime \prime}(0).$$ | (5) |

(It is derived the same way as (2) but with $f$ replaced by $g$.) Now we can substitute (5) and (1) in (2) to eliminate the term ${f}^{\prime}(p)\cdot {\alpha}^{\prime \prime}(0)$:

$$\begin{array}{cc}\hfill {(f\circ \alpha )}^{\prime \prime}(0)& ={f}^{\prime \prime}(p)\cdot {\alpha}^{\prime}(0)\cdot {\alpha}^{\prime}(0)+\lambda \cdot {g}^{\prime}(p)\cdot {\alpha}^{\prime \prime}(0)\hfill \\ & ={f}^{\prime \prime}(p)\cdot {\alpha}^{\prime}(0)\cdot {\alpha}^{\prime}(0)-\lambda \cdot {g}^{\prime \prime}(p)\cdot {\alpha}^{\prime}(0)\cdot {\alpha}^{\prime}(0),\hfill \end{array}$$ | (6) |

or expressed as a quadratic form,

$${(f\circ \alpha )}^{\prime \prime}(0)\cdot {u}^{2}={f}^{\prime \prime}(p)\cdot {h}^{2}-\lambda \cdot {g}^{\prime \prime}(p)\cdot {h}^{2}.$$ | (7) |

Thus, to understand the nature of the stationary point $p$, we can study the modified Hessian:

$${f}^{\prime \prime}(p)-\lambda \cdot {g}^{\prime \prime}(p),\text{restricted to}{\mathrm{T}}_{p}M\text{.}$$ | (8) |

For example, if this bilinear form^{} is positive definite^{}, then $p$ is a local minimum^{},
and if it is negative definite, then $p$ is a local maximum, and so on.
All the tests that apply to the usual Hessian in ${\mathbb{R}}^{n}$ apply to the modified Hessian (8).

In coordinates of ${\mathbb{R}}^{n}$, the modified Hessian (8) takes the form

$$\sum _{i=1}^{n}\sum _{j=1}^{n}\left({\frac{{\partial}^{2}f}{\partial {x}^{i}\partial {x}^{j}}|}_{p}-{\sum _{k=1}^{m}{\lambda}_{k}\frac{{\partial}^{2}{g}^{k}}{\partial {x}^{i}\partial {x}^{j}}|}_{p}\right){h}^{i}{h}^{j}.$$ | (9) |

We emphasize that the vector $h$ can be restricted to lie in the tangent space ${\mathrm{T}}_{p}M$, when studying the stationary point $p$ of $f$ restricted to $M$.

In matrix form (9) can be written

$$B={\left[\begin{array}{c}\hfill \frac{{\partial}^{2}f}{\partial {x}^{i}\partial {x}^{j}}-\sum _{k=1}^{m}{\lambda}_{k}\frac{{\partial}^{2}{g}^{k}}{\partial {x}^{i}\partial {x}^{j}}\hfill \end{array}\right]}_{ij}.$$ | (10) |

But again, the test vector $h$ need only lie on ${\mathrm{T}}_{p}M$,
so if we want to apply positive^{}/negative definiteness tests for matrices,
they should instead be applied to the *projected* or *reduced* Hessian:

$${Z}^{\mathrm{t}}BZ$$ | (11) |

where the columns of the $n\times (n-m)$ matrix $Z$ form a *basis*
for ${\mathrm{T}}_{p}M=\mathrm{ker}{g}^{\prime}(p)\subset {\mathbb{R}}^{n}$.

Title | tests for local extrema in Lagrange multiplier method |

Canonical name | TestsForLocalExtremaInLagrangeMultiplierMethod |

Date of creation | 2013-03-22 15:28:52 |

Last modified on | 2013-03-22 15:28:52 |

Owner | stevecheng (10074) |

Last modified by | stevecheng (10074) |

Numerical id | 6 |

Author | stevecheng (10074) |

Entry type | Result |

Classification | msc 26B12 |

Classification | msc 49-00 |

Classification | msc 49K35 |

Related topic | HessianForm |

Related topic | RelationsBetweenHessianMatrixAndLocalExtrema |

Defines | projected Hessian |

Defines | reduced Hessian |