# tests for local extrema in Lagrange multiplier method

Let $U$ be open in $\mathbb{R}^{n}$, and $f\colon U\to\mathbb{R}$, $g\colon U\to\mathbb{R}^{m}$ be twice continuously differentiable functions. Assume that $p\in U$ is a stationary point for $f$ on $M=g^{-1}(\{0\})$, and $\operatorname{D}g$ has full rank everywhere11Actually, only $\operatorname{D}g(p)$ needs to have full rank, and the arguments presented here continue to hold in that case, although $M$ would not necessarily be a manifold then. on $M$. Then we know that $p$ is the solution to the Lagrange multiplier system

 $\operatorname{D}f(p)=\lambda\cdot\operatorname{D}g(p)\,,$ (1)

for a Lagrange multiplier vector $\lambda=(\lambda_{1},\ldots,\lambda_{m})$.

Our aim is to develop an analogue of the second derivative test for the stationary point $p$.

The most straightforward way to proceed is to consider a coordinate chart $\alpha\colon V\to M$ for the manifold $M$, and consider the Hessian of the function $f\circ\alpha\colon V\to\mathbb{R}^{n}$ at $0=\alpha^{-1}(p)$. This Hessian is in fact just the Hessian form of $f\colon M\to\mathbb{R}$ expressed in the coordinates of the chart $\alpha$. But the whole point of using Lagrange multipliers is to avoid calculating coordinate charts directly, so we find an equivalent expression for $\operatorname{D}^{2}(f\circ\alpha)(0)$ in terms of $\operatorname{D}^{2}f(p)$ without mentioning derivatives of $\alpha$.

To do this, we differentiate $f\circ\alpha$ twice using the chain rule and product rule22Note that the “productoperation involved (second equality of (2)) is the operation of composition of two linear mappings. Think hard about this if you are not sure; it took me several tries to get this formula right, since multi-variable iterated derivatives have a complicated structure.. To reduce clutter, from now on we use the prime notation for derivatives rather than $\operatorname{D}$.

 $\begin{split}\displaystyle(f\circ\alpha)^{\prime\prime}(0)&\displaystyle=((f^{% \prime}\circ\alpha)\cdot\alpha^{\prime})^{\prime}(0)\\ &\displaystyle=\bigl{(}(f^{\prime\prime}\circ\alpha)\cdot\alpha^{\prime}\cdot% \alpha^{\prime}+(f^{\prime}\circ\alpha)\cdot\alpha^{\prime\prime}\bigr{)}(0)\\ &\displaystyle=f^{\prime\prime}(\alpha(0))\cdot\alpha^{\prime}(0)\cdot\alpha^{% \prime}(0)+f^{\prime}(\alpha(0))\cdot\alpha^{\prime\prime}(0)\\ &\displaystyle=f^{\prime\prime}(p)\cdot\alpha^{\prime}(0)\cdot\alpha^{\prime}(% 0)+f^{\prime}(p)\cdot\alpha^{\prime\prime}(0)\,.\end{split}$ (2)

If we interpret $(f\circ\alpha)^{\prime\prime}(0)$ as a bilinear mapping of vectors $u,v\in\mathbb{R}^{n-m}$, then formula (2) really means

 $(f\circ\alpha)^{\prime\prime}(0)\cdot(u,v)=f^{\prime\prime}(p)\cdot\bigl{(}% \alpha^{\prime}(0)\cdot u,\alpha^{\prime}(0)\cdot v\bigr{)}+f^{\prime}(p)\cdot% \bigl{(}\alpha^{\prime\prime}(0)\cdot(u,v)\bigr{)}\,.$ (3)

To obtain the quadratic form, we set $v=u$; also we abbreviate the vector $\alpha^{\prime}(0)\cdot u$ by $h$, which belongs to the tangent space $\mathrm{T}_{p}M$ of $M$ at $p$. So,

 $(f\circ\alpha)^{\prime\prime}(0)\cdot u^{2}=f^{\prime\prime}(p)\cdot h^{2}+f^{% \prime}(p)\cdot\bigl{(}\alpha^{\prime\prime}(0)\cdot u^{2}\bigr{)}\,.$ (4)

Naïvely, we might think that $(f\circ\alpha)^{\prime\prime}(0)$ is simply $f^{\prime\prime}(p)$ restricted to the tangent space $\mathrm{T}_{p}M$. This happens to be the first term in (4), but there is also an additional contribution by the second term involving $\alpha^{\prime\prime}(0)$; intuitively, $\alpha^{\prime\prime}(0)$ is the curvature of the surface (manifold) $M$, “changing the geometry” of the graph of $f$.

But the second term of (4) still involves $\alpha$. To eliminate it, we differentiate the equation $g\circ\alpha=0$ twice.

 $0=(g\circ\alpha)^{\prime\prime}(0)=g^{\prime\prime}(p)\cdot\alpha^{\prime}(0)% \cdot\alpha^{\prime}(0)+g^{\prime}(p)\cdot\alpha^{\prime\prime}(0)\,.$ (5)

(It is derived the same way as (2) but with $f$ replaced by $g$.) Now we can substitute (5) and (1) in (2) to eliminate the term $f^{\prime}(p)\cdot\alpha^{\prime\prime}(0)$:

 $\begin{split}\displaystyle(f\circ\alpha)^{\prime\prime}(0)&\displaystyle=f^{% \prime\prime}(p)\cdot\alpha^{\prime}(0)\cdot\alpha^{\prime}(0)+\lambda\cdot g^% {\prime}(p)\cdot\alpha^{\prime\prime}(0)\\ &\displaystyle=f^{\prime\prime}(p)\cdot\alpha^{\prime}(0)\cdot\alpha^{\prime}(% 0)-\lambda\cdot g^{\prime\prime}(p)\cdot\alpha^{\prime}(0)\cdot\alpha^{\prime}% (0)\,,\end{split}$ (6)

or expressed as a quadratic form,

 $(f\circ\alpha)^{\prime\prime}(0)\cdot u^{2}=f^{\prime\prime}(p)\cdot h^{2}-% \lambda\cdot g^{\prime\prime}(p)\cdot h^{2}\,.$ (7)

Thus, to understand the nature of the stationary point $p$, we can study the modified Hessian:

 $f^{\prime\prime}(p)-\lambda\cdot g^{\prime\prime}(p)\,,\quad\text{ restricted % to \mathrm{T}_{p}M. }$ (8)

For example, if this bilinear form is positive definite, then $p$ is a local minimum, and if it is negative definite, then $p$ is a local maximum, and so on. All the tests that apply to the usual Hessian in $\mathbb{R}^{n}$ apply to the modified Hessian (8).

In coordinates of $\mathbb{R}^{n}$, the modified Hessian (8) takes the form

 $\sum_{i=1}^{n}\sum_{j=1}^{n}\left(\left.\frac{\partial^{2}f}{\partial x^{i}% \partial x^{j}}\right|_{p}-\sum_{k=1}^{m}\lambda_{k}\,\left.\frac{\partial^{2}% g^{k}}{\partial x^{i}\partial x^{j}}\right|_{p}\right)h^{i}h^{j}\,.$ (9)

We emphasize that the vector $h$ can be restricted to lie in the tangent space $\mathrm{T}_{p}M$, when studying the stationary point $p$ of $f$ restricted to $M$.

In matrix form (9) can be written

 $B=\begin{bmatrix}\dfrac{\partial^{2}f}{\partial x^{i}\partial x^{j}}-\sum% \limits_{k=1}^{m}\lambda_{k}\,\dfrac{\partial^{2}g^{k}}{\partial x^{i}\partial x% ^{j}}\end{bmatrix}_{ij}\,.$ (10)

But again, the test vector $h$ need only lie on $\mathrm{T}_{p}M$, so if we want to apply positive/negative definiteness tests for matrices, they should instead be applied to the projected or reduced Hessian:

 $Z^{\mathrm{t}}BZ$ (11)

where the columns of the $n\times(n-m)$ matrix $Z$ form a basis for $\mathrm{T}_{p}M=\ker g^{\prime}(p)\subset\mathbb{R}^{n}$.

 Title tests for local extrema in Lagrange multiplier method Canonical name TestsForLocalExtremaInLagrangeMultiplierMethod Date of creation 2013-03-22 15:28:52 Last modified on 2013-03-22 15:28:52 Owner stevecheng (10074) Last modified by stevecheng (10074) Numerical id 6 Author stevecheng (10074) Entry type Result Classification msc 26B12 Classification msc 49-00 Classification msc 49K35 Related topic HessianForm Related topic RelationsBetweenHessianMatrixAndLocalExtrema Defines projected Hessian Defines reduced Hessian