|
|
|
|
proof of arithmetic-geometric means inequality using Lagrange multipliers
|
(Example)
|
|
|
As an interesting example of the Lagrange multiplier method, we employ it to prove the arithmetic-geometric means inequality:
with equality if and only if all the $x_i$ are equal.
To begin with, define $f\colon \real_{\geq 0}^n \to \real_{\geq 0}$ by $f(x) = (x_1 \dotsm x_n)^{1/n}$ .
Fix $a > 0$ (the arithmetic mean), and consider the set $M = \{ x \in \real^n : x_i \geq 0 { for all $i$, } \sum_i x_i = na \}$ . This is clearly a closed and bounded set in $\real^n$ , and hence is compact. Therefore $f$ on $M$ has both a maximum and minimum.
$M$ is almost a manifold (a differentiable surface), except that it has a boundary consisting of points with $x_i = 0$ for some $i$ . Clearly $f$ attains the minimum zero on this boundary, and on the ``interior'' of $M$ , that is $M' = \{ x \in \real^n : x_i > 0 { for all $i$, } \sum_i x_i = na \}$ , $f$ is positive. Now $M'$ is a manifold, and the maximum of $f$ on $M$ is evidently a local maximum on $M'$ . Hence the Lagrange multiplier method may be applied.
We have the constraint 1 $0 = g(x) = \sum_i x_i - na$ under which we are to maximize $f$ .
We compute:
If we take the product,
, of the extreme left- and right- hand sides of this equation, we obtain $$ \frac{1}{n^n} = \frac{f(x)^n}{n^n x_1 \dotsm x_n} = \lambda^n $$ from which it follows that $\lambda = 1/n$ ($\lambda = -1/n$ is obviously inadmissible). Then substituting back, we get $f(x) = x_i$ , which implies that the $x_i$ are all equal to $a$ , The value of $f$ at this point is $a$ .
Since we have obtained a unique solution to the Lagrange multiplier system, this unique solution must be the unique maximum point of $f$ on $M$ . So $f(x) \leq a$ amongst all numbers $x_i$ whose arithmetic mean is $a$ , with equality occurring if and only if all the $x_i$ are equal to $a$ .
The case $a = 0$ , which was not included in the above analysis, is trivial.
The question of whether the point obtained from solving Lagrange system is actually a maximum for $f$ was taken care of by our preliminary compactness argument. Another popular way to determine whether a given stationary point is a local maximum or minimum is by studying the Hessian of the function at that point. For the sake of illustrating the relevant techniques, we will prove again that $x_i = a$ is a local maximum for $f$
, this time by showing that the Hessian is negative definite.
Actually, it turns out to be not much harder to show that $f$ is weakly concave everywhere on $\real_{\geq 0}^n$ , not just on $M$ . A plausibility argument for this fact is that the graph of $t \mapsto \sqrt[n]{t}$ is concave, and since $f$ also involves an $n$ th root, it should be concave too. We will prove concavity of $f$ by showing that the Hessian of $f$ is negative semi-definite.
Since $M$ is a convex2 set, the restriction of $f$ to $M$ will also be weakly concave; then we can conclude that the critical point $x_i = a$ on $M$ must be a global minimum on $M$ .
We begin by calculating second-order derivatives:
(The symbol $\delta_{ij}$ is the Kronecker delta.) Thus the Hessian matrix is:
Now if we had actual numbers to substitute for the $x_i$ , then we can go to a computer and ask if the matrix is negative definite or not. But we want to prove global concavity, so we have to employ some clever algebra, and work directly from the definition.
Let
be a test vector. Negative semi-definiteness means $v \, \D^2 f(x) \, v^{\mathrm{T}} \leq 0$ for all $v$ . So let us write out:
We would be done if we can prove \begin{equation}\label{cauchy} \left( \sum_i w_i \right)^2 \leq n \sum_i w_i^2\,, \quad w_i = \frac{v_i}{x_i}\,. \end{equation}But look, this is just the Cauchy-Schwarz inequality, concerning the dot product of the vectors
and
! So $\D^2 f(x)$ is negative semi-definite everywhere, i.e. $f$ is weakly concave.
Moreover, the case of equality in the Cauchy-Schwarz inequality ( ) only occurs when
is parallel to
, i.e. $v_i = \mu x_i$ for some scalar $\mu$ . Notice that
is the normal vector to the tangent space of $M$ , so
unless $\mu = 0$ . This means, equality never holds in ( ) if $v \neq 0$ is restricted to the tangent space of $M$ . In turn, this means that $f$ is strictly concave on the convex set $M$ , and so $x_i = a$ is a strict global
minimum of $f$ on $M$ .
Observe that the maximum point $x_i = a$ is pleasantly symmetric. We can show that the solution has to be symmetric, by exploiting the symmetry of the function $f$ and the set $M$ .
We have already calculated that $f$ is strictly concave on $M$ ; this was independent of the Lagrange multiplier method. Since $f$ is strictly concave on a closed convex set, it has a unique maximum there; call it $x$ .
Pick any indices $i$ and $j$ , ranging from $1$ to $n$ , and form the linear transformation $T$ , which switches the $i$ th and $j$ th coordinates of its argument.
By definition, $f(x) \geq f(\xi)$ for all $\xi \in M$ . But $f \circ T = f$ , so this means $f(Tx) \geq f(T\xi)$ for all $\xi \in M$ . But $M$ is mapped to itself by the transformation $T$ , so this is the same as saying that $f(Tx) \geq f(\xi)$ for all $\xi \in M$ . But the global maximum is unique, so $Tx = x$ , that is, $x_i = x_j$ .
Note that the argument in this last proof is extremely general -- it takes the form: if $x$ is the unique maximum of $f \colon M \to \real$ , and $T$ is any symmetry of $M$ such that $f \circ T = f$ , then $Tx = x$ . This kind of symmetry argument was used to great effect by Jakob Steiner in his famous attempted proof of the isoperimetric inequality: among all closed curves of a given arc length, the circle maximizes the area enclosed. However, Steiner did not realize that the assumption that there is a unique maximizing curve has to be proved. Actually, that is the hardest part of the proof -- if the assumption is known, then one may easily calculate using the Lagrange multiplier method that the maximizing curve must be a circle!
Footnotes
- 1
- Although $g$ does not constrain that $x_i > 0$ , this does not really matter -- for those who are worried about such things, a rigorous way to escape this annoyance is to simply define $f(x) = 0$ to be zero whenever $x_i \leq 0$ for some $i$ . Clearly the new $f$ cannot have a maximum at such points, so we may simply ignore the points with $x_i \leq 0$ when solving the system. And for the same reason, the fact that $f$ fails to be differentiable at the boundary points is immaterial.
- 2
- If $M$ is not convex, then the arguments that follow will not work. In general, a prescription to determine whether a critical point is a local minimum or maximum can be found in tests for local extrema in Lagrange multiplier method.
|
"proof of arithmetic-geometric means inequality using Lagrange multipliers" is owned by stevecheng.
|
|
(view preamble | get metadata)
Cross-references: calculate, curve, area, circle, arc length, closed curves, isoperimetric inequality, proof, global maximum, transformation, coordinates, linear transformation, indices, independent, strictly concave, symmetry, symmetric, global minimum, strict, strictly, tangent space, normal vector, scalar, parallel, dot product, Cauchy-Schwarz inequality, vector, algebra, matrix, computer, Hessian matrix, Kronecker delta, derivatives, second-order, restriction, tests for local extrema in Lagrange multiplier method, local minimum, critical point, convex, negative, root, graph, concave, negative definite, function, Hessian, stationary point, argument, compactness, analysis, numbers, solution, implies, equation, sides, product, local maximum, positive, points, boundary, surface, differentiable, manifold, compact, bounded set, closed, arithmetic mean, fix, equality, arithmetic-geometric means inequality, Lagrange multiplier method
This is version 13 of proof of arithmetic-geometric means inequality using Lagrange multipliers, born on 2005-07-27, modified 2007-08-08.
Object id is 7278, canonical name is ProofOfArithmeticGeometricMeansInequalityUsingLagrangeMultipliers.
Accessed 5450 times total.
Classification:
| AMS MSC: | 26B12 (Real functions :: Functions of several variables :: Calculus of vector functions) | | | 49-00 (Calculus of variations and optimal control; optimization :: General reference works ) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|