|
|
|
|
Lagrange multipliers on manifolds
|
(Topic)
|
|
|
We discuss in this article the theoretical aspects of the Lagrange multiplier method.
To enhance understanding, proofs and intuitive explanations of the Lagrange multipler method will be given from several different viewpoints, both elementary and advanced.
Let be a -dimensional differentiable manifold (without boundary), and
, and
, for
, be continuously differentiable. Set
.
Of course, as in one-dimensional calculus, the condition
by itself does not guarantee is a minimum or maximum point, even locally.
The version of Lagrange multipliers typically used in calculus is the special case
in Theorem 1. In this case, the conclusion of the theorem can also be written in terms of gradients instead of differential forms:
Theorem 2 Suppose
are linearly independent at each point of . If is a local minimum or maximum point of restricted to , then there exist Lagrange multipliers
, depending on , such that
This formulation and the first one are equivalent since the 1-form can be identified with the gradient , via the formula
.
The functions can also be coalesced into a vector-valued function
. Then we have:
If
is represented by its Jacobian matrix, then the condition that it be surjective is equivalent to its Jacobian matrix having full rank.
Note the deliberate use of the space
instead of
-- to which the former is isomorphic to -- for the Lagrange multiplier vector. It turns out that the Lagrange multiplier vector naturally lives in the dual space and not the original vector space
. This distinction is particularly important in the infinite-dimensional generalizations of Lagrange multipliers. But even in the finite-dimensional setting, we do see hints that the dual space has to be involved, because a transpose is involved in the matrix expression for Lagrange multipliers.
If the expression
is written out in coordinates, then it is apparent that the components of the vector are exactly those Lagrange multipliers from Theorems 1 and 2.
The proof of the Lagrange multiplier theorem is surprisingly short and elegant, when properly phrased in the language of abstract manifolds and differential forms.
However, for the benefit of the readers not versed in these topics, we provide, in addition to the abstract proof, a concrete translation of the arguments in the more familiar setting
.
Proof. Since  are linearly independent at each point of
 ,  is an embedded submanifold of  , of dimension  . Let
 , with open in
 , be a coordinate chart for  such that
 . Then
 has a local minimum or maximum at 0, and therefore
 at 0. But  at  is an isomorphism
 , so the preceding equation says that vanishes on
 .
Now, by the definition of , we have
, so
. So like , vanishes on
.
In other words, is in the annihilator
of the subspace
. Since
has dimension , and
has dimension , the annihilator
has dimension . Now
are linearly independent, so they must in fact be a basis for
. But we had argued that
. Therefore may be written as a unique linear combination of the :

The last paragraph of the previous proof can also be rephrased, based on the same underlying ideas, to make evident the fact that the Lagrange multiplier vector lives in the dual space
.
Yet another proof could be devised by observing that the result is obvious if
and the constraint functions are just coordinate projections on
:
We clearly must have
at a point that minimizes over
. The general case can be deduced to this by a coordinate change:
Proof. [Alternate argument.] Since  are linearly independent, we can find a coordinate chart for  about the point  , with coordinate functions
 such that  for
 . Then
but
 at the point  . Set
 at  . 
Proof. We assume that
 . Consider the list vector
 discussed earlier, and its Jacobian matrix
 in Euclidean coordinates. The  th row of this matrix is
So the matrix
 has full rank (i.e.
 ) if and only if the  gradients
 are linearly independent.
Consider each solution of . Since
has full rank, we can apply the implicit function theorem, which states that there exist smooth solution parameterizations
around each point . ( is an open set in
, .) These are the coordinate charts which give to
a manifold structure.
We now consider specially the point ; without loss of generality, assume
. Then
is a function on Euclidean space having a local minimum or maximum at 0, so its derivative vanishes at 0. Calculating by the chain rule, we have
. In other words,
. Intuitively, this says that the directional derivatives at of lying in the tangent space
of the manifold vanish.
By the definition of and , we have
. By the chain rule again, we derive
.
Let the columns of
be the column vectors
, which span the -dimensional space
, and look at the matrix equation
again. The equation for each entry of this matrix, which consists of only one row, is:
In other words,
 is orthogonal to
 , and hence it is orthogonal to the entire tangent space
 .
Similarly, the matrix equation
can be split into individual scalar equations:
Thus
 is orthogonal to
 . But
 are, by hypothesis, linearly independent, and there are  of these gradients, so they must form a basis for the orthogonal complement of
 , of  dimensions. Hence
 can be written as a unique linear combination of
 :

We now discuss the intuitive and geometric interpretations of Lagrange multipliers.
Each equation defines a hypersurface in
, a manifold of dimension . If we consider the tangent hyperplane at of these hypersurfaces,
, the gradient
gives the normal vector to these hyperplanes.
The manifold is the intersection of the hypersurfaces . Presumably, the tangent space
is the intersection of the
, and the subspace perpendicular to
would be spanned by the normals
. Now, the direction derivatives at of with respect to each vector in
, as we have proved, vanish. So the direction of
, the direction of the greatest change in at , should be perpendicular to
. Hence
can be written as a linear combination of the
.
Note, however, that this geometric picture, and the manipulations with the gradients
and
, do not carry over to abstract manifolds. The notions of gradients and normals to surfaces depend on the inner product structure of
, which is not present in an abstract manifold (without a Riemannian metric).
On the other hand, this explains the mysterious appearance of annihilators in the last paragraph of the abstract proof. Annihilators and dual space theory serve as the proper tools to formalize the manipulations we made with the matrix equations
and
, without resorting to Euclidean coordinates, which, of course, are not even defined on an abstract manifold.
If we are willing to interpret the quantities and as infinitesimals, even the abstract version of the result has an intuitive explanation. Suppose we are at the point of the manifold , and consider an infinitesimal movement about this point. The infinitesimal movement is a vector in the tangent space
, because, near , looks like the linear space
. And as moves, the function changes by a corresponding infinitesimal amount that is approximately linear in .
Furthermore, the change may be decomposed as the sum of a change as moves along the manifold , and a change as moves out of the manifold . But if has a local minimum at , then there cannot be any change of along ; thus only changes when moving out of . Now is described by the equations , so a movement out of is described by the infinitesimal changes
. As is linear in the change , we ought to be able to write it as a weighted sum of the changes . The weights are, of course, the Lagrange multipliers .
The linear algebra performed in the abstract proof can be regarded as the precise, rigorous translation of the preceding argument.
Observe that the formula for Lagrange multipliers is formally very similar to the standard formula for expressing a differential form in terms of a basis:
In fact, if are linearly independent, then they do form a basis for
, that can be extended to a basis for
. By the uniqueness of the basis representation, we must have
That is, is the differential of with respect to changes in .
In applications of Lagrange multipliers to economic problems, the multipliers are rates of substitution -- they give the rate of improvement in the objective function as the constraints are relaxed.
In applications, sometimes we are interested in finding stationary points of -- defined as points such that vanishes on
, or equivalently, that the Taylor expansion of at , under any system of coordinates for , has no terms of first order. Then the Lagrange multiplier method works for this situation too.
The following theorem incorporates the more general notion of stationary points.
In this formulation, is not necessarily a manifold, but it is one when intersected with a sufficiently small neighborhood about . So it makes sense to talk about
, although we are abusing notation here. The subspace in question can be more accurately described as the annihilated subspace of
.
It is also enough that be linearly independent only at the point . For are continuous, so they will be linearly independent for points near anyway, and we may restrict our viewpoint to a sufficiently small neighborhood around , and the proofs carry through.
The proof involves only simple modifications to that of Theorem 1 -- for instance, the converse implication follows because we have already proved that the form a basis for the annihilator of
, independently of whether or not is a stationary point of on .
- 1
- Friedberg, Insel, Spence. Linear Algebra. Prentice-Hall, 1997.
- 2
- David Luenberger. Optimization by Vector Space Methods. John Wiley & Sons, 1969.
- 3
- James R. Munkres. Analysis on Manifolds. Westview Press, 1991.
- 4
- R. Tyrrell Rockafellar. ``Lagrange Multipliers and Optimality''. SIAM Review. Vol. 35, No. 2, June 1993.
- 5
- Michael Spivak. Calculus on Manifolds. Perseus Books, 1998.
|
"Lagrange multipliers on manifolds" is owned by stevecheng.
|
|
(view preamble)
See Also: manifold
| Keywords: |
manifold, Lagrange multiplier, Lagrangian multiplier |
This object's parent.
|
|
Cross-references: implication, converse, continuous, annihilated subspace, neighborhood, extremum, first order, Taylor expansion, stationary points, objective function, weights, sum, near, infinitesimals, theory, Riemannian metric, inner product, surfaces, normals, spanned by, perpendicular, intersection, normal vector, hyperplane, tangent, hypersurface, interpretations, orthogonal complement, hypothesis, scalar, entire, orthogonal, span, column vectors, columns, tangent space, directional derivatives, chain rule, derivative, Euclidean space, without loss of generality, structure, open set, smooth, implicit function theorem, solution, row, Euclidean, list vector, projections, obvious, kernel, pullback, image, linear transformation, linear algebra, linear combination, basis, annihilator, subspace, vanishes, equation, isomorphism, coordinate chart, open, dimension, embedded submanifold, components, coordinates, expression, matrix, transpose, finite-dimensional, infinite-dimensional, vector space, dual space, isomorphic, rank, Jacobian matrix, vector, surjective, tangent map, vector-valued function, functions, 1-form, equivalent, differential forms, gradients, terms, conclusion, Calculus, exterior derivative, restricted, local minimum, point, linearly independent, continuously differentiable, boundary, differentiable manifold, Lagrange multiplier method
This is version 21 of Lagrange multipliers on manifolds, born on 2005-07-27, modified 2007-07-08.
Object id is 7276, canonical name is ProofOfLagrangeMultiplierMethodOnManifolds.
Accessed 2733 times total.
Classification:
| AMS MSC: | 49-00 (Calculus of variations and optimal control; optimization :: General reference works ) | | | 58C05 (Global analysis, analysis on manifolds :: Calculus on manifolds; nonlinear operators :: Real-valued functions) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|