calculus of variations


Imagine a bead of mass m on a wire whose endpoints are at a=(0,0) and b=(xf,yf), with yf lower than the starting position. If gravity acts on the bead with force F=mg, what path (arrangement of the wire) minimizes the bead’s travel time from a to b, assuming no friction?

This is the famed brachistochrone problemMathworldPlanetmath, and its solution was one of the first accomplishments of the calculus of variations. Many minimum problems can be solved using the techniques introduced here.

In its general form, the calculus of variations concerns quantities

S[q,q˙,t]=abL(q(t),q˙(t),t)𝑑t (1)

for which we wish to find a minimum or a maximum.

To make this concrete, let’s consider a much simpler problem than the brachistochrone: what’s the shortest distance between two points p=(x1,y1) and q=(x2,y2)? Let the variable s represent distance along the path, so that pq𝑑s=S. We wish to find the path such that S is a minimum. Zooming in on a small portion of the path, we can see that

ds2 =dx2+dy2 (2)
ds =dx2+dy2 (3)

If we parameterize the path by t, then we have

ds=(dxdt)2+(dydt)2dt (4)

Let’s assume y=f(x), so that we may simplify (4) to

ds=1+(dydx)2dx=1+f(x)2dx. (5)

Now we have

S=pqL𝑑x=x1x21+f(x)2𝑑x (6)

In this case, L is particularly simple. Converting to q’s and t’s to make the comparison easier, we have L=L[f(x)]=L[q˙(t)], not the more general L[q(t),q˙(t),t] covered by the calculus of variations. We’ll see later how to use our L’s simplicity to our advantage. For now, let’s talk more generally.

We wish to find the path described by L, passing through a point q(a) at t=a and through q(b) at t=b, for which the quantity S is a minimum, for which in the path produce no first-order change in S, which we’ll call a “stationary point.” This is directly analogous to the idea that for a function f(t), the minimum can be found where δt produce no first-order change in f(t). This is where f(t+δt)f(t); taking a Taylor seriesMathworldPlanetmath expansion of f(t) at t, we find

f(t+δt)=f(t)+δtf(t)+O(δt2)=f(t), (7)

with f(t):=ddtf(t). Of course, since the whole point is to consider δt0, once we neglect terms O(δt2) this is just the point where f(t)=0. This point, call it t=t0, could be a minimum or a maximum, so in the usual calculus of a single variable we’d proceed by taking the second derivative, f′′(t0), and seeing if it’s positive or negative to see whether the function has a minimum or a maximum at t0, respectively.

In the calculus of variations, we’re not considering in t—we’re considering in the integral of the relatively complicated function L(q,q˙,t), where q˙=ddtq(t). Also, S is a functionalMathworldPlanetmathPlanetmathPlanetmath, and we can think of the minimizationMathworldPlanetmath problem as the discovery of a minimum in S-space as we jiggle the parameters q and q˙.

For the shortest-distance problem, it’s clear the maximum time doesn’t exist, since for any finite path length S0 we (intuitively) can always find a curve for which the path’s length is greater than S0. This is often true, and we’ll assume for this discussion that finding a stationary point means we’ve found a minimum.

Formally, we write the condition that produce no change in S as δS=0. To make this precise, we simply write

δS :=S[q+δq,q˙+δq˙,t]-S[q,q˙,t]
=abL(q+δq,q˙+δq˙)𝑑t-S[q,q˙,t]

How are we to simplify this mess? We are considering to the path, which suggests a Taylor series expansion of L(q+δq,q˙+δq˙) about (q,q˙):

L(q+δq,q˙+δq˙)=L(q,q˙)+δqqL(q,q˙)+δq˙q˙L(q,q˙)+O(δq2)+O(δq˙2)

and since we make little error by discarding higher-order terms in δq and δq˙, we have

abL(q+δq,q˙+δq˙)𝑑t=S[q,q˙,t]+abδqqL(q,q˙)+δq˙q˙L(q,q˙)dt

Keeping in mind that δq˙=ddtδq and noting that

ddt(δqq˙L(q,q˙)) =δqddtq˙L(q,q˙)+δq˙q˙L(q,q˙),

a simple application of the product ruleMathworldPlanetmath ddt(fg)=f˙g+fg˙ which allows us to substitute

δq˙q˙L(q,q˙) =ddt(δqq˙L(q,q˙))-δqddtq˙L(q,q˙),

we can rewrite the integral, shortening L(q,q˙) to L for convenience, as:

abδqqL+δq˙q˙Ldt =abδqqL-δqddtq˙L+ddt(δqq˙L)dt
=abδq[qL-ddtq˙L]𝑑t+δqq˙L|ab

Substituting all of this progressively back into our original expression for δS, we obtain

δS =abL(q+δq,q˙+δq˙)𝑑t-S[q,q˙,t]
=S+ab[δqqL+δq˙q˙L]𝑑t-S
=abδq[qL-ddtq˙L]𝑑t+δqq˙L|ab=0.

Two conditions come to our aid. First, we’re only interested in the neighboring paths that still begin at a and end at b, which corresponds to the condition δq=0 at a and b, which lets us cancel the final term. Second, between those two points, we’re interested in the paths which do vary, for which δq0. This leads us to the condition

abδq[qL-ddtq˙L]𝑑t=0. (8)

The fundamental theorem of the calculus of variations is that for continuous functionsMathworldPlanetmathPlanetmath f(t),g(t) with g(t)0t(a,b),

abf(t)g(t)𝑑t=0f(t)=0t(a,b). (9)

Using this theorem, we obtain

qL-ddt(q˙L)=0. (10)

This condition, one of the fundamental equations of the calculus of variations, is called the Euler–Lagrange condition. When presented with a problem in the calculus of variations, the first thing one usually does is to ask why one simply doesn’t plug the problem’s L into this equation and solve.

Recall our shortest-path problem, where we had arrived at

S=abL𝑑x=x1x21+f(x)2𝑑x. (11)

Here, x takes the place of t, f takes the place of q, and (8) becomes

fL-ddxfL=0 (12)

Even with fL=0, this is still ugly. However, because fL=0, we can use the Beltrami identityMathworldPlanetmath,

L-qqL=C. (13)

(For the derivationMathworldPlanetmath of this useful little trick, see the corresponding entry.) Now we must simply solve

1+f(x)2-f(x)fL=C (14)

which looks just as daunting, but quickly reduces to

1+f(x)2-f(x)122f(x)1+f(x)2 =C (15)
1+f(x)2-f(x)21+f(x)2 =C (16)
11+f(x)2 =C (17)
f(x) =1C2-1=m. (18)

That is, the slope of the curve representing the shortest path between two points is a constant, which means the searched curve, i.e. the extremal of this variational problem, must be a straight line. Through this lengthy process, we’ve proved that a straight line is the shortest distance between two points.

To find the actual function f(x) given endpoints (x1,y1) and (x2,y2), simply integrate with respect to x:

f(x)=f(x)𝑑x=b𝑑x=mx+d (19)

and then apply the boundary conditionsMathworldPlanetmath

f(x1) =y1=mx1+d (20)
f(x2) =y2=mx2+d (21)

Subtracting the first condition from the second, we get m=y2-y1x2-x1, the standard equation for the slope of a line. Solving for d=y1-mx1, we get

f(x)=y2-y1x2-x1(x-x1)+y1 (22)

which is the basic equation for a line passing through (x1,y1) and (x2,y2).

The solution to the brachistochrone problem, while slightly more complicated, follows along exactly the same lines.

Title calculus of variations
Canonical name CalculusOfVariations
Date of creation 2013-03-22 12:20:48
Last modified on 2013-03-22 12:20:48
Owner rspuzio (6075)
Last modified by rspuzio (6075)
Numerical id 16
Author rspuzio (6075)
Entry type Topic
Classification msc 49K05
Classification msc 47A60
Related topic TaylorSeries
Related topic LinearFunctional
Related topic BeltramiIdentity
Related topic EulerLagrangeDifferentialEquation
Related topic TheoremForLocallyIntegrableFunctions
Related topic Extremal
Related topic EquationOfCatenaryViaCalculusOfVariations
Related topic LeastSurfaceOfRevolution
Related topic BrachistochroneCurve
Related topic SpeediestInclinedPlane
Defines brachistochrone problem