PlanetMath (more info)
 Math for the people, by the people. Sponsor PlanetMath
Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS  
Login
create new user
name:
pass:
forget your password?
Main Menu
Owner confidence rating: Very high Entry average rating: Very high
calculus of variations (Topic)

Imagine a bead of mass $m$ on a wire whose endpoints are at $a = (0,0)$ and $b = (x_f, y_f)$ , with $y_f$ lower than the starting position. If gravity acts on the bead with force $F = m g$ , what path (arrangement of the wire) minimizes the bead's travel time from $a$ to $b$ , assuming no friction?

This is the famed ``brachistochrone problem,'' and its solution was one of the first accomplishments of the calculus of variations. Many minimum problems can be solved using the techniques introduced here.

In its general form, the calculus of variations concerns quantities \begin{equation} S[q,\dot{q}, \mvar] = \int_{a}^{b} L(q(\mvar),\dot{q}(\mvar), \mvar) d\mvar \end{equation}for which we wish to find a minimum or a maximum.

To make this concrete, let's consider a much simpler problem than the brachistochrone: what's the shortest distance between two points $p = (x1,y1)$ and $q = (x2,y2)$ ? Let the variable $s$ represent distance along the path, so that $\int_{p}^{q} ds = S$ . We wish to find the path such that $S$ is a minimum. Zooming in on a small portion of the path, we can see that

$\displaystyle ds^2$ $\displaystyle = dx^2 + dy^2$ (1)
$\displaystyle ds$ $\displaystyle = \sqrt{dx^2 + dy^2}$ (2)

If we parameterize the path by $t$ , then we have \begin{equation} ds = \sqrt{\left(\frac{dx}{dt}\right)^2 + \left(\frac{dy}{dt}\right)^2}\ dt \end{equation} Let's assume $y = f(x)$ , so that we may simplify (4) to \begin{equation} ds = \sqrt{1 + \left(\frac{dy}{dx}\right)^2}\ dx = \sqrt{1 + f'(x)^2}\ dx. \end{equation} Now we have \begin{equation} S = \int_{p}^{q} L\ dx = \int_{x1}^{x2} \sqrt{1 + f'(x)^2}\ dx \end{equation}In this case, $L$ is particularly simple. Converting to $q$ 's and $t$ 's to make the comparison easier, we have $L = L[f'(x)] = L[\dot{q}(t)]$ , not the more general $L[q(t), \dot{q}(t), t]$ covered by the calculus of variations. We'll see later how to use our $L$ 's simplicity to our advantage. For now, let's talk more generally.

We wish to find the path described by $L$ , passing through a point $q(a)$ at $\mvar=a$ and through $q(b)$ at $\mvar=b$ , for which the quantity $S$ is a minimum, for which small perturbations in the path produce no first-order change in $S$ , which we'll call a ``stationary point.'' This is directly analogous to the idea that for a function $f(t)$ , the minimum can be found where small perturbations $\delta t$ produce no first-order change in $f(t)$ . This is where $f(t + \delta t) \approx f(t)$ ; taking a Taylor series expansion of $f(t)$ at $t$ , we find \begin{equation} f(t + \delta t) = f(t) + \delta t f'(t) + O({\delta t}^2) = f(t), \end{equation}with $f'(t) := \mderiv{t}{f(t)}$ . Of course, since the whole point is to consider $\delta t \neq 0$ , once we neglect terms $O({\delta t}^2)$ this is just the point where $f'(t) = 0$ . This point, call it $t = t_0$ , could be a minimum or a maximum, so in the usual calculus of a single variable we'd proceed by taking the second derivative, $f''(t_0)$ , and seeing if it's positive or negative to see whether the function has a minimum or a maximum at $t_0$ , respectively.

In the calculus of variations, we're not considering small perturbations in $t$ --we're considering small perturbations in the integral of the relatively complicated function $L(q,\dot{q}, \mvar)$ , where $\dot{q} = \mderiv{\mvar}{q(\mvar)}$ . Also, $S$ is a functional, and we can think of the minimization problem as the discovery of a minimum in $S$ -space as we jiggle the parameters $q$ and $\dot{q}$ .

For the shortest-distance problem, it's clear the maximum time doesn't exist, since for any finite path length $S_0$ we (intuitively) can always find a curve for which the path's length is greater than $S_0$ . This is often true, and we'll assume for this discussion that finding a stationary point means we've found a minimum.

Formally, we write the condition that small parameter perturbations produce no change in $S$ as $\delta S = 0$ . To make this precise, we simply write:

$\displaystyle \delta S$ $\displaystyle :=S[q + \delta q,\ \dot{q} + \delta \dot{q},\ t] - S[q, \dot{q},t]$    
  $\displaystyle = \int_{a}^{b} L(q + \delta q,\ \dot{q} + \delta \dot{q}) dt- S[q, \dot{q},t]$    

How are we to simplify this mess? We are considering small perturbations to the path, which suggests a Taylor series expansion of $L(q + \delta q,\dot{q} + \delta \dot{q})$ about $(q, \dot{q})$ : \begin{equation*} L(q + \delta q,\dot{q} + \delta \dot{q}) = L(q,\dot{q}) + \delta q \mpderiv{q}L(q,\dot{q}) + \delta \dot{q} \mpderiv{\dot{q}}L(q,\dot{q}) + O(\delta q^2) + O(\delta \dot{q}^2) \end{equation*}and since we make little error by discarding higher-order terms in $\delta q$ and $\delta \dot{q}$ , we have \begin{equation*} \int_{a}^{b} L(q + \delta q,\dot{q} + \delta \dot{q}) \md \mvar = S[q, \dot{q}, \mvar] + \int_{a}^{b} \delta q \mpderiv{q}L(q,\dot{q}) + \delta \dot{q} \mpderiv{\dot{q}}L(q,\dot{q}) \md \mvar \end{equation*}Keeping in mind that $\delta \dot{q} = \mderiv{\mvar}{\delta q}$ and noting that
$\displaystyle \frac{d}{d{t}}{\left(\delta q \frac{\partial}{\partial {\dot{q}}}L(q,\dot{q})\right)}$ $\displaystyle = \delta q \frac{d}{d{t}}{\frac{\partial}{\partial {\dot{q}}}L(q,\dot{q})} + \delta \dot{q} \frac{\partial}{\partial {\dot{q}}}L(q,\dot{q}),$    

a simple application of the product rule $\mderiv{\mvar}{(fg)} = \dot{f}g + f\dot{g}$ which allows us to substitute
$\displaystyle \delta \dot{q} \frac{\partial}{\partial {\dot{q}}}L(q,\dot{q})$ $\displaystyle = \frac{d}{d{t}}{\left(\delta q \frac{\partial}{\partial {\dot{q}... ...t)} - \delta q \frac{d}{d{t}}{\frac{\partial}{\partial {\dot{q}}}L(q,\dot{q})},$    

we can rewrite the integral, shortening $L(q,\dot{q})$ to $L$ for convenience, as:
$\displaystyle \int_{a}^{b} \delta q \frac{\partial}{\partial {q}} L + \delta \dot{q} \frac{\partial}{\partial {\dot{q}}} L dt$ $\displaystyle = \int_{a}^{b} \delta q \frac{\partial}{\partial {q}} L - \delta ... ...\frac{d}{d{t}}{\left(\delta q \frac{\partial}{\partial {\dot{q}}} L \right)} dt$    
  $\displaystyle = \int_{a}^{b} \delta q \left[ \frac{\partial}{\partial {q}} L - ... ...L}\right] dt+ \delta q \frac{\partial}{\partial {\dot{q}}} L \Big\vert _{a}^{b}$    

Substituting all of this progressively back into our original expression for $\delta S$ , we obtain
$\displaystyle \delta S$ $\displaystyle = \int_{a}^{b} L(q + \delta q,\dot{q} + \delta \dot{q}) dt- S[q, \dot{q}, t]$    
  $\displaystyle = S + \int_{a}^{b} \left[\delta q \frac{\partial}{\partial {q}}L + \delta \dot{q} \frac{\partial}{\partial {\dot{q}}}L \right]dt- S$    
  $\displaystyle = \int_{a}^{b} \delta q \left[ \frac{\partial}{\partial {q}} L - ... ...ght] dt+ \delta q \frac{\partial}{\partial {\dot{q}}} L \Big\vert _{a}^{b} = 0.$    

Two conditions come to our aid. First, we're only interested in the neighboring paths that still begin at $a$ and end at $b$ , which corresponds to the condition $\delta q = 0$ at $a$ and $b$ , which lets us cancel the final term. Second, between those two points, we're interested in the paths which do vary, for which $\delta q \neq 0$ . This leads us to the condition \begin{equation} \int_{a}^{b} \delta q \left[ \mpderiv{q} L - \mderiv{\mvar}{\mpderiv{\dot{q}} L}\right] \md \mvar = 0. \end{equation}The fundamental theorem of the calculus of variations is that for continuous functions $f(t), g(t)$ with $g(t) \ne 0\ \forall t \in (a,b)$ , \begin{equation} \int_{a}^{b} f(t) g(t) dt = 0 \Longrightarrow f(t) = 0\quad \forall t \in (a,b). \end{equation}Using this theorem, we obtain \begin{equation} \mpderiv{q} L - \mderiv{\mvar}\left(\mpderiv{\dot{q}} L\right) = 0. \end{equation} This condition, one of the fundamental equations of the calculus of variations, is called the Euler-Lagrange condition. When presented with a problem in the calculus of variations, the first thing one usually does is to ask why one simply doesn't plug the problem's $L$ into this equation and solve.

Recall our shortest-path problem, where we had arrived at \begin{equation} S = \int_{a}^{b} L\ dx = \int_{x1}^{x2} \sqrt{1 + f'(x)^2}\ dx. \end{equation}Here, $x$ takes the place of $\mvar$ , $f$ takes the place of $q$ , and (8) becomes \begin{equation} \mpderiv{f} L - \mderiv{x}{\mpderiv{f'} L} = 0 \end{equation}Even with $\mpderiv{f} L = 0$ , this is still ugly. However, because $\mpderiv{f}L = 0$ , we can use the Beltrami identity, \begin{equation} L - q' {\mpderiv{q'} L} = C. \end{equation}(For the derivation of this useful little trick, see the corresponding entry.) Now we must simply solve \begin{equation} \sqrt{1 + f'(x)^2} - f'(x) {\mpderiv{f'} L} = C \end{equation}which looks just as daunting, but quickly reduces to

$\displaystyle \sqrt{1 + f'(x)^2} - f'(x)\frac{\frac{1}{2}2f'(x)}{\sqrt{1 + f'(x)^2}}$ $\displaystyle = C$ (3)
$\displaystyle \frac{1 + f'(x)^2 - f'(x)^2}{\sqrt{1 + f'(x)^2}}$ $\displaystyle = C$ (4)
$\displaystyle \frac{1}{\sqrt{1 + f'(x)^2}}$ $\displaystyle = C$ (5)
$\displaystyle f'(x)$ $\displaystyle = \sqrt{\frac{1}{C^2} - 1} = m.$ (6)

That is, the slope of the curve representing the shortest path between two points is a constant, which means the curve must be a straight line. Through this lengthy process, we've proved that a straight line is the shortest distance between two points.

To find the actual function $f(x)$ given endpoints $(x_1,y_1)$ and $(x_2,y_2)$ , simply integrate with respect to $x$ : \begin{equation} f(x) = \int f'(x) dx = \int b dx = mx + d \end{equation}and then apply the boundary conditions

$\displaystyle f(x_1)$ $\displaystyle = y_1 = mx_1 + d$ (7)
$\displaystyle f(x_2)$ $\displaystyle = y_2 = mx_2 + d$ (8)

Subtracting the first condition from the second, we get $m = \frac{y_2 - y_1}{x_2 - x_1}$ , the standard equation for the slope of a line. Solving for $d = y_1 - mx_1$ , we get \begin{equation} f(x) = \frac{y_2 - y_1}{x_2 - x_1}(x - x_1) + y_1 \end{equation}which is the basic equation for a line passing through $(x_1,y_1)$ and $(x_2,y_2)$ .

The solution to the brachistochrone problem, while slightly more complicated, follows along exactly the same lines.




"calculus of variations" is owned by rspuzio. [ full author list (3) | owner history (3) ]
(view preamble | get metadata)

View style:

See Also: Taylor series, linear functional, Beltrami identity, Euler-Lagrange differential equation (elementary), fundamental lemma of calculus of variations


Attachments:
stationary point (Definition) by matte
Log in to rate this entry.
(view current ratings)

Cross-references: boundary conditions, integrate, line, straight, slope, Beltrami identity, even, equation, Euler-Lagrange condition, theorem, continuous functions, fundamental theorem of the calculus of variations, expression, product rule, application, stationary point, path length, finite, clear, parameters, functional, integral, negative, positive, second derivative, Calculus, terms, Taylor series, function, passing through, variable, points, distance, path, acts on, endpoints
There are 6 references to this entry.

This is version 9 of calculus of variations, born on 2002-02-16, modified 2006-01-19.
Object id is 1995, canonical name is CalculusOfVariations.
Accessed 43877 times total.

Classification:
AMS MSC47A60 (Operator theory :: General theory of linear operators :: Functional calculus)

Pending Errata and Addenda
None.
[ View all 4 ]
Discussion
Style: Expand: Order:
forum policy

jump to page: 1 2 >> of 2 (6 items)

5 identical messages by mathforever on 2005-03-21 08:48:42
One can see here (the messages attached to the entry "calculus of variations") that there are 5 messages with identical contents. Probably it was because the post button was 5 times pushed. So may be it is not bad idea to leave only one of them and rest to delete. If this can be done than this message is also not needed and should be deleted as well.

Regards
Serg.

-------------------------------
knowledge can become a science
only with a help of mathematics
[ reply | up ]
Beltrami Identity by GrassyKnoll_1963 on 2004-12-01 21:15:38
I am not sure if your precondition for using the Beltrami Identity is correct.

You are saying that dL/df = 0 is the precondition for using the Beltrami Identity. Should it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.
[ reply | up ]
Beltrami Identity by GrassyKnoll_1963 on 2004-12-01 21:13:57
I am not sure if your prcondition for using the Beltrami Identiry is correct.

You are saying that dL/df = 0 as the precondition for using the Beltrami Identity. Sould it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.
[ reply | up ]
Beltrami Identity by GrassyKnoll_1963 on 2004-12-01 21:13:55
I am not sure if your prcondition for using the Beltrami Ddentiry is correct.

You are saying that dL/df = 0 as the precondition for using the Beltrami Identity. Sould it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.
[ reply | up ]
Beltrami Identity by GrassyKnoll_1963 on 2004-12-01 21:13:52
I am not sure if your prcondition for using the Beltrami identiry is correct.

You are saying that dL/df = 0 as the precondition for using the Beltrami Identity. Sould it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.
[ reply | up ]

jump to page: 1 2 >> of 2 (6 items)

Interact
post | correct | update request | add example | add (any)