Calculus of Variations

Calculus of Variations#

The Discrete Picture#

Most presentations of the calculus of variations use the functional formulation, which often requires careful manipulations of the integration by parts formula to derive the corresponding Euler-Lagrange equations. Here we present an alternative perspective afforded by considering the discrete version of the problem.

Consider a function \(f(x)\) represented by the vector \(\vect{f}\): \(f_{n} = f(x_n)\) at a set of equally spaced points \(x_n\) with spacing \(\d{x} = a = x_{n+1} - x_{n}\). The goal is to minimize (or more generally, extremize) some functional

\[\begin{gather*} \min_{f(x)} E[f], \qquad \frac{\delta E[f]}{\delta f(y)} = 0. \end{gather*}\]

In the discrete case, \(E(\vect{f})\) just becomes a scalar valued function of the vector \(\vect{f}\), and the equations of motion simply amount to

\[\begin{gather*} \min_{\vect{f}} E(\vect{f}), \qquad \pdiff{E(\vect{f})}{f_m} = \vect{\nabla}_{m}E(\vect{f}) = 0. \end{gather*}\]

The advantage of this formulation is that, if you are careful, you have a numerically exact minimization problem and corresponding gradient that can be used to solve the system with a stable algorithm like L-BFGS-M which is suitable for large-scale problems.

Of course, to do this, you must carefully specify terms like the derivative \(\dot{\vect{f}} = \mat{D}\vect{f}\) which might enter \(E(\vect{f})\). If using a finite-difference approximation, one might have

\[\begin{gather*} \mat{D}_{d} = \frac{1}{2a}\begin{pmatrix} 0 & 1\\ -1 & 0 & 1\\ & -1 & \ddots & \ddots\\ & & \ddots & 0 & 1\\ & & & -1 & 0\\ \end{pmatrix}, \qquad \mat{D}_{p} = \frac{1}{2a}\begin{pmatrix} 0 & 1 & & & -1\\ -1 & 0 & 1\\ & -1 & \ddots & \ddots\\ & & \ddots & 0 & 1\\ 1& & & -1 & 0\\ \end{pmatrix}. \end{gather*}\]

The resulting equations of motion will have terms with \(\mat{D}^{T} = -\mat{D}\), accounting for the minus sign obtained after integrating by parts. This exact relationship \(\mat{D}^{T} = -\mat{D}\) follows from the explicit choice of boundary conditions – Dirichlet here for \(\mat{D}_{d}\) and periodic for \(\mat{D}_{p}\), and imply vanishing boundary terms in the equations of motion. Other choices of boundary conditions might require corrections to this relation, corresponding to non-vanishing boundary terms in the usual functional derivative.

Example

Consider a general functional of the form

\[\begin{gather*} E[f] = \int_{0}^{L}\mathcal{L}(f, \dot{f}, x)\d{x}. \end{gather*}\]

A discretized version might look like

\[\begin{gather*} E(\vect{f}) = a\sum_{m}\mathcal{L}(f_{m}, \dot{f}_{m}, x_{m}) = a\sum_{m}\mathcal{L}(f_{m}, \sum_{n}D_{mn}f_{n}, x_{m}). \end{gather*}\]

We can now explicitly compute the variation, noting that \(\partial f_m/\partial f_n = \delta_{mn}\):

\[\begin{gather*} \pdiff{E(\vect{f})}{f_{l}} = a\sum_{m}\Biggl( \pdiff{\mathcal{L}(f_{m}, \dot{f}_{m}, x_{m})}{f_{m}} \underbrace{\pdiff{f_{m}}{f_{l}}}_{\delta_{ml}} + \pdiff{\mathcal{L}(f_{m}, \dot{f}_{m}, x_{m})}{\dot{f}_{m}} \underbrace{\pdiff{\dot{f}_{m}}{f_{l}}}_{D_{ml}} \Biggr)\\ = a\Biggl(\pdiff{\mathcal{L}(f_{l}, \dot{f}_{l}, x_{l})}{f_{l}} + \underbrace{\sum_{m}D_{ml} \pdiff{\mathcal{L}(f_{m}, \dot{f}_{m}, x_{m})}{\dot{f}_{m}} }_{\mat{D}^T\cdot(\partial\mathcal{L}/\partial \dot{\vect{f}})} \Biggr). \end{gather*}\]

This is exact: all subtleties about boundary conditions etc. will explicitly contained in the relationship between \(\mat{D}^T \approx -\mat{D}\). If the boundary conditions are such that \(\mat{D}^T = -\mat{D}\) exactly, we obtain the usual Euler-Lagrange equations

\[\begin{gather*} \pdiff{\mathcal{L}(\vect{f}, \dot{\vect{f}}, \vect{x})}{\vect{f}} = \mat{D}\cdot \pdiff{\mathcal{L}(\vect{f}, \dot{\vect{f}}, \vect{x})}{\dot{\vect{f}}}. \end{gather*}\]

Catenary#

What is the shape of a thin rope of linear mass density \(\lambda\) and fixed length \(L_0\) hanging between two hooks? This is a classic problem for calculus of variations. Expressed as an optimization problem, our goal is to minimize the potential energy of the rope while holding the length constant:

\[\begin{gather*} \min_{y(x)} E[y] \quad \Big|\quad [y(x_0), y(x_1)] = [y_0, y_1], \quad \text{and}\quad L[y] = L_0. \end{gather*}\]

The first task is to parameterize the rope. A straightforward approach is to express it as a function \(y(x)\) subject to the boundary conditions \(y(x_0) = y_0\) and \(y(x_1) = y_1\). The length and potential energy are then given by the functionals \(L[y]\) and \(E[y]\) respectively:

\[\begin{gather*} L[y] = \int_{0}^{L}\d{L}, \qquad E[y] = \int_{0}^{M}\d{m}\; gy. \end{gather*}\]

The differentials can be expressed as:

\[\begin{gather*} \d{L} = \sqrt{\d{x}^2 + \d{y}^2} = \sqrt{1 + (y')^2}\d{x}, \qquad \d{m} = \lambda \d{L}, \end{gather*}\]

giving the definite integrals

\[\begin{gather*} L[y] = \int_{x_0}^{x_1}\d{x}\;\sqrt{1+(y')^2}, \qquad E[y] = \lambda g \int_{x_0}^{x_1}\d{x}\; y\sqrt{1+(y')^2}. \end{gather*}\]

Direct Approach#

The constraint can be enforced by using the method of sec:LagrangeMultipliers, thus we look for extremal points of the functional

\[\begin{gather*} I[y] = E[y] - \mu L[y] = \int_{x_0}^{x_1}\d{x}\; \underbrace{(\lambda g y - \mu)\sqrt{1+(y')^2}}_{\mathcal{L}(y, y', x)}. \end{gather*}\]

To simplify this, we write the integrand in terms of the function \(\mathcal{L}(y, y', x)\), which is the analogue of the Lagrangian in classical mechanics.

\[\begin{gather*} I[y] = \int_{x_0}^{x_1}\d{x}\; \mathcal{L}(y, y', x). \end{gather*}\]

The solution to the extremization problem follows the usual approach of calculus of variations, which yields the Euler-Lagrange equation:

\[\begin{gather*} \diff{}{x}\pdiff{\mathcal{L}}{y'} = \pdiff{\mathcal{L}}{y}. \end{gather*}\]

Using our functional form, this gives the following solution for the catenary:

\[\begin{gather*} \diff{}{x}\left(\frac{y'(\lambda g y - \mu)} {\sqrt{1+(y')^2}}\right) = \lambda g \sqrt{1+(y')^2}. \end{gather*}\]

Expanding, then simplifying, we have

\[\begin{gather*} \frac{y''(\lambda g y - \mu) + \lambda g (y')^2}{\sqrt{1+(y')^2}} - \frac{(y')^2y''(\lambda g y - \mu)}{\sqrt{1+(y')^2}^3}= \lambda g \sqrt{1+(y')^2},\\ y''(\lambda g y - \mu) = \lambda g \frac{1 + (y')^2}{1 - (y')^2}. \end{gather*}\]

Expressing, \(y'' = \d{y'}/d{y} y'\), this separates:

\[\begin{gather*} \int y' \frac{1 - (y')^2}{1 + (y')^2}\d{y'} = \int\frac{\lambda g}{\lambda g y - \mu}\d{y}. \end{gather*}\]

However, even after solving this, we still have another integration to perform. Maybe there is a better way…

Alternative Strategies#

Let’s see if some other formulations might have helped us. Consider two expressions of the solution: \(y(x)\) as we have done above with the independent variable \(x\) (the analog of \(t\) in Lagrangian mechancs), and the inverse \(x(y)\) which switches the roll so that \(y\) is the independent variable. We can express the extremization problem in terms of the following two Lagrangians:

\[\begin{align*} \mathcal{L}_x(y, y', x) = (\lambda g y - \mu)\sqrt{1 + (y')^2},\\ \mathcal{L}_y(x, x', y) = (\lambda g y - \mu)\sqrt{1 + (x')^2}. \end{align*}\]

Conservation Laws#

Notice that \(\mathcal{L}_x(y, y', x)\) is independent of \(x\). This is the equivalent of a classical Lagrangian being time-independent, and Noether’s theorem tells us that the corresponding Hamiltonian is conserved:

\[\begin{gather*} H_x = y'\pdiff{\mathcal{L}_x(y, y', x)}{y'} - \mathcal{L}_x(y, y', x) = \text{const}. \end{gather*}\]

Similarly, \(\mathcal{L}_y(x, x', y)\) is independent of \(x\). This makes the conserved quantity a little more obvious since, from the Euler-Lagrange equations:

\[\begin{gather*} \diff{}{y}\left(\pdiff{\mathcal{L}_y(x, x', y)}{x'}\right) = \pdiff{\mathcal{L}_y(x, x', y)}{x} = 0,\\ \pdiff{\mathcal{L}_y(x, x', y)}{x'} = \text{const}. \end{gather*}\]

It turns out that these two conserved quantities are equivalent, and this trick of exchanging the dependent and independent variables works in classical mechanics to derive the conserved Hamiltonian. In any case, it should be pretty apparent that this latter form is simpler to evaluate:

\[\begin{gather*} \pdiff{\mathcal{L}_y(x, x', y)}{x'} = \frac{x'(\lambda g y - \mu)}{\sqrt{1 + (x')^2}} = \text{const}. \end{gather*}\]

Simplifying, and replacing \(x'(y) = 1/y'(x)\), we obtain

\[\begin{gather*} \lambda g y - \mu = c^{-1}\sqrt{1 + (y')^2}, \end{gather*}\]

where \(c\) is a constant.

Note

This is considerably simpler than the direct approach. When presented with a somewhat complicated problem, it can be worth your time to first consider the general form of the problem and try a few general approaches to see if you can find one that might be computationally simpler before delving into the algebraic details.

This is also separable:

\[\begin{gather*} \d{x} = \frac{\d{y}}{\sqrt{c^{2}(\lambda g y - \mu)^2 - 1}}. \end{gather*}\]

The integral here has the form

\[\begin{gather*} \int \frac{1}{\sqrt{ay^2 + by + c}}\d{y} \end{gather*}\]

which can be found in common tables.

Full Solution

Once can also check a partially remembered answer. I recall that the solution for a catenary is somehow related to the hyperbolic functions. Practical experience with hanging ropes suggest it is probably something of the form:

\[\begin{gather*} y(x) = a\cosh(bx + d), \qquad y'(x) = ab\sinh(bx + d). \end{gather*}\]

If \(ab = 1\), then \(1+(y')^2 = 1+\sinh^2(bx+c) = \cosh^2(bx+c)\) which is almost right, but won’t get the constant \(\mu\). We can get this by adding a constant, and a bit more through suggest the following form which has a minimum at \((x_{\min}, y_{\min})\):

\[\begin{gather*} y(x) = a\Bigl(\cosh\bigl((x-x_{\min})/a\bigr) - 1\Bigr) + y_{\min}, \\ y'(x) = \sinh\Bigl((x-x_{\min})/a\Bigr), \\ \begin{aligned} \sqrt{1 + (y')^2} &= \cosh\Bigl((x-x_{\min})/a\Bigr) = \frac{y(x) + y_{\min}}{a} - 1 \\ &= \frac{\lambda g y - \mu}{c}. \end{aligned} \end{gather*}\]

This is exactly the solution if

\[\begin{gather*} a = \frac{c}{\lambda g}, \qquad \frac{y_{\min}}{a} - 1 = -\frac{\mu}{c}. \end{gather*}\]

This has three parameters \(a\), \(x_{\min}\), and \(y_{\min}\), allowing us to satisfy the three constraints \(y(x_0) = y_0\), \(y(x_1) = y_1\), and \(L[y] = L_0\). The length equation becomes:

\[\begin{gather*} L[y] = \int_{x_0}^{x_1}\d{x}\; \sqrt{1+(y')^2} = \int_{x_0}^{x_1}\d{x}\; \cosh\tfrac{x-x_{\min}}{a}\\ = \left.a\sinh\tfrac{x-x_{\min}}{a}\right|_{x_0}^{x_1} = L_0. \end{gather*}\]

The algebra can be simplified considerably by choosing the origin at the minimum \(x_{\min} = 0\), \(y_{\min} = 0\). This implies \(x_0 = -x_1\) and \(y_0 = y_1\):

\[\begin{gather*} c = \mu, \qquad a = \frac{\mu}{\lambda g},\\ L_0 = 2a\sinh\tfrac{x_1}{a} \end{gather*}\]

As a final check, one should make sure that no errors were made multiplying or dividing by zero, and that this is indeed a minimum, not a maximum or saddle point. The latter check is fairly easy on physical grounds if we make sure \(a, x_1 >0\). The maximum energy has a form with \(a, x_1<0\) – an inverted catenary.

As a final check, let’s compute the conserved Hamiltonian from the first approach:

\[\begin{align*} H_x &= \overbrace{\frac{(y')^2(\lambda g y - \mu)}{\sqrt{1 + (y')^2}}}^{y'\partial\mathcal{L}_x/\partial y'} -\overbrace{(\lambda g y - \mu)\sqrt{1 + (y')^2}}^{\mathcal{L}_x} = \text{const},\\ &= \frac{(y')^2 - \Bigl(1 + (y')^2\Bigr)}{\sqrt{1 + (y')^2}}(\lambda g y - \mu),\\ &= \frac{\lambda g y - \mu}{\sqrt{1 + (y')^2}},\\ \end{align*}\]

which gives the same equation as the second approach:

\[\begin{gather*} \lambda g y - \mu = c^{-1}\sqrt{1 + (y')^2}. \end{gather*}\]

Geodesics#

A final application goes back to geometry – the computation of geodesics on a manifold \(M\). Consider a path \(X(t)\) in our coordinate chart corresponding to the path \(x(t)\). We can obtain the length of the path by integrating the speed:

\[\begin{gather*} L(t) = \int_{0}^{t} \braket{V(t)|V(t)}\d{t} = \int_{0}^{t} g_{\alpha\beta}\bigl(X(t)\bigr)V^{\alpha}(t)V^{\beta}(t)\d{t} = \int_{0}^{t} g_{\alpha\beta}\bigl(X(t)\bigr)\dot{X}^{\alpha}(t)\dot{X}^{\beta}(t)\d{t}. \end{gather*}\]

Holding the endpoints and \(t\) fixed, we can view the path-length as a functional of the path:

\[\begin{gather*} L[X] = \int_{0}^{t} \mathcal{L}(X, \dot{X}), \qquad \mathcal{L}(X, \dot{X}) = g_{\alpha\beta}\bigl(X(t)\bigr)\dot{X}^{\alpha}(t)\dot{X}^{\beta}(t). \end{gather*}\]

The geodesic equation is thus the corresponding Euler-Lagrange equation

\[\begin{gather*} \pdiff{\mathcal{L}(X, \dot{X})}{X^{\gamma}} = \diff{}{t}\pdiff{\mathcal{L}(X, \dot{X})}{\dot{X}^{\gamma}},\\ \partial_{\gamma}g_{\alpha\beta}\dot{X}^{\alpha}\dot{X}^{\beta} =\diff{}{t} g_{\alpha\beta}(\dot{X}^{\alpha}\delta^{\beta}_{\gamma} + \delta^{\alpha}_{\gamma}\dot{X}^{\beta}) =\diff{(g_{\alpha\gamma}+g_{\gamma\alpha})\dot{X}^{\alpha}}{t},\\ \Bigl( g_{\alpha\beta,\gamma} - g_{\alpha\gamma,\beta} - g_{\gamma\alpha,\beta} \Bigr)\dot{X}^{\alpha}\dot{X}^{\beta} = (g_{\alpha\gamma} + g_{\gamma\alpha})\ddot{X}^{\alpha}. \end{gather*}\]

Using the symmetry of \(g_{\alpha\beta} = g_{\beta\alpha}\) and the fact that each term of the lhs is symmetric under \(\alpha \leftrightarrow \beta\), we can rewrite this in terms of the Christoffel symbols:

\[\begin{gather*} \underbrace{\frac{1}{2}\Bigl( g_{\alpha\beta,\gamma} - g_{\alpha\gamma,\beta} - g_{\beta\gamma,\alpha} \Bigr)}_{-[\alpha \beta, \gamma]}\dot{X}^{\alpha}\dot{X}^{\beta} = g_{\alpha\gamma}\ddot{X}^{\alpha},\\ \Gamma^{\gamma}_{\alpha\beta}\dot{X}^{\alpha}\dot{X}^{\beta} + \ddot{X}^{\gamma} = 0. \end{gather*}\]