Variational ODEs#

Added in version 5.0.0.

Consider a system of differential equations in the standard form

\[ \frac{d\boldsymbol{x}}{dt} = f\left(\boldsymbol{x}, \boldsymbol{\alpha}, t\right), \]

where \(\boldsymbol{x}\) is the vector of state variables and \(\boldsymbol{\alpha}\) a vector of parameters. For a given set of initial conditions \(\boldsymbol{x}_0\) at time \(t_0\), the solution of this system will be

\[ \boldsymbol{x} = \boldsymbol{x}\left(t, \boldsymbol{x}_0, t_0, \boldsymbol{\alpha} \right). \]

When solving numerically initial-value problems, it is often useful to compute not only the solution, but also its partial derivatives with respect to the initial conditions and/or the parameters. The derivatives with respect to the initial conditions, for instance, are needed for the computation of chaos indicators and for uncertainty propagation, and they can also be used to propagate a small neighborhood in phase space around the initial conditions. The derivatives with respect to the parameters of the system are required when formulating optimisation and inversion problems such as orbit determination, trajectory optimisation and training of neural networks in neural ODEs.

There are two main methods for the computation of the partial derivatives. The first one is based on the application of automatic differentiation (AD) techniques directly to the numerical integration algorithm. This can be done either by replacing the algebra of floating-point numbers with the algebra of (generalised) dual numbers (aka truncated Taylor polynomials), or via differentiable programming techniques. The former approach is used by libraries such as pyaudi, desolver and TaylorIntegration.jl, while differentiable programming is popular in the machine learning community with projects such as PyTorch, JAX and TensorFlow. Differentiable programming is also popular in the Julia programming language community.

The second method is based on the formulation of the variational equations, that is, differential equations satisfied by the partial derivatives which are added to and solved together with the original ODEs. For instance, we can formulate differential equations for the first-order derivatives with respect to the initial conditions via elementary calculus:

\[ \frac{d}{dt}\frac{\partial \boldsymbol{x}}{\partial \boldsymbol{x}_0} = \frac{\partial }{\partial \boldsymbol{x}_0} \frac{d \boldsymbol{x}}{dt} = \frac{\partial f}{\partial \boldsymbol{x}_0} = \frac{\partial f}{\partial \boldsymbol{x}} \frac{\partial \boldsymbol{x}}{\partial \boldsymbol{x}_0}. \]

The variational ODE system then reads

\[\begin{split} \begin{cases} \frac{d\boldsymbol{x}}{dt} = f\left(\boldsymbol{x}, \boldsymbol{\alpha}, t\right) \\ \frac{d}{dt}\frac{\partial \boldsymbol{x}}{\partial \boldsymbol{x}_0} = \frac{\partial f}{\partial \boldsymbol{x}} \frac{\partial \boldsymbol{x}}{\partial \boldsymbol{x}_0} \end{cases}, \end{split}\]

and the original state vector \(\boldsymbol{x}\) has been extended to include the variational state variables \(\frac{\partial \boldsymbol{x}}{\partial \boldsymbol{x}_0}\).

heyoka.py adopts the variational approach for the computation of the partial derivatives, supporting the formulation of variational ODEs at arbitrary differentiation orders and with respect to any combination of initial conditions, parameters and initial time. In this tutorial, we will explore this feature and show a couple of interesting use cases.

Before beginning, however, let us point out for clarity (and for the benefit of the search engines indexing this page) that in the scientific literature there is a bewildering variety of different names and monikers used when discussing partial derivatives of ODEs and their applications. Here is a (partial) list:

  • in the astrodynamics community, the term differential algebra is often used to refer to the computation of partial derivatives via truncated Taylor polynomials (e.g., see this paper). The term actually originates from the community of beam physics, where it has been used in the context of the theoretical modelling of particle accelerators since the 90s (e.g., see this review);

  • in the mathematical community, the term jet transport is sometimes used to refer to the propagation of a small neighborhood in phase space around the initial conditions via the Taylor series constructed form the partial derivatives (e.g., see this paper). In heyoka.py, we refer to a similar idea as Taylor map evaluation;

  • in the Julia programming language community, the term local sensitivity analysis refers to the computation of the partial derivatives via the variational equations, while discrete sensitivity analysis refers to the computation of the partial derivatives by directly differentiating the numerical method’s steps (e.g., see this review);

  • in the space engineering community, the term state transition tensors is sometimes used to indicate the generalisations of the state transition matrix (which in turn is built from the first-order partial derivatives) to higher differentiation orders.

Constructing a variational ODE system#

Let us begin with the definition of a simple ODE system:

import heyoka as hy

# Create the symbolic variables x and v.
x, v = hy.make_vars("x", "v")

# Create an ODE system.
sys = [(x, v), (v, hy.cos(hy.time) - hy.par[0] * v - hy.sin(x))]

This is the forced damped pendulum system already considered in another tutorial, where we have introduced the air friction coefficient as the runtime parameter par[0].

We then proceed to create a var_ode_sys:

# Create the variational ODE system.
vsys = hy.var_ode_sys(sys, hy.var_args.vars, order=2)

Upon construction, var_ode_sys formulates the variational equations for the input ODE system sys up to the specified differentiation order. The second argument specifies with respect to which quantities the variational equations are formulated. In this case, we used the vars enumerator of the var_args enum: this means that the variational equations will be formulated with respect to the initial conditions of the state variables. In a completely equivalent manner, we could have written instead:

vsys = hy.var_ode_sys(sys, [x, v], order=2)

In this case, instead of a var_args enumerator, we passed an explicit list of state variables with respect to whose initial conditions we want to formulate the variational equations. In a similar fashion, we could have provided a list of runtime parameters instead, or event a mixed list of variables and runtime parameters. Please refer to the documentation of var_ode_sys and var_args for exhaustive explanations of what can be passed as second argument to the constructor.

Let us explore a bit the var_ode_sys class. First of all, we can access the variational system of equations:

vsys.sys
[(x, v),
 (v, ((cos(t) - (p0 * v)) - sin(x))),
 (∂[(0, 1)]x, ∂[(0, 1)]v),
 (∂[(1, 1)]x, ∂[(1, 1)]v),
 (∂[(0, 1)]v, (-(∂[(0, 1)]x * cos(x)) - (∂[(0, 1)]v * p0))),
 (∂[(1, 1)]v, (-(∂[(1, 1)]x * cos(x)) - (∂[(1, 1)]v * p0))),
 (∂[(0, 2)]x, ∂[(0, 2)]v),
 (∂[(0, 1), (1, 1)]x, ∂[(0, 1), (1, 1)]v),
 (∂[(1, 2)]x, ∂[(1, 2)]v),
 (∂[(0, 2)]v,
  (-(∂[(0, 2)]v * p0) - (((∂[(0, 1)]x * -sin(x)) * ∂[(0, 1)]x) + (∂[(0, 2)]x * cos(x))))),
 (∂[(0, 1), (1, 1)]v,
  (-(∂[(0, 1), (1, 1)]v * p0) - (((∂[(0, 1)]x * -sin(x)) * ∂[(1, 1)]x) + (∂[(0, 1), (1, 1)]x * cos(x))))),
 (∂[(1, 2)]v,
  (-(∂[(1, 2)]v * p0) - (((∂[(1, 1)]x * -sin(x)) * ∂[(1, 1)]x) + (∂[(1, 2)]x * cos(x)))))]

The first two equations are from the original system of ODEs, while the remaining ones are the variational equations. The names of the variational variables begin with the \(\partial\) symbol, followed by a sparse multiindex encoding of the differentiation indices. For instance, the variational variable ∂[(0, 1)]x is the first-order derivative of \(x\) with respect to the first variational argument, that is, \(\frac{\partial x}{\partial x_0}\). Similarly, ∂[(0, 1), (1, 1)]x is the second order derivative of \(x\) with respect to both variational arguments, that is, \(\frac{\partial^2 x}{\partial x_0 \partial y_0}\). The ordering of the variational equations follows the same scheme explained in the tutorial about computing derivatives.

We can also query other properties of vsys. For instance, the differentiation order:

vsys.order
2

The number of state variables in the original ODE system:

vsys.n_orig_sv
2

And the list of arguments with respect to which the variational equations are formulated:

vsys.vargs
[x, v]

Constructing a variational integrator#

After the construction of a variational ODE system, we are now ready to construct a variational integrator. We can do this by simply passing the variational ODE system (instead of the original, non-variational ODE system) as first input argument to the constructor:

# Construct a variational integrator.
ta_var = hy.taylor_adaptive(vsys, [.2, .3], pars=[.4], compact_mode=True)

Note how we constructed the integrator with compact mode enabled: the formulation of the variational equations, especially at high differentiation orders, greatly increases the size of the symbolic expressions that need to be just-in-time compiled during the creation of the integrator. By enabling compact mode, we keep the compilation time at manageable levels.

Let us inspect the integrator:

ta_var
C++ datatype            : double
Tolerance               : 2.220446049250313e-16
High accuracy           : false
Compact mode            : true
Taylor order            : 20
Dimension               : 12
Time                    : 0
State                   : [0.2, 0.3, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0]
Parameters              : [0.4]
Variational order       : 2

The screen output informs us that ta_var is a variational integrator of order 2. We can also see that, although on construction we passed the initial conditions only for the x and v state variables, the integrator automatically set up appropriate initial conditions for the variational variables. Indeed, with respect to a non-variational integrator, the state vector has been augmented to store also the values of the variational variables:

ta_var.state
array([0.2, 0.3, 1. , 0. , 0. , 1. , 0. , 0. , 0. , 0. , 0. , 0. ])

Alternatively, instead of relying on the integrator to automatically set up the initial conditions of the variational variables, we could also pass a full 12-elements vector of initial conditions in input.

We are now ready to proceed to our first variational integration:

# Propagate until t=3.
ta_var.propagate_until(3.)

# Print the full state vector.
ta_var.state
array([ 0.11757215, -1.24940656, -0.425478  ,  0.41983649, -0.19171818,
       -0.51871994,  0.27771857,  0.22392433,  0.60414865, -0.1122785 ,
       -0.12689065,  0.12443791])

The first two entries in the state vector are the current values of the state variables x and v. The remaining entries are the current values of the partial derivatives.

That is all fine and good, but how do we fetch the values of the derivatives we are interested in at the end of an integration? As mentioned earlier, the partial derivatives are ordered in the state vector following the same criterion explained in the tutorial about computing derivatives: first by total order of differentiation, then by component (i.e., the derivatives of x precede the derivatives of v) and finally by reverse lexicographic order with respect to the differentiation multiindices. However, navigating by hand this ordering scheme can be complicated, especially at high differentiation orders.

Variational integrators provide a couple of methods that facilitate the task of locating specific derivatives in the state vector. The first helper is get_vslice(). This method takes as input a differentiation order and, optionally, a component index, and returns a slice into the state vector corresponding to the range of indices for the requested derivatives. Let us see a couple of examples:

# Fetch the range of all order-2 derivatives.
sl = ta_var.get_vslice(order=2)
sl
slice(6, 12, None)

That is, the order-2 derivatives are between indices 6 and 12 in the state vector. We can use sl to index directly into the state vector:

# Fetch all order-2 derivatives from the state vector.
ta_var.state[sl]
array([ 0.27771857,  0.22392433,  0.60414865, -0.1122785 , -0.12689065,
        0.12443791])

If we are interested only in the order-2 derivatives of v, we can pass the additional component keyword argument:

# Fetch the range of the order-2 derivatives for v.
# component=1 means the second original state variable,
# i.e., v (component=0 would fetch the derivatives for x).
sl = ta_var.get_vslice(order=2, component=1)

# Fetch the order-2 derivatives for v.
ta_var.state[sl]
array([-0.1122785 , -0.12689065,  0.12443791])

Often fetching the values of the derivatives is not enough, and we also need to access the differentiation multiindices associated to each derivative. In order to do this, we can use the get_mindex() method, which takes in input a single index into the state vector and returns the corresponding differentiation multiindex.

Let us see a couple of examples:

ta_var.get_mindex(i=0)
[0, 0, 0]

At i=0 in the state vector we have the order-0 derivative of the first state variable - that is, a complicated way of saying that we have the current value of x. Let us see a more useful example:

# Fetch the range of all order-2 derivatives.
sl = ta_var.get_vslice(order=2)

# Print the multiindices and associated values.
for idx in range(sl.start, sl.stop):
    print(f"Multiindex: {ta_var.get_mindex(idx)}, derivative value: {ta_var.state[idx]}")
Multiindex: [0, 2, 0], derivative value: 0.27771856625922825
Multiindex: [0, 1, 1], derivative value: 0.2239243290399873
Multiindex: [0, 0, 2], derivative value: 0.6041486469599123
Multiindex: [1, 2, 0], derivative value: -0.1122785039896587
Multiindex: [1, 1, 1], derivative value: -0.12689065229074553
Multiindex: [1, 0, 2], derivative value: 0.12443790781972988

Recall that in a multiindex the first number refers to the component index (i.e., 0 for x and 1 for v), while the remaining indices refer to the differentiation orders with respect to the variational arguments.

Taylor map evaluation#

One of the most useful applications of the variational equations is the ability to compute how a small perturbation on the initial conditions and/or parameters of the system affects the current state of the system, and to do it quickly (i.e., without having to repeat the numerical integration with the updated initial conditions/parameters). This is accomplished by using the values of the partial derivatives to construct and evaluate the multivariate Taylor series of the solution around the original initial conditions/parameters of the system. This approach, when applied to perturbations on the initial conditions, is sometimes called jet transport in the literature. Here, more generally, we will call it evaluation of the Taylor map.

Variational integrators provide a specific method called eval_taylor_map() to construct and evaluate the Taylor map. Let us see a simple example. We begin by re-creating from scratch our variational integrator:

ta_var = hy.taylor_adaptive(vsys, [.2, .3], pars=[.4], compact_mode=True)

We define two small displacements on the state variables x and v:

dx = 1e-4
dv = -2e-4

And we create a non-variational integrator with displaced initial conditions with respect to the variational one:

# Non-variational integrator with displaced
# initial conditions.
ta = hy.taylor_adaptive(sys, [.2 + dx, .3 + dv], pars=[.4], compact_mode=True)

Next, we propagate both integrators up to \(t=3\):

ta_var.propagate_until(3.)
ta.propagate_until(3.);

Clearly, the values of x and v will differ in the two integrators due to the different initial conditions:

print(f"Non-variational state: {ta.state}")
print(f"Variational state    : {ta_var.state[:2]}")
Non-variational state: [ 0.11744564 -1.24932198]
Variational state    : [ 0.11757215 -1.24940656]

We can now use the eval_taylor_map() method on the variational integrator to compute the effect of the displacements dx and dv on the state of the system:

ta_var.eval_taylor_map([dx, dv])
array([ 0.11744564, -1.24932198])

eval_taylor_map() takes in input a vector of displacements (one for each variational argument), and computes and evaluates the Taylor map, returning a reference to an internal array in ta_var storing the result of the evaluation (i.e., the updated values of the state variables). We can see that, in this specific case, the evaluation of the Taylor map reproduces accurately the state vector computed by the non-variational integrator with displaced initial conditions:

print(f'Taylor map error: {ta_var.eval_taylor_map([dx, dv]) - ta.state}')
Taylor map error: [6.58681443e-13 6.68798350e-13]

Note that the Taylor map state vector can also be fetched via the tstate property of the integrator:

ta_var.tstate
array([ 0.11744564, -1.24932198])

Warning

Accessing the Taylor map state vector via tstate will NOT trigger any Taylor map evaluation, it will merely return a reference to the internal array storing the result of the evaluation. It is your responsibility to ensure that you called eval_taylor_map() before accessing this array via tstate.

A note on computational efficiency#

var_ode_sys uses internally the diff_tensors() and dtens API to formulate the variational equations. This means that the computation of the symbolic derivatives is performed in an efficient manner. For instance, reverse-mode symbolic automatic differentiation will be employed when computing the first-order variationals of ODE systems containing a large number of parameters (e.g., in neural ODEs).

See the computing derivatives tutorial for a more in-depth discussion of how heyoka.py computes symbolic derivatives.