Customising the adaptive integrator

Customising the adaptive integrator#

In the previous section we showed a few usage examples of the taylor_adaptive class using the default options. Here, we will show how the behaviour of the integrator can be customised in a variety of ways.

Error tolerance#

As we mentioned earlier, by default the taylor_adaptive class uses an error tolerance equal to the machine epsilon of the floating-point type in use. E.g., when using the double floating-point type, the tolerance is set to \(\sim 2.2\times 10^{-16}\).

The tolerance value is used by the taylor_adaptive class to control the error arising from truncating the (infinite) Taylor series representing the solution of the ODE system. In other words, taylor_adaptive strives to ensure that the magnitude of the remainders of the Taylor series is not greater than the tolerance, either in an absolute or relative sense. Absolute error control mode is activated when all elements of the state vector have a magnitude less than 1, while relative error control mode is activated when at least one element of the state vector has a magnitude greater than 1.

In order to specify a non-default tolerance, the keyword argument tol can be used when constructing an integrator object:

    // Create the symbolic variables x and v.
    auto [x, v] = make_vars("x", "v");

    // Create the integrator object
    // in double precision, specifying
    // a non-default tolerance.
    auto ta = taylor_adaptive<double>{// Definition of the ODE system:
                                      // x' = v
                                      // v' = -9.8 * sin(x)
                                      {prime(x) = v, prime(v) = -9.8 * sin(x)},
                                      // Initial conditions
                                      // for x and v.
                                      {0.05, 0.025},
                                      // Set the tolerance to 1e-9.
                                      kw::tol = 1e-9};

    // Print the integrator object to screen.
    std::cout << ta << '\n';
Tolerance               : 1.0000000000000001e-09
Taylor order            : 12
Dimension               : 2
Time                    : 0.0000000000000000
State                   : [0.050000000000000003, 0.025000000000000001]

The optimal Taylor order for a tolerance of \(10^{-9}\) is now 12 (instead of 20 for a tolerance of \(\sim 2.2\times 10^{-16}\)).

Integrating the system back and forth shows how the accuracy of the integration is reduced with respect to the default tolerance value:

    // Integrate forth to t = 10 and then back to t = 0.
    ta.propagate_until(10.);
    ta.propagate_until(0.);

    std::cout << ta << '\n';
Tolerance               : 1.0000000000000001e-09
Taylor order            : 12
Dimension               : 2
Time                    : 0.0000000000000000
State                   : [0.050000000001312848, 0.024999999997558649]

Compact mode#

By default, the just-in-time compilation process of heyoka aims at maximising runtime performance over everything else. In practice, this means that heyoka generates a timestepper function in which there are no branches and where all loops have been fully unrolled.

This approach leads to highly optimised timestepper functions, but, on the other hand, it can result in long compilation times and high memory usage for large ODE systems. Thus, heyoka provides also a compact mode option in which code generation employs more traditional programming idioms that greatly reduce compilation time and memory usage. Compact mode results in a performance degradation of \(\lesssim 2\times\) with respect to the default code generation mode, but it renders heyoka usable with ODE systems consisting of thousands of terms.

Let’s try to quantify the performance difference in a concrete case. In this example, we first construct the ODE system corresponding to an N-body problem with 6 particles via the model::nbody() utility function:

    // Create an nbody system with 6 particles.
    auto sys = model::nbody(6);

Next, we create an initial state vector for our system. The contents of the vector do not matter at this stage:

    // Create an initial state vector (6 values per body).
    auto sv = std::vector<double>(36);

Next, we time the creation of an integrator object in default code generation mode:

    // Take the current time.
    auto start = chrono::steady_clock::now();

    // Construct an integrator in default mode.
    auto ta_default = taylor_adaptive<double>{sys, sv};

    std::cout << "Default mode timing: "
              << chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now() - start).count() << "ms\n";
Default mode timing: 3807ms

Finally, we time the creation of the same integrator object in compact mode (which can be activated via the compact_mode keyword argument):

    // Reset the start time.
    start = chrono::steady_clock::now();

    // Construct an integrator in compact mode.
    auto ta_compact = taylor_adaptive<double>{sys, sv, kw::compact_mode = true};

    std::cout << "Compact mode timing: "
              << chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now() - start).count() << "ms\n";
Compact mode timing: 269ms

That is, in this specific example compact mode is more than 10 times faster than the default code generation mode when it comes to the construction of the integrator object. For larger ODE systems, the gap will be even wider.

High-accuracy mode#

For long-term integrations at very low error tolerances, heyoka offers an opt-in high-accuracy mode. In high-accuracy mode, heyoka employs techniques that minimise the numerical errors arising from the use of finite-precision floating-point numbers, at the cost of a slight runtime performance degradation.

Currently, high-accuracy mode changes the way heyoka evaluates the Taylor polynomials used to update the state of the system at the end of an integration timestep. Specifically, polynomial evaluation via Horner’s rule is replaced by compensated summation, which prevents catastrophic cancellation issues and ultimately helps maintaining machine precision over very long integrations.

High-accuracy mode can be enabled via the high_accuracy keyword argument.