Computations in extended precision#
As hinted in the installation instructions, heyoka supports computations not only in single and double precision, but also in extended precision. Specifically, heyoka currently supports:
the 80-bit IEEE extended-precision format (~21 decimal digits),
the 128-bit IEEE quadruple-precision format (~36 decimal digits).
How these extended precision floating-point types can be accessed and used from C++ varies depending on the platform. The 80-bit
extended-precision format is available as the C++ long double
type on most platforms based on Intel x86 processors. Quadruple-precision
computations are supported either via the long double
type (e.g., on 64-bit Linux ARM) or via the the mppp::real128
type
(provided that the platform supports the nonstandard __float128
floating-point type and that heyoka was compiled with support
for the mp++ library - see the installation instructions).
A simple example#
We will be assuming here that long double
implements the 80-bit extended-precision floating-point format.
In order to verify that heyoka indeed is able to work in extended precision, we will be monitoring the evolution of the energy constant
in a high-precision numerical integration of the simple pendulum.
Let us begin as usual with the definition of the dynamical equations and the creation of the integrator object:
// Create the symbolic variables x and v.
auto [x, v] = make_vars("x", "v");
// Create the integrator object
// in extended precision.
auto ta = taylor_adaptive<long double>{// Definition of the ODE system:
// x' = v
// v' = -9.8 * sin(x)
{prime(x) = v, prime(v) = -9.8 * sin(x)},
// Initial conditions
// for x and v.
{-1., 0.}};
In order to activate extended precision, we created an integrator object of type taylor_adaptive<long double>
- that is,
we specified long double
, instead of the usual double
, as the (only) template parameter for the taylor_adaptive
class template.
Note that, for simplicity, we still used double-precision values for the initial state: these values are automatically
converted to long double
by the integrator’s constructor. Note also that, when operating in extended precision,
all numerical values encapsulated in an integrator are represented in extended precision - this includes not only the state vector,
but also the time coordinate, the tolerance, the Taylor coefficients, etc. Similarly to double-precision integrators, the default value
of the tolerance is the machine epsilon of long double
.
Next, we define a small helper function that will allow us to monitor the evolution of the energy constant throughout the integration:
// Create a small helper to compute the energy constant
// from the state vector.
auto compute_energy = [](const auto &sv) {
using std::cos;
return (sv[1] * sv[1]) / 2 + 9.8 * (1 - cos(sv[0]));
};
Before starting the integration, we compute and store the initial energy for later use:
// Compute and store the intial energy.
const auto orig_E = compute_energy(ta.get_state());
We can now begin a step-by-step integration. At the end of each step, we will be computing and printing to screen the relative energy error:
// Integrate for a few timesteps.
for (auto i = 0; i < 20; ++i) {
using std::abs;
ta.step();
std::cout << "Relative energy error: " << abs((orig_E - compute_energy(ta.get_state())) / orig_E) << '\n';
}
Relative energy error: 0
Relative energy error: 0
Relative energy error: 9.62658e-20
Relative energy error: 0
Relative energy error: 9.62658e-20
Relative energy error: 9.62658e-20
Relative energy error: 9.62658e-20
Relative energy error: 1.92532e-19
Relative energy error: 1.92532e-19
Relative energy error: 1.92532e-19
Relative energy error: 1.92532e-19
Relative energy error: 9.62658e-20
Relative energy error: 1.92532e-19
Relative energy error: 9.62658e-20
Relative energy error: 9.62658e-20
Relative energy error: 9.62658e-20
Relative energy error: 9.62658e-20
Relative energy error: 9.62658e-20
Relative energy error: 0
Relative energy error: 9.62658e-20
The console output indeed confirms that energy is conserved at the level of the epsilon of the 80-bit extended-precision format (that is, \(\sim 10^{-19}\)).
Performing quadruple-precision integrations via the mppp::real128
type follows exactly the same pattern. That is,
one just needs to replace long double
with mppp::real128
when instantiating the integrator object.
Other classes and functions#
Besides the adaptive integrator, several other classes and functions in heyoka can be used in extended precision.
The event classes, for instance, can be constructed in extended precision by passing long double
or mppp::real128
as the template parameter (instead of double
). Note that the precision of an event
must match the precision of the integrator object in which the event is used, otherwise an error will be produced
at compilation time.
Full code listing#
#include <iostream>
#include <heyoka/heyoka.hpp>
using namespace heyoka;
int main()
{
#if !defined(HEYOKA_ARCH_PPC)
// Create the symbolic variables x and v.
auto [x, v] = make_vars("x", "v");
// Create the integrator object
// in extended precision.
auto ta = taylor_adaptive<long double>{// Definition of the ODE system:
// x' = v
// v' = -9.8 * sin(x)
{prime(x) = v, prime(v) = -9.8 * sin(x)},
// Initial conditions
// for x and v.
{-1., 0.}};
// Create a small helper to compute the energy constant
// from the state vector.
auto compute_energy = [](const auto &sv) {
using std::cos;
return (sv[1] * sv[1]) / 2 + 9.8 * (1 - cos(sv[0]));
};
// Compute and store the intial energy.
const auto orig_E = compute_energy(ta.get_state());
// Integrate for a few timesteps.
for (auto i = 0; i < 20; ++i) {
using std::abs;
ta.step();
std::cout << "Relative energy error: " << abs((orig_E - compute_energy(ta.get_state())) / orig_E) << '\n';
}
#endif
}