Quadruple-precision floats

Quadruple-precision floats#

Note

The functionality described in this section is available only if mp++ was configured with the MPPP_WITH_QUADMATH option enabled (see the installation instructions).

Added in version 0.5.

#include <mp++/real128.hpp>

The real128 class#

class mppp::real128#

Quadruple-precision floating-point class.

This class represents real values encoded in the quadruple-precision IEEE 754 floating-point format (which features up to 36 decimal digits of precision). The class is a thin wrapper around the __float128 type and the quadmath library, available on GCC, Clang and the Intel compiler on most modern platforms, on top of which it provides the following additions:

interoperability with other mp++ classes,
consistent behaviour with respect to the conventions followed elsewhere in mp++ (e.g., values are default-initialised to zero rather than to indefinite values, conversions must be explicit, etc.),
enhanced compile-time (constexpr) capabilities,
a generic C++ API.

Most of the functionality is exposed via plain functions, with the general convention that the functions are named after the corresponding quadmath functions minus the trailing q suffix. For instance, the quadmath code

__float128 a = 1;
auto b = ::sinq(a);

that computes the sine of 1 in quadruple precision, storing the result in b, becomes in mp++

real128 a{1};
auto b = sin(a);

where the sin() function is resolved via argument-dependent lookup.

Various overloaded operators are provided. Alternative comparison functions treating NaNs specially are also provided for use in the C++ standard library (and wherever strict weak ordering relations are needed).

The real128 class is a literal type, and, whenever possible, operations involving real128 are marked as constexpr. Some functions which are not constexpr in the quadmath library have been reimplemented as constexpr functions via compiler builtins.

A tutorial showcasing various features of real128 is available.

https://gcc.gnu.org/onlinedocs/libquadmath/

__float128 m_value#

The internal value.

This class member gives direct access to the __float128 instance stored inside a real128.

constexpr real128()#

Default constructor.

The default constructor will set this to zero.

real128(const real128&) = default#

real128(real128&&) = default#: real128 is trivially copy and move constructible.

explicit constexpr real128(const __float128 &x)#

Constructor from a __float128.

This constructor will initialise the internal __float128 value to x.

Parameters:: x – the __float128 that will be assigned to the internal value.

template<real128_interoperable T> constexpr real128(const T &x)#

Constructor from interoperable types.

This constructor will initialise the internal value to x. Depending on the value and type of x, this may not be exactly equal to x after initialisation (e.g., if x is a very large integer).

Parameters:: x – the value that will be used for the initialisation.
Throws:: std::overflow_error – in case of (unlikely) overflow errors during initialisation.

template<real128_cpp_complex T> explicit constexpr real128(const T &c)#

Note

This constructor is constexpr only if at least C++14 is being used.

Added in version 0.20.

Constructor from complex C++ types.

The initialisation is is successful only if the imaginary part of c is zero.

Parameters:: c – the input complex value.
Throws:: std::domain_error – if the imaginary part of c is not zero.

template<string_type T> explicit real128(const T &s)#

Constructor from string.

This constructor will initialise this from the string_type s. The accepted string formats are detailed in the quadmath library’s documentation (see the link below). Leading whitespaces are accepted (and ignored), but trailing whitespaces will raise an error.

Parameters:

s – the string that will be used to initialise this.

Throws:

std::invalid_argument – if s does not represent a valid quadruple-precision floating-point value.
unspecified – any exception thrown by memory errors in standard containers.

explicit real128(const char *begin, const char *end)#

Constructor from range of characters.

This constructor will initialise this from the content of the input half-open range, which is interpreted as the string representation of a floating-point value.

Internally, the constructor will copy the content of the range to a local buffer, add a string terminator, and invoke the constructor from string.

Parameters:

begin – the begin of the input range.
end – the end of the input range.

Throws:

unspecified – any exception thrown by the constructor from string or by memory errors in standard containers.

real128 &operator=(const real128&) = default#

real128 &operator=(real128&&) = default#: real128 is trivially copy and move assignable.

constexpr real128 &operator=(const __float128 &x)#

Note

This operator is constexpr only if at least C++14 is being used.

Assignment from a __float128.

Parameters:: x – the __float128 that will be assigned to the internal value.
Returns:: a reference to this.

template<real128_interoperable T> constexpr real128 &operator=(const T &x)#

Note

This operator is constexpr only if at least C++14 is being used.

Assignment from interoperable types.

Parameters:: x – the assignment argument.
Returns:: a reference to this.
Throws:: unspecified – any exception thrown by the construction of a real128 from x.

template<real128_cpp_complex T> constexpr real128 &operator=(const T &c)#

Note

This operator is constexpr only if at least C++14 is being used.

Added in version 0.20.

Assignment from complex C++ types.

Parameters:: c – the assignment argument.
Returns:: a reference to this.
Throws:: std::domain_error – if the imaginary part of c is not zero.

real128 &operator=(const real &x)#

constexpr real128 &operator=(const complex128 &x)#

real128 &operator=(const complex &x)#

Note

The real overload is available only if mp++ was configured with the MPPP_WITH_MPFR option enabled. The complex overload is available only if mp++ was configured with the MPPP_WITH_MPC option enabled.

Note

The complex128 overload is constexpr only if at least C++14 is being used.

Added in version 0.20.

Assignment operators from other mp++ classes.

These operators are formally equivalent to converting x to real128 and then move-assigning the result to this.

Parameters:: x – the assignment argument.
Returns:: a reference to this.
Throws:: unspecified – any exception raised by the conversion of x to real128.

template<string_type T> real128 &operator=(const T &s)#

Assignment from string.

The body of this operator is equivalent to:

return *this = real128{s};

That is, a temporary real128 is constructed from s and it is then move-assigned to this.

Parameters:: s – the string that will be used for the assignment.
Returns:: a reference to this.
Throws:: unspecified – any exception thrown by the constructor from string.

explicit constexpr operator __float128() const#

Conversion to __float128.

Returns:: a copy of the __float128 value stored internally.

template<real128_interoperable T> explicit constexpr operator T() const#

Conversion operator to interoperable types.

This operator will convert this to a real128_interoperable type.

Conversion to C++ types is implemented via direct cast, and thus no checks are performed to ensure that the value of this can be represented by the target type.

Conversion to rational, if successful, is exact.

Conversion to integral types will produce the truncated counterpart of this.

Returns:: this converted to T.
Throws:: std::domain_error – if this represents a non-finite value and T is integer or rational.

template<real128_cpp_complex T> explicit constexpr operator T() const#

Note

This operator is constexpr only if at least C++14 is being used.

Added in version 0.20.

Conversion to complex C++ types.

Returns:: this converted to the type T.

template<real128_interoperable T> constexpr bool get(T &rop) const#

template<real128_cpp_complex T> constexpr bool get(T &rop) const#

Note

The first overload is constexpr only if at least C++14 is being used. The second overload is constexpr only if at least C++20 is being used.

Conversion member functions to interoperable and complex C++ types.

These member functions, similarly to the conversion operator, will convert this to T, storing the result of the conversion into rop. Differently from the conversion operator, these functions do not raise any exception: if the conversion is successful, the functions will return true, otherwise the functions will return false. If the conversion fails, rop will not be altered. The conversion can fail only if T is either integer or rational, and this represents a non-finite value.

Added in version 0.20: The conversion function to complex C++ types.

Parameters:: rop – the variable which will store the result of the conversion.
Returns:: true if the conversion succeeds, false otherwise.

std::string to_string() const#

Convert to string.

This member function will convert this to a decimal string representation in scientific format. The number of significant digits in the output (36) guarantees that a real128 constructed from the returned string will have a value identical to the value of this.

The implementation uses the quadmath_snprintf() function from the quadmath library.

Returns:: a decimal string representation of this.
Throws:: std::runtime_error – if the internal call to the quadmath_snprintf() function fails.

std::tuple<std::uint_least8_t, std::uint_least16_t, std::uint_least64_t, std::uint_least64_t> get_ieee() const#

Get the IEEE representation of the value.

This member function will return a tuple containing the IEEE quadruple-precision floating-point representation of the value. The returned tuple elements are, in order:

the sign of the value (1 for a negative sign bit, 0 for a positive sign bit),
the exponent (a 15-bit unsigned value),
the high part of the significand (a 48-bit unsigned value),
the low part of the significand (a 64-bit unsigned value).

Returns:: a tuple containing the IEEE quadruple-precision floating-point representation of the value stored in this.

int ilogb() const#

real128 logb() const#

Note

The logb() function is available when using libquadmath from GCC 6 onwards. See also the MPPP_QUADMATH_HAVE_LOGBQ definition.

Added in version 0.21.

Returns:: the unbiased exponent of this, as an int or as a real128.

bool signbit() const#

Sign bit.

This member function will return the value of the sign bit of this. That is, if this is not a NaN the function will return true if this is negative or \(-0\), false otherwise. If this is NaN, the sign bit of the NaN value will be returned.

Returns:: true if the sign bit of this is set, false otherwise.

constexpr int fpclassify() const#

Note

This function is not constexpr if the Intel C++ compiler is being used.

Categorise the floating point value.

This member function will categorise the floating-point value of this into the 5 categories, represented as int values, defined by the standard:

FP_NAN for NaN,
FP_INFINITE for infinite,
FP_NORMAL for normal values,
FP_SUBNORMAL for subnormal values,
FP_ZERO for zero.

Returns:: the category to which the value of this belongs.

constexpr bool isnan() const#

constexpr bool isinf() const#

constexpr bool finite() const#

constexpr bool isfinite() const#

constexpr bool isnormal() const#

Note

These functions are not constexpr if the Intel C++ compiler is being used.

Detect NaN, infinity, finite value (finite() and isfinite()) or normal value.

Added in version 0.22: isfinite() and isnormal().

Returns:: true is the value of this is, respectively, NaN, an infinity, finite or normal, false otherwise.

constexpr real128 &abs()#

constexpr real128 &fabs()#

Note

These functions are constexpr only if at least C++14 is being used.

Note

These functions are not constexpr if the Intel C++ compiler is being used.

In-place absolute value.

These member functions will set this to its absolute value.

Added in version 0.23: The fabs() overload.

Returns:: a reference to this.

real128 &sqrt()#

real128 &cbrt()#

In-place roots.

These member functions will set this to, respectively:

\(\sqrt{x}\),
\(\sqrt[3]{x}\),

where \(x\) is the current value of this.

Returns:: a reference to this.

real128 &sin()#

real128 &cos()#

real128 &tan()#

In-place trigonometric functions.

These member functions will set this to, respectively:

\(\sin{x}\),
\(\cos{x}\),
\(\tan{x}\),

where \(x\) is the current value of this.

Returns:: a reference to this.

real128 &asin()#

real128 &acos()#

real128 &atan()#

In-place inverse trigonometric functions.

These member functions will set this to, respectively:

\(\arcsin{x}\),
\(\arccos{x}\),
\(\arctan{x}\),

where \(x\) is the current value of this.

Returns:: a reference to this.

real128 &sinh()#

real128 &cosh()#

real128 &tanh()#

In-place hyperbolic functions.

These member functions will set this to, respectively:

\(\sinh{x}\),
\(\cosh{x}\),
\(\tanh{x}\),

where \(x\) is the current value of this.

Returns:: a reference to this.

real128 &asinh()#

real128 &acosh()#

real128 &atanh()#

In-place inverse hyperbolic functions.

These member functions will set this to, respectively:

\(\operatorname{arcsinh}{x}\),
\(\operatorname{arccosh}{x}\),
\(\operatorname{arctanh}{x}\),

where \(x\) is the current value of this.

Returns:: a reference to this.

real128 &exp()#

real128 &exp2()#

real128 &expm1()#

real128 &log()#

real128 &log10()#

real128 &log2()#

real128 &log1p()#

Note

The exp2() function is available when using libquadmath from GCC 9 onwards. See also the MPPP_QUADMATH_HAVE_EXP2Q definition.

In-place logarithms and exponentials.

These member functions will set this to, respectively:

\(e^x\),
\(2^x\),
\(e^x - 1\),
\(\log{x}\),
\(\log_{10}{x}\),
\(\log_2{x}\),
\(\log{\left( 1 + x \right)}\),

where \(x\) is the current value of this.

Added in version 0.21: The exp2(), expm1() and log1p() functions.

Returns:: a reference to this.

real128 &lgamma()#

real128 &tgamma()#

In-place gamma functions.

These member functions will set this to, respectively:

\(\log\Gamma\left( x \right)\),
\(\Gamma\left( x \right)\),

where \(x\) is the current value of this.

Added in version 0.21: The tgamma() function.

Returns:: a reference to this.

real128 &j0()#

real128 &j1()#

real128 &y0()#

real128 &y1()#

Added in version 0.21.

In-place Bessel functions of the first and second kind.

These member functions will set this to, respectively:

\(J_0\left( x \right)\),
\(J_1\left( x \right)\),
\(Y_0\left( x \right)\),
\(Y_1\left( x \right)\),

where \(x\) is the current value of this.

Returns:: a reference to this.

real128 &erf()#

real128 &erfc()#

In-place error functions.

These member functions will set this to, respectively:

\(\operatorname{erf}\left( x \right)\),
\(\operatorname{erfc}\left( x \right)\),

where \(x\) is the current value of this.

Added in version 0.21: The erfc() function.

Returns:: a reference to this.

real128 &ceil()#

real128 &floor()#

real128 &nearbyint()#

real128 &rint()#

real128 &round()#

real128 &trunc()#

Added in version 0.21.

Integer rounding functions.

These member functions will set this to, respectively:

\(\left\lceil x \right\rceil\),
\(\left\lfloor x \right\rfloor\),
the nearest integer value to x, according to the current rounding mode, without raising the FE_INEXACT exception,
the nearest integer value to x, according to the current rounding mode, possibly raising the FE_INEXACT exception,
the nearest integer value to x rounding halfway cases away from zero,
\(\operatorname{trunc}\left( x \right)\),

where \(x\) is the current value of this.

Returns:: a reference to this.

Types#

type __float128#: A quadruple-precision floating-point type available on GCC, Clang and the Intel compiler. This is the type wrapped by the real128 class.

See also

https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html

Concepts#

template<typename T> concept mppp::real128_interoperable#

This concept is satisfied by real-valued types that can interoperate with real128. Specifically, this concept is satisfied if either:

T is integer, or
T is rational, or
on GCC, the Intel compiler and Clang>=7, T satisfies mppp::cpp_arithmetic, or
on Clang<7, T satisfies mppp::cpp_arithmetic, except if T is long double.

template<typename T> concept mppp::real128_cpp_complex#

Added in version 0.20.

This concept is satisfied by complex C++ types that can interoperate with real128. Specifically, this concept is satisfied if either:

on GCC, the Intel compiler and Clang>=7, T satisfies mppp::cpp_complex, or
on Clang<7, T satisfies mppp::cpp_complex, except if T is std::complex<long double>.

template<typename T, typename U> concept mppp::real128_op_types#

This concept is satisfied if the types T and U are suitable for use in the generic binary operators involving real128 and other types. Specifically, the concept will be true if either:

T and U are both real128, or
one type is real128 and the other is a real128_interoperable type.

template<typename T, typename U> concept mppp::real128_eq_op_types#

Added in version 0.20.

This concept is satisfied if the types T and U are suitable for use in the generic binary equality and inequality operators involving real128 and other types. Specifically, the concept will be true if either:

T and U satisfy real128_op_types, or
one type is real128 and the other is a real128_cpp_complex type.

Functions#

Conversion#

template<mppp::real128_interoperable T> constexpr bool mppp::get(T &rop, const mppp::real128 &x)#

template<mppp::real128_cpp_complex T> constexpr bool mppp::get(T &rop, const mppp::real128 &x)#

Note

The first overload is constexpr only if at least C++14 is being used. The second overload is constexpr only if at least C++20 is being used.

Conversion functions.

These functions will convert the input real128 x to T, storing the result of the conversion into rop. If the conversion is successful, the functions will return true, otherwise the functions will return false. If the conversion fails, rop will not be altered. The conversion can fail only if T is either integer or rational, and x represents a non-finite value.

Parameters:

rop – the variable which will store the result of the conversion.
x – the input value.

Returns:

true if the conversion succeeds, false otherwise.

mppp::real128 mppp::frexp(const mppp::real128 &x, int *exp)#

Decompose a real128 into a normalized fraction and an integral power of two.

If x is zero, this function will return zero and store zero in exp. Otherwise, this function will return a real128 \(r\) with an absolute value in the \(\left[0.5,1\right)\) range, and it will store an integer value \(n\) in exp such that \(r \times 2^n\) equals to \(x\). If x is a non-finite value, the return value will be x and an unspecified value will be stored in exp.

Parameters:

x – the input real128.
exp – a pointer to the value that will store the exponent.

Returns:

the binary significand of x.

int mppp::ilogb(const mppp::real128 &x)#

real128 mppp::logb(const mppp::real128 &x)#

Note

The logb() function is available when using libquadmath from GCC 6 onwards. See also the MPPP_QUADMATH_HAVE_LOGBQ definition.

Added in version 0.21.

Unbiased exponent.

Parameters:: x – the input argument.
Returns:: the unbiased exponent of x, as an int or as a real128.

Arithmetic#

mppp::real128 mppp::fma(const mppp::real128 &x, const mppp::real128 &y, const mppp::real128 &z)#

Fused multiply-add.

This function will return \(\left(x \times y\right) + z\) as if calculated to infinite precision and rounded once.

Parameters:

x – the first factor.
y – the second factor.
z – the addend.

Returns:

\(\left(x \times y\right) + z\).

constexpr mppp::real128 mppp::abs(const mppp::real128 &x)#

constexpr mppp::real128 mppp::fabs(const mppp::real128 &x)#

Note

These functions are not constexpr if the Intel C++ compiler is being used.

Absolute value.

Added in version 0.23: The fabs() overload.

Parameters:: x – the real128 whose absolute value will be computed.
Returns:: \(\left| x \right|\).

mppp::real128 mppp::scalbn(const mppp::real128 &x, int n)#

mppp::real128 mppp::scalbln(const mppp::real128 &x, long n)#

mppp::real128 mppp::ldexp(const mppp::real128 &x, int n)#

Multiply by power of 2.

Added in version 0.21: The ldexp() function.

Parameters:

x – the input real128.
n – the power of 2 by which x will be multiplied.

Returns:

\(x \times 2^n\).

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::fdim(const T &x, const U &y)#

Added in version 0.21.

Positive difference.

This function returns the positive difference between x and y. That is, if \(x>y\), returns \(x-y\), otherwise returns \(+0\). Internally, the implementation uses the fdimq() function from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Parameters:

x – the first argument.
y – the second argument.

Returns:

the positive difference of x and y.

Comparison#

bool mppp::signbit(const mppp::real128 &x)#

Sign bit.

Parameters:: x – the input value.
Returns:: the sign bit of x (as returned by mppp::real128::signbit()).

constexpr int mppp::fpclassify(const mppp::real128 &x)#

Note

This function is not constexpr if the Intel C++ compiler is being used.

Categorise a real128.

Parameters:: x – the value whose floating-point category will be returned.
Returns:: the category of the value of x, as established by mppp::real128::fpclassify().

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::fmax(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::fmin(const T &x, const U &y)#

Added in version 0.21.

Max/min.

These functions will return, respectively, the maximum and minimum of the two input operands. NaNs are treated as missing data (between a NaN and a numeric value, the numeric value is chosen). Internally, the implementation uses the fmaxq() and fminq() functions from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Parameters:

x – the first argument.
y – the second argument.

Returns:

the maximum and minimum of the two input operands.

constexpr bool mppp::isnan(const mppp::real128 &x)#

constexpr bool mppp::isinf(const mppp::real128 &x)#

constexpr bool mppp::finite(const mppp::real128 &x)#

constexpr bool mppp::isfinite(const mppp::real128 &x)#

constexpr bool mppp::isnormal(const mppp::real128 &x)#

Note

These functions are not constexpr if the Intel C++ compiler is being used.

Detect special values.

These functions will return true is x is, respectively:

NaN,
an infinity,
a finite value (finite() and isfinite()),
a normal value,

and false otherwise.

Added in version 0.22: isfinite() and isnormal().

Parameters:: x – the input value.
Returns:: a boolean flag indicating if x is NaN, an infinity, a finite value or a normal value.

constexpr bool mppp::real128_equal_to(const mppp::real128 &x, const mppp::real128 &y)#

Note

This function is not constexpr if the Intel C++ compiler is being used.

Equality predicate with special NaN handling.

If both x and y are not NaN, this function is identical to the equality operator. Otherwise, this function will return true if both operands are NaN, false otherwise.

In other words, this function behaves like an equality operator which considers all NaN values equal to each other.

Parameters:

x – the first operand.
y – the second operand.

Returns:

true if \(x = y\) (including the case in which both operands are NaN), false otherwise.

constexpr bool mppp::real128_lt(const mppp::real128 &x, const mppp::real128 &y)#

Note

This function is not constexpr if the Intel C++ compiler is being used.

Less-than predicate with special NaN handling.

If both x and y are not NaN, this function is identical to the less-than operator. If at least one operand is NaN, this function will return true if x is not NaN, false otherwise.

In other words, this function behaves like a less-than operator which considers NaN values greater than non-NaN values. This function can be used as a comparator in various facilities of the standard library (e.g., std::sort(), std::set, etc.).

Parameters:

x – the first operand.
y – the second operand.

Returns:

true if \(x < y\) (with NaN values considered greather than non-NaN values), false otherwise.

constexpr bool mppp::real128_gt(const mppp::real128 &x, const mppp::real128 &y)#

Note

This function is not constexpr if the Intel C++ compiler is being used.

Greater-than predicate with special NaN handling.

If both x and y are not NaN, this function is identical to the greater-than operator. If at least one operand is NaN, this function will return true if y is not NaN, false otherwise.

In other words, this function behaves like a greater-than operator which considers NaN values greater than non-NaN values. This function can be used as a comparator in various facilities of the standard library (e.g., std::sort(), std::set, etc.).

Parameters:

x – the first operand.
y – the second operand.

Returns:

true if \(x > y\) (with NaN values considered greather than non-NaN values), false otherwise.

Roots#

mppp::real128 mppp::sqrt(const mppp::real128 &x)#

mppp::real128 mppp::cbrt(const mppp::real128 &x)#

Root functions.

These functions will return, respectively:

\(\sqrt{x}\),
\(\sqrt[3]{x}\).

Parameters:: x – the input argument.
Returns:: the square or cubic root of x.

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::hypot(const T &x, const U &y)#

Euclidean distance.

This function will return \(\sqrt{x^2+y^2}\). The calculation is performed without undue overflow or underflow during the intermediate steps of the calculation. Internally, the implementation uses the hypotq() function from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Added in version 0.21: Support for types other than real128.

Parameters:

x – the first argument.
y – the second argument.

Returns:

\(\sqrt{x^2+y^2}\).

Exponentiation#

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::pow(const T &x, const U &y)#

This function will compute \(x^y\). Internally, the implementation uses the powq() function from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Parameters:

x – the base.
y – the exponent.

Returns:

\(x^y\).

Trigonometry#

mppp::real128 mppp::sin(const mppp::real128 &x)#

mppp::real128 mppp::cos(const mppp::real128 &x)#

mppp::real128 mppp::tan(const mppp::real128 &x)#

mppp::real128 mppp::asin(const mppp::real128 &x)#

mppp::real128 mppp::acos(const mppp::real128 &x)#

mppp::real128 mppp::atan(const mppp::real128 &x)#

Trigonometric functions.

These functions will return, respectively:

\(\sin x\),
\(\cos x\),
\(\tan x\),
\(\arcsin x\),
\(\arccos x\),
\(\arctan x\).

Parameters:: x – the input value.
Returns:: a trigonometric function of x.

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::atan2(const T &y, const U &x)#

Added in version 0.21.

Two-arguments arctangent.

This function will compute \(\arctan\left( y,x \right)\). Internally, the implementation uses the atan2q() function from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Parameters:

y – the sine argument.
x – the cosine argument.

Returns:

\(\arctan\left( y,x \right)\).

void mppp::sincos(const mppp::real128 &x, mppp::real128 *s, mppp::real128 *c)#

Added in version 0.21.

Simultaneous sine and cosine.

This function will set the variables pointed to by s and c to, respectively, \(\sin x\) and \(\cos x\).

Parameters:

x – the input argument.
s – a pointer to the sine return value.
c – a pointer to the cosine return value.

Hyperbolic functions#

mppp::real128 mppp::sinh(const mppp::real128 &x)#

mppp::real128 mppp::cosh(const mppp::real128 &x)#

mppp::real128 mppp::tanh(const mppp::real128 &x)#

mppp::real128 mppp::asinh(const mppp::real128 &x)#

mppp::real128 mppp::acosh(const mppp::real128 &x)#

mppp::real128 mppp::atanh(const mppp::real128 &x)#

Hyperbolic functions.

These functions will return, respectively:

\(\sinh x\),
\(\cosh x\),
\(\tanh x\),
\(\operatorname{arcsinh} x\),
\(\operatorname{arccosh} x\),
\(\operatorname{arctanh} x\).

Parameters:: x – the input value.
Returns:: a hyperbolic function of x.

Logarithms and exponentials#

mppp::real128 mppp::exp(const mppp::real128 &x)#

mppp::real128 mppp::exp2(const mppp::real128 &x)#

mppp::real128 mppp::expm1(const mppp::real128 &x)#

mppp::real128 mppp::log(const mppp::real128 &x)#

mppp::real128 mppp::log10(const mppp::real128 &x)#

mppp::real128 mppp::log2(const mppp::real128 &x)#

mppp::real128 mppp::log1p(const mppp::real128 &x)#

Note

The exp2() function is available when using libquadmath from GCC 9 onwards. See also the MPPP_QUADMATH_HAVE_EXP2Q definition.

Logarithms and exponentials.

These functions will return, respectively:

\(e^x\),
\(2^x\),
\(e^x - 1\),
\(\log{x}\),
\(\log_{10}{x}\),
\(\log_2{x}\),
\(\log{\left( 1 + x \right)}\).

Added in version 0.21: The exp2(), expm1() and log1p() functions.

Parameters:: x – the input value.
Returns:: a logarithm/exponential of x.

Gamma functions#

mppp::real128 mppp::lgamma(const mppp::real128 &x)#

mppp::real128 mppp::tgamma(const mppp::real128 &x)#

Gamma functions.

These functions will return, respectively:

\(\log\Gamma\left( x \right)\),
\(\Gamma\left( x \right)\).

Added in version 0.21: The tgamma() function.

Parameters:: x – the input value.
Returns:: the result of the operation.

Bessel functions#

mppp::real128 mppp::j0(const mppp::real128 &x)#

mppp::real128 mppp::j1(const mppp::real128 &x)#

mppp::real128 mppp::jn(int n, const mppp::real128 &x)#

mppp::real128 mppp::y0(const mppp::real128 &x)#

mppp::real128 mppp::y1(const mppp::real128 &x)#

mppp::real128 mppp::yn(int n, const mppp::real128 &x)#

Added in version 0.21.

Bessel functions of the first and second kind of integral order.

These functions will return, respectively,

\(J_0\left( x \right)\),
\(J_1\left( x \right)\),
\(J_n\left( x \right)\),
\(Y_0\left( x \right)\),
\(Y_1\left( x \right)\),
\(Y_n\left( x \right)\).

Parameters:

n – the order of the Bessel function.
x – the argument.

Returns:

a Bessel function of x.

Other special functions#

mppp::real128 mppp::erf(const mppp::real128 &x)#

mppp::real128 mppp::erfc(const mppp::real128 &x)#

Error functions.

These functions will return, respectively:

\(\operatorname{erf}\left( x \right)\),
\(\operatorname{erfc}\left( x \right)\).

Added in version 0.21: The erfc() function.

Parameters:: x – the input value.
Returns:: the (complementary) error function of \(x\).

Floating-point manipulation#

mppp::real128 mppp::nextafter(const mppp::real128 &from, const mppp::real128 &to)#

Added in version 0.14.

This function returns the next representable value of from in the direction of to.

If from equals to to, to is returned.

Parameters:

from – the real128 whose next representable value will be returned.
to – the direction of the next representable value.

Returns:

the next representable value of from in the direction of to.

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::copysign(const T &x, const U &y)#

Added in version 0.21.

Copy sign.

This function composes a floating point value with the magnitude of x and the sign of y. Internally, the implementation uses the copysignq() function from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Parameters:

x – the first argument.
y – the second argument.

Returns:

a value with the magnitude of x and the sign of y.

Integer and remainder-related functions#

mppp::real128 mppp::ceil(const mppp::real128 &x)#

mppp::real128 mppp::floor(const mppp::real128 &x)#

mppp::real128 mppp::nearbyint(const mppp::real128 &x)#

mppp::real128 mppp::rint(const mppp::real128 &x)#

long long mppp::llrint(const mppp::real128 &x)#

long mppp::lrint(const mppp::real128 &x)#

mppp::real128 mppp::round(const mppp::real128 &x)#

long long mppp::llround(const mppp::real128 &x)#

long mppp::lround(const mppp::real128 &x)#

mppp::real128 mppp::trunc(const mppp::real128 &x)#

Added in version 0.21.

Integer rounding functions.

These member functions will return, respectively:

\(\left\lceil x \right\rceil\),
\(\left\lfloor x \right\rfloor\),
the nearest integer value to x, according to the current rounding mode, without raising the FE_INEXACT exception,
the nearest integer value to x, according to the current rounding mode, possibly raising the FE_INEXACT exception, represented as a:
- real128 (rint()),
- long long (llrint()),
- long (lrint()),
the nearest integer value to x rounding halfway cases away from zero, represented as a:
- real128 (round()),
- long long (llround()),
- long (lround()),
\(\operatorname{trunc}\left( x \right)\).

Parameters:: x – the input argument.
Returns:: the result of the operation.

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::fmod(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> mppp::real128 mppp::remainder(const T &x, const U &y)#

Added in version 0.21.

Floating modulus and remainder.

These functions will return, respectively, the floating modulus and the remainder of the division \(x/y\).

The floating modulus is \(x - n\times y\), where \(n\) is \(x/y\) with its fractional part truncated.

The remainder is \(x - m\times y\), where \(m\) is the integral value nearest the exact value \(x/y\).

Special values are handled as described in the C99 standard.

Internally, the implementation uses the fmodq() and remainderq() functions from the quadmath library, after the conversion of one of the operands to real128 (if necessary).

Parameters:

x – the numerator.
y – the denominator.

Returns:

the floating modulus or remainder of \(x/y\).

mppp::real128 mppp::remquo(const mppp::real128 &x, const mppp::real128 &y, int *quo)#

Added in version 0.21.

Remainder and quotient.

This function will return the remainder of the division \(x/y\). Additionally, it will store the sign and at least three of the least significant bits of \(x/y\) in quo.

Parameters:

x – the numerator.
y – the denominator.
quo – a pointer to the quotient return value.

Returns:

the remainder of \(x/y\).

mppp::real128 mppp::modf(const mppp::real128 &x, mppp::real128 *iptr)#

Added in version 0.21.

Decompose in integral and fractional parts.

This function will return the fractional part of x, and it will store the integral part of x into the variable pointed to by iptr.

Parameters:

x – the input argument.
iptr – a pointer to the return value for the integral part.

Returns:

the fractional part of x.

Input/Output#

std::ostream &mppp::operator<<(std::ostream &os, const mppp::real128 &x)#

Output stream operator.

This function will direct to the output stream os the input real128 x.

Parameters:

os – the target stream.
x – the input real128.

Returns:

a reference to os.

Throws:

std::overflow_error – in case of (unlikely) overflow errors.
std::invalid_argument – if the quadmath printing primitive quadmath_snprintf() returns an error code.
unspecified – any exception raised by the public interface of std::ostream or by memory allocation errors.

Other#

std::size_t mppp::hash(const mppp::real128 &x)#

Added in version 0.12.

Hash function for real128.

This function guarantees that x == y implies hash(x) == hash(y).

Parameters:: x – the argument.
Returns:: a hash value for x.

Mathematical operators#

constexpr mppp::real128 mppp::operator+(const mppp::real128 &x)#

constexpr mppp::real128 mppp::operator-(const mppp::real128 &x)#

Identity and negation.

Parameters:: x – the argument.
Returns:: \(x\) and \(-x\) respectively.

constexpr mppp::real128 &mppp::operator++(mppp::real128 &x)#

constexpr mppp::real128 &mppp::operator--(mppp::real128 &x)#

Note

These operators are constexpr only if at least C++14 is being used.

Prefix increment and decrement.

Parameters:: x – the argument.
Returns:: a reference to x after it has been incremented/decremented by one.

constexpr mppp::real128 mppp::operator++(mppp::real128 &x, int)#

constexpr mppp::real128 mppp::operator--(mppp::real128 &x, int)#

Note

These operators are constexpr only if at least C++14 is being used.

Suffix increment and decrement.

Parameters:: x – the argument.
Returns:: a copy of x before the increment/decrement.

template<typename T, mppp::real128_op_types<T> U> constexpr mppp::real128 mppp::operator+(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr mppp::real128 mppp::operator-(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr mppp::real128 mppp::operator*(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr mppp::real128 mppp::operator/(const T &x, const U &y)#

Binary arithmetic operators.

These operators will return, respectively:

\(x+y\),
\(x-y\),
\(x \times y\),
\(x / y\).

Parameters:

x – the first operand.
y – the second operand.

Returns:

the result of the binary operation.

Throws:

unspecified – any exception thrown by the constructor of real128 from mp++ types.

template<typename T, mppp::real128_op_types<T> U> constexpr T &mppp::operator+=(T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr T &mppp::operator-=(T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr T &mppp::operator*=(T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr T &mppp::operator/=(T &x, const U &y)#

Note

These operators are constexpr only if at least C++14 is being used.

In-place arithmetic operators.

These operators will set x to, respectively:

\(x+y\),
\(x-y\),
\(x \times y\),
\(x / y\).

Parameters:

x – the first operand.
y – the second operand.

Returns:

a reference to x.

Throws:

unspecified – any exception thrown by the corresponding binary operator, or by the conversion of real128 to mp++ types.

template<typename T, mppp::real128_eq_op_types<T> U> constexpr bool mppp::operator==(const T &x, const U &y)#

template<typename T, mppp::real128_eq_op_types<T> U> constexpr bool mppp::operator!=(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr bool mppp::operator<(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr bool mppp::operator>(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr bool mppp::operator<=(const T &x, const U &y)#

template<typename T, mppp::real128_op_types<T> U> constexpr bool mppp::operator>=(const T &x, const U &y)#

Comparison operators.

These operators will return true if, respectively:

\(x=y\),
\(x \neq y\),
\(x < y\),
\(x > y\),
\(x \leq y\),
\(x \geq y\),

false otherwise.

Note

These operators will handle NaN in the same way as the builtin floating-point types. For alternative comparison functions that treat NaN specially, please see the comparison functions section.

Added in version 0.20: Equality and inequality comparison with real128_cpp_complex types.

Parameters:

x – the first operand.
y – the second operand.

Returns:

the result of the comparison.

Throws:

unspecified – any exception thrown by the constructor of real128 from mp++ types.

Constants#

A few mathematical constants are provided. The constants are available as inline variables (e.g., mppp::pi_128, requires C++17 or later) and as constexpr functions (e.g., mppp::real128_pi(), always available). Inline variables and constexpr functions provide exactly the same functionality, but inline variables are more convenient if C++17 is an option.

Note

Some of these constants are also available as macros from the quadmath library.

constexpr unsigned mppp::real128_sig_digits()#

constexpr unsigned mppp::sig_digits_128#: The number of binary digits in the significand of a real128 (113).

constexpr mppp::real128 mppp::real128_max()#

constexpr mppp::real128 mppp::max_128#: The maximum positive finite value representable by real128.

constexpr mppp::real128 mppp::real128_min()#

constexpr mppp::real128 mppp::min_128#: The minimum positive value representable by real128 with full precision.

constexpr mppp::real128 mppp::real128_epsilon()#

constexpr mppp::real128 mppp::epsilon_128#: The difference between 1 and the next larger number representable by real128 (\(2^{-112}\)).

constexpr mppp::real128 mppp::real128_denorm_min()#

constexpr mppp::real128 mppp::denorm_min_128#: The smallest positive denormalized number representable by real128.

constexpr mppp::real128 mppp::real128_inf()#

constexpr mppp::real128 mppp::inf_128#

constexpr mppp::real128 mppp::real128_nan()#

constexpr mppp::real128 mppp::nan_128#: Positive infinity and NaN.

constexpr mppp::real128 mppp::real128_pi()#

constexpr mppp::real128 mppp::pi_128#: Quadruple-precision \(\pi\) constant.

constexpr mppp::real128 mppp::real128_e()#

constexpr mppp::real128 mppp::e_128#: Quadruple-precision \(\text{e}\) constant (Euler’s number).

constexpr mppp::real128 mppp::real128_sqrt2()#

constexpr mppp::real128 mppp::sqrt2_128#: Quadruple-precision \(\sqrt{2}\) constant.

Standard library specialisations#

template<> class std::numeric_limits<mppp::real128>#

This specialisation exposes the compile-time properties of real128 as specified by the C++ standard.

static constexpr bool is_specialized = true#

static constexpr int digits = 113#

static constexpr int digits10 = 33#

static constexpr int max_digits10 = 36#

static constexpr bool is_signed = true#

static constexpr bool is_integer = false#

static constexpr bool is_exact = false#

static constexpr int radix = 2#

static constexpr int min_exponent = -16381#

static constexpr int min_exponent10 = -4931#

static constexpr int max_exponent = 16384#

static constexpr int max_exponent10 = 4931#

static constexpr bool has_infinity = true#

static constexpr bool has_quiet_NaN = true#

static constexpr bool has_signaling_NaN = false#

static constexpr std::float_denorm_style has_denorm = std::denorm_present#

static constexpr bool has_denorm_loss = true#

static constexpr bool is_iec559 = true#

static constexpr bool is_bounded = false#

static constexpr bool is_modulo = false#

static constexpr bool traps = false#

static constexpr bool tinyness_before = false#

static constexpr std::float_round_style round_style = std::round_to_nearest#

static constexpr mppp::real128 min()#

Returns:: the output of mppp::real128_min().

static constexpr mppp::real128 max()#

Returns:: the output of mppp::real128_max().

static constexpr mppp::real128 lowest()#

Returns:: the negative of the output of mppp::real128_max().

static constexpr mppp::real128 epsilon()#

Returns:: the output of mppp::real128_epsilon().

static constexpr mppp::real128 round_error()#

Returns:: 0.5.

static constexpr mppp::real128 infinity()#

Returns:: the output of mppp::real128_inf().

static constexpr mppp::real128 quiet_NaN()#

Returns:: the output of mppp::real128_nan().

static constexpr mppp::real128 signaling_NaN()#

Returns:: 0.

static constexpr mppp::real128 denorm_min()#

Returns:: the output of mppp::real128_denorm_min().

template<> class std::hash<mppp::real128>#

Added in version 0.27.

Specialisation of std::hash for mppp::real128.

The hash is computed via std::size_t mppp::hash(const mppp::real128 &).

using argument_type = mppp::real128 #

using result_type = std::size_t#

Note

The argument_type and result_type type aliases are defined only until C++14.

std::size_t operator()(const mppp::real128 &x) const#

Parameters:: x – the input mppp::real128.
Returns:: a hash value for x.

User-defined literals#

Added in version 0.19.

template<char... Chars> mppp::real128 mppp::literals::operator""_rq()#

User-defined quadruple-precision literal.

This numeric literal operator template can be used to construct real128 instances. Floating-point literals in decimal and hexadecimal format are supported.

Throws:: std::invalid_argument – if the input sequence of characters is not a valid floating-point literal (as defined by the C++ standard).

Quadruple-precision floats

Contents

Quadruple-precision floats#

The real128 class#

Types#

Concepts#

Functions#

Conversion#

Arithmetic#

Comparison#

Roots#

Exponentiation#

Trigonometry#

Hyperbolic functions#

Logarithms and exponentials#

Gamma functions#

Bessel functions#

Other special functions#

Floating-point manipulation#

Integer and remainder-related functions#

Input/Output#

Other#

Mathematical operators#

Constants#

Standard library specialisations#

User-defined literals#