什么规则控制将 static_cast<float> 应用于双精度数的舍入行为?
What rule governs rounding behavior in applying static_cast<float> to a double?
如果我们在 C++ 中有一个双精度值并对其执行 static_cast<float>
,返回值的绝对值是否总是较小?我的直觉是肯定的,原因如下。
- 可能的单精度指数集严格来说是双精度指数的子集
- 在将双精度尾数转换为单精度时,位可能会被截断然后结束以使双精度的尾数适合浮点数的尾数。但是,如果更准确的话,有时会向上舍入到下一个最高浮点值,这并非不可能。也许这是系统相关的,或者在某些标准中定义的。
我在下面的程序中对此进行了一些数值试验。似乎有时会向上舍入,有时会向下舍入。
我在哪里可以找到更多关于我如何期望这种舍入行为的信息?它总是四舍五入到最接近的浮点数吗?
#include <cmath>
#include <iostream>
int main() {
// Start testing double precision values starting at x, going up to max
double x = 0.98;
constexpr double max = 1e10;
// Loop over many possible double-precision values, print out
// if casting to float ever produced a larger number.
int output_counter = 0; // output every n steps
constexpr int output_interval = 100000000;
std::cout.precision(17);
while (x < max) {
// volatile to ensure compiler doesn't optimize this out
volatile float xprime = static_cast<float>(x);
double xprimeprime = static_cast<double>(xprime);
if (xprimeprime > x)
std::cout << "Found a round up! x=" << x << ", xprime = "<< xprime << std::endl;
// Go to the next higher double precision value
x = std::nextafter(x, std::numeric_limits<double>::infinity());
output_counter++;
if (output_counter == output_interval) {
std::cout << x << std::endl;
output_counter = 0;
}
}
}
标准在[conv.double]中说:
A prvalue of floating-point type can be converted to a prvalue of another floating-point type. If the source value can be exactly represented in the destination type, the result of the conversion is that exact representation. If the source value is between two adjacent destination values, the result of the conversion is an implementation-defined choice of either of those values. Otherwise, the behavior is undefined.
请注意,使用 <limits>
header 您可以通过 std::numeric_limits<T>::round_style
检查圆形样式。有关可能的值,请参阅 [round.style]。 (至少我假设 floating-point conversion 属于 floating-point arithmetic。)
我在Draft C++17 Standard I normally use for answers to questions such as this1; however, cppreference中找不到明确的答案(这通常是可靠的)强烈建议floating-point转换的舍入模式是实现定义的.
但是,它还指出,如果遵循 IEEE-754 规则,则舍入到 最接近的 可表示值2:
Floating-point conversions
A prvalue of a floating-point type can be converted to a prvalue of any other floating-point type.
If the conversion is listed under floating-point promotions, it is a promotion and not a conversion.
- If the source value can be represented exactly in the destination type, it does not change. If the source value is between two representable values of the destination type, the result is one of those two values (it is implementation-defined which one, although if IEEE arithmetic is supported, rounding defaults to nearest).
- Otherwise, the behavior is undefined.
此外,可以更改 IEE-754 默认 行为,使用 std::fesetround(int round)
function,使用 [=12] 中定义的以下舍入模式之一=] header:
#define FE_DOWNWARD /*implementation defined*/ // (since C++11)
#define FE_TONEAREST /*implementation defined*/ // (since C++11)
#define FE_TOWARDZERO /*implementation defined*/ // (since C++11)
#define FE_UPWARD /*implementation defined*/ // (since C++11)
1BlameTheBits found the relevant section in the Standard。在我提到的 C++17 草案中,这实际上是 §7.9.1 但其他方面类似。
2 IEEE-754 实际上定义了 5 different rules 用于浮点舍入。
如果我们在 C++ 中有一个双精度值并对其执行 static_cast<float>
,返回值的绝对值是否总是较小?我的直觉是肯定的,原因如下。
- 可能的单精度指数集严格来说是双精度指数的子集
- 在将双精度尾数转换为单精度时,位可能会被截断然后结束以使双精度的尾数适合浮点数的尾数。但是,如果更准确的话,有时会向上舍入到下一个最高浮点值,这并非不可能。也许这是系统相关的,或者在某些标准中定义的。
我在下面的程序中对此进行了一些数值试验。似乎有时会向上舍入,有时会向下舍入。
我在哪里可以找到更多关于我如何期望这种舍入行为的信息?它总是四舍五入到最接近的浮点数吗?
#include <cmath>
#include <iostream>
int main() {
// Start testing double precision values starting at x, going up to max
double x = 0.98;
constexpr double max = 1e10;
// Loop over many possible double-precision values, print out
// if casting to float ever produced a larger number.
int output_counter = 0; // output every n steps
constexpr int output_interval = 100000000;
std::cout.precision(17);
while (x < max) {
// volatile to ensure compiler doesn't optimize this out
volatile float xprime = static_cast<float>(x);
double xprimeprime = static_cast<double>(xprime);
if (xprimeprime > x)
std::cout << "Found a round up! x=" << x << ", xprime = "<< xprime << std::endl;
// Go to the next higher double precision value
x = std::nextafter(x, std::numeric_limits<double>::infinity());
output_counter++;
if (output_counter == output_interval) {
std::cout << x << std::endl;
output_counter = 0;
}
}
}
标准在[conv.double]中说:
A prvalue of floating-point type can be converted to a prvalue of another floating-point type. If the source value can be exactly represented in the destination type, the result of the conversion is that exact representation. If the source value is between two adjacent destination values, the result of the conversion is an implementation-defined choice of either of those values. Otherwise, the behavior is undefined.
请注意,使用 <limits>
header 您可以通过 std::numeric_limits<T>::round_style
检查圆形样式。有关可能的值,请参阅 [round.style]。 (至少我假设 floating-point conversion 属于 floating-point arithmetic。)
我在Draft C++17 Standard I normally use for answers to questions such as this1; however, cppreference中找不到明确的答案(这通常是可靠的)强烈建议floating-point转换的舍入模式是实现定义的.
但是,它还指出,如果遵循 IEEE-754 规则,则舍入到 最接近的 可表示值2:
Floating-point conversions
A prvalue of a floating-point type can be converted to a prvalue of any other floating-point type. If the conversion is listed under floating-point promotions, it is a promotion and not a conversion.
- If the source value can be represented exactly in the destination type, it does not change. If the source value is between two representable values of the destination type, the result is one of those two values (it is implementation-defined which one, although if IEEE arithmetic is supported, rounding defaults to nearest).
- Otherwise, the behavior is undefined.
此外,可以更改 IEE-754 默认 行为,使用 std::fesetround(int round)
function,使用 [=12] 中定义的以下舍入模式之一=] header:
#define FE_DOWNWARD /*implementation defined*/ // (since C++11)
#define FE_TONEAREST /*implementation defined*/ // (since C++11)
#define FE_TOWARDZERO /*implementation defined*/ // (since C++11)
#define FE_UPWARD /*implementation defined*/ // (since C++11)
1BlameTheBits found the relevant section in the Standard。在我提到的 C++17 草案中,这实际上是 §7.9.1 但其他方面类似。
2 IEEE-754 实际上定义了 5 different rules 用于浮点舍入。