使用 IEEE FP 进行整数到浮点数的转换

Integer to float conversions with IEEE FP

c++
ieee-754

在支持 IEEE-754 FP 算法的 C++ 实现中，关于从整数类型到浮点类型的转换有哪些保证？

具体来说，将任何整数值转换为任何浮点类型是否总是明确定义的行为，可能导致值 +-inf？或者是否存在这会导致未定义行为的情况？

（注意，我不是在问精确的转换，只是从语言标准的角度来看执行转换总是合法的）

在第 7.4 节中，standard IEEE-754 (2008) 表示这是明确定义的行为。但它与 IEEE-754 相关，C/C++ 实现可以自由尊重或不尊重它（参见 zwol 答案）。

The overflow exception shall be signaled if and only if the destination format’s largest finite number is exceeded in magnitude by what would have been the rounded floating-point result were the exponent range unbounded. The default result shall be determined by the rounding-direction attribute and the sign of the intermediate result as follows:

a) roundTiesToEven and roundTiesToAway carry all overflows to ∞ with the sign of the intermediate result.

b) roundTowardZero carries all overflows to the format’s largest finite number with the sign of the intermediate result.

c) roundTowardNegative carries positive overflows to the format’s largest finite number, and carries negative overflows to −∞

d) roundTowardPositive carries negative overflows to the format’s most negative finite number, and carries positive overflows to + ∞

这4点的所有情况都为整型到浮点型的转换提供了确定性的结果。

所有其他情况（无溢出）也已明确定义，具有 IEEE-754 标准给出的确定性结果。

IEC 60559（IEEE 754 的当前后续标准）在所有情况下都明确定义了整数到浮点数的转换，如中所讨论的，但语言标准具有最终决定权关于这个问题。

在基本标准中，C++11 第 4.9 节 "Floating-integral conversions" 第 2 段使超出范围的整数到浮点数的转换成为未定义行为。（引用自文档 N3337，这是最接近免费公开提供的官方 2011 C++ 标准的近似值。）

A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating point type. The result is exact if possible. If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value. [ Note: Loss of precision occurs if the integral value cannot be represented exactly as a value of the floating type. — end note ] If the value being converted is outside the range of values that can be represented, the behavior is undefined. If the source type is bool, the value false is converted to zero and the value true is converted to one.

强调我的。 C 标准用不同的词说同样的事情（第 6.3.1.4 节第 2 段）。

C++ 标准不讨论 C++ 实现提供符合 IEC 60559 的浮点运算意味着什么。然而，C 标准（最接近 C11 的在线免费提供的近似值是 N1570）确实在其附件 F 中讨论了这一点，并且当 C++ 未指定某些内容时，C++ 实现者确实倾向于求助于 C 以寻求指导。 Annex F中没有explicit整数到浮点数转换的讨论，但是F.1p1中有这句话：

Since negative and positive infinity are representable in IEC 60559 formats, all real numbers lie within the range of representable values.

将这句话与 6.3.1.4p2 放在一起向我表明，当整数的大小超出可表示的有限数的范围。并且该解释与 IEC 60559 指定的转换行为一致，因此我们可以有理由相信，这就是声称符合附件 F 的 C 的实现 会做的。

然而，将 C 标准的任何解释应用于 C++ 充其量是有风险的；很长一段时间以来，C++ 都没有被定义为 C 的超集。如果您的 C++ 实现预定义宏 __STDC_IEC_559__ and/or 文档以某种方式和符合 IEC 60559，则您不使用 "be sloppy about floating-point math in the name of speed" 编译模式（默认情况下可能启用），您可能可以依靠超出范围的转换来产生 ±Inf。否则就是UB。

使用 IEEE FP 进行整数到浮点数的转换

Integer to float conversions with IEEE FP

c++

ieee-754