对浮点舍入的 Go 语言规范的误解

Question

关于 Constant expressions 部分的 Go 语言规范指出：

A compiler may use rounding while computing untyped floating-point or complex constant expressions; see the implementation restriction in the section on constants. This rounding may cause a floating-point constant expression to be invalid in an integer context, even if it would be integral when calculated using infinite precision, and vice versa.

是否句子

This rounding may cause a floating-point constant expression to be invalid in an integer context

指向如下内容：

func main() {
    a := 853784574674.23846278367
    fmt.Println(int8(a)) // output: 0
}

Answer 1

int8 是一个带符号的整数，其值介于 -128 到 127 之间。这就是为什么您在 int8(a) 转换中看到意外值的原因。

Answer 2

规范中引用的部分不适用于您的示例，因为 a 不是常量表达式而是变量，因此 int8(a) 正在转换 non-constant 表达式。此转换由 Spec: Conversions 涵盖，数字类型之间的转换：

When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero).

[...] In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent.

由于您将 non-constant 表达式 a 853784574674.23846278367 转换为整数，小数部分将被丢弃，并且由于结果不适合 int8 ，结果未指定，是 implementation-dependent.

引用的部分意味着虽然常量的表示精度比内置类型（例如 float64 或 int64）高得多，但编译器（必须）实现的精度是不是无限的（出于实际原因），即使浮点文字可以精确表示，对它们执行操作也可能会进行中间舍入，并且可能无法给出数学上正确的结果。

The spec includes the minimum supportable precision:

Implementation restriction: Although numeric constants have arbitrary precision in the language, a compiler may implement them using an internal representation with limited precision. That said, every implementation must:

Represent integer constants with at least 256 bits.

Represent floating-point constants, including the parts of a complex constant, with a mantissa of at least 256 bits and a signed binary exponent of at least 16 bits.

Give an error if unable to represent an integer constant precisely.

Give an error if unable to represent a floating-point or complex constant due to overflow.

Round to the nearest representable constant if unable to represent a floating-point or complex constant due to limits on precision.

例如：

const (
    x = 1e100000 + 1
    y = 1e100000
)

func main() {
    fmt.Println(x - y)
}

此代码应输出 1，因为 x 比 y 大 1。运行它在 Go Playground 上输出 0 因为常量表达式 x - y 是四舍五入执行的，结果 +1 丢失了。 x 和 y 都是整数（没有小数部分），所以在整数上下文中结果应该是 1。但是数字是 1e100000，表示它需要大约 333000 位，这不是编译器的有效要求（根据规范，256 位尾数就足够了）。

如果我们降低常量，我们会得到正确的结果：

const (
    x = 1e1000 + 1
    y = 1e1000
)

func main() {
    fmt.Println(x - y)
}

这将输出数学上正确的 1 结果。在 Go Playground 上试一试。表示数字 1e1000 需要大约 ~3333 位，这似乎是受支持的（并且远远高于最低 256 位要求）。

对浮点舍入的 Go 语言规范的误解

Misunderstanding Go Language specification on floating-point rounding

constants

go

constant-expression