对浮点舍入的 Go 语言规范的误解

Misunderstanding Go Language specification on floating-point rounding

关于 Constant expressions 部分的 Go 语言规范指出:

A compiler may use rounding while computing untyped floating-point or complex constant expressions; see the implementation restriction in the section on constants. This rounding may cause a floating-point constant expression to be invalid in an integer context, even if it would be integral when calculated using infinite precision, and vice versa.


是否句子

This rounding may cause a floating-point constant expression to be invalid in an integer context

指向如下内容:

func main() {
    a := 853784574674.23846278367
    fmt.Println(int8(a)) // output: 0
}

int8 是一个带符号的整数,其值介于 -128 到 127 之间。这就是为什么您在 int8(a) 转换中看到意外值的原因。

规范中引用的部分不适用于您的示例,因为 a 不是常量表达式而是变量,因此 int8(a) 正在转换 non-constant 表达式。此转换由 Spec: Conversions 涵盖,数字类型之间的转换:

When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero).

[...] In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent.

由于您将 non-constant 表达式 a 853784574674.23846278367 转换为整数,小数部分将被丢弃,并且由于结果不适合 int8 ,结果未指定,是 implementation-dependent.

引用的部分意味着虽然常量的表示精度比内置类型(例如 float64int64)高得多,但编译器(必须)实现的精度是不是无限的(出于实际原因),即使浮点文字可以精确表示,对它们执行操作也可能会进行中间舍入,并且可能无法给出数学上正确的结果。

The spec includes the minimum supportable precision:

Implementation restriction: Although numeric constants have arbitrary precision in the language, a compiler may implement them using an internal representation with limited precision. That said, every implementation must:

  • Represent integer constants with at least 256 bits.
  • Represent floating-point constants, including the parts of a complex constant, with a mantissa of at least 256 bits and a signed binary exponent of at least 16 bits.
  • Give an error if unable to represent an integer constant precisely.
  • Give an error if unable to represent a floating-point or complex constant due to overflow.
  • Round to the nearest representable constant if unable to represent a floating-point or complex constant due to limits on precision.

例如:

const (
    x = 1e100000 + 1
    y = 1e100000
)

func main() {
    fmt.Println(x - y)
}

此代码应输出 1,因为 xy 大 1。 运行 它在 Go Playground 上输出 0 因为常量表达式 x - y 是四舍五入执行的,结果 +1 丢失了。 xy 都是整数(没有小数部分),所以在整数上下文中结果应该是 1。但是数字是 1e100000,表示它需要大约 333000 位,这不是编译器的有效要求(根据规范,256 位尾数就足够了)。

如果我们降低常量,我们会得到正确的结果:

const (
    x = 1e1000 + 1
    y = 1e1000
)

func main() {
    fmt.Println(x - y)
}

这将输出数学上正确的 1 结果。在 Go Playground 上试一试。表示数字 1e1000 需要大约 ~3333 位,这似乎是受支持的(并且远远高于最低 256 位要求)。