为什么 numpy 数组的最大值不能用该 dtype 表示？

Question

我正在将 NumPy 数组从浮点数据类型转换为整数数据类型。在此过程中，我想将高于 dtype 允许的最大值的值转换为该最大值。但是由于某种原因失败了，并且转换了 returns 的最小值。这是重现代码（Python3，Numpy 1.22.2），仅以 numpy.inf 为例

float_array = numpy.array([[1, +numpy.inf], [2,2]])
dtype = numpy.dtype(numpy.int64)
cut_array = numpy.nan_to_num(float_array, posinf=numpy.iinfo(dtype).max)
int_array = cut_array.astype(dtype)

这个 returns int_array[0,1] 等于 -9223372036854775808。为什么可表示的最大值（大约 9.2e+18）实际上不能用于 dtype int64？

我测试了一下，比最大值稍微小一点的值就可以了，例如使用 posinf=numpy.iinfo(dtype).max - 600 会导致良好的转换。

Answer 1

来自 Warren Weckesser 和 Tim Roberts 的评论：由于 double 只有 53 位精度，因此它不能准确表示 int64，例如 int(float(9223372036854775807)) = 9223372036854775808 在这个例子中，int 转换已经舍入了由 float 近似的原始 int 值，这实质上是将 int 加 1，使其溢出。

为什么 numpy 数组的最大值不能用该 dtype 表示？

Why can the maximum value of a numpy array not be expressed in that dtype?

python

arrays

numpy