Python numpy.divmod 和整数表示

Question

我试图将 numpy.divmod 与非常大的整数一起使用，但我发现了一个奇怪的行为。在 2**63 ~ 1e19 左右（这应该是 python 3.5+ 中 int 的通常内存表示的限制），会发生这种情况：

from numpy import divmod

test = 10**6
for i in range(15,25):
  x = 10**i
  print(i, divmod(x, test))

15 (1000000000, 0)
16 (10000000000, 0)
17 (100000000000, 0)
18 (1000000000000, 0)
19 (10000000000000.0, 0.0)
20 ((100000000000000, 0), None)
21 ((1000000000000000, 0), None)
22 ((10000000000000000, 0), None)
23 ((100000000000000000, 0), None)
24 ((1000000000000000000, 0), None)

不知何故，商和余数在 2**63 之前都可以正常工作，然后就有些不同了。

我的猜测是 int 表示是 "vectorized"（即 Scala 中的 BigInt，Long 的小端 Seq）。但是，我希望，作为 divmod(array, test) 的结果，一对数组：商数组和余数数组。

我不知道这个功能。内置的 divmod 不会发生这种情况（一切都按预期工作）

为什么会这样？跟int内部表示有关系吗？

详细信息：numpy 版本 1.13.1，python3.6

Answer 1

问题是 np.divmod 会将参数转换为数组，发生的事情真的很简单：

>>> np.array(10**19)
array(10000000000000000000, dtype=uint64)
>>> np.array(10**20)
array(100000000000000000000, dtype=object)

对于 10**i 和 i > 19，您将得到一个 object 数组，在其他情况下，它将是一个 "real NumPy array"。

而且，确实，object 数组与 np.divmod 的行为似乎很奇怪：

>>> np.divmod(np.array(10**5, dtype=object), 10)   # smaller value but object array
((10000, 0), None)

我想在这种情况下，正常的 Python 内置 divmod 计算第一个返回的元素，所有剩余的项目都用 None 因为它委托给 Pythons 函数。

请注意，object 数组的行为通常与原生 dtype 数组不同。它们要慢得多并且经常委托给 Python 函数（这通常是不同结果的原因）。

Python numpy.divmod 和整数表示

Python numpy.divmod and integer representation

python

numpy

internal-representation

divmod