Python 在 nan 列表中查找 nan 的索引有时只会产生错误?

Python find index of nan in nan-list yields error only sometimes?

对于全南列表 a = [np.nan, np.nan]a.index(np.nan) returns 0,而对于 np.nan return 由 b = np.nanmax(a)a.index(b)给出一个ValueErrornp.nanb 的对象 id 不同。但是,如果 a[2,3.1]c = np.array(a).tolist(),那么 id(a[1])id(c[1]) 也会不同,但是没有 ValueError a.index(c[1])?

list.index() 是如何工作的?它是否比较值相等(我猜不会,否则 a.index(np.nan) 应该 return 一个错误,因为 np.nan != np.nan)?对于对象 ID(我再次猜测不是,否则 a.index(c[1]) 应该 return 一个错误)?为什么 a.index(np.nanmax(a)) 的示例在 a = [np.nan,np.nan] 时不起作用,而 a.index(np.nan) 却起作用?

import numpy as np

a = [np.nan, np.nan]
b = np.nanmax(a)

print(id(np.nan), id(a[0]), id(a[1]), id(b))

a.index(np.nan)
a.index(b)

# Output:
# 47021195940144 47021195940144 47021195940144 47021566155984
#   ...
#   File "<ipython-input-2-fb7cc8fa88c0>", line 9, in <module>
#     a.index(b)
# ValueError: nan is not in list

实施list.index

如果您想了解 index 是如何实现的(在 C 中),您可以查看 here
为了更容易理解,我在 python:

中重写了它
import sys


def index(self, value, start=0, stop=sys.maxsize, /):
    # make sure that start and end are in boundaries
    if start < 0:
        start += len(self)
        if start < 0:
            start = 0
    if stop < 0:
        stop += len(self)
        if stop < 0:
            stop = 0

    # iterate throughout list and try to find the value
    for i, obj in enumerate(self[start:stop]):
        if obj is value or obj == value:
            return i

    raise ValueError("%r is not in list" % value)

为何如此实施的详细信息

要理解这部分,我建议您阅读我之前引用的实现

所有的魔法都发生在 PyObject_RichCompareBool:
如果它像 index 中那样被调用,那么它的行为就像 x is y or x == y

这个事实在docs中也有说明(index使用Py_EQ

int PyObject_RichCompareBool(PyObject *o1, PyObject *o2, int opid)

Compare the values of o1 and o2 using the operation specified by opid, which must be one of Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT, or Py_GE, corresponding to <, <=, ==, !=, >, or >= respectively. Returns -1 on error, 0 if the result is false, 1 otherwise. This is the equivalent of the Python expression o1 op o2, where op is the operator corresponding to opid.

Note If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE.

-1 的案例由 python 处理,我们无需担心。 (python 引发异常并自动停止 运行 我们的代码)

那么它是如何工作的?

最后,如果我们应用我们的知识,那么我们可以看到行为是这样的原因:

import numpy as np

instance1 = np.nan

l = [instance1]
instance2 = np.nanmax(l)  # RuntimeWarning: All-NaN axis encountered

print(instance1 is instance2 or instance1 == instance2)
# False therefore ValueError
import numpy as np

instance1 = 3.1

l = [instance1]
instance2 = np.array(l).tolist()[0]

print(instance1 is instance2 or instance1 == instance2)
# True (instance1 == instance2) therefore no ValueError

另外

这里还有您的概括示例:

import numpy as np

instance1 = np.nan

l = [instance1]
instance2 = np.nanmax(l)  # RuntimeWarning: All-NaN axis encountered

assert instance1 is l[0]
assert instance1 is not instance2

assert not l.index(instance1)
assert not l.index(instance2)  # ValueError: nan is not in list

import numpy as np

instance1 = 3.1

l = [instance1]
instance2 = np.array(l).tolist()[0]

assert instance1 is l[0]
assert instance1 is not instance2

assert not l.index(instance1)
assert not l.index(instance2)  # no ValueError

在 python 中,您可以创建一个 nan 值对象:

In [80]: mynan=float('nan')
In [81]: id(mynan)
Out[81]: 139640449759024

制作另一个并获得不同的ID:

In [82]: mynan=float('nan')
In [83]: id(mynan)
Out[83]: 139640449757264

numpy 有自己的版本:

In [84]: id(np.nan)
Out[84]: 139640952170000

我认为总是给出相同的 id(在特定会话中)

列出清单:

In [85]: a = [.1, np.nan, .3, mynan]

np.isnan 可以测试 nan 值,即使 id 和值不起作用:

In [86]: np.isnan(a)
Out[86]: array([False,  True, False,  True])

据我所知,列表索引首先测试 id,然后测试 ==。记住按 reference.

列出存储元素
In [87]: a.index(np.nan)
Out[87]: 1
In [88]: a.index(mynan)
Out[88]: 3
In [89]: a.index(float('nan'))
Traceback (most recent call last):
  File "<ipython-input-89-33bf9e0279e3>", line 1, in <module>
    a.index(float('nan'))
ValueError: nan is not in list