Python 在 nan 列表中查找 nan 的索引有时只会产生错误?
Python find index of nan in nan-list yields error only sometimes?
对于全南列表 a = [np.nan, np.nan]
、a.index(np.nan)
returns 0
,而对于 np.nan
return 由 b = np.nanmax(a)
,a.index(b)
给出一个ValueError
。 np.nan
和 b
的对象 id 不同。但是,如果 a
是 [2,3.1]
和 c = np.array(a).tolist()
,那么 id(a[1])
和 id(c[1])
也会不同,但是没有 ValueError
a.index(c[1])
?
list.index()
是如何工作的?它是否比较值相等(我猜不会,否则 a.index(np.nan)
应该 return 一个错误,因为 np.nan != np.nan
)?对于对象 ID(我再次猜测不是,否则 a.index(c[1])
应该 return 一个错误)?为什么 a.index(np.nanmax(a))
的示例在 a = [np.nan,np.nan]
时不起作用,而 a.index(np.nan)
却起作用?
import numpy as np
a = [np.nan, np.nan]
b = np.nanmax(a)
print(id(np.nan), id(a[0]), id(a[1]), id(b))
a.index(np.nan)
a.index(b)
# Output:
# 47021195940144 47021195940144 47021195940144 47021566155984
# ...
# File "<ipython-input-2-fb7cc8fa88c0>", line 9, in <module>
# a.index(b)
# ValueError: nan is not in list
实施list.index
如果您想了解 index
是如何实现的(在 C 中),您可以查看 here
为了更容易理解,我在 python:
中重写了它
import sys
def index(self, value, start=0, stop=sys.maxsize, /):
# make sure that start and end are in boundaries
if start < 0:
start += len(self)
if start < 0:
start = 0
if stop < 0:
stop += len(self)
if stop < 0:
stop = 0
# iterate throughout list and try to find the value
for i, obj in enumerate(self[start:stop]):
if obj is value or obj == value:
return i
raise ValueError("%r is not in list" % value)
为何如此实施的详细信息
要理解这部分,我建议您阅读我之前引用的实现
所有的魔法都发生在 PyObject_RichCompareBool
:
如果它像 index
中那样被调用,那么它的行为就像 x is y or x == y
这个事实在docs中也有说明(index
使用Py_EQ
)
int PyObject_RichCompareBool(PyObject *o1, PyObject *o2, int opid)
Compare the values of o1 and o2 using the operation specified by opid, which must be one of Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT, or Py_GE, corresponding to <, <=, ==, !=, >, or >= respectively. Returns -1 on error, 0 if the result is false, 1 otherwise. This is the equivalent of the Python expression o1 op o2, where op is the operator corresponding to opid.
Note If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE.
-1
的案例由 python 处理,我们无需担心。 (python 引发异常并自动停止 运行 我们的代码)
那么它是如何工作的?
最后,如果我们应用我们的知识,那么我们可以看到行为是这样的原因:
import numpy as np
instance1 = np.nan
l = [instance1]
instance2 = np.nanmax(l) # RuntimeWarning: All-NaN axis encountered
print(instance1 is instance2 or instance1 == instance2)
# False therefore ValueError
import numpy as np
instance1 = 3.1
l = [instance1]
instance2 = np.array(l).tolist()[0]
print(instance1 is instance2 or instance1 == instance2)
# True (instance1 == instance2) therefore no ValueError
另外
这里还有您的概括示例:
import numpy as np
instance1 = np.nan
l = [instance1]
instance2 = np.nanmax(l) # RuntimeWarning: All-NaN axis encountered
assert instance1 is l[0]
assert instance1 is not instance2
assert not l.index(instance1)
assert not l.index(instance2) # ValueError: nan is not in list
和
import numpy as np
instance1 = 3.1
l = [instance1]
instance2 = np.array(l).tolist()[0]
assert instance1 is l[0]
assert instance1 is not instance2
assert not l.index(instance1)
assert not l.index(instance2) # no ValueError
在 python 中,您可以创建一个 nan
值对象:
In [80]: mynan=float('nan')
In [81]: id(mynan)
Out[81]: 139640449759024
制作另一个并获得不同的ID:
In [82]: mynan=float('nan')
In [83]: id(mynan)
Out[83]: 139640449757264
numpy
有自己的版本:
In [84]: id(np.nan)
Out[84]: 139640952170000
我认为总是给出相同的 id(在特定会话中)
列出清单:
In [85]: a = [.1, np.nan, .3, mynan]
np.isnan
可以测试 nan
值,即使 id
和值不起作用:
In [86]: np.isnan(a)
Out[86]: array([False, True, False, True])
据我所知,列表索引首先测试 id
,然后测试 ==
。记住按 reference
.
列出存储元素
In [87]: a.index(np.nan)
Out[87]: 1
In [88]: a.index(mynan)
Out[88]: 3
In [89]: a.index(float('nan'))
Traceback (most recent call last):
File "<ipython-input-89-33bf9e0279e3>", line 1, in <module>
a.index(float('nan'))
ValueError: nan is not in list
对于全南列表 a = [np.nan, np.nan]
、a.index(np.nan)
returns 0
,而对于 np.nan
return 由 b = np.nanmax(a)
,a.index(b)
给出一个ValueError
。 np.nan
和 b
的对象 id 不同。但是,如果 a
是 [2,3.1]
和 c = np.array(a).tolist()
,那么 id(a[1])
和 id(c[1])
也会不同,但是没有 ValueError
a.index(c[1])
?
list.index()
是如何工作的?它是否比较值相等(我猜不会,否则 a.index(np.nan)
应该 return 一个错误,因为 np.nan != np.nan
)?对于对象 ID(我再次猜测不是,否则 a.index(c[1])
应该 return 一个错误)?为什么 a.index(np.nanmax(a))
的示例在 a = [np.nan,np.nan]
时不起作用,而 a.index(np.nan)
却起作用?
import numpy as np
a = [np.nan, np.nan]
b = np.nanmax(a)
print(id(np.nan), id(a[0]), id(a[1]), id(b))
a.index(np.nan)
a.index(b)
# Output:
# 47021195940144 47021195940144 47021195940144 47021566155984
# ...
# File "<ipython-input-2-fb7cc8fa88c0>", line 9, in <module>
# a.index(b)
# ValueError: nan is not in list
实施list.index
如果您想了解 index
是如何实现的(在 C 中),您可以查看 here
为了更容易理解,我在 python:
import sys
def index(self, value, start=0, stop=sys.maxsize, /):
# make sure that start and end are in boundaries
if start < 0:
start += len(self)
if start < 0:
start = 0
if stop < 0:
stop += len(self)
if stop < 0:
stop = 0
# iterate throughout list and try to find the value
for i, obj in enumerate(self[start:stop]):
if obj is value or obj == value:
return i
raise ValueError("%r is not in list" % value)
为何如此实施的详细信息
要理解这部分,我建议您阅读我之前引用的实现
所有的魔法都发生在 PyObject_RichCompareBool
:
如果它像 index
中那样被调用,那么它的行为就像 x is y or x == y
这个事实在docs中也有说明(index
使用Py_EQ
)
int PyObject_RichCompareBool(PyObject *o1, PyObject *o2, int opid)
Compare the values of o1 and o2 using the operation specified by opid, which must be one of Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT, or Py_GE, corresponding to <, <=, ==, !=, >, or >= respectively. Returns -1 on error, 0 if the result is false, 1 otherwise. This is the equivalent of the Python expression o1 op o2, where op is the operator corresponding to opid.
Note If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE.
-1
的案例由 python 处理,我们无需担心。 (python 引发异常并自动停止 运行 我们的代码)
那么它是如何工作的?
最后,如果我们应用我们的知识,那么我们可以看到行为是这样的原因:
import numpy as np
instance1 = np.nan
l = [instance1]
instance2 = np.nanmax(l) # RuntimeWarning: All-NaN axis encountered
print(instance1 is instance2 or instance1 == instance2)
# False therefore ValueError
import numpy as np
instance1 = 3.1
l = [instance1]
instance2 = np.array(l).tolist()[0]
print(instance1 is instance2 or instance1 == instance2)
# True (instance1 == instance2) therefore no ValueError
另外
这里还有您的概括示例:
import numpy as np
instance1 = np.nan
l = [instance1]
instance2 = np.nanmax(l) # RuntimeWarning: All-NaN axis encountered
assert instance1 is l[0]
assert instance1 is not instance2
assert not l.index(instance1)
assert not l.index(instance2) # ValueError: nan is not in list
和
import numpy as np
instance1 = 3.1
l = [instance1]
instance2 = np.array(l).tolist()[0]
assert instance1 is l[0]
assert instance1 is not instance2
assert not l.index(instance1)
assert not l.index(instance2) # no ValueError
在 python 中,您可以创建一个 nan
值对象:
In [80]: mynan=float('nan')
In [81]: id(mynan)
Out[81]: 139640449759024
制作另一个并获得不同的ID:
In [82]: mynan=float('nan')
In [83]: id(mynan)
Out[83]: 139640449757264
numpy
有自己的版本:
In [84]: id(np.nan)
Out[84]: 139640952170000
我认为总是给出相同的 id(在特定会话中)
列出清单:
In [85]: a = [.1, np.nan, .3, mynan]
np.isnan
可以测试 nan
值,即使 id
和值不起作用:
In [86]: np.isnan(a)
Out[86]: array([False, True, False, True])
据我所知,列表索引首先测试 id
,然后测试 ==
。记住按 reference
.
In [87]: a.index(np.nan)
Out[87]: 1
In [88]: a.index(mynan)
Out[88]: 3
In [89]: a.index(float('nan'))
Traceback (most recent call last):
File "<ipython-input-89-33bf9e0279e3>", line 1, in <module>
a.index(float('nan'))
ValueError: nan is not in list