Python NaN 问题

Question

我正在从 pandas 数据帧生成的 CSV 中读取坐标集。坐标集的长度不尽相同，因此用 NaN 填充。这是我要开始工作的代码：

df=pd.read_csv('contours_20150210.csv') # reading in the dataframe and xy coordinates
c131x=np.asarray(df["contour_131_x"])
c131y=np.asarray(df["contour_131_y"])
c193x=np.asarray(df["contour_193_x"])
c193y=np.asarray(df["contour_193_y"])
c211x=np.asarray(df["contour_211_x"])
c211y=np.asarray(df["contour_211_y"])

nn_193_211=[]

dist_193_211 = distance_matrix(c193,c211) #Computing the distances between all the sets of coordinates

for i in range(len(dist_193_211[:][1])):
    nn_193_211.append([np.where(dist_193_211[i] == np.nanmin(dist_193_211[i]))[0][0],np.nanmin(dist_193_211[i])]) 
# I am looking for the nearest neighbors, both the value of the distance between them and which value that is in the list of coordinates

问题是当 for 循环到达 nans 时出现以下错误，即使我使用的是 np.nanmin。

/tmp/ipykernel_3022/578260609.py:2: RuntimeWarning: All-NaN slice encountered
  nn_193_211.append([np.where(dist_193_211[i] == np.nanmin(dist_193_211[i]))[0][0],np.nanmin(dist_193_211[i])])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_3022/578260609.py in <module>
      1 for i in range(len(dist_193_211[:][1])):
----> 2     nn_193_211.append([np.where(dist_193_211[i] == np.nanmin(dist_193_211[i]))[0][0],np.nanmin(dist_193_211[i])])
      3 print(nn_193_211[0:100])
      4 #print(np.max(nn_193_211),np.min(nn_193_211))

IndexError: index 0 is out of bounds for axis 0 with size 0

我决定只截断填充 nans（它们是数组中唯一的 nans，其他地方没有丢失的数据）。所以我在 Python 和运行中阅读了关于 nans 的以下测试：

print('c131x: ',c131x)
print('np.nan is np.nan:',np.nan is np.nan)
print('c131x[-1] is np.nan:',c131x[-1] is np.nan)

print(np.where(np.vectorize(c131x) is np.nan))
print(np.where(np.vectorize(c131y) is np.nan))
print(np.where(np.vectorize(c193x) is np.nan))
print(np.where(np.vectorize(c193y) is np.nan))
print(np.where(np.vectorize(c211x) is np.nan))
print(np.where(np.vectorize(c211y) is np.nan))

这是输出：

c131x:  [-202.79993465 -202.49993494 -202.19993523 ...           nan           nan
           nan]
np.nan is np.nan: True
c131x[-1] is np.nan: False
(array([], dtype=int64),)
(array([], dtype=int64),)
(array([], dtype=int64),)
(array([], dtype=int64),)
(array([], dtype=int64),)
(array([], dtype=int64),)

据我了解，np.nan is np.nan 和 c131x[-1] is np.nan 都应该返回 True：我是不是漏掉了什么？如果我不能确定 nans 在哪里，我就不能对数组进行切片。

Answer 1

BlueBuffalo73 的建议给了我一个关于不安全转换的错误；然而，我受到了这个建议的启发并尝试了

c131x=c131x[:np.where(np.isnan(c131x))[0][0]]

确实有效。我现在有截断的坐标数组。

Python NaN 问题

Python NaN problems

python

arrays

numpy

nan

pandas