删除特定列包含某些值的行

Delete rows where specific columns contain some value

我有一个数组,每行包含 8 个值:

data = np.array([[ 1,  2,  3, 5, 6, 7, 15, 27],
                 [ 5,  6,  7, 5, 10, 12, 23, 52],
                 [ 9, 10, 0, 0, 0, 0, 27,44]])

我想删除 data[:,2:5] 等于零的每一行(所以 2 到 5 之间的所有列都等于零)

我发现使用下面的有效,但是有点啰嗦,我无法扩展到更多的列:

data_nonzero = np.delete(data, np.where(np.bitwise_and(np.bitwise_and((data[:,2]==0), (data[:,3]==0)), np.bitwise_and((data[:,4]==0), (data[:,5]==0)) ) )[0], 0)

我试过类似的东西:

new_a = np.delete(data, np.s_[:,2:5] == 0, axis=0)

但这似乎不起作用:

boolean array argument obj to delete must be one dimensional

最好,它会检查每行中 4 列的 2 个条件。类似于:

new_a = np.delete(data, np.where(np.s_[:,2:5] == 0 | np.s_[:,2:5] > 50000), axis=0)

在这种特殊情况下,我只会使用布尔索引来否定您的条件,即

>>> data[(data[:, 2:6] != 0).any(axis=1), ...]
array([[ 1,  2,  3,  5,  6,  7, 15, 27],
       [ 5,  6,  7,  5, 10, 12, 23, 52]])

换句话说,您想要 select 包含任何非零值的行。

我想出了一个解决办法:
data.csv 文件包含:

var1, var2, var3, var4, var5, var6, var7
x,x,0,0,0,0,x
x,x,0,0,0,0,x
x,x,65535,65535,65535,65535,x
x,x,0,40,116,3,x
x,x,65535,95,208,2,x
x,x,3,147,277,2,x
x,x,2,203,325,2,x

代码:

data = genfromtxt(filename[0], delimiter=',',skip_header=1)

print('------ Original data ------')
print(data[0:7,:])


new_a = np.delete(data, ~np.any(data[:,2:5], axis=1),axis=0)
print('------ Rows where data[:,2:5] == 0 removed ------')
print(new_a[0:5,:])

new_b = np.delete(new_a, np.all(new_a[:,2:5] > 60000,axis=1),axis=0)
print('------ Rows where data[:,2:5] > 60000 removed ------')
print(new_b[0:4,:])

结果:

------ Original data ------
[[x x 0.00 0.00 0.00 0.00 x]
[x x 0.00 0.00 0.00 0.00 x]
[x x 65535.00 65535.00 65535.00 65535.00 x]
[x x 0.00 40.00 116.00 3.00 x]
[x x 65535.00 95.00 208.00 2.00 x]
[x x 3.00 147.00 277.00 2.00 x]
[x x 2.00 203.00 325.00 2.00 x]]

------ Rows where data[:,2:5] == 0 removed ------
[[x x 65535.00 65535.00 65535.00 65535.00 x]
[x x 0.00 40.00 116.00 3.00 x]
[x x 65535.00 95.00 208.00 2.00 x]
[x x 3.00 147.00 277.00 2.00 x]
[x x 2.00 203.00 325.00 2.00 x]]

------ Rows where data[:,2:5] > 60000 removed ------
[[66.01 -46.05 0.00 40.00 116.00 3.00 x]
[66.01 -39.46 65535.00 95.00 208.00 2.00 x]
[66.01 -32.87 3.00 147.00 277.00 2.00 x]
[66.01 -26.28 2.00 203.00 325.00 2.00 x]]