从 numpy ndarray 中提取非 None 数组的有效方法
efficient way to extract non None arrays from numpy ndarray
如果这个问题看起来很长而且很基础,我提前表示诚挚的歉意。
给定:
import numpy as np
import time
c, q = int(3e5), int(5e5)
a = np.full( (c,q,3), None )
# fillout with some non None arrays: 3D (x,y,z) positions
a[0,0, :] = np.array([-4,0.1,0])
a[0,1, :] = np.array([9.2,3.1,0])
a[0,5, :] = np.array([3,-4.3,0])
a[0,6, :] = np.array([-1,12.8,0])
a[2,1, :] = np.array([4.5,-9,0])
a[2,3, :] = np.array([-0.1,6.1,0])
a[2,8, :] = np.array([-7,1,0])
a[3,0, :] = np.array([-1,0.7,0])
a[3,6, :] = np.array([-15,26,0])
a[5,0, :] = np.array([0.1,-1.1,0])
a[7,5, :] = np.array([0,0,0])
a[8,2, :] = np.array([5,6,0])
a[9,10, :] = np.array([-1.1,1,0])
a[10,3, :] = np.array([-32,15,0])
a[11,7, :] = np.array([0,9.3,0])
a[12,2, :] = np.array([0.9,6.2,0])
a[14,9, :] = np.array([8.6,5.6,0])
a[15,5, :] = np.array([0.5,8.5,0])
目标:
我想从 a
中提取非 None
元素。目前,我的以下代码非常耗时且效率很低,因为我使用的是基本 for loop
:
bt = time.time()
for ci in range(c):
if any(ci == value for value in [2, 5]):
print(f">> Generating {ci}+ ranks ...")
poseNplus = []
aNplus = a[ci:]
for ci_i in range(aNplus.shape[0]):
aNplus_Q = aNplus[ci_i]
for qi in range(aNplus_Q.shape[0]):
if all(aNplus_Q[qi] != None):
poseNplus.append( aNplus_Q[qi] )
print(len(poseNplus), poseNplus)
et = time.time()
print(f"Took {(et-bt):.3f} s")
这很花时间:
Took 580.888 s
按照@Marc Felix 的回答,我可以提取 ALL 非 None
三元组如下:首先更改 a = np.full( (c,q,3), np.nan )
,然后:
bt = time.time()
nan_values = np.any(np.isnan(a), axis=-1)
result = a[nan_values==False].reshape((-1, 3))
et = time.time()
print(f"Took {(et-bt):.3f} s")
print(result.shape)
print(result)
哪个returns:
Took 0.318 s
(18, 3)
[[ -4. 0.1 0. ]
[ 9.2 3.1 0. ]
[ 3. -4.3 0. ]
[ -1. 12.8 0. ]
[ 4.5 -9. 0. ] <<<--- rank2 - END: from here till end
[ -0.1 6.1 0. ]
[ -7. 1. 0. ]
[ -1. 0.7 0. ]
[-15. 26. 0. ]
[ 0.1 -1.1 0. ] <<<--- rank5 - END: from here till end
[ 0. 0. 0. ]
[ 5. 6. 0. ]
[ -1.1 1. 0. ]
[-32. 15. 0. ]
[ 0. 9.3 0. ]
[ 0.9 6.2 0. ]
[ 8.6 5.6 0. ]
[ 0.5 8.5 0. ]]
但我想要的结果应该是这样的:
>> Generating 2+ ranks ...
[[ 4.5 -9. 0. ]
[ -0.1 6.1 0. ]
[ -7. 1. 0. ]
[ -1. 0.7 0. ]
[-15. 26. 0. ]
[ 0.1 -1.1 0. ]
[ 0. 0. 0. ]
[ 5. 6. 0. ]
[ -1.1 1. 0. ]
[-32. 15. 0. ]
[ 0. 9.3 0. ]
[ 0.9 6.2 0. ]
[ 8.6 5.6 0. ]
[ 0.5 8.5 0. ]]
------------------------------------------------------------
>> Generating 5+ ranks ...
[[ 0.1 -1.1 0. ]
[ 0. 0. 0. ]
[ 5. 6. 0. ]
[ -1.1 1. 0. ]
[-32. 15. 0. ]
[ 0. 9.3 0. ]
[ 0.9 6.2 0. ]
[ 8.6 5.6 0. ]
[ 0.5 8.5 0. ]]
------------------------------------------------------------
问题:
还有其他省时的方法吗?
我知道 this post 但结果是:
b = a[a != None]
print(b)
[-4.0 0.1 0.0 9.2 3.1 0.0 3.0 -4.3 0.0 -1.0 12.8 0.0 4.5 -9.0 0.0 -0.1 6.1
0.0 -7 1 0 -1.0 0.7 0.0 -15 26 0 0.1 -1.1 0.0 0 0 0 5 6 0 -1.1 1.0 0.0
-32 15 0 0.0 9.3 0.0 0.9 6.2 0.0 8.6 5.6 0.0 0.5 8.5 0.0]
您可以使用 np.isnan() 检测 nan 值。这看起来如下:
nan_values = np.any(np.isnan(a), axis=-1)
那么下面应该会给你正确的结果:
result = a[nan_values==False].reshape((-1, 3))
修改@Marc Felix 的回答和a
修改使用np.full
作为提问者的更新:
nan_values = np.any(np.isnan(a[2:]), axis=-1)
result = a[2:][nan_values==False].reshape((-1, 3))
print(f">> Generating {2}+ ranks ...\n", result, '\n ------------------------------------------------------------')
nan_values = np.any(np.isnan(a[5:]), axis=-1)
result = a[5:][nan_values==False].reshape((-1, 3))
print(f">> Generating {5}+ ranks ...\n", result, '\n ------------------------------------------------------------')
会得到预期的结果。
如果这个问题看起来很长而且很基础,我提前表示诚挚的歉意。
给定:
import numpy as np
import time
c, q = int(3e5), int(5e5)
a = np.full( (c,q,3), None )
# fillout with some non None arrays: 3D (x,y,z) positions
a[0,0, :] = np.array([-4,0.1,0])
a[0,1, :] = np.array([9.2,3.1,0])
a[0,5, :] = np.array([3,-4.3,0])
a[0,6, :] = np.array([-1,12.8,0])
a[2,1, :] = np.array([4.5,-9,0])
a[2,3, :] = np.array([-0.1,6.1,0])
a[2,8, :] = np.array([-7,1,0])
a[3,0, :] = np.array([-1,0.7,0])
a[3,6, :] = np.array([-15,26,0])
a[5,0, :] = np.array([0.1,-1.1,0])
a[7,5, :] = np.array([0,0,0])
a[8,2, :] = np.array([5,6,0])
a[9,10, :] = np.array([-1.1,1,0])
a[10,3, :] = np.array([-32,15,0])
a[11,7, :] = np.array([0,9.3,0])
a[12,2, :] = np.array([0.9,6.2,0])
a[14,9, :] = np.array([8.6,5.6,0])
a[15,5, :] = np.array([0.5,8.5,0])
目标:
我想从 a
中提取非 None
元素。目前,我的以下代码非常耗时且效率很低,因为我使用的是基本 for loop
:
bt = time.time()
for ci in range(c):
if any(ci == value for value in [2, 5]):
print(f">> Generating {ci}+ ranks ...")
poseNplus = []
aNplus = a[ci:]
for ci_i in range(aNplus.shape[0]):
aNplus_Q = aNplus[ci_i]
for qi in range(aNplus_Q.shape[0]):
if all(aNplus_Q[qi] != None):
poseNplus.append( aNplus_Q[qi] )
print(len(poseNplus), poseNplus)
et = time.time()
print(f"Took {(et-bt):.3f} s")
这很花时间:
Took 580.888 s
按照@Marc Felix 的回答,我可以提取 ALL 非 None
三元组如下:首先更改 a = np.full( (c,q,3), np.nan )
,然后:
bt = time.time()
nan_values = np.any(np.isnan(a), axis=-1)
result = a[nan_values==False].reshape((-1, 3))
et = time.time()
print(f"Took {(et-bt):.3f} s")
print(result.shape)
print(result)
哪个returns:
Took 0.318 s
(18, 3)
[[ -4. 0.1 0. ]
[ 9.2 3.1 0. ]
[ 3. -4.3 0. ]
[ -1. 12.8 0. ]
[ 4.5 -9. 0. ] <<<--- rank2 - END: from here till end
[ -0.1 6.1 0. ]
[ -7. 1. 0. ]
[ -1. 0.7 0. ]
[-15. 26. 0. ]
[ 0.1 -1.1 0. ] <<<--- rank5 - END: from here till end
[ 0. 0. 0. ]
[ 5. 6. 0. ]
[ -1.1 1. 0. ]
[-32. 15. 0. ]
[ 0. 9.3 0. ]
[ 0.9 6.2 0. ]
[ 8.6 5.6 0. ]
[ 0.5 8.5 0. ]]
但我想要的结果应该是这样的:
>> Generating 2+ ranks ...
[[ 4.5 -9. 0. ]
[ -0.1 6.1 0. ]
[ -7. 1. 0. ]
[ -1. 0.7 0. ]
[-15. 26. 0. ]
[ 0.1 -1.1 0. ]
[ 0. 0. 0. ]
[ 5. 6. 0. ]
[ -1.1 1. 0. ]
[-32. 15. 0. ]
[ 0. 9.3 0. ]
[ 0.9 6.2 0. ]
[ 8.6 5.6 0. ]
[ 0.5 8.5 0. ]]
------------------------------------------------------------
>> Generating 5+ ranks ...
[[ 0.1 -1.1 0. ]
[ 0. 0. 0. ]
[ 5. 6. 0. ]
[ -1.1 1. 0. ]
[-32. 15. 0. ]
[ 0. 9.3 0. ]
[ 0.9 6.2 0. ]
[ 8.6 5.6 0. ]
[ 0.5 8.5 0. ]]
------------------------------------------------------------
问题:
还有其他省时的方法吗?
我知道 this post 但结果是:
b = a[a != None]
print(b)
[-4.0 0.1 0.0 9.2 3.1 0.0 3.0 -4.3 0.0 -1.0 12.8 0.0 4.5 -9.0 0.0 -0.1 6.1
0.0 -7 1 0 -1.0 0.7 0.0 -15 26 0 0.1 -1.1 0.0 0 0 0 5 6 0 -1.1 1.0 0.0
-32 15 0 0.0 9.3 0.0 0.9 6.2 0.0 8.6 5.6 0.0 0.5 8.5 0.0]
您可以使用 np.isnan() 检测 nan 值。这看起来如下:
nan_values = np.any(np.isnan(a), axis=-1)
那么下面应该会给你正确的结果:
result = a[nan_values==False].reshape((-1, 3))
修改@Marc Felix 的回答和a
修改使用np.full
作为提问者的更新:
nan_values = np.any(np.isnan(a[2:]), axis=-1)
result = a[2:][nan_values==False].reshape((-1, 3))
print(f">> Generating {2}+ ranks ...\n", result, '\n ------------------------------------------------------------')
nan_values = np.any(np.isnan(a[5:]), axis=-1)
result = a[5:][nan_values==False].reshape((-1, 3))
print(f">> Generating {5}+ ranks ...\n", result, '\n ------------------------------------------------------------')
会得到预期的结果。