Combining/Summing 行基于索引列表的 numpy 数组

Question

我有一个数组，我想将特定的行添加到一起以获得行数较少的数组。

import numpy as np
a = np.arange(50).reshape(10,5)
b = [0,0,0,1,1,2,2,2,2,3]
a
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44],
       [45, 46, 47, 48, 49]])

a.shape
(10, 5)

len(b)
10

我想使用 b 作为我想要合并的行的基础。前 3 行将相加成为新的第一行，第四和第五行将相加成为新的第二行。依此类推。

想要的结果：

array([[15, 18, 21, 24, 27],
       [35, 37, 39, 41, 43],
       [130, 134, 138, 142, 146],
       [45, 46, 47, 48, 49]])

循环对于我的目的来说效率很低。我不确定这是否可以在 numpy 中完成，但也许 pandas 或 xarray?

感谢任何帮助。

Answer 1

在 pandas 中，解决方案是创建 DataFrame 并按 b 数组创建的索引聚合总和：

a = np.arange(50).reshape(10, 5)
b = [0,0,0,1,1,2,2,2,2,3]
print (a)

c = pd.DataFrame(a, index=b).sum(level=0).to_numpy()
print (c)
[[ 15  18  21  24  27]
 [ 35  37  39  41  43]
 [130 134 138 142 146]
 [ 45  46  47  48  49]]

Answer 2

我想添加一个 numpy 解决方案：

import numpy as np

a = np.arange(50).reshape(10,5)
b = [0,0,0,1,1,2,2,2,2,3]

sum_common = lambda x : sum(a[b==x,:])

indx = np.unique(b)

c = np.array(map(sum_common, indx))

当然可以一行完成:

c=np.array(map(lambda x : sum(a[b==x,:]), np.unique(b)))

结果：

array([[ 15,  18,  21,  24,  27],
       [ 35,  37,  39,  41,  43],
       [130, 134, 138, 142, 146],
       [ 45,  46,  47,  48,  49]])

Combining/Summing 行基于索引列表的 numpy 数组

Combining/Summing rows of a numy array based on a list of indices

python

numpy

pandas

python-xarray