根据最后一列累积 NumPy 数组的行

Question

我有以下问题。我有一个包含坐标数组的数组，前三个条目是 x、y、z 坐标，第四个条目是轨道的 ID。我想在第一个时间点之后开始的轨道上添加漂移。是否有一种简单的方法可以将漂移动态添加到具有不同长度的 id 的轨道中，立即添加到整个数组中？（所以可以看到，id号的track只有3个坐标条目，id为3的track有6个）

import numpy as np
drift=np.array([1,1,0])
a = np.array([[1,1,1,0],[1,1,1,0],[1,1,1,0],
              [1,1,1,2],[1,1,1,2],[1,1,1,3],
              [1,1,1,3],[1,1,1,3],[1,1,1,3],
              [1,1,1,3],[1,1,1,3]])

输出：

output = np.array([[1,1,1,0],[2,2,1,0],[3,3,1,0],
                   [1,1,1,2],[2,2,1,2],[1,1,1,3],
                   [2,2,1,3],[3,3,1,3],[4,4,1,3],
                   [5,5,1,3],[6,6,1,3]])

Answer 1

据我所知，没有内置的方法可以做到这一点，但你可以用这个简单的循环来解决它：

import numpy as np
drift=np.array([1,1,0])
a = np.array([[1,1,1,0],[1,1,1,0],[1,1,1,0],
[1,1,1,2],[1,1,1,2],[1,1,1,3],[1,1,1,3],[1,1,1,3],[1,1,1,3],[1,1,1,3],[1,1,1,3]])

_id = 0
n = 0
for i in range(a.shape[0]):
    if a[i, 3] == _id:
        a[i, 0:3] = a[i, 0:3] + n * drift
        n += 1
    else:
        _id = a[i, 3]
        n = 1

print(a)

Answer 2

这是一个如何以矢量化方式完成的示例：

import numpy as np


drift = np.array([1, 1, 0])
a = np.array([[1, 1, 1, 0], [1, 1, 1, 0], [1, 1, 1, 0], [1, 1, 1, 2], 
              [1, 1, 1, 2], [1, 1, 1, 3], [1, 1, 1, 3], [1, 1, 1, 3], 
              [1, 1, 1, 3], [1, 1, 1, 3], [1, 1, 1, 3]])


def multirange(counts: np.ndarray) -> np.ndarray:
    """
    Calculates concatenated ranges. Code was taken at:
    
    """
    counts = counts[counts != 0]
    counts1 = counts[:-1]
    reset_index = np.cumsum(counts1)
    incr = np.ones(counts.sum(), dtype=int)
    incr[0] = 0
    incr[reset_index] = 1 - counts1
    incr.cumsum(out=incr)
    return incr


def drifts(ids: np.ndarray,
           drift: np.ndarray) -> np.ndarray:
    diffs = np.diff(ids)
    max_drifts_per_id = np.concatenate((np.where(diffs)[0], [len(ids) - 1])) + 1
    max_drifts_per_id[1:] = max_drifts_per_id[1:] - max_drifts_per_id[:-1]
    multipliers = multirange(max_drifts_per_id)
    drifts = np.tile(drift, (len(ids), 1))
    return drifts * multipliers[:, np.newaxis]


a[:, :-1] += drifts(a[:, -1], drift)
print(a)

输出：

array([[0, 0, 0, 0],
       [1, 1, 0, 0],
       [2, 2, 0, 0],
       [0, 0, 0, 2],
       [1, 1, 0, 2],
       [0, 0, 0, 3],
       [1, 1, 0, 3],
       [2, 2, 0, 3],
       [3, 3, 0, 3],
       [4, 4, 0, 3],
       [5, 5, 0, 3]])

解释：

drifts 函数的想法是采用一个 id 数组（在我们的例子中，我们可以获得 a[:, -1]: array([0, 0, 0, 2, 2, 3, 3, 3, 3, 3, 3])）和 drift ( np.array([1, 1, 0])) 以获得以下数组，然后可以将其附加到原始数组：

array([[0, 0, 0],
       [1, 1, 0],
       [2, 2, 0],
       [0, 0, 0],
       [1, 1, 0],
       [0, 0, 0],
       [1, 1, 0],
       [2, 2, 0],
       [3, 3, 0],
       [4, 4, 0],
       [5, 5, 0]])

一行一行：

diffs = np.diff(ids)

这里我们得到一个数组，其中所有 non-zero 元素都将具有第一个数组中最后一个 ID 的索引：

array([0, 0, 2, 0, 1, 0, 0, 0, 0, 0])

有关详细信息，请参阅 np.diff。

max_drifts_per_id = np.concatenate((np.where(diffs)[0], [len(ids) - 1])) + 1

np.where(diffs)[0] 将给出前一个数组中那些 non-zero 元素的索引。我们附加最后一个元素的索引并将结果索引递增 1，以便稍后获得范围。有关详细信息，请参阅 np.where。串联后 max_drifts_per_id 将是：

array([ 3,  5, 11])

max_drifts_per_id[1:] = max_drifts_per_id[1:] - max_drifts_per_id[:-1]

这里从之前的结果中我们得到范围的结束值数组：

array([3, 2, 6])

multipliers = multirange(max_drifts_per_id)

我们使用 multirange 作为连接 np.arange 调用的有效替代方法。请参阅如何在 numpy 中有效地连接多个 arange 调用？了解详情。结果 multipliers 将是：

array([0, 1, 2, 0, 1, 0, 1, 2, 3, 4, 5])

drifts = np.tile(drift, (len(ids), 1))

通过 np.tile 我们将 drift 扩展为与 ids 具有相同的行数：

array([[1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0],
       [1, 1, 0]])

return drifts * multipliers[:, np.newaxis]

我们将它乘以 multipliers 得到：

array([[0, 0, 0],
       [1, 1, 0],
       [2, 2, 0],
       [0, 0, 0],
       [1, 1, 0],
       [0, 0, 0],
       [1, 1, 0],
       [2, 2, 0],
       [3, 3, 0],
       [4, 4, 0],
       [5, 5, 0]])

最后这个返回值可以添加到原始数组中：

a[:, :-1] += drifts(a[:, -1], drift)

根据最后一列累积 NumPy 数组的行

Accumulate rows of NumPy array based on the last column

python

arrays

numpy

accumulate

cumsum