计算多维数组中值沿轴的百分位数排名

Question

我有一个 3D 维数组。

>>> M2 = np.arange(24).reshape((4, 3, 2))
>>> print(M2)
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]],

       [[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]]])

我想计算特定值沿轴 = 0 的百分位数排名。

例如如果值 = 4，输出预计为：

[[0.25, 0.25],
 [0.25, 0.25],
 [0.25, 0.0]]

其中 [0][0] 处的 0.25 是 4 在 [0, 6, 12, 18] 等中的百分位数排名

如果值 = 2.5，输出预计为：

[[0.25, 0.25],
 [0.25, 0.0],
 [0.0, 0.0]]

我在考虑使用 scipy.stats.percentileofscore，但这个似乎不适用于多维数组。

----------------------------编辑---------------- ----------

从埃文的评论中得到启发。我想出了一个使用 scipy.stats.percentileofscore.

的解决方案

percentile_rank_lst = []
for p in range(M2.shape[1]):
    for k in range(M2.shape[2]):
        M2_ = M2[:, p, k]
        percentile_rank = (stats.percentileofscore(M2_, 4)) / 100
        percentile_rank_lst.append(percentile_rank)

percentile_rank_nparr = np.array(percentile_rank_lst).reshape(M2.shape[1], M2.shape[2])
print(percentile_rank_nparr)

输出为：

array([[0.25, 0.25],
 [0.25, 0.25],
 [0.25, 0.0]])

Answer 1

我认为这样做可以：

def get_percentile(val, M=M2, axis=0):
    return (M > val).argmax(axis)/ M.shape[axis]

get_percentile(4)
#array([[0.25, 0.25],
#       [0.25, 0.25],
#       [0.25, 0.  ]])

get_percentile(2.5)
#array([[0.25, 0.25],
#       [0.25, 0.  ],
#       [0.  , 0.  ]])

计算多维数组中值沿轴的百分位数排名

Calculate the percentile rank of a value in a multi-dimensional array along an axis

python

numpy

scipy

pandas

python-xarray