如何计算 numpy 中一维数组的移动（或滚动，如果你愿意）percentile/quantile？

Question

在pandas中，我们有pd.rolling_quantile()。在 numpy 中，我们有 np.percentile()，但我不确定如何做它的 rolling/moving 版本。

解释一下我的意思 moving/rolling percentile/quantile:

给定数组 [1, 5, 7, 2, 4, 6, 9, 3, 8, 10]，大小为 window 的移动分位数 0.5（即移动百分位数 50%）为：

1
5 - 1 5 7 -> 0.5 quantile = 5
7 - 5 7 2 ->                5
2 - 7 2 4 ->                4
4 - 2 4 6 ->                4
6 - 4 6 9 ->                6
9 - 6 9 3 ->                6
3 - 9 3 8 ->                8
8 - 3 8 10 ->               8
10

所以 [5, 5, 4, 4, 6, 6, 8, 8] 就是答案。为了使结果序列与输入的长度相同，一些实现插入 NaN 或 None，而 pandas.rolling_quantile() 允许通过较小的 window 计算前两个分位数值。

Answer 1

我们可以用 np.lib.stride_tricks.as_strided 创建滑动 windows，实现为 -

的函数

In [14]: a = np.array([1, 5, 7, 2, 4, 6, 9, 3, 8, 10]) # input array

In [15]: W = 3 # window length

In [16]: np.percentile(strided_app(a, W,1), 50, axis=-1)
Out[16]: array([ 5.,  5.,  4.,  4.,  6.,  6.,  8.,  8.])

为了使其与输入的长度相同，我们可以用 np.concatenate 填充 NaNs 或者用 np.pad 填充更容易。因此，对于 W=3，它将是 -

In [39]: np.pad(_, 1, 'constant', constant_values=(np.nan)) #_ is previous one
Out[39]: array([ nan,   5.,   5.,   4.,   4.,   6.,   6.,   8.,   8.,  nan])

Answer 2

series = pd.Series([1, 5, 7, 2, 4, 6, 9, 3, 8, 10])

In [194]: series.rolling(window = 3, center = True).quantile(.5)

Out[194]: 
0      nan
1   5.0000
2   5.0000
3   4.0000
4   4.0000
5   6.0000
6   6.0000
7   8.0000
8   8.0000
9      nan
dtype: float64

中心默认为False。因此，您需要手动将其设置为 True，以便 quantile-calculation window 对称地包含当前索引。

如何计算 numpy 中一维数组的移动（或滚动，如果你愿意）percentile/quantile？

How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy?

numpy

quantile

pandas

rolling-computation