基于关闭值将列表的部分分组为多个列表的 Pythonic 方法

Pythonic Way to group sections of a list into multiple lists based off value

下面是我绘制的图,其中 y 轴 (v) 是列表中包含的值。如您所见,列表值在高值段和低值段之间交替,因此列表如下所示:

li = [0.5,0.49,0.5,..,0.5,0.001,0.001,...,0.001,0.49,0.5,...,0.5,]

我的目标是分别取高值的六个段和低值的六个段中的每一个,然后计算每个段的平均值。为此,我试图将上面的列表分开,并将每个部分放入其自己的列表中,并将每个列表放入相应的高价值/低价值列表中。大致如下:

high_segments = [[high_values1],[high_values2],[high_values3]]
low_segments  = [[low_values1],[low_values2],[low_values3]]

我一直在尝试构建一个 for 循环来执行此操作,但一直在努力解决如何处理低值组和高值组之间的变化。非常感谢任何建议。

我假设你的输入数组li是连续的6个高值和6个低值,导致数组有36个元素。首先,numpy.reshape 函数创建一个连续的 6 元素子数组。然后,我们可以 select 奇子数组(图中的高值)和偶数子数组(低值)进行切片。通过第二个轴堆叠两个阵列将形成所需的形状。 block_reduce 函数将为每个块计算。

import numpy as np
# conda install -c anaconda scikit-image
from skimage.measure import block_reduce

if __name__ == '__main__':
    li = np.arange(0, 36)
    li = li.reshape(-1, 6)
    high_values = li[::2]
    low_values = li[1::2]
    combined = np.stack((high_values, low_values), axis=1)
    segment_average = block_reduce(combined, block_size=(1,2,6), func=np.mean, cval=np.mean(combined)).flatten()
    print(f"[main] input:\n{li}")
    print(f"[main] high_values:\n{high_values}")
    print(f"[main] low_values:\n{low_values}")
    print(f"[main] combined:\n{combined}")
    print(f"[main] segment average: {segment_average}")

结果:

[main] input:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 31 32 33 34 35]]
[main] high_values:
[[ 0  1  2  3  4  5]
 [12 13 14 15 16 17]
 [24 25 26 27 28 29]]
[main] low_values:
[[ 6  7  8  9 10 11]
 [18 19 20 21 22 23]
 [30 31 32 33 34 35]]
[main] combined:
[[[ 0  1  2  3  4  5]
  [ 6  7  8  9 10 11]]

 [[12 13 14 15 16 17]
  [18 19 20 21 22 23]]

 [[24 25 26 27 28 29]
  [30 31 32 33 34 35]]]
[main] segment average: [ 5.5 17.5 29.5]

使用 numpy 并按平均值拆分。

import numpy as np

li = np.array([
    0.5, 0.49, 0.5,
    0.001, 0.001, 0.001,
    0.49, 0.5, 0.5,
    0, 0.002, 0.01,
])

# Split into high/low groups using the mean:
is_high = li >= li.mean()
is_low = li < li.mean()

# Determine the groups:
diff = np.insert(np.diff(is_high), 0, False).astype(np.int)  # array([0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0])
groups = diff.cumsum()  # array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3])

high_segments = np.array([li[groups==kk] for kk in np.unique(groups[is_high])])
low_segments = np.array([li[groups==kk] for kk in np.unique(groups[is_low])])

high_segments_mean = high_segments.mean(axis=1)
low_segments_mean = low_segments.mean(axis=1)