根据相邻的差异值将列表分成组
Divide list into groups based on adjoining difference values
我对列表中的元素分组有以下问题。数字图像转换后,我将孔中心分开并将它们收集在 values 列表中,然后在计算相邻元素之间的差异后,我得到了 diff_ar .现在我想获取属于一个 group/cluster 的元素的索引。我假设一个部分中元素之间的最大差异应小于 3。此外,只有当内部至少有 7 个元素时才能创建组。结果,我期望元组列表包含索引开始和索引结束每个 od 检测到的组(本例中为 2)。
Image for better issue statement
values = [73.0, 143.0, 323.0, 324.0, 325.0, 325.0, 325.0, 325.0, 325.5,
325.5, 326.0, 326.0, 326.0, 326.0, 406.0, 406.5, 432.5, 433.0,
433.5, 434.5, 435.0, 435.0, 436.0, 436.5, 437.5, 438.0]
diff_ar = [70.0, 180.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.5, 0.0,
0.0, 0.0, 80.0, 0.5, 26.0, 0.5, 0.5, 1.0, 0.5, 0.0, 1.0, 0.5,
1.0, 0.5]
expected_output = [(2,12),(16,24)]
第一个解决方案:使用more_itertools.split_when
.
import more_itertools
values = [73.0, 143.0, 323.0, 324.0, 325.0, 325.0, 325.0, 325.0, 325.5,
325.5, 326.0, 326.0, 326.0, 326.0, 406.0, 406.5, 432.5, 433.0,
433.5, 434.5, 435.0, 435.0, 436.0, 436.5, 437.5, 438.0]
threshold = 3
min_items = 7
groups = [(g[0][0], g[-1][0]) for g in more_itertools.split_when(enumerate(values), lambda x,y: abs(x[1]-y[1])>=threshold) if len(g) >= min_items]
print(groups)
# [(2, 13), (16, 25)]
第二种方案:自己写循环。
def split_values(values, threshold, min_items):
result = []
prev = values[0]
last_cut = 0
for i,x in enumerate(values[1:], start=1):
if abs(x - prev) >= threshold:
if i - last_cut >= min_items:
result.append((last_cut, i-1))
last_cut = i
prev = x
if len(values) - last_cut >= min_items:
result.append((last_cut, len(values)-1))
return result
values = [73.0, 143.0, 323.0, 324.0, 325.0, 325.0, 325.0, 325.0, 325.5,
325.5, 326.0, 326.0, 326.0, 326.0, 406.0, 406.5, 432.5, 433.0,
433.5, 434.5, 435.0, 435.0, 436.0, 436.5, 437.5, 438.0]
print(split_values(values, 3, 7))
# [(2, 13), (16, 25)]
我对列表中的元素分组有以下问题。数字图像转换后,我将孔中心分开并将它们收集在 values 列表中,然后在计算相邻元素之间的差异后,我得到了 diff_ar .现在我想获取属于一个 group/cluster 的元素的索引。我假设一个部分中元素之间的最大差异应小于 3。此外,只有当内部至少有 7 个元素时才能创建组。结果,我期望元组列表包含索引开始和索引结束每个 od 检测到的组(本例中为 2)。
Image for better issue statement
values = [73.0, 143.0, 323.0, 324.0, 325.0, 325.0, 325.0, 325.0, 325.5,
325.5, 326.0, 326.0, 326.0, 326.0, 406.0, 406.5, 432.5, 433.0,
433.5, 434.5, 435.0, 435.0, 436.0, 436.5, 437.5, 438.0]
diff_ar = [70.0, 180.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.5, 0.0,
0.0, 0.0, 80.0, 0.5, 26.0, 0.5, 0.5, 1.0, 0.5, 0.0, 1.0, 0.5,
1.0, 0.5]
expected_output = [(2,12),(16,24)]
第一个解决方案:使用more_itertools.split_when
.
import more_itertools
values = [73.0, 143.0, 323.0, 324.0, 325.0, 325.0, 325.0, 325.0, 325.5,
325.5, 326.0, 326.0, 326.0, 326.0, 406.0, 406.5, 432.5, 433.0,
433.5, 434.5, 435.0, 435.0, 436.0, 436.5, 437.5, 438.0]
threshold = 3
min_items = 7
groups = [(g[0][0], g[-1][0]) for g in more_itertools.split_when(enumerate(values), lambda x,y: abs(x[1]-y[1])>=threshold) if len(g) >= min_items]
print(groups)
# [(2, 13), (16, 25)]
第二种方案:自己写循环。
def split_values(values, threshold, min_items):
result = []
prev = values[0]
last_cut = 0
for i,x in enumerate(values[1:], start=1):
if abs(x - prev) >= threshold:
if i - last_cut >= min_items:
result.append((last_cut, i-1))
last_cut = i
prev = x
if len(values) - last_cut >= min_items:
result.append((last_cut, len(values)-1))
return result
values = [73.0, 143.0, 323.0, 324.0, 325.0, 325.0, 325.0, 325.0, 325.5,
325.5, 326.0, 326.0, 326.0, 326.0, 406.0, 406.5, 432.5, 433.0,
433.5, 434.5, 435.0, 435.0, 436.0, 436.5, 437.5, 438.0]
print(split_values(values, 3, 7))
# [(2, 13), (16, 25)]