组合数组的相似元素

Combining similar elements of an Array

我想创建一个循环遍历数组并组合每个数组的第三个元素(如果它们具有相同的前两个元素)的函数,但是我能想到的唯一方法具有非常高的复杂性,任何推荐的算法[ python 首选,但任何伪代码或算法都可以]:

示例输入 -> Delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1' ], [2, 2, '0'], [2, 2, '1'] ]

预期输出 -> Delta = [ [0, 0, '1' ], [0, 1, '1' ], [1, 2, '0, 1' ], [2, 2, ' 0, 1' ] ]

感谢您的宝贵时间

您可以为此使用 itertools.groupby(和 operator.itemgetter):

from itertools import groupby
from operator import itemgetter

delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]

result = [
    [*key, ", ".join(map(itemgetter(2), group))]
    for key, group in groupby(sorted(delta), key=itemgetter(0, 1))
]

注意:仅当输入尚未排序时才需要 sorted——您的示例已排序。

您可以对列表进行排序,然后使用循环:

from typing import List, Union


def merge_lists(lists: List[List[Union[int, str]]]) -> List[List[Union[int, str]]]:
    """Merges lists based on first two elements."""
    if not lists:
        return lists
    sorted_lists = sorted(lists)
    result = [sorted_lists[0]]
    for sub_list in sorted_lists[1:]:
        curr_first, curr_second, key = sub_list
        prev_first, prev_second, *keys = result[-1]
        if curr_first == prev_first and curr_second == prev_second and key not in keys:
            result[-1].append(key)
        else:
            result.append(sub_list)
    return [[first, second, ', '.join(keys)] for first, second, *keys in result] 


lists = [[0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1']]
print(f'{lists = }')
print(f'{merge_lists(lists) = }')

输出:

lists = [[0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1']]
merge_lists(lists) = [[0, 0, '1'], [0, 1, '1'], [1, 2, '0, 1'], [2, 2, '0, 1']]

如果前两个元素是字符串而不是数字,使用类似 natsort.

可以使用 itertools.groupby 完成排序数据的分组:参见 trincot 的回答。

可以使用 dict 个列表对未排序的数据进行分组:

def combine_third_on_first_two(delta):
    d = {}
    for a,b,c in delta:
        d.setdefault((a,b), []).append(c)
    return [(a, b, ', '.join(l)) for (a,b),l in d.items()]

delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]
print(combine_third_on_first_two(delta))
# [(0, 0, '1'), (0, 1, '1'), (1, 2, '0, 1'), (2, 2, '0, 1')]

dict这个分组非常标准,已经在外部模块more_itertools中实现,如more_itertools.map_reduce:

from more_itertools import map_reduce
from operator import itemgetter

delta = [ [0, 0, '1'], [0, 1, '1'], [1, 2, '0'], [1, 2, '1'], [2, 2, '0'], [2, 2, '1'] ]

d = map_reduce(delta,
               keyfunc=itemgetter(0, 1), valuefunc=itemgetter(2), reducefunc=', '.join)
result = [(a,b,s) for (a,b), s in d.items()]
print(result)
# [(0, 0, '1'), (0, 1, '1'), (1, 2, '0, 1'), (2, 2, '0, 1')]