Python 找到连续且递增的子数组的索引,其差值为 1

Python find indices of subarrays which are continuous and increasing with difference of 1

输入总是严格递增的。我可以用一个 for 循环和很多 if-else 条件来写这个,但是有一些简单的方法吗?这是一个例子:

input: [2,3,4,6,8,9]
output: [[0,1,2],[4,5]]

input: [1,2,3,4]
output: [[0,1,2,3]]

input: [0,1,2,4,6,7,8,10,12,13,14]
output: [[0,1,2], [4,5,6], [8,9,10]]

例如这里是我写的代码

def get_mergable_indices(sent_order):
    mergable_indices = []
    continuous_sents = [0]
    for index, value in enumerate(sent_order):
        if index == 0:
            continue
        else:
            if value - sent_order[index-1] == 1:
                continuous_sents.append(index)
            else:
                if len(continuous_sents)>1:
                    mergable_indices.append(continuous_sents)
                continuous_sents = [index]
    if len(continuous_sents)>1:
        mergable_indices.append(continuous_sents)
    return mergable_indices

太大了想缩小

无需使用任何模块即可轻松完成。

def get_mergable_indices(sent_order):
    lst = sent_order
    out = []

    l = []
    for a in range(max(lst)):  # set range to the max number in the list.
        try:
            if lst[a]+1 == lst[a+1]:  # check current number plus 1 is equal to next number
                l.append(a)
                l.append(a+1)
            else: # if not equal then append l to the out list also set the l to an empty list.
                if l:
                    out.append(list(set(l)))
                    l = []
        except IndexError:
            pass
    out.append(list(set(l))) 
    return (out)

输出

input: [2,3,4,6,8,9]
output: [[0,1,2],[4,5]]

input: [1,2,3,4]
output: [[0,1,2,3]]

input: [0,1,2,4,6,7,8,10,12,13,14]
output: [[0,1,2], [4,5,6], [8,9,10]]

这可以接受任何可迭代序列:

from itertools import pairwise

def get_mergable_indices(sent_order):
    result = []
    curr = []
    for idx, (i, j) in enumerate(pairwise(sent_order)):
        if j - i == 1:
            curr.append(idx)
        elif curr:
            curr.append(idx)
            result.append(curr)
            curr = []

    if curr:
        curr.append(idx + 1)
        result.append(curr)

    return result

输出:

>>> get_mergable_indices([2, 3, 4, 6, 8, 9])
[[0, 1, 2], [4, 5]]
>>> get_mergable_indices(range(1, 5))
[[0, 1, 2, 3]]
>>> get_mergable_indices([0, 1, 2, 4, 6, 7, 8, 10, 12, 13, 14])
[[0, 1, 2], [4, 5, 6], [8, 9, 10]]

这是我的方法:

def check_continuous(inp_list):
    idx = idy = 0
    res = [[]]
    while idx < len(inp_list) - 1:
        # Not append repeated indices
        if inp_list[idx] - inp_list[idx+1] == -1: # If the next element is 1 higher, just check for -1
            if idx not in res[idy]:
                res[idy].append(idx)
            if idx+1 not in res[idy]:
                res[idy].append(idx+1)
        else:
            # Don't append empty lists
            if res[idy]:
                res.append([])
                idy += 1
        idx += 1
    return res

print(check_continuous([2,3,4,6,8,9]))
# [[0, 1, 2], [4, 5]]
print(check_continuous([1,2,3,4]))
# [[0, 1, 2, 3]]
print(check_continuous([0,1,2,4,6,7,8,10,12,13,14]))
# [[0, 1, 2], [4, 5, 6], [8, 9, 10]]

我认为这可以大大改进

也许你可以试试这个:

def get_mergable_indices(sent_order):
    lst, res = [j-i for i, j in enumerate(sent_order)], []
    ci = 0
    for i in set(lst):
        lci = lst.count(i)
        if lci  > 1:
            res.append(list(range(ci, lci + ci)))
        ci += lci 
    return res

输出:

>>> get_mergable_indices([2,3,4,6,8,9])
[[0, 1, 2], [4, 5]]
>>> get_mergable_indices([1,2,3,4])
[[0, 1, 2, 3]]
>>> get_mergable_indices([0,1,2,4,6,7,8,10,12,13,14])
[[0, 1, 2], [4, 5, 6], [8, 9, 10]]

正如我在评论中提到的,np.diff 在这方面可能是一个不错的选择。接受的答案再次使用了两个循环,但写得更小,与其他答案没有太大区别。这个问题可以通过 Just NumPy 解决:

a = np.array([0, 1, 2, 4, 6, 7, 8, 10, 12, 13, 14])

diff = np.diff(a, prepend=a[0]-2)                                      # [2 1 1 2 2 1 1 2 2 1 1]
diff_w = np.where(diff == 1)[0]                                        # [ 1  2  5  6  9 10]

mask_ = np.diff(diff_w, prepend=diff_w[0]-2)                           # [2 1 3 1 3 1]
mask_ = mask_ != 1                                                     # [ True False  True False  True False]

con_values = np.insert(diff_w, np.where(mask_)[0], diff_w[mask_] - 1)  # [ 0  1  2  4  5  6  8  9 10]

# result = np.split(con_values, np.where(np.diff(con_values, prepend=con_values[0] - 1) != 1)[0])
result = np.split(con_values, np.where(np.diff(con_values, prepend=con_values[0] - 2) != 1)[0])[1:]
# [array([0, 1, 2], dtype=int64), array([4, 5, 6], dtype=int64), array([ 8,  9, 10], dtype=int64)]

我已经在您的示例和其他示例中测试了这段代码,并且它有效。但是,如果使用其他示例有任何问题,可以通过从这段代码中获得灵感进行一些小的改动来解决。我把这段代码写在不同的部分,以便更容易理解。如果这对您很重要,您可以将它们组合在一行中。