一个函数的许多递归调用之一找到了正确的结果,但它不能 "tell" 其他。有比这个丑陋的解决方法更好的解决方法吗?

One of many recursive calls of a function found the correct result, but it can't "tell" the others. Is there a better fix than this ugly workaround?

最近,我正在尝试编写一个函数来在任意深度嵌套序列中的任何位置查找原始值,以及 return 到达那里所采用的路径(作为每个连续嵌套序列中的索引列表, 为了)。我遇到了一个非常意想不到的障碍:函数正在寻找结果,但没有 returning 它!该函数保留了 returning 输出,而不是正确的输出,只有在尝试在序列中找到 而不是 的项目时才应该产生输出。

通过在函数的不同点放置 print 语句,我发现问题是在实际找到该项目的递归调用 returned 之后,其他人没有找到该项目也在 returning,而且显然比发现它的人来得晚。这意味着最终结果将从 'success' 值重置为 'fail' 值,除非 'success' 值是最后遇到的。

我尝试通过在成功案例的早期在 return 函数中放置一个额外的条件来解决这个问题,试图抢占导致错误最终结果的额外的、不必要的递归调用。现在,这个 是我 运行 问题的根本原因所在:

事先无法知道哪个递归调用(如果有的话)会找到该项目,一旦其中一个找到它,它就无法'communicating' 和其他人!

我想出的避免这个更深层次问题的唯一方法是将函数完全重构为 'set' 自身外部的变量,当且仅当 'success' 输出时 'success' =] 遇到条件。外部全局变量开始设置为 'failed to find item in sequence' 值,除 'success' 情况外不会重置。所有其他递归调用只是 return 而不做任何事情。这看起来非常丑陋且效率低下,但它确实有效。

第一次尝试

# ITERATIVE/RECURSIVE SEQUENCE TRAVERSER (First Attempt)
# Works on 'result1' but not on 'result2'

# Searches for 'item' in sequence (list or tuple) S, and returns a tuple
# containing the indices (in order of increasing depth) at which the item
# can be found, plus the depth in S at which 'item' was found.

# If the item is *not* found, returns a tuple containing an empty list and -1

def traverse(S, item, indices=[], atDepth=0):
    # If the sequence is empty, return the 'item not found' result
    if not S:
        return ([], -1)
    
    else:
        # For each element in the sequence (breadth-first)
        for i in range(len(S)):
            # Success condition base case:  found the item!
            if S[i] == item:
                return (indices + [i], atDepth)
            
            # Recursive step (depth-first):  enter nested sequence
            # and repeat procedure from beginning
            elif type(S[i]) in (list, tuple):
                return traverse(S[i], item, indices + [i], atDepth + 1) 
            
        # Fail condition base case:  searched the entire length
        # and depth of the sequence and didn't find the item, so
        # return the 'item not found' result
        else:
            print("We looked everywhere but didn't find " + str(item) + " in " + str(S) + ".")
            return ([], -1)


L = [0, 1, 2, [3, (4, 5, [6, 6.25, 6.5, 6.75, 7])], [[8, ()]], (([9], ), 10)]

result1 = traverse(L, 7)
result2 = traverse(L, 9)

print("-------------------------------------------")
print(result1)
print("-------------------------------------------")
print(result2)

第二次尝试

# ITERATIVE/RECURSIVE SEQUENCE TRAVERSER (Second Attempt)
# Does not work on either test case

# Searches for 'item' in sequence (list or tuple) S, and returns a tuple
# containing the indices (in order of increasing depth) at which the item
# can be found, plus the depth in S at which 'item' was found.

# If the item is *not* found, returns a tuple containing an empty list and -1

def traverse(S, item, indices=[], atDepth=0, returnValue=None):
    # If the sequence is empty, return the 'item not found' result
    if not S:
        print("Sequence S is empty.")
        return ([], -1)
    
    # ---  ATTEMPTED FIX:
    # If the item is found before the end of S is reached,
    # do not perform additional searches.  In addition to being
    # inefficient, doing extra steps would cause incorrect false
    # negatives for the item being in S.
    # ---  DOES NOT WORK:  the underlying issue is that the multiple recursive
    # calls generated at the same time can't communicate with each other,
    # so the others don't 'know' if one of them already found the item.
    elif returnValue:
        print("Inside 'elif' statement!")
        return returnValue
    
    else:
        # For each element in the sequence (breadth-first)
        for i in range(len(S)):
            # Success condition base case:  found the item!
            if S[i] == item:
                # Return the depth and index at that depth of the item
                print("---  Found item " + str(item) + " at index path " + str(indices) + " in current sequence")
                returnValue2 = (indices + [i], atDepth)
                print("---  Item " + str(item) + " is at index path " + str(returnValue2) + " in S, SHOULD RETURN")
                #return returnValue2  # THIS DIDN'T FIX THE PROBLEM
                #break                # NEITHER DID THIS
            
            # Recursive step (depth-first):  enter nested sequence
            # and repeat procedure from beginning
            elif type(S[i]) in (list, tuple):
                # CANNOT USE 'return' BEFORE RECURSIVE CALL, as it would cause any items
                # in the outer sequence which come after the first occurrence of a nested
                # sequence to be missed (i.e. the item could exist in S, but if it is
                # after the first nested sequence, it won't be found)
                traverse(S[i], item, indices + [i], atDepth + 1, returnValue)  # CAN'T USE 'returnValue2' HERE (out of scope);
                                                                               # so parameter can't be updated in 'if' condition
            
        # Fail condition base case:  searched the entire length
        # and depth of the sequence and didn't find the item, so
        # return the 'item not found' result
        else:
            print("We looked everywhere but didn't find " + str(item) + " in " + str(S) + ".")
            return ([], -1)


L = [0, 1, 2, [3, (4, 5, [6, 6.25, 6.5, 6.75, 7])], [[8, ()]], (([9], ), 10)]

result1 = traverse(L, 7)
result2 = traverse(L, 9)

print("-------------------------------------------")
print(result1)
print("-------------------------------------------")
print(result2)

第三次也是最后一次尝试 -- 可行,但不理想!

# ITERATIVE/RECURSIVE SEQUENCE TRAVERSER (Third Attempt)
# This 'kludge' is ** HIDEOUSLY UGLY **, but it works!

# Searches for 'item' in sequence (list or tuple) S, and generates a tuple
# containing the indices (in order of increasing depth) at which the item
# can be found, plus the depth in S at which 'item' was found.

# If the item is *not* found, returns nothing (implicitly None)
# The results of calling the function are obtained via external global variables.

# This 3rd version of 'traverse' is thus actually a void function,
# and relies on altering the global state instead of producing an output.


# -----  WORKAROUND:  If the result is found, have the recursive call that found it
# send it to global scope and use this global variable as the final result of calling
# the 'traverse' function.

# Initialize the global variables to the "didn't find the item" result,
# so the result will still be correct if the item actually isn't in the sequence.
globalVars = {'result1': ([], -1), 'result2': ([], -1)}


def traverse(S, item, send_output_to_var, indices=[], atDepth=0):
    # If the sequence is empty, return *without* doing anything to the global variable.
    # It is already initialized to the "didn't find item" result.
    if not S:
        return
    
    else:
        # For each element in the sequence (breadth-first)
        for i in range(len(S)):
            # Success condition base case:  found the item!
            if S[i] == item:
                # Set the global variable to the index path of 'item' in 'S'.
                globalVars[send_output_to_var] = (indices + [i], atDepth)
                # No need to keep on doing unnecessary work!
                return
            
            # Recursive step (depth-first):  enter nested sequence
            # and repeat procedure from beginning
            elif type(S[i]) in (list, tuple):
                # Don't use 'return' before the recursive call, or it will miss items
                # in the outer sequence after a nested sequence is encountered.
                traverse(S[i], item, send_output_to_var, indices + [i], atDepth + 1) 
            
        # Fail condition base case:  searched the entire length
        # and depth of the sequence and didn't find the item.
        else:
            # Return *without* setting the global variable, as it is
            # already initialized to the "didn't find item" result.
            return


L = [0, 1, 2, [3, (4, 5, [6, 6.25, 6.5, 6.75, 7])], [[8, ()]], (([9], ), 10)]

traverse(L, 7, 'result1')
traverse(L, 9, 'result2')

print("-------------------------------------------")
print(globalVars['result1'])
print("-------------------------------------------")
print(globalVars['result2'])

我想知道我是否遗漏了什么,实际上有一种方法可以在不使用外部变量的情况下完成这项工作。最好的解决方案是以某种方式 'shutting down' 所有其他递归调用,一旦其中一个 return 成功结果,但我不相信这是可能的(我很想错对这个!)。或者可能是某种 'priority queue' 延迟 'success' 案例递归调用(如果存在)的 return 直到 after 所有 'fail' case 递归调用有 returned?

我看过这个类似的问题: 但是尽管 ggorlen 在这里 接受的答案解决了 OP 的问题,甚至提到了这个问题(“匹配结果没有正确传递到调用堆栈”),它是为执行特定的任务,并且没有提供我正在寻找的更一般情况下的洞察力。

你的第一次尝试几乎是完美的,唯一的错误是你return在当前深度搜索第一个list/tuple的结果,不管 是否找到了 item。相反,您需要检查一个肯定的结果,只有 return 如果是一个。这样你就可以不断迭代当前深度,直到找到 item 或者根本找不到。

所以你需要改变:

return traverse(S[i], item, indices + [i], atDepth + 1) 

类似于:

t = traverse(S[i], item, indices + [i], atDepth + 1) 
if t != ([], -1):
    return t

完整代码:

def traverse(S, item, indices=[], atDepth=0):
    # If the sequence is empty, return the 'item not found' result
    if not S:
        return ([], -1)
    
    else:
        # For each element in the sequence (breadth-first)
        for i in range(len(S)):
            # Success condition base case:  found the item!
            if S[i] == item:
                return (indices + [i], atDepth)
            
            # Recursive step (depth-first):  enter nested sequence
            # and repeat procedure from beginning
            elif type(S[i]) in (list, tuple):
                t = traverse(S[i], item, indices + [i], atDepth + 1) 
                if t != ([], -1):
                    return t
            
        # Fail condition base case:  searched the entire length
        # and depth of the sequence and didn't find the item, so
        # return the 'item not found' result
        else:
            print("We looked everywhere but didn't find " + str(item) + " in " + str(S) + ".")
            return ([], -1)

你的两个测试用例的输出:

>>> traverse(L, 7)
([3, 1, 2, 4], 3)
>>> traverse(L, 9)
We looked everywhere but didn't find 9 in [6, 6.25, 6.5, 6.75, 7].
We looked everywhere but didn't find 9 in (4, 5, [6, 6.25, 6.5, 6.75, 7]).
We looked everywhere but didn't find 9 in [3, (4, 5, [6, 6.25, 6.5, 6.75, 7])].
We looked everywhere but didn't find 9 in [8, ()].
We looked everywhere but didn't find 9 in [[8, ()]].
([5, 0, 0, 0], 3)

注意 正如@FreddyMcloughlan 指出的那样,atDepth 只是 returned 列表的长度减去 1。因此您可以删除该参数从函数调用中,只需使用:


def traverse(S, item, indices=[]):
    # If the sequence is empty, return the 'item not found' result
    if not S:
        return ([], -1)
    
    else:
        # For each element in the sequence (breadth-first)
        for i in range(len(S)):
            # Success condition base case:  found the item!
            if S[i] == item:
                return (indices + [i], len(indices))
            
            # Recursive step (depth-first):  enter nested sequence
            # and repeat procedure from beginning
            elif type(S[i]) in (list, tuple):
                t = traverse(S[i], item, indices + [i]) 
                if t != ([], -1):
                    return t
            
        # Fail condition base case:  searched the entire length
        # and depth of the sequence and didn't find the item, so
        # return the 'item not found' result
        else:
            print("We looked everywhere but didn't find " + str(item) + " in " + str(S) + ".")
            return ([], -1)

你需要边走边堆叠索引,并为递归函数分配一个成功标志。当您执行递归调用时,您将推送新索引。当函数 returns 时,如果失败则弹出并继续搜索,如果成功则除了 return 成功什么都不做。最后,堆栈将是空的或充满解决方案。

堆栈可以是全局变量或作为参数传递。

我会以不同的方式编写它,使用递归调用的结果来决定是立即 return 还是继续搜索。如果找到该项目,我的版本 return 是一个 non-empty 索引列表,否则 None

def traverse(S, item):
    for i, element in enumerate(S):
        if element == item:
            return [i]
        if type(element) in (list, tuple):
            if indices := traverse(element, item):
                indices.insert(0, i)
                return indices

你的两次测试结果(Try it online!):

[3, 1, 2, 4]
[5, 0, 0, 0]

我不传入列表,而是仅从项目的位置向后构建列表(或者如果它不在任何地方,那么我根本不构建任何列表)。尽管在 0 处插入使其成为二次方,但比复制切片快得多,因此工作量要少得多。如果你想要它是线性的,你可以将这个递归函数包装在另一个函数中,并使递归函数 append 索引和包装函数 reverse在 return 将其发送给原始外部呼叫者之前列出。如果需要,包装函数还可以让您将“non-empty 列表或 None”结果转换为其他内容(如深度对)。

由于在评论中讨论了将 traverse 函数重构为迭代而不是递归,为了完整起见,这里也是该版本:

# Iterative-only version of the same function from the question

def traverse(S, item):
    indices = []
    sequence = S
    depth = 0

    # By definition, if the sequence is empty, you won't find any items in it.
    # Return the 'item not found' result.
    if not S:
        return ([], -1)

    else:
        # You have to start somewhere :D
        indices.append(0)

        while depth > -1:
            # Go back up one level once the end of a nested sequence is reached.
            while indices[depth] == len(sequence):
                depth -= 1
                
                # If the depth ever gets below 0, that means that we've scanned
                # the entire length of the outermost sequence and not found the
                # item.  Return the 'failed to find item' result.
                if depth == -1:
                    return ([], -1)

                # Remove the entry corresponding to index in the innermost
                # nested sequence we just exited
                indices.pop()

                # Reset 'sequence' using the outermost sequence S and the indices
                # computed so far, to get back to the sequence which contained
                # the previous one
                sequence = S
                for i in range(len(indices) - 1):
                    sequence = sequence[indices[i]]

                indices[depth] += 1
                # And return to the top of the outer 'while' loop to re-check
                # whether to go down another level
                continue
                
            # Success:  found the item at depth 'depth', in sequence 'sequence',
            # at index 'indices[depth]` inside that sequence.  Return 'indices'
            # and 'depth'.
            if sequence[indices[depth]] == item:
                return (indices, depth)
            
            # Recursion replacement:  enter the nested subsequence and increase the depth
            # as long as the nested subsequence is not empty.  If it is empty, treat it
            # as if it were a non-sequence type.
            elif (type(sequence[indices[depth]]) in (list, tuple)) and (sequence[indices[depth]]):
                sequence = sequence[indices[depth]]
                depth += 1
                indices.append(0)
                        
            # The item being compared is not a sequence type and isn't equal to 'item', so increment
            # the index without increasing the depth
            else:
                indices[depth] += 1
                
        # If all of S has been searched and item was not found,
        # return the 'failed to find item' result
        return ([], -1)
                            

L = [0, 1, 2, [3, (4, 5, [6, 6.25, 6.5, 6.75, 7]), 7.5], [[8, ()]], (([9], ), 10)]

# Demonstrating the function's use:  search for numbers from 0 to 10 in list L,
# in increments of 0.5
for i in range(21):
    outputString = "Looking for " + (str(i/2) if (i/2 % 1) else str(int(i/2))) + " in L: "
    indices, depth = traverse(L, i/2)
    if indices:
         print(outputString + "found at depth " + str(depth) + " by index path " + str(indices))
    else:
        print(outputString + "Not found.")

我听说迭代算法比递归算法更快,所以我尝试使用 time.time() 对所有 3 个工作版本进行基准测试。对于所有 3 个,我使用了如下所示的相同 'busy work' 测试:

L = [0, 1, 2, [3, (4, 5, [6, 6.25, 6.5, 6.75, 7]), 7.5], [[8, ()]], (([9], ), 10)]

startTime = time.time()

for i in range(1000):
    for j in range(21):
        traverse(L, j/2)

finishedAt = time.time()
print("[WHICH VERSION] took " + str(finishedAt - startTime) + " seconds")

并得到了这些结果:

对于应用了 Nick 修复的原始递归函数: Original recursive version took 0.14863014221191406 seconds

对于凯利的递归版本: Quadratic recursive version took 0.11344289779663086 seconds

对于迭代版本: Iterative version took 0.1890261173248291 seconds

正如我所预料的那样,Kelly 的版本比原来的函数更快,它有我的原始函数没有的优化。我还没有实现带有反转列表的外包装函数的线性版本,但我希望那会更快。

不幸的是,迭代版本似乎比没有优化的递归函数的原始版本,所以我想我会把它交给代码审查站点以查看可以改进的地方...

参考资料: 原始错误修复:请在此处查看 Nick 的回答:

Kelly 的递归版本:

迭代版本:见上文