将递归方法(递归在循环内完成)转换为迭代方法

Converting a recursive method (where the recursion is done inside a loop) into an iterative method

我有一个递归算法,用于迭代分层数据结构,但不幸的是,对于某些数据,分层结构太深以至于我遇到了 WhosebugError。我已经看到这种情况发生在大约 150 个节点的深度,而数据可能会增长到比这更远的地方。对于上下文,此代码将 运行 在有限的环境中更改 JVM 堆栈大小不是一个选项,并且数据结构是给定的并且表示具有目录和文件的不同文件系统。

为了解决堆栈溢出问题,我尝试将算法转换为迭代算法。这不是我以前必须做的事情,所以我从一些例子开始,展示如何用简单的递归来做到这一点,但我不确定如何将它应用于循环内的递归。我找到了一种似乎可行的方法,但代码相当疯狂。

这是我原来的递归方法的简化版本:

private CacheEntry sumUpAndCacheChildren(Node node) {
    final CacheEntry entry = getCacheEntry(node);

    if (entryIsValid(entry))
        return entry;

    Node[] children = node.listChildren();

    long size = 0;  

    if (children != null) {         
        for (Node child : children) {
            if (child.hasChildren()) {  
                size += sumUpAndCacheChildren(child).size;                  
            } else {                    
                size += child.size();
            }
        }                   
    }

    return putInCache(node, size);      
}

每个叶节点都有一个大小,而任何祖先节点的大小都被认为是其所有后代的大小。我想知道每个节点的这个大小,所以每个节点的大小都会被聚合和缓存。

这是迭代版本:

private CacheEntry sumUpAndCacheChildren(Node initialNode) {
    class StackFrame {
        final Node node;
        Node[] children;

        // Local vars
        long size;

        // Tracking stack frame state
        int stage;
        int loopIndex;

        StackFrame(Node node) {
            this.node = node;
            this.children = null;
            this.size = 0;
            this.stage = 0;
            this.loopIndex = 0;
        }
    }

    final Stack<StackFrame> stack = new Stack<StackFrame>();
    stack.push(new StackFrame(initialNode));
    CacheEntry retValue = getCacheEntry(initialNode);

    outer:
    while (!stack.isEmpty()) {
        final StackFrame frame = stack.peek();
        final Node node = frame.node;

        switch(frame.stage) {
            case 0: {
                final CacheEntry entry = getCacheEntry(node);

                if (entryIsValid(entry)) {
                    retValue = entry;
                    stack.pop();
                    continue;       
                }

                frame.children = node.asItem().listChildren();
                frame.stage = frame.children != null ? 1 : 3;
            } break;
            case 1: {
                for (int i = frame.loopIndex; i < frame.children.length; ++i) {
                    frame.loopIndex = i;
                    final Node child = frame.children[i];

                    if (child.hasChildren()) {
                        stack.push(new StackFrame(child));
                        frame.stage = 2;    // Accumulate results once all the child stacks have been calculated.
                        frame.loopIndex++;  // Make sure we restart the for loop at the next iteration the next time around.
                        continue outer;
                    } else {
                        frame.size += child.size();
                    }
                }

                frame.stage = 3;
            } break;
            case 2: {
                // Accumulate results
                frame.size += retValue.size;
                frame.stage = 1;            // Continue the for loop
            } break;
            case 3: {
                retValue = putInCache(node, frame.type);
                stack.pop();
                continue;
            }
        }
    }

    return retValue;
}

这感觉比它需要的更疯狂,而且在代码中我递归到子项并对它们执行不同操作的所有地方都必须这样做会很痛苦。当我在每个级别上聚合并在子级上的 for 循环中执行此操作时,我可以使用哪些技术使递归更容易?

编辑:

借助以下答案,我能够大大简化事情。现在的代码几乎和原来的递归版本一样简洁。现在,我只需要在递归相同数据结构的其他地方应用相同的原则。

好的,我会用人话来解释它,因为我现在不想编码:

  1. 获取最顶层元素并写入列表
  2. 循环开始
  3. 在此级别计算元素并将它们添加到您的计数器
  4. 从当前列表中获取子列表,单独存储
  5. 删除当前元素列表
  6. 将子列表写入当前元素列表所在的位置
  7. 循环结束

如果 children 的列表不再有任何元素,你只需要将一个布尔值放入循环头并将其设置为 false ...我希望我能够正确表达我自己,随时提问and/or询问澄清。

如果数据结构保持 "folding out",则此算法在每次迭代中将呈指数级变慢 ( --> O(n²) ),其效率相当低且即时很确定有人可以提出优化 - 但它会比递归更快,并且不会产生堆栈溢出;但是对于非常大的数据集,它可能会产生 OutOfMemoryException - 但由于在任何时候都只迭代一个级别,所以这是......我猜这很不现实

由于您正在处理树结构并希望计算累积大小,请在跟踪每个节点的父节点时尝试 DFS。我在这里假设您不能更改或继承 Node 并且我保留了您使用的所有函数签名。

private class SizedNode {
    public long cumulativeSize;
    public Node node;
    public SizedNode parent;

    public SizedNode(SizedNode parent, Node node) {
        this.node = node;
        this.parent = parent;
    }

    public long getSize() {
        if (node.hasChildren()) {
            return cumulativeSize;
        }
        else {
            return node.size();
        }
    }
}

private void sumUpAndCacheChildren(Node start)
{
    Stack<SizedNode> nodeStack = new Stack<SizedNode>();

    // Let's start with the beginning node.
    nodeStack.push(new SizedNode(null, start));

    // Loop as long as we've got nodes to process
    while (!nodeStack.isEmpty()) {

        // Take a look at the top node
        SizedNode sizedNode = nodeStack.peek();            
        CacheEntry entry = getCacheEntry(sizedNode.node);

        if (entryIsValid(entry)) {
            // It's cached already, so we have computed its size
            nodeStack.pop();

            // Add the size to the parent, if applicable.
            if (sizedNode.parent != null) {
                sizedNode.parent.cumulativeSize += sizedNode.getSize();

                // If the parent's now the top guy, we're done with it so let's cache it
                if (sizedNode.parent == nodeStack.peek()) {
                    putInCache(sizedNode.parent.node, sizedNode.parent.getSize());
                }
            }
        }
        else {
            // Not cached.
            if (sizedNode.node.hasChildren()) {
                // It's got a bunch of children.
                // We can't compute the size yet, so just add the kids to the stack.
                Node[] children = sizedNode.node.listChildren();
                if (children != null) {
                    for (Node child : children) {
                        nodeStack.push(new SizedNode(sizedNode, child));
                    }    
                }                    
            }
            else {
                // It's a leaf node. Let's cache it.
                putInCache(sizedNode.node, sizedNode.node.size());
            }
        }
    }
}

您基本上是在对 N 叉树进行 post 阶迭代遍历;您可以尝试搜索它以获取更详细的示例。

非常粗糙的伪代码:

Node currentNode;
Stack<Node> pathToCurrent;
Stack<Integer> sizesInStack;
Stack<Integer> indexInNode;

pathToCurrent.push(rootNode);
sizesInStack.push(0);
indexInNode.push(0);

current = rootNode;
currentSize = 0;
currentIndex = 0;
while (current != null) {
  if (current.children != null && currentIndex < current.children.size) {
    //process the next node
    nextChild = current.children[currentIndex];
    pathToCurrent.push(current);
    sizesInStack.push(currentSize);
    indexInNode.push(currentIndex);
    current = nextChild;
    currentSize = 0;
    currentIndex = 0;
  } else {
    //this node is a leaf, or we've handled all its children 
    //put our size into the cache, then pop off the stack and set up for the next child of our parent
    currentSize += this.size();
    putInCache(this, currentSize);
    current = pathToCurrent.pop();  //If pop throws an exception on empty stack, handle it here and exit the loop
    currentSize = currentSize + sizesInStack.pop();
    currentIndex = 1 + indexInNode.pop();
  }
}

在根据我的用例调整@Marius 的回答后,我想到了这个:

class SizedNode {
    final Node node;
    final SizedNode parent;

    long size;
    boolean needsCaching;

    SizedNode(Node node, SizedNode parent) {
        this.parent = parent;
        this.node = node;
    }
}

private CacheEntry sumUpAndCacheChildren(Node start) {      
    final Stack<SizedNode> stack = new Stack<SizedNode>();
    stack.push(new SizedNode(start, null));
    CacheEntry returnValue = getCacheEntry(start);

    while (!stack.isEmpty()) {
        final SizedNode sizedNode = stack.pop();           
        final CacheEntry entry = getCacheEntry(sizedNode.folder);

        if (sizedNode.needsCaching) {
            // We finished processing all children, and now we're done with this node.
            if (sizedNode.parent != null) {
                sizedNode.parent.size += sizedNode.size;                
            }
            returnValue = putInCache(sizedNode.folder, sizedNode.size);
        } else if (entryIsValid(entry)) {
            if (sizedNode.parent != null) {
                sizedNode.parent.size += entry.size;
            }
            returnValue = entry;
        } else {                    
            // The next time we see this node again, it will be after we process all of its children.
            sizedNode.needsCaching = true;
            stack.push(sizedNode);

            for (Node child : sizedNode.node.listChildren()) {
                if (child.hasChildren()) {
                    stack.push(new SizedNode(child, sizedNode));                        
                } else {
                    sizedNode.size += child.size();
                }
            }               
        }
    }

    return returnValue;
}

这比我第一次通过时想出的疯狂混乱要好得多。只是表明您确实必须考虑转换算法,以便它作为一种迭代方法也有意义。感谢大家的帮助!