Python 嵌入式函数内存爆炸

Question

我使用Python有一段时间了，时不时遇到一些内存爆炸的问题。我已经搜索了一些资源来解决我的问题，例如 Memory profiling embedded python 和 https://mflerackers.wordpress.com/2012/04/12/fixing-and-avoiding-memory-leaks-in-python/ 和 https://docs.python.org/2/reference/datamodel.html#object.del 然而，none 对我有用。

我现在的问题是使用内嵌函数时内存爆炸。以下代码工作正常：

class A:
  def fa:
    some operations
    get dictionary1
    combine dictionary1 to get string1
    dictionary1 = None
    return *string1*

  def fb: 
    for i in range(0, j):
      call self.fa
      get dictionary2 by processing *string1* 
      # dictionary1 and dictionary2 are basically the same. 
      update *dictionary3* by processing dictionary2
      dictionary2 = None
    return *dictionary3*

class B:
  def ga: 
    for n in range(0, m):
      call A.fb # as one argument is updated dynamically, I have to call it within the loop 
      processes *dictoinary3*
    return something

当我注意到我不需要将 dictionary1 组合到 string1 时，问题就出现了，我可以直接通过 dictionary1 到 A.fb。我是这样实现的，然后程序变得极慢，内存使用量暴增10倍以上。我已验证这两种方法都返回了正确的结果。

谁能告诉我为什么这么小的修改会导致这么大的差异？

以前，我在对多源树（具有 100,000+ 个节点）中的节点进行分级时也注意到了这一点。如果我从源节点（可能具有最大高度）开始调平，则内存使用率比可能具有最小高度的源节点差 100 倍。而平化时间差不多。

这个问题困扰了我很久。提前致谢！

如果有人感兴趣，我可以将源代码通过电子邮件发送给您以获得更清楚的解释。

Answer 1

您正在解决相同的问题这一事实不应该暗示解决方案的效率。数组排序也有同样的问题：你可以使用冒泡排序 O(n^2)，或合并排序 O(nlogn)，或者，如果你可以应用一些限制，你可以使用像 radix 这样的非比较排序算法或具有线性运行时间的桶排序。

从不同的节点开始遍历将生成不同的遍历图的方式 - 其中一些可能效率不高（重复节点更多次）。

至于 "combine dictionary1 to string1" - 它可能是一个非常昂贵的操作，并且由于此函数被递归调用（多次） - 性能可能会差得多。但这只是一个有根据的猜测，如果没有更多关于在这些函数中执行的操作的复杂性的详细信息，就无法回答。

Python 嵌入式函数内存爆炸

Python memory explosion with embedded functions

python

memory-management

memory-leaks