如何在不达到递归限制的情况下腌制 class() 具有子对象或邻居关系的对象并在加载时保留对象

Question

底部有一个 TLDR。我有一个 python class 如下，它用于为 D* 寻路创建节点（不包括寻路过程）：

import numpy as np
from tqdm import tqdm 
import sys
class Node():
    
    """A node class for D* Pathfinding"""

    def __init__(self,parent, position):
        self.parent = parent
        self.position = position
        self.tag = "New"
        self.state = "Empty"
        self.h = 0
        self.k = 0
        self.neighbours = []

    def __eq__(self, other):
        if type(other) != Node : return False
        else: return self.position == other.position
    def __lt__(self, other):
        return self.k < other.k

def makenodes(grid):
    
    """Given all the enpoints, make nodes for them"""
    endpoints = getendpoints(grid) 
    nodes = [Node(None,pos) for pos in endpoints]
    t = tqdm(nodes, desc='Making Nodes') #Just creates a progress bar
    for node in t:
        t.refresh()
        node.neighbours = getneighbours(node,grid,nodes)
    return nodes

def getneighbours(current_node,grid,spots):
'''Given the current node, link it to it's neighbours''' 
    neighbours = []
    for new_position in [(0, -1), (0, 1), (-1, 0), (1, 0)]: # Adjacent nodes

            # Get node position
            node_position = (int(current_node.position[0] + new_position[0]), int(current_node.position[1] + new_position[1]))

            # Make sure within range
            if node_position[0] > (len(maze) - 1) or node_position[0] < 0 or node_position[1] > (len(maze[len(maze)-1]) -1) or node_position[1] < 0:
                continue

            # Create new node
            new_node = spots[spots.index(Node(None, node_position))]
            neighbours.append(new_node)
    return neighbours

def getendpoints(grid):
    
    """returns all locations on the grid"""
    
    x,y = np.where(grid == 0)
    endpoints = [(x,y) for (x,y) in zip(x,y)]
    return endpoints

"""Actually Making The Nodes Goes as Follows"""
grid = np.zeros((100,100))
spots = makenodes(grid)

sys.setrecursionlimit(40000)
grid = np.zeros((100,100))
spots = makenodes(grid)

with open('100x100 Nodes Init 4 Connected', 'wb') as f:
    pickle.dump(spots, f)
print('Done 100x100')

该程序运行很好，但是这个节点制作过程在 100x100 上需要 3 分钟，但在 1000x1000 上需要一个多星期。为了解决这个问题，我使用 pickle.dump() 来保存所有节点，然后当我运行我的程序时，我可以使用 spots = pickle.load(...)。在 grid=np.zeros((40,50)) 上，我必须将递归限制设置为大约 16000，在 100x100 上，我将递归限制设置为 40000，但内核死机了。

为了帮助您直观地了解节点是什么以及它们与邻居的连接方式，请参阅我绘制的图像，其中黑色：节点和白色：连接。

如果我将 getneighbours() 更改为 return (x,y) 坐标元组（而不是节点对象），那么我可以腌制节点，因为它们不再递归。这种方法会减慢程序的其余部分，因为每次我想引用一个节点时，我都必须重建它并在列表 spots 中搜索它。（因为问题已经很长了，所以我简化了这部分的解释）。

如何保存这些节点对象，同时将它们的邻居保持为更大网格的节点？[=37=]

TLDR: 我有一个节点 class 是递归的，因为每个节点都包含对其邻居的引用，使其高度递归。我可以使用 pickle for grid = np.zeros((40,50)) 保存节点并设置相当高的递归限制 16000。对于较大的网格，我在调用 pickle.dump
时达到了最大系统递归

Answer 1

再次包含 TLDR（这样您就可以检查这是否适合您）。我找到了一种解决方法，它保留了 pickling 的省时优势，但允许节点即使在更大的尺寸下也能被 pickle。包括用于证明这一点的测试：

我更改了 makenodes() 中每个 Node() 的结构，通过剥离它的邻居来进行腌制，从而减少递归。使用问题中提到的方法，grid = np.zeros((40,50)) 将需要 sys.setrecursionlimt(16000) 左右。使用这种方法，永远不会达到 Python 的内置递归限制 1000，并且可以在加载 pickle 文件时重建预期的结构。

基本上酸洗的时候，都是按照下面的流程进行的。

列出所有节点及其相应的索引，以便：

In [1]: nodes
Out [1]: [[0, <__main__.Node at 0x7fbfbebf60f0>],...]

对邻居做同样的事情：

In [2]: neighbours
Out [2]: [[0, [<__main__.Node at 0x7fbfbebf6f60>, <__main__.Node at 0x7fbfbec068d0>]],...]

以上是任何大小的迷宫的节点和邻居的外观示例。 '...' 表示列表继续包含它包含的元素，另外注意 len(nodes) 应该等于 len(neighbours。该索引不是必需的，但我已将其作为附加检查包括在内。

变化如下：

def makenodes(grid):
    
    """Given all the enpoints, make nodes for them"""
    
    endpoints = getendpoints(grid)
    spots = [Node(None,pos) for pos in endpoints]
    neighbours = []
    nodes = []
    t = tqdm(spots, desc='Making Nodes')
    i = 0
    for node in t:
        t.refresh()
        neighbour = getneighbours(node,grid,spots)
        nodes.append([i,node])
        neighbours.append([i,neighbour])
        i+=1
            
    return nodes, neighbours

"""Actually Making the Nodes"""
grid = np.zeros((100,100))
nodes, neighbours = makenodes(grid)
arr = np.array([nodes,neighbours])
with open('100x100 Nodes Init 4 Connected.pickle', 'wb') as f:
    pickle.dump(arr,f)
print('Done 100x100')

def reconstruct(filename):

'''Reconstruct the relationships by re-assigning neighbours'''
    with open(filename, 'rb') as file:
        f = pickle.load(file)
    nodes = f[0]
    neighbours = f[1]
    reconstructed = []
    for node in nodes:
        i = node[0]
        for neighbour in neighbours[i][1]:
            node[1].neighbours.append(neighbour)
            reconstructed.append(node[1])
    return reconstructed

nodes_in = reconstruct('100x100 Nodes Init 4 Connected.pickle')

这实现了预期的输出，并且可以在重新加载（重建）时重新建立 neighbour/children 关系。此外，每个节点的邻居列表中的对象直接指向它们对应的节点，这样如果你运行只在选择的数组上这样做而不重建它：

In [3]: nodes2[1][1] is neighbours2[0][1][0]
Out[3]: True

时差创建和保存 50x50 网格的平均时间是。这样做 60 次的平均时间：

原路：9.613s
适配方式（重构）：9.822s

平均加载时间：

原路：0.0582s
适应方式：0.04071s

两个文件大小：410KB
结论

适应的方式可以处理更大的尺寸，从时代来看，具有相同的性能。这两种方法都会创建相同大小的文件，因为 pickle 只存储每个对象一次，请阅读相关文档。请注意，选择使用 50x50 网格是为了防止原始方法达到递归限制。我能够用适应的方式腌制的最大尺寸是 500x500。我没有做得更大，仅仅是因为节点制作过程在我的机器上花费了将近 2 天的时间。然而加载 500x500 只需要不到 1 秒，所以好处更加明显。

**TLDR：**我设法将节点及其相应的邻居保存为 np.array([nodes.neighbours]) 的数组，以便在 pickling 时，重新生成必要的数据当可以使用 un-pickling 时建立关系。本质上，必要的数据以不需要提高递归限制的方式计算和保存。然后您可以在加载时快速重建数据。

如何在不达到递归限制的情况下腌制 class() 具有子对象或邻居关系的对象并在加载时保留对象

How to pickle class() objects with children or neighbour relationships without hitting recursion limits and retain objects when loading

python

pickle