Numpy shared_memory 数组在 Pool 中重置为零

Numpy shared_memory array resetting with zeros inside Pool

我正在尝试与进程 Pool 共享一个大型 3 维 numpy 数组,以便对所述大型数组的切片执行一些操作。 在我的 main:

_dtype = np.dtype('float64')
n_rotations, n_coords, n_points = 7000, 3, 25600
shm = shared_memory.SharedMemory(
    create=True, size=n_rotations * n_coords * n_points * _dtype.itemsize)
rotations_name = shm.name
coordinates = np.ndarray(
    (n_rotations, n_coords, n_points), dtype=_dtype, buffer=shm.buf)
coordinates = rotations @ ellipsoid
print(coordinates.shape)  # outputs (n_rotations, n_coords, n_points)

chunks = [(rot_idx, rotations_name,
            args.output, (n_rotations, n_coords, n_points), max_rad)
            for rot_idx in range(n_rotations)]
pool = Pool(args.processes)
_res = pool.starmap_async(gen_features, chunks).get()

这里gen_features定义如下:

def gen_features(idx: int, buf_name: str, _dir: str,
                 rot_dims: tuple, max_rad: int):
    shm = shared_memory.SharedMemory(name=buf_name)
    rotations = np.ndarray(rot_dims, dtype=np.dtype('float64'), buffer=shm.buf)
    print(rotations)  # here the np array has become zero-filled for some reason
    del rotations, _
    shm.close()
    return idx

经过将近一个小时的调试后,发现您必须“复制”数据,如 this 部分所述:

b[:] = a[:]  # Copy the original data into shared memory

基本上,这个

coordinates[:] = rotations @ ellipsoid

而不是

coordinates = rotations @ ellipsoid