使用多处理高效切片和读取图像

Question

我有一张大卫星图像，想运行对其进行对象检测模型推理。目前，我对大图像进行切片，保存图块，然后读取它们以使我的模型输出检测（框和掩码）。我知道这是一种低效的做事方式，因为一旦图像 slice/tile 被读取，它就不再需要了，但我目前正在将它保存到磁盘。

有没有更有效的方法来完成这个过程？也许通过多处理或射线库？

Answer 1

正如您所提到的，Ray 将是一个很好的选择，因为它使用了共享内存并且能够运行在一台或多台机器上使用相同的代码。

类似以下结构的东西可以工作。

import numpy as np
import ray

ray.init()

@ray.remote
def do_object_detection(image, index):
    image_slice = image[index]
    # Do object detection.
    return 1

# Store the object in shared memory.
image = np.ones((1000, 1000))
image_id = ray.put(image)

# Process the slices in parallel. You probably want to use 2D slices instead
# of 1D slices.
result_ids = [do_object_detection.remote(image_id, i) for i in range(1000)]
results = ray.get(result_ids)

请注意，执行 do_object_detection 任务的工作人员不会创建自己的图像副本。相反，他们将可以访问共享内存中的图像副本。

如果您已经将图像放在单独的文件中，另一种方法是执行如下操作。

import numpy as np
import ray

ray.init()

@ray.remote
def do_object_detection(filename):
    # Load the file and process it.
    return 1

filenames = ['file1.png', 'file2.png', 'file3.png']

# Process all of the images.
result_ids = [do_object_detection.remote(filename) for filename in filenames]
results = ray.get(result_ids)

使用多处理高效切片和读取图像

Efficiently slice and read images using multiprocessing

python

multiprocessing

ray

python-multiprocessing