无法将结果附加到多处理列表

cannot append results to lists on multiprocessing

下面的代码将使用多处理生成面部编码,我可以打印编码,但问题是 knownEncodings,knownNames,no_faces,error_in_image 执行后全部为空。我知道这是由于 multiprocessing 造成的,但不确定如何缓解这种情况。

import face_recognition
from imutils import paths
from multiprocessing import Pool
import pickle
import cv2
import os,sys,time

print("[INFO] quantifying faces...")

img_folder_path=sys.argv[1]

image_paths = list(paths.list_images(img_folder_path))

knownEncodings = []
knownNames = []
no_faces = []
error_in_image =[]

def create_encoding(imagePath):
    print("[INFO] processing image...")
    name = imagePath.split(os.path.sep)[-1]
    image = cv2.imread(imagePath)
    if image is None:
        return
    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # detect the (x, y)-coordinates of the bounding boxes
    # corresponding to each face in the input image
    boxes = face_recognition.face_locations(rgb)

    # compute the facial embedding for the face
    if len(boxes) != 0:
        boxes = list(boxes[0])
        encodings = face_recognition.face_encodings(image, [boxes])
        for encoding in encodings:  
            knownEncodings.append(encoding)
            knownNames.append(name)
        
    else:
        print("no face found" ,image_paths )
        no_faces.append(image_paths )



# loop over the image paths with multiprocessing
start_time = time.time()

with Pool(8) as pool:
    pool.map(create_encoding, image_paths )


end_time = time.time()
print(end_time - start_time)

# dump the facial encodings + names to disk
print("[INFO] serializing encodings...")
data = {"encodings": knownEncodings, "names": knownNames, "no_faces":no_faces,"error_in_image":error_in_image}

f_name = img_folder_path.replace("/","-")
print(f_name)
f = open(f"encodings_{f_name}.pickle", "wb")
f.write(pickle.dumps(data))
f.close()

您不应该跨多个进程使用列表。您可以使用 multiprocessing.Queue 或其他进程安全模型。 How to use multiprocessing queue in Python?

大概您 运行 在 OS 下,例如 Linux,它使用 fork 创建新进程,否则您需要将创建新进程的代码放在 if __name__ == '__main__': 块中。但是你真的应该用你所在的实际平台来标记多处理问题 运行。您的问题(无论如何)是您的全局列表(例如 knownEncodings)不可跨进程共享。每个进程(在 Linux 的情况下)继承了主进程的地址 space,因为它在创建时查看了它,但是一旦它对变量进行了修改,该存储的副本就是制作(“写时复制”语义)。本质上每个进程都有自己的全局变量副本。

但是,有许多方法可以跨进程共享数据。最简单的解决方案,即需要对程序进行最少更改的解决方案,是使用由 multiprocessing.managers,SyncManager class 创建的“托管列表”,可以通过调用 multiprocessing.Manager()(这也会导致创建一个进程)。您需要的更改是:

假设你运行在使用fork的平台下,比如Linux,替换您在程序顶部的四个列表的全局定义:

manager = Manager()
knownEncodings = manager.list()
knownNames = manager.list()
no_faces = manager.list()
error_in_image = manager.list()