无法将结果附加到多处理列表
cannot append results to lists on multiprocessing
下面的代码将使用多处理生成面部编码,我可以打印编码,但问题是 knownEncodings,knownNames,no_faces,error_in_image 执行后全部为空。我知道这是由于 multiprocessing 造成的,但不确定如何缓解这种情况。
import face_recognition
from imutils import paths
from multiprocessing import Pool
import pickle
import cv2
import os,sys,time
print("[INFO] quantifying faces...")
img_folder_path=sys.argv[1]
image_paths = list(paths.list_images(img_folder_path))
knownEncodings = []
knownNames = []
no_faces = []
error_in_image =[]
def create_encoding(imagePath):
print("[INFO] processing image...")
name = imagePath.split(os.path.sep)[-1]
image = cv2.imread(imagePath)
if image is None:
return
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect the (x, y)-coordinates of the bounding boxes
# corresponding to each face in the input image
boxes = face_recognition.face_locations(rgb)
# compute the facial embedding for the face
if len(boxes) != 0:
boxes = list(boxes[0])
encodings = face_recognition.face_encodings(image, [boxes])
for encoding in encodings:
knownEncodings.append(encoding)
knownNames.append(name)
else:
print("no face found" ,image_paths )
no_faces.append(image_paths )
# loop over the image paths with multiprocessing
start_time = time.time()
with Pool(8) as pool:
pool.map(create_encoding, image_paths )
end_time = time.time()
print(end_time - start_time)
# dump the facial encodings + names to disk
print("[INFO] serializing encodings...")
data = {"encodings": knownEncodings, "names": knownNames, "no_faces":no_faces,"error_in_image":error_in_image}
f_name = img_folder_path.replace("/","-")
print(f_name)
f = open(f"encodings_{f_name}.pickle", "wb")
f.write(pickle.dumps(data))
f.close()
您不应该跨多个进程使用列表。您可以使用 multiprocessing.Queue 或其他进程安全模型。
How to use multiprocessing queue in Python?
大概您 运行 在 OS 下,例如 Linux,它使用 fork 创建新进程,否则您需要将创建新进程的代码放在 if __name__ == '__main__':
块中。但是你真的应该用你所在的实际平台来标记多处理问题 运行。您的问题(无论如何)是您的全局列表(例如 knownEncodings
)不可跨进程共享。每个进程(在 Linux 的情况下)继承了主进程的地址 space,因为它在创建时查看了它,但是一旦它对变量进行了修改,该存储的副本就是制作(“写时复制”语义)。本质上每个进程都有自己的全局变量副本。
但是,有许多方法可以跨进程共享数据。最简单的解决方案,即需要对程序进行最少更改的解决方案,是使用由 multiprocessing.managers,SyncManager
class 创建的“托管列表”,可以通过调用 multiprocessing.Manager()
(这也会导致创建一个进程)。您需要的更改是:
假设你运行在使用fork的平台下,比如Linux,替换您在程序顶部的四个列表的全局定义:
manager = Manager()
knownEncodings = manager.list()
knownNames = manager.list()
no_faces = manager.list()
error_in_image = manager.list()
下面的代码将使用多处理生成面部编码,我可以打印编码,但问题是 knownEncodings,knownNames,no_faces,error_in_image 执行后全部为空。我知道这是由于 multiprocessing 造成的,但不确定如何缓解这种情况。
import face_recognition
from imutils import paths
from multiprocessing import Pool
import pickle
import cv2
import os,sys,time
print("[INFO] quantifying faces...")
img_folder_path=sys.argv[1]
image_paths = list(paths.list_images(img_folder_path))
knownEncodings = []
knownNames = []
no_faces = []
error_in_image =[]
def create_encoding(imagePath):
print("[INFO] processing image...")
name = imagePath.split(os.path.sep)[-1]
image = cv2.imread(imagePath)
if image is None:
return
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect the (x, y)-coordinates of the bounding boxes
# corresponding to each face in the input image
boxes = face_recognition.face_locations(rgb)
# compute the facial embedding for the face
if len(boxes) != 0:
boxes = list(boxes[0])
encodings = face_recognition.face_encodings(image, [boxes])
for encoding in encodings:
knownEncodings.append(encoding)
knownNames.append(name)
else:
print("no face found" ,image_paths )
no_faces.append(image_paths )
# loop over the image paths with multiprocessing
start_time = time.time()
with Pool(8) as pool:
pool.map(create_encoding, image_paths )
end_time = time.time()
print(end_time - start_time)
# dump the facial encodings + names to disk
print("[INFO] serializing encodings...")
data = {"encodings": knownEncodings, "names": knownNames, "no_faces":no_faces,"error_in_image":error_in_image}
f_name = img_folder_path.replace("/","-")
print(f_name)
f = open(f"encodings_{f_name}.pickle", "wb")
f.write(pickle.dumps(data))
f.close()
您不应该跨多个进程使用列表。您可以使用 multiprocessing.Queue 或其他进程安全模型。 How to use multiprocessing queue in Python?
大概您 运行 在 OS 下,例如 Linux,它使用 fork 创建新进程,否则您需要将创建新进程的代码放在 if __name__ == '__main__':
块中。但是你真的应该用你所在的实际平台来标记多处理问题 运行。您的问题(无论如何)是您的全局列表(例如 knownEncodings
)不可跨进程共享。每个进程(在 Linux 的情况下)继承了主进程的地址 space,因为它在创建时查看了它,但是一旦它对变量进行了修改,该存储的副本就是制作(“写时复制”语义)。本质上每个进程都有自己的全局变量副本。
但是,有许多方法可以跨进程共享数据。最简单的解决方案,即需要对程序进行最少更改的解决方案,是使用由 multiprocessing.managers,SyncManager
class 创建的“托管列表”,可以通过调用 multiprocessing.Manager()
(这也会导致创建一个进程)。您需要的更改是:
假设你运行在使用fork的平台下,比如Linux,替换您在程序顶部的四个列表的全局定义:
manager = Manager()
knownEncodings = manager.list()
knownNames = manager.list()
no_faces = manager.list()
error_in_image = manager.list()