Python 多处理,不能 pickle thread.lock (pymongo)
Python multiprocessing, can't pickle thread.lock (pymongo)
我有一个 class 方法如下:
def get_add_new_links(self, max_num_links):
self.get_links_m2(max_num_links)
processes = mp.cpu_count()
pool = mp.Pool(processes=processes)
func = partial(worker, self)
with open(os.path.join(self.report_path, "links.txt"), "r") as f:
reports = pool.map(func, f.readlines())
pool.close()
pool.join()
其中 get_links_m2
是另一种创建文件“links.txt”的方法。工人是:
def worker(obje, link):
doc, rep = obje.get_info_m2(link)
obje.add_new_active(doc, sure_not_exists=True)
return rep
方法 get_info_m2
访问 link 并提取一些信息。 add_new_active
方法将信息添加到 MongoDB.
我的代码可能有什么问题?当我 运行 它我得到这个错误(和回溯):
File "controller.py", line 234, in get_add_new_links
reports = pool.map(func, f.readlines()) File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/pool.py", line
260, in map
return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/pool.py",
line 608, in get
raise self._value File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/pool.py", line
385, in _handle_tasks
put(task) File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/connection.py",
line 206, in send
self._send_bytes(ForkingPickler.dumps(obj)) File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/reduction.py",
line 50, in dumps
cls(buf, protocol).dump(obj) TypeError: can't pickle _thread.lock objects
如the docs所述:
永远不要这样做:
client = pymongo.MongoClient()
# Each child process attempts to copy a global MongoClient
# created in the parent process. Never do this.
def func():
db = client.mydb
# Do something with db.
proc = multiprocessing.Process(target=func)
proc.start()
相反,必须在 worker 函数中初始化客户端。
我有一个 class 方法如下:
def get_add_new_links(self, max_num_links):
self.get_links_m2(max_num_links)
processes = mp.cpu_count()
pool = mp.Pool(processes=processes)
func = partial(worker, self)
with open(os.path.join(self.report_path, "links.txt"), "r") as f:
reports = pool.map(func, f.readlines())
pool.close()
pool.join()
其中 get_links_m2
是另一种创建文件“links.txt”的方法。工人是:
def worker(obje, link):
doc, rep = obje.get_info_m2(link)
obje.add_new_active(doc, sure_not_exists=True)
return rep
方法 get_info_m2
访问 link 并提取一些信息。 add_new_active
方法将信息添加到 MongoDB.
我的代码可能有什么问题?当我 运行 它我得到这个错误(和回溯):
File "controller.py", line 234, in get_add_new_links
reports = pool.map(func, f.readlines()) File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/pool.py", line
260, in map
return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/pool.py",
line 608, in get
raise self._value File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/pool.py", line
385, in _handle_tasks
put(task) File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/connection.py",
line 206, in send
self._send_bytes(ForkingPickler.dumps(obj)) File "/home/vladimir/anaconda3/lib/python3.5/multiprocessing/reduction.py",
line 50, in dumps
cls(buf, protocol).dump(obj) TypeError: can't pickle _thread.lock objects
如the docs所述:
永远不要这样做:
client = pymongo.MongoClient()
# Each child process attempts to copy a global MongoClient
# created in the parent process. Never do this.
def func():
db = client.mydb
# Do something with db.
proc = multiprocessing.Process(target=func)
proc.start()
相反,必须在 worker 函数中初始化客户端。