如何防止 multiprocessing 继承 imports 和 globals?
How to prevent multiprocessing from inheriting imports and globals?
我在更大的代码库中使用多处理,其中一些导入语句有副作用。我如何 运行 后台进程中的函数而不继承全局导入?
# helper.py:
print('This message should only print once!')
# main.py:
import multiprocessing as mp
import helper # This prints the message.
def worker():
pass # Unfortunately this also prints the message again.
if __name__ == '__main__':
mp.set_start_method('spawn')
process = mp.Process(target=worker)
process.start()
process.join()
背景: 导入 TensorFlow 会初始化 CUDA,它会保留一些 GPU 内存。因此,生成太多进程会导致 CUDA OOM 错误,即使这些进程不使用 TensorFlow。
没有答案的类似问题:
- How to avoid double imports with the Python multiprocessing module?
# helper.py:
print('This message should only print once!')
# main.py:
import multiprocessing as mp
def worker():
pass
def main():
# Importing the module only locally so that the background
# worker won't import it again.
import helper
mp.set_start_method('spawn')
process = mp.Process(target=worker)
process.start()
process.join()
if __name__ == '__main__':
main()
Is there a resources that explains exactly what the multiprocessing
module does when starting an mp.Process
?
超级快速版本(使用 spawn 上下文而不是 fork)
准备一些东西(一对用于通信、清理回调等的管道),然后使用进程对象的 fork()
exec()
. On windows it's Create
ProcessW()
. The new python interpreter is called with a startup script spawn_main()
and passed the communication pipe file descriptors via a crafted command string and the -c
switch. The startup script cleans up the environment a little bit, then unpickles the Process
object from its communication pipe. Finally it calls the run
方法创建一个新进程。
那么导入模块呢?
Pickle 语义处理其中的一部分,但是 __main__
和 sys.modules
需要一些 tlc,它被处理 here(在“清理环境”位期间)。
我在更大的代码库中使用多处理,其中一些导入语句有副作用。我如何 运行 后台进程中的函数而不继承全局导入?
# helper.py:
print('This message should only print once!')
# main.py:
import multiprocessing as mp
import helper # This prints the message.
def worker():
pass # Unfortunately this also prints the message again.
if __name__ == '__main__':
mp.set_start_method('spawn')
process = mp.Process(target=worker)
process.start()
process.join()
背景: 导入 TensorFlow 会初始化 CUDA,它会保留一些 GPU 内存。因此,生成太多进程会导致 CUDA OOM 错误,即使这些进程不使用 TensorFlow。
没有答案的类似问题:
- How to avoid double imports with the Python multiprocessing module?
# helper.py:
print('This message should only print once!')
# main.py:
import multiprocessing as mp
def worker():
pass
def main():
# Importing the module only locally so that the background
# worker won't import it again.
import helper
mp.set_start_method('spawn')
process = mp.Process(target=worker)
process.start()
process.join()
if __name__ == '__main__':
main()
Is there a resources that explains exactly what the multiprocessing module does when starting an
mp.Process
?
超级快速版本(使用 spawn 上下文而不是 fork)
准备一些东西(一对用于通信、清理回调等的管道),然后使用进程对象的 fork()
exec()
. On windows it's Create
ProcessW()
. The new python interpreter is called with a startup script spawn_main()
and passed the communication pipe file descriptors via a crafted command string and the -c
switch. The startup script cleans up the environment a little bit, then unpickles the Process
object from its communication pipe. Finally it calls the run
方法创建一个新进程。
那么导入模块呢?
Pickle 语义处理其中的一部分,但是 __main__
和 sys.modules
需要一些 tlc,它被处理 here(在“清理环境”位期间)。