python 多处理写入共享文件

python multiprocessing writing to shared file

当写入我通过将其传递给使用多处理实现的工作函数共享的打开文件时,文件内容未正确写入。相反'^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^' 被写入文件。

为什么会这样?你不能让很多多处理单元写入同一个文件吗?你需要使用锁吗?队列?我没有正确或有效地使用 Multiprocessing 吗?

我觉得一些示例代码可能会有所帮助,但请将其作为我打开文件并通过多处理将打开的文件传递给在该文件上写入的另一个函数的参考。

多处理文件:

import multiprocessing as mp

class PrepWorker():
    def worker(self, open_file):
        for i in range(1,1000000):
            data = GetDataAboutI() # This function would be in a separate file
            open_file.write(data)
            open_file.flush()
        return

if __name__ == '__main__':
    open_file = open('/data/test.csv', 'w+')
    for i in range(4):
        p = mp.Process(target=PrepWorker().worker, args=(open_file,))
        jobs.append(p)
        p.start()

    for j in jobs:
        j.join()
        print '{0}.exitcode = {1}' .format(j.name, j.exitcode)   
    open_file.close()

Why would this happen?

有几个进程可能会尝试调用

open_file.write(data)
open_file.flush()

同时。在你看来,哪种行为是合适的

  • a.write
  • b.write
  • a.flush
  • c.write
  • b.flush

会发生吗?

Can you not have many multiprocessing units writing to the same file? Do you need to use a Lock? A Queue?

Python multiprocessing safely writing to a file recommends having one queue, which is the read by one process which writes to the file. So do Writing to a file with multiprocessing and Processing single file from multiple processes in python.