无法打开在 mpi4py 中创建的 zip 存档

zip archives created inside mpi4py cannot be opened

我 运行 python 中的一些代码创建了大量 .csv 文件。我在超级计算机上执行此操作,并使用 mpi4py 来管理并行处理。每个节点上有几个进程 运行,每个进程完成一些操作。每个进程创建一个 .csv 文件。我不想将所有 .csv 文件都放到主硬盘上,因为有太多了。因此,我将它们写入本地 SSD,每个 SSD 连接到一个节点。然后,我想将每个 SSD 中的所有 .csvs 压缩到每个节点的一个 zip 存档中,并将存档放在主硬盘驱动器上。我是这样做的:

from mpi4py import MPI
import os
import platform
import zipfile

# MPI setup
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

def get_localrank(comm):
    comm_localrank = MPI.Comm.Split_type(comm, MPI.COMM_TYPE_SHARED, 0)
    return comm_localrank.Get_rank()

def ls_csv(path, ext=".csv"):
    all_files = os.listdir(path)
    return [ path+"/"+f for f in all_files if f.endswith(ext) ]

def ssd_to_drive(comm, drive_path):
    # only execute on one MPI process per node
    if get_localrank(comm) == 0:

        # every node's file needs a different name
        archive_name = drive_path + "/data_" + platform.node() + ".zip"

        csvs = ls_csv("ssd_path")

        zf = zipfile.ZipFile(archive_name, 'w', zipfile.ZIP_DEFLATED)

        for csvf in csvs:
            zf.write(csvf)

        zf.close()

##copy archived csvs from the ssds back to project
ssd_to_drive(comm=comm, drive_path="mypath")

我将 zip 文件放回了我想要的目录,但我无法打开它们。它们要么已损坏,要么 unzip 认为它们是多部分存档的一部分,如果是这样,我不知道如何重建它们。

当我执行 less data_nodename.zip 时,我得到以下信息(其中每个错误语句之前的字母数字字符串代表“nodename”):

e33n08
End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.

f09n09
End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.

f09n10
End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.

e33n06
Zipfile is disk 37605 of a multi-disk archive, and this is not the disk on
     which the central zipfile directory begins (disk 25182).

f09n11
Zipfile is disk 40457 of a multi-disk archive, and this is not the disk on
     which the central zipfile directory begins (disk 740).

a06n10
end-of-central-directory record claims this
  is disk 11604 but that the central directory starts on disk 48929; this is a
  contradiction.  Attempting to process anyway.

f10n14
end-of-central-directory record claims this
  is disk 15085 but that the central directory starts on disk 52010; this is a
  contradiction.

值得注意的是,我只有 3 个 zip 文件似乎声称是多磁盘存档的开头(e33n08、f09n09、f09n10),但至少有 4 个引用“中央目录开头”磁盘(25182, 740, 48929, 52010).

所以现在我无法弄清楚这些是否已损坏,或者 zipfile 是否真的认为它正在创建一个多磁盘存档,为什么会这样,或者如果它真的是多磁盘的,如何重建存档-磁盘。

最后,当我使用多个 mpi 任务但只有一个节点执行此过程时,创建的单个 zip 存档很好并且可读 lessunzip

不敢相信我错过了这个,但文件确实已损坏。在写入 zip 档案完成之前作业已经退出。