如何在内存中提取zip文件

how to extract zip file in memory

我读过许多将 zip 文件解压到磁盘上的主题。但是我有一个需要在内存中提取 zip 的用例。 ZIP 文件再次包含 zip 文件列表。在堆栈溢出中经历了几次 post 之后,我问了这个问题。您能否分享任何 post / link,其中包含有关如何在内存中解压缩文件的一些信息?

java class java.util.zip.ZipInputStream 允许您将 Zip 存档中的数据读入字节数组。

如果你想读取嵌套的 .zip 文件,你可以尝试使用 ZipInputStream(就像已经提到的那样)并检查 ZipEntry(s) 是否也是一个 *.zip 文件,在这种情况下,它可以作为下一个 .zip 文件递归读取。类似于:

private static void readZipInputStream(
        InputStream inputStream, BiConsumer<ZipEntry, ByteArrayOutputStream> consumerFunction) throws IOException {

    try (ZipInputStream zipInput = new ZipInputStream(inputStream)) {
        ZipEntry entry;
        while ((entry = zipInput.getNextEntry()) != null) {
            ByteArrayOutputStream outStream = new ByteArrayOutputStream();
            byte[] buffer = new byte[1024];
            int length;
            while ((length = zipInput.read(buffer)) != -1) {
                outStream.write(buffer, 0, length);
            }

            if (entry.getName().endsWith(".zip")) {
                // need to go deeper...
                ByteArrayInputStream inStream = new ByteArrayInputStream(outStream.toByteArray());
                readZipInputStream(inStream, consumerFunction);
            } else {
                // do something...
                consumerFunction.accept(entry, outStream);
            }
        }
    }
}

例如,有一个结构如下的 zip 文件:

file.zip
├─1+2.zip
│ ├─1.zip
│ │ └─1.txt
│ └─2.zip
│   └─2.txt
└─3.zip
  └─3.txt

并像这样使用 readZipInputStream 函数:

public class Application {

    public static void main(String[] args) throws IOException {
        String path = "file.zip";
        try (FileInputStream inputStream = new FileInputStream(Paths.get(path).toFile())) {
            readZipInputStream(
                    inputStream,
                    (entry, outputStream) -> {
                        System.out.println(entry.getName());
                        System.out.println("--------------------------------");
                        System.out.println(outputStream.toString());
                        System.out.println("--------------------------------");
                    }
            );
        }
    }
}

将打印三个 .txt 文件的内容:

1.txt
--------------------------------
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat.
--------------------------------
2.txt
--------------------------------
- Integer vel sem consectetur, ullamcorper leo quis, consequat mauris.
- Nulla efficitur sapien at velit fermentum condimentum.
- Vestibulum elementum nulla ut ipsum tempus, ut molestie sem sollicitudin.
--------------------------------
3.txt
--------------------------------
Morbi tincidunt ornare mi. Sed id risus tortor. Interdum et malesuada 
fames ac ante ipsum primis in faucibus. Pellentesque tincidunt, 
nulla a interdum porta, orci elit ultricies leo, in maximus orci 
tortor pulvinar est. Curabitur eget fermentum risus. Vestibulum euismod 
convallis eros, nec blandit neque blandit at.
--------------------------------