如何 tar 使用 Commons Compress 的文件过大并导致内存不足崩溃?

How to tar a file with Commons Compress that is too large and causes out of memory crash?

在下面的代码中,如果我给 (Apache) Commons Compress 一个大小为几 GB 的文件,它将崩溃,因为它耗尽了我的所有内存。

我可以让它一次读取然后写入文件的一小部分吗?我一直在研究分块,但我不确定如何做到这一点,以便在将这些片段写入 .tar 格式后我可以将文件放回原处。

在这里处理任何大小的支持文件的最佳方法是什么?

FileOutputStream fileOutputStream = new FileOutputStream("output.tar");
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream = new GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipOutputStream)) {

tarArchiveOutputStream.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);

File currentFile = new File("Huge_MultiGB_File.txt");
String relativeFilePath = currentFile.getPath();
TarArchiveEntry tarEntry = new TarArchiveEntry(currentFile, relativeFilePath);
tarEntry.setSize(currentFile.length());
tarArchiveOutputStream.putArchiveEntry(tarEntry);
tarArchiveOutputStream.write(IOUtils.toByteArray(new FileInputStream(currentFile)));
tarArchiveOutputStream.closeArchiveEntry();

您必须写入一小部分文件并将其写入循环输出,而不是先使用 IOUtils

将整个文件读入内存

大致是这样的:

FileInputStream source=new FileInputStream(....somefile);
tarArchiveOutputStream; prepared to w writing

byte[] buff = new byte[1024*10]; //10kb buff
int numBytesRead = -1; //number of bytes read


while(( numBytesRead = source.read(buff)) > 0 ) {
    // while source has bytes, read from source and write
    // the same number of bytes to the tar outputstream
    tarArchiveOutputStream.write(buff, 0, numBytesRead);
   }
}