拆分二进制文件

Split binary file

split 方法有两个参数,要拆分的文件名和每个拆分的大小。你能检查一下我是否在写轨道上吗?以及关于将什么放入 for 循环的伪代码?

import java.io.*;

public class SplitFile {

    public static void main(String[] args) throws IOException {
        Split("testfile.pdf", 256);

    }

    public static Split(String filename, int splitSize) throws IOException {

        int numberOfFiles = 0;

        File file = new File(filename);

        numberOfFiles = ((int) file.length() / splitSize) + 1;

        for (; numberOfFiles >= 0; numberOfFiles--) {

            DataInputStream in = new DataInputStream(new BufferedInputStream(
                    new FileInputStream(filename)));

            DataOutputStream out = new DataOutputStream(
                    new BufferedOutputStream(new FileOutputStream(file))); //What do I put here?

        }


    }

}

需要更改

  • 每个输出部分的文件对象,例如
  • 在循环外初始化数据输入流,而不是在循环内

代码

File original = new File(filename);
int numberOfFiles = ((int) original.length() / splitSize) + 1;

DataInputStream in = 
    new DataInputStream(new BufferedInputStream(new FileInputStream(filename)));

// <== just count through parts.
for (int i = 0; i < numberOfFiles; i++) {
    File output = new File(String.format("%s-%d", filename, i));
    // <== Part of file being output e.g. testfile.pdf-1, testfile.pdf-2

    DataOutputStream out = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(output)));

}

实际写作...

  • 使用 read() 调用从输入流读取字节
  • 使用 write() 调用将字节写入输出流

两种方法,要么一次 1 个字节 - 最简单,但效率较低,要么使用缓冲区,更难编码,但效率更高。

缓冲方法

long length = original.length();

DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename)));

int pos = 0;
byte[] buffer = new byte[splitSize];
for (...) {
    ...

    // make sure you deal with file not being exactly divisible, 
    // last chunk might be smaller
    long remaining = length - pos;
    in.read(buffer, pos, (int) Math.min(splitSize, remaining));
    out.write(buffer, 0, (int) Math.min(splitSize, remaining));

    pos += splitSize;
}

一次 1 个字节。

for (...) {
    ...
    for (int i = 0; i < splitSize && pos < length; i++) {
        out.write(in.read());
        pos++;
    }
}

您可以按照以下方式使用 Java NIO API 来完成。

import java.io.IOException;
import java.nio.channels.FileChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;

public final class SplitFile {

    public static void main(String[] args) throws IOException {
        split("testfile.pdf", 256);
    }

    private static void split(String filename, int splitSize) throws IOException {
        int i = filename.lastIndexOf('.');
        String basename = filename.substring(0, i);
        String ext = filename.substring(i + 1);

        Path inputPath = Paths.get(filename);

        int numberOfFiles = (int) (Files.size(inputPath) / splitSize) + 1;

        try (FileChannel inputChannel = FileChannel.open(inputPath, StandardOpenOption.READ)) {
            for (int j = 0; j < numberOfFiles; j++) {
                String outputFilename = String.format("%s-%04d.%s", basename, j + 1, ext);

                Path outputPath = inputPath.getParent().resolve(outputFilename);

                try (FileChannel outputChannel = FileChannel.open(outputPath, StandardOpenOption.CREATE, StandardOpenOption.WRITE)) {
                    inputChannel.transferTo(j * splitSize, splitSize, outputChannel);
                }
            }
        }
    }
}