当缓冲区长度 > 1 时，下载的文件已损坏

Question

我正在尝试编写一个函数，用于在特定 URL 下载文件。该函数会生成损坏的文件，除非我将缓冲区设为大小为 1 的数组（如下面的代码所示）。

缓冲区初始化（我打算使用）上面的三元语句以及 1 以外的硬编码整数值将生成损坏的文件。

注意：MAX_BUFFER_SIZE是一个常数，在我的代码中定义为8192（2^13）。

public static void downloadFile(String webPath, String localDir, String fileName) {
    try {
        File localFile;
        FileOutputStream writableLocalFile;
        InputStream stream;

        url = new URL(webPath);
        HttpURLConnection connection = (HttpURLConnection) url.openConnection();

        int size = connection.getContentLength(); //File size in bytes
        int read = 0; //Bytes read

        localFile = new File(localDir);

        //Ensure that directory exists, otherwise create it.
        if (!localFile.exists())
            localFile.mkdirs();

        //Ensure that file exists, otherwise create it.
        //Note that if we define the file path as we do below initially and call mkdirs() it will create a folder with the file name (I.e. test.exe). There may be a better alternative, revisit later.
        localFile = new File(localDir + fileName);
        if (!localFile.exists())
            localFile.createNewFile();

        writableLocalFile = new FileOutputStream(localFile);
        stream = connection.getInputStream();

        byte[] buffer;
        int remaining;
        while (read != size) {
            remaining = size - read; //Bytes still to be read
            //remaining > MAX_BUFFER_SIZE ? MAX_BUFFER_SIZE : remaining
            buffer = new byte[1]; //Adjust buffer size according to remaining data (to be read).

            read += stream.read(buffer); //Read buffer-size amount of bytes from the stream.
            writableLocalFile.write(buffer, 0, buffer.length); //Args: Bytes to read, offset, number of bytes
        }

        System.out.println("Read " + read + " bytes.");

        writableLocalFile.close();
        stream.close();
    } catch (Throwable t) {
        t.printStackTrace();
    }
}

我这样写的原因是我可以在用户下载时向他们提供实时进度条。我已将其从代码中删除以减少混乱。

Answer 1

len = stream.read(buffer);
read += len;
writableLocalFile.write(buffer, 0, len);

您不能使用 buffer.length 作为读取的字节，您需要使用读取调用的 return 值。因为它可能 return 读取很短，然后您的缓冲区在读取字节后包含垃圾（0 字节或之前读取的数据）。

除了计算剩余和使用动态缓冲区外，只需使用 16k 或类似的东西。最后一读会很短，这很好。

Answer 2

InputStream.read() 读取的字节数可能少于您请求的字节数。但是您总是将整个缓冲区附加到文件中。您需要捕获实际的读取字节数并仅将这些字节附加到文件。

此外：

注意 InputStream.read() 到 return -1 (EOF)
服务器可能 return 大小不正确。因此，检查 read != size 是危险的。我建议不要完全依赖 Content-Length HTTP 字段。相反，只需继续从输入流中读取，直到遇到 EOF。

当缓冲区长度 > 1 时，下载的文件已损坏

Downloaded files are corrupted when buffer length is > 1

java

inputstream

outputstream