通过 BufferedInputStream / BufferedOutputStream 读取/写入文件的速度

Question

有两个问题。

如果编码为 bis.read() 而不是 bis.read(bys)，程序实际上做了什么？（尽管慢得多，但它无论如何都能工作。）
为什么 bos.write(bys) 比 bos.write(bys, 0, len) 快很多？（我预计运行中两者的速度相同。）

谢谢！

public class CopyFileBfdBytes {

    public static void main(String[] args) throws IOException {

        FileInputStream fis = new FileInputStream("d:/Test1/M1.MP3");
        BufferedInputStream bis = new BufferedInputStream(fis);

        FileOutputStream fos = new FileOutputStream("d:/Test2/M2.mp3");
        BufferedOutputStream bos = new BufferedOutputStream(fos);

        byte[] bys = new byte[8192];
        int len;
        while ((len = bis.read(bys)) != -1){
//        while ((len = bis.read()) != -1){  // 1. Why does it still work when bys in bis.read() is missing?
            bos.write(bys);
//            bos.write(bys, 0, len);     // 2. Why is this slower than bos.write(bys)?
            bos.flush();
        }
        fis.close();
        bis.close();
        fos.close();
        bos.close();
    }
}

Answer 1

首先，您似乎只想按原样复制文件。有更简单（甚至可能更高效的方法）来做到这一点。

复制数据的其他方法

正在复制文件

如果您需要的只是像示例中那样复制实际文件，您可以简单地使用：

package example;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class SO66024231 {

    public static void main(String[] args) throws IOException {
        Files.copy(Paths.get("d:/Test1/M1.MP3"), Paths.get("d:/Test2/M2.mp3"));
    }
}

这很可能是最具描述性的（例如，其他开发者实际上看到了你想做什么）并且可以由底层系统非常有效地完成。

将数据从任何源复制到任何目标（InputStream 到 OutputStream）

如果您需要将数据从任何 InputStream 传输到任何 OutputStream，您可以使用方法 InputStream#transferTo(OutputStream):

package example;

import java.io.*;

public class SO66024231 {

    public static void main(String[] args) throws IOException {
        try (InputStream fis = new FileInputStream("d:/Test1/M1.MP3")) {
            try (OutputStream fos = new FileOutputStream("d:/Test2/M2.mp3")) {
                fis.transferTo(fos);
            }
        }
    }
}

深入描述您的问题

注：我就笼统的说一下InputStreams和OutputStreams。您使用了 BufferedInputStream 和 BufferedOutputStream。这些是内部缓冲数据的特定实现。这个内部缓冲和我接下来要讲的缓冲没有任何关系！

输入流

InputStream#read() 和 InputStream#read(byte[]) 之间有根本的区别。

InputStream#read() 从 InputStream 中读取一个字节并 returns 它。返回值是 0-255 范围内的 int 或 -1 如果 Stream 耗尽（没有更多数据）。

package example;

import java.io.*;

public class SO66024231 {

    public static void main(String[] args) throws IOException {
        final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
        printAllBytes(new ByteArrayInputStream(myBytes));
    }

    public static void printAllBytes(InputStream in) throws IOException {
        int currByte;
        while ((currByte = in.read()) != -1) {
            System.out.println((byte) currByte);// note the cast to byte!
        }
        
        // prints: -1, 0, 3, 4, 5, 6, 7, 8, 127
    }
}

InputStream#read(byte[]) 然而，就完全不同了。它采用 byte[] 作为用作缓冲区的参数。然后它（在内部）尝试用当前可以获得的尽可能多的字节填充给定缓冲区，并且 returns 它已填充的实际字节数或 -1 如果 Stream 耗尽。

示例：

package example;

import java.io.*;

public class SO66024231 {

    public static void main(String[] args) throws IOException {
        final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
        printAllBytes(new ByteArrayInputStream(myBytes));
    }

    public static void printAllBytes(InputStream in) throws IOException {
        final byte[] buffer = new byte[2];// do not use this small buffer size. This is just for the example
        int bytesRead;

        while ((bytesRead = in.read(buffer)) != -1) {
            // loop from 0 to bytesRead, !NOT! to buffer.length!!!
            for (int i = 0; i < bytesRead; i++) {
                System.out.println(buffer[i]);
            }
        }

        // prints: -1, 0, 3, 4, 5, 6, 7, 8, 127
    }
}

不好的例子：现在是一个坏例子。以下代码存在编程错误，请勿使用！

我们现在从 0 循环到 buffer.length，但我们的输入数据恰好包含 9 个字节。这意味着，在最后一次迭代中，我们的缓冲区将只填充一个字节。我们缓冲区中的第二个字节不会被触及。

package example;

import java.io.*;

public class SO66024231 {

    /**
     * ERROURNOUS EXAMPLE!!! DO NOT USE
     */
    public static void main(String[] args) throws IOException {
        final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
        printAllBytes(new ByteArrayInputStream(myBytes));
    }

    public static void printAllBytes(InputStream in) throws IOException {
        final byte[] buffer = new byte[2];// do not use this small buffer size. This is just for the example
        int bytesRead;

        while ((bytesRead = in.read(buffer)) != -1) {
            for (int i = 0; i < buffer.length; i++) {
                System.out.println(buffer[i]);
            }
        }

        // prints: -1, 0, 3, 4, 5, 6, 7, 8, 127, 8 <-- see; the 8 is printed because we ignored the bytesRead value in our for loop; the 8 is still in our buffer from the previous iteration
    }
}

输出流

既然我描述了阅读上的差异是什么，我将向您描述写作上的差异。

首先，正确的例子（使用OutputStream.write(byte[], int, int)）：

package example;

import java.io.*;
import java.util.Arrays;

public class SO66024231 {

    public static void main(String[] args) throws IOException {
        final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
        final byte[] copied = copyAllBytes(new ByteArrayInputStream(myBytes));

        System.out.println(Arrays.toString(copied));// prints: [-1, 0, 3, 4, 5, 6, 7, 8, 127]
    }

    public static byte[] copyAllBytes(InputStream in) throws IOException {
        final ByteArrayOutputStream bos = new ByteArrayOutputStream();
        final byte[] buffer = new byte[2];
        int bytesRead;

        while ((bytesRead = in.read(buffer)) != -1) {
            bos.write(buffer, 0, bytesRead);
        }

        return bos.toByteArray();
    }
}

坏的例子：

package example;

import java.io.*;
import java.util.Arrays;

public class SO66024231 {

    /*
    ERRORNOUS EXAMPLE!!!!
     */
    public static void main(String[] args) throws IOException {
        final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
        final byte[] copied = copyAllBytes(new ByteArrayInputStream(myBytes));

        System.out.println(Arrays.toString(copied));// prints: [-1, 0, 3, 4, 5, 6, 7, 8, 127, 8] <-- see; the 8 is here again
    }

    public static byte[] copyAllBytes(InputStream in) throws IOException {
        final ByteArrayOutputStream bos = new ByteArrayOutputStream();
        final byte[] buffer = new byte[2];
        int bytesRead;

        while ((bytesRead = in.read(buffer)) != -1) {
            bos.write(buffer);
        }

        return bos.toByteArray();
    }
}

这是因为，就像我们使用 InputStream 的示例一样，如果我们忽略 bytesRead，我们将向 OutputStream 写入一个我们不想要的值: 来自上一次迭代的字节 8。这是因为在内部，OutputStream#write(byte[])（在大多数实现中）只是 OutputStream.write(buffer, 0, buffer.length) 的快捷方式。这意味着它将整个缓冲区写入 OutputStream.

通过 BufferedInputStream / BufferedOutputStream 读取/写入文件的速度

Speed on file reading / writing by BufferedInputStream / BufferedOutputStream

java

performance

bufferedinputstream

bufferedoutputstream

复制数据的其他方法

正在复制文件

将数据从任何源复制到任何目标（InputStream 到 OutputStream）

深入描述您的问题

输入流

输出流