为什么两次common-codec md5的结果不同

Question

当我使用apache common-codec md5Hex 获取输入流的md5 结果时，两次得到不同的结果。示例代码如下：

public static void main(String[] args) {
    String data = "D:\test.jpg";
    File file = new File(data);
    InputStream is = null;
    try {
        is = new FileInputStream(file);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }
    String digest = null, digest2 = null;
    try {
        System.out.println(is.hashCode());
        digest = DigestUtils.md5Hex(is);
        System.out.println(is.hashCode());

        digest2 = DigestUtils.md5Hex(is);
        System.out.println(is.hashCode());

    } catch (IOException e) {
        e.printStackTrace();
    }
    System.out.println("Digest = " + digest);
    System.out.println("Digest2 = " + digest2);
}

结果是：

1888654590
1888654590
1888654590
Digest = 5cc6c20f0b3aa9b44fe952da20cc928e
Digest2 = d41d8cd98f00b204e9800998ecf8427e

感谢您的回答！

Answer 1

InputStream只能遍历一次。第一次调用遍历它和 returns 输入文件的 MD5。当您第二次调用 md5hex 时，InputStream 指向文件结尾，因此 digest2 是 empty input.

的 MD5

Answer 2

d41d8cd98f00b204e9800998ecf8427e是空串("")的md5散列。

那是因为is是一个流，这意味着一旦你读完它（在DigestUtils.md5Hex(is)中），"cursor"就在流的末尾，在那里没有更多数据可读，因此尝试读取任何内容都将 return 0 字节。

我建议改为将流的内容读取到 byte[]，然后对其进行哈希处理。
有关如何从 InputStream 获取 byte[]，请参阅 this question。

Answer 3

您不能在 InputStream 中返回。所以调用两次：

DigestUtils.md5Hex(is);

不一样。最好读入字节数组并使用：

public static String md5Hex(byte[] data)

为什么两次common-codec md5的结果不同

Why the result is different for twice common-codec md5

java

md5

inputstream