如何在 PHP 中同时使用多种算法对文件进行哈希处理?

How to hash file with multiple algorithms at the same time in PHP?

我想使用多种算法对给定文件进行哈希处理,但现在我按顺序进行,如下所示:

return [
    hash_file('md5', $uri),
    hash_file('sha1', $uri),
    hash_file('sha256', $uri)
];

有没有办法对只打开一个流而不是 N 的文件进行哈希处理,其中 N 是我要使用的算法数量?像这样:

return hash_file(['md5', 'sha1', 'sha256'], $uri);

您可以打开文件指针,然后使用 hash_init() with hash_update() to calculate the hash on the file without opening the file many times, then use hash_final() 获取结果哈希。

<?php
function hash_file_multi($algos = [], $filename) {
    if (!is_array($algos)) {
        throw new \InvalidArgumentException('First argument must be an array');
    }

    if (!is_string($filename)) {
        throw new \InvalidArgumentException('Second argument must be a string');
    }

    if (!file_exists($filename)) {
        throw new \InvalidArgumentException('Second argument, file not found');
    }

    $result = [];
    $fp = fopen($filename, "r");
    if ($fp) {
        // ini hash contexts
        foreach ($algos as $algo) {
            $ctx[$algo] = hash_init($algo);
        }

        // calculate hash
        while (!feof($fp)) {
            $buffer = fgets($fp, 65536);
            foreach ($ctx as $key => $context) {
                hash_update($ctx[$key], $buffer);
            }
        }

        // finalise hash and store in return
        foreach ($algos as $algo) {
            $result[$algo] = hash_final($ctx[$algo]);
        }

        fclose($fp);
    } else {
        throw new \InvalidArgumentException('Could not open file for reading');
    }   
    return $result;
}

$result = hash_file_multi(['md5', 'sha1', 'sha256'], $uri);

var_dump($result['md5'] === hash_file('md5', $uri)); //true
var_dump($result['sha1'] === hash_file('sha1', $uri)); //true
var_dump($result['sha256'] === hash_file('sha256', $uri)); //true

也发布到 PHP 手册:http://php.net/manual/en/function.hash-file.php#122549

这是对 * 的修改,它只读取文件一次,甚至对 non-seekable 流也有效,例如 STDIN:

<?php
function hash_stream_multi($algos = [], $stream) {
    if (!is_array($algos)) {
        throw new \InvalidArgumentException('First argument must be an array');
    }

    if (!is_resource($stream)) {
        throw new \InvalidArgumentException('Second argument must be a resource');
    }

    $result = [];
    foreach ($algos as $algo) {
        $ctx[$algo] = hash_init($algo);
    }
    while (!feof($stream)) {
        $chunk = fread($stream, 1 << 20);  // read data in 1 MiB chunks
        foreach ($algos as $algo) {
            hash_update($ctx[$algo], $chunk);
        }
    }
    foreach ($algos as $algo) {
        $result[$algo] = hash_final($ctx[$algo]);
    }
    return $result;
}

// test: hash standard input with MD5, SHA-1 and SHA-256
$result = hash_stream_multi(['md5', 'sha1', 'sha256'], STDIN);
print_r($result);

Try it online!

它通过使用 fread() in chunks (of one megabyte, which should give a reasonable balance between performance and memory use) and feeding the chunks to each hash with hash_update() 从输入流中读取数据来工作。

*) 劳伦斯在我写这篇文章时更新了他的答案,但我觉得我的答案仍然足够独特,足以证明保留这两个答案是合理的。此解决方案与 Lawrence 的更新版本之间的主要区别在于我的函数采用输入流而不是文件名,并且我使用的是 fread() 而不是 fgets()(因为对于散列,不需要在换行符上拆分输入)。