逐行阅读 .txt 单词列表时如何获得正确的哈希值?

How do I get the correct hash values when reading a .txt wordlist line-by-line?

我正在尝试构建一个 Python3.x 脚本来读取 .txt 单词列表并将每一行的单词转换为其散列等效项,但是当我执行此脚本时它会生成错误的散列。

希望你们能帮我弄清楚我在这里做错了什么..

输出

Arguments passed to the program:
Namespace(inputHashType=['md5'], verbose=True, 
    wordlist=_io.TextIOWrapper name='C:\Users\Mikael\Desktop\wordlist.txt' mode='rt' encoding='utf-8')
Verbose is set to: True

correct hash:  b61a6d542f9036550ba9c401c80f00ef
Line 1:  PT: tests      As hash: a58c6e40436bbb090294218b7d758a15

输入文件示例:

tests
tests1
tests2

源代码

import argparse
import sys
from Crypto.Hash import MD5, SHA1, SHA224, SHA256, SHA384, SHA512


parser = argparse.ArgumentParser(description='Hash production')
parser.add_argument('-v', action='store_true', dest='verbose', default=False, help='Print attempts')
parser.add_argument('-t', nargs=1, dest='inputHashType', help='Hash type')
parser.add_argument('-d', nargs='?', dest='wordlist', type=argparse.FileType('rt', encoding='utf-8'), default=sys.stdin, help='Dictionary (as file)')
args =  parser.parse_args()

inputHashType = ''.join(map(str, args.inputHashType)) # Formats args list as string
inputHashType.lower()

if inputHashType == 'md5':
    htype = MD5.new()

try:
    if args.verbose:
        with args.wordlist as file:
            line = file.readline()
            cnt = 1
            while line:
                word = line.encode('utf-8').rstrip()
                hashed = htype.update(word)
                hashed = htype.hexdigest()
                print("Line {}:  PT: {}      As hash: {}".format(cnt, line.strip(), hashed))
                line = file.readline()
                cnt += 1
    else:
        break
except:
    print('Error')

问题在于,在代码的 try 块中,您是 re-using MD5 通过 update() 方法对每个新行的哈希计算器。这不会计算该输入字符串的散列值,但会累积输入并评估截至该点的累积字符串的散列值。

使用 md5sum:

可以很容易地看出这是发生了什么
$ echo -n 'tests' | md5sum
b61a6d542f9036550ba9c401c80f00ef  -    # Identical to your 1st output line
$ echo -n 'teststests' | md5sum         # This is what you're calculating
a58c6e40436bbb090294218b7d758a15  -    # Identical to your 2nd output line.

要评估每个新输入的哈希值,您需要通过调用 new() 方法 re-initialize 一个新的 MD5 实例。