CTF 类型杂耍与 ripemd160 哈希

Question

我正在尝试解决一个应该使用杂耍类型的 CTF。代码是：

if ($_GET["hash"] == hash("ripemd160", $_GET["hash"]))
{
    echo $flag;
}
else
{
    echo "<h1>Bad Hash</h1>";
}

我在 python 中制作了一个脚本，用于检查 ripemd160 中以“0e”开头且仅以数字结尾的随机哈希值。代码是：

def id_generator(size, chars=string.digits):
    return ''.join(random.choice(chars) for _ in range(size))
param = "0e"
results = []
while True:
    h = hashlib.new('ripemd160')
    h.update("{0}".format(str(param)).encode('utf-8'))
    hashed = h.hexdigest()
    if param not in results:
        print(param)
        if hashed.startswith("0e") and hashed[2:].isdigit():
            print(param)
            print(hashed)
            break
        results.append(param)
    else:
        print("CHECKED")
    param = "0e" + str(id_generator(size=10))

有什么解决方法的建议吗？谢谢！

Answer 1

评论好像有点误会，我先把问题稍微解释一下：

Type jugggling 指的是 PHP 的行为，即变量在特定条件下隐式转换为不同的数据类型。例如，以下所有逻辑表达式将计算为 PHP 中的 true：

0 == 0                       // int vs. int
"0" == 0                     // str -> int
"abc" == 0                   // any non-numerical string -> 0
"1.234E+03" == "0.1234E+04"  // string that looks like a float -> float
"0e215962017" == 0           // another string that looks like a float

最后一个例子很有趣，因为它的 MD5 散列值是另一个由 0e 后跟一串十进制数字 (0e291242476940776845150308577824) 组成的字符串。所以这是 PHP 中的另一个逻辑表达式，其计算结果为 true:

"0e215962017" == md5("0e215962017")

要解决此 CTF 挑战，您必须找到一个字符串 "equal" 到它自己的哈希值，但使用 RIPEMD160 算法而不是 MD5。当它作为查询字符串变量（例如 ?hash=0e215962017）提供时，PHP 脚本将公开标志的值。

像这样的虚假哈希冲突并不难发现。每 256 个 MD5 散列中大约有 1 个以 '0e' 开头，剩下的 30 个字符都是数字的概率是 (10/16)^30。如果您计算一下，您会发现 PHP 中 MD5 哈希值等于零的概率大约为 3.4 亿分之一。我花了大约一分钟（将近 2.16 亿次尝试）才找到上面的例子。

可以使用完全相同的方法来查找适用于 RIPEMD160 的相似值。您只需要测试更多的哈希值，因为额外的哈希数字意味着 "collision" 的概率大约为 146 亿分之一。相当多，但仍然易于处理（事实上，我在大约 15 分钟内找到了解决这个挑战的方法，但我不会在这里发布）。

另一方面，您的代码将花费很多，很多更长的时间来找到解决方案。首先，生成随机输入绝对没有意义。顺序值同样有效，而且生成速度更快。

如果您使用顺序输入值，那么您也不必担心重复相同的哈希计算。您的代码使用列表结构来存储以前散列的值。这是一个糟糕的想法。在列表中搜索项目是 O(n) operation, so once your code has (unsuccessfully) tested a billion inputs, it will have to compare every new input against each of these billion inputs at each iteration, causing your code to grind to a complete standstill. Your code would actually run a lot faster if you didn't bother checking for duplicates. When you have time, I suggest you learn when to use lists, dicts and sets in Python.

另一个问题是您的代码只测试 10 位数字，这意味着它最多只能测试 100 亿个可能的输入。根据上面给出的数字，您确定这是一个合理的限制吗？

最后，您的代码会在计算哈希值之前打印每个输入字符串。在你的程序输出一个解决方案之前，你可以预期它会在某个地方打印出大约十亿屏的错误猜测。这样做有什么意义吗？号

这是我用来查找前面提到的 MD5 冲突的代码。你可以很容易地调整它以与 RIPEMD160 一起工作，如果你愿意，你可以将它转换为 Python（尽管 PHP 代码要简单得多）：

$n = 0;
while (1) {
    $s = "0e$n";
    $h = md5($s);
    if ($s == $h) break;
    $n++;
}
echo "$s : $h\n";

注意： 使用 PHP 的 hash_equals() function and strict comparison operators 可以避免您自己的代码中存在此类漏洞。

CTF 类型杂耍与 ripemd160 哈希

CTF Type Juggling with ripemd160 hash

php

hash

cryptography

ripemd

ctf