使用 python 进行双重哈希

Double Hashing with python

我目前正在 class 中进行哈希处理。 我需要创建一个双哈希函数,它接受一个列表并使用双哈希和 returns 一个新列表。

我了解列表如何使用双重散列,但我很难写下它的代码。

hashkey = key % len(list)
steps = q - (key % q)
new_hashkey = steps + hashkey
#q = double_hash_value

这是我在class中学到的双重哈希函数。我只是无法在代码中实现它。

这是我目前的尝试:

def double_hashing(keys, hashtable_size, double_hash_value):
    hashtable_list = [None] * hashtable_size
    for i in range(len(keys)):
        hashkey = keys[i] % hashtable_size
        if hashtable_list[hashkey] is None:
            hashtable_list[hashkey] = keys[i]
        else:
            new_hashkey = hashkey
            while hashtable_list[new_hashkey] is not None:
                steps = double_hash_value - (keys[i] % double_hash_value)
                new_hashkey = hashkey + steps
            hashtable_list[new_hashkey] = keys[i]
            return hashtable_list

values = [26, 54, 94, 17, 31, 77, 44, 51]
double = double_hashing(values, 13, 5)
print('Double =', double)

我相当确定这几乎是正确的,但我只是犯了一个愚蠢的错误或其他什么。我了解双重哈希的工作原理,但上面的代码不起作用。

这个输出应该是:

[26, None, 54, 94, 17, 31, 44, 51, None, None, None, None, 77]

但我的输出是:

[26, None, 54, 94, 17, 31, 44, None, None, None, None, None, 77]

未添加索引位置的值 51。

感谢任何帮助。谢谢。

据我所知,你的函数有两个问题:

问题 1 是在您的 while 循环中,您使用 hashkey 来更新 new_hashkey 的值,这意味着如果您的函数无法找到合适的索引在 while 循环的第一次迭代中给定值,它永远不会找到它,因为 new_hashkey 的值永远不会改变。此外,通过简单地将 steps 添加到 new_hashkey 你 运行 有一个 new_hashkey 大于你的 hashtable_size 的风险,最终会抛出 IndexError.您可以通过将该值取模 hashtable_size 来解决这个问题。 其次,您的功能 return 太早了。在您的示例中,它在遇到 44 之后正在 returning (即它第一次进入 else 块),这就是为什么您没有将 51 添加到你的输出列表。您的 return 语句实际上应该在 for 循环完成之后,以便您确保将 keys 列表中的所有值添加到输出列表中。

查看下面的更新代码(更改的行已注释):

def double_hashing(keys, hashtable_size, double_hash_value):
    hashtable_list = [None] * hashtable_size
    for i in range(len(keys)):
        hashkey = keys[i] % hashtable_size
        if hashtable_list[hashkey] is None:
            hashtable_list[hashkey] = keys[i]
        else:
            new_hashkey = hashkey
            while hashtable_list[new_hashkey] is not None:
                steps = double_hash_value - (keys[i] % double_hash_value)
                new_hashkey = (new_hashkey + steps) % hashtable_size  ## problem 1 is here
            hashtable_list[new_hashkey] = keys[i]
    return hashtable_list  ## problem 2 is here


values = [26, 54, 94, 17, 31, 77, 44, 51]
print( double_hashing(values, 13, 5) )

[26, None, 54, 94, 17, 31, 44, 51, None, None, None, None, 77]

一般来说,我们是这样用双重散列解决冲突的:如果有冲突,使用第二个散列函数如下,找到散列tableT中的下一个位置,如下图:

现在,让我们使用以下代码实现双重哈希:

def h1(k, m):
    return (2*k+3)%m

def h2(k, m):
    return (3*k+1)%m

def resolve_collision_with_double_hashing(hastable, keys, m, h1, h2):
    for k in keys:
        index = h1(k, m)
        if not hastable[index]: # no collision
            hastable[index] = k
        else:         # use double-hashing
            v = h2(k, m)
            inserted = False
            i = 1 # no need to check for i = 0, since collision already occurred
            while i < m: 
                index1 =  (index +  v * i) % m
                i += 1
                print('inserting {}, number of probings: {}'.format(k, i))
                if not hastable[index1]:
                    hastable[index1], inserted = k, True
                    break
            if not inserted:
                print('could not insert {}'.format(k))

print('hash table: ' + ' '.join(map(lambda x: str(x) if x else '', hastable)))

m = 11
hashtable = [None]*m
keys = [3,2,9,6,11,13,7,1,12,22]
resolve_collision_with_double_hashing(hashtable, keys, m, h1, h2)

# trying to insert 13, number of probings: 2
# trying to insert 13, number of probings: 3
# trying to insert 13, number of probings: 4
# inserted 13
# trying to insert 7, number of probings: 2
# trying to insert 7, number of probings: 3
# trying to insert 7, number of probings: 4
# trying to insert 7, number of probings: 5
# trying to insert 7, number of probings: 6
# trying to insert 7, number of probings: 7
# trying to insert 7, number of probings: 8
# trying to insert 7, number of probings: 9
# trying to insert 7, number of probings: 10
# trying to insert 7, number of probings: 11
# could not insert 7
# trying to insert 12, number of probings: 2
# trying to insert 12, number of probings: 3
# inserted 12
# trying to insert 22, number of probings: 2
# trying to insert 22, number of probings: 3
# trying to insert 22, number of probings: 4
# trying to insert 22, number of probings: 5
# trying to insert 22, number of probings: 6
# inserted 22
# hash table: _ _ 12 11 6 1 13 2 22 3 9