使用 python 进行双重哈希
Double Hashing with python
我目前正在 class 中进行哈希处理。
我需要创建一个双哈希函数,它接受一个列表并使用双哈希和 returns 一个新列表。
我了解列表如何使用双重散列,但我很难写下它的代码。
hashkey = key % len(list)
steps = q - (key % q)
new_hashkey = steps + hashkey
#q = double_hash_value
这是我在class中学到的双重哈希函数。我只是无法在代码中实现它。
这是我目前的尝试:
def double_hashing(keys, hashtable_size, double_hash_value):
hashtable_list = [None] * hashtable_size
for i in range(len(keys)):
hashkey = keys[i] % hashtable_size
if hashtable_list[hashkey] is None:
hashtable_list[hashkey] = keys[i]
else:
new_hashkey = hashkey
while hashtable_list[new_hashkey] is not None:
steps = double_hash_value - (keys[i] % double_hash_value)
new_hashkey = hashkey + steps
hashtable_list[new_hashkey] = keys[i]
return hashtable_list
values = [26, 54, 94, 17, 31, 77, 44, 51]
double = double_hashing(values, 13, 5)
print('Double =', double)
我相当确定这几乎是正确的,但我只是犯了一个愚蠢的错误或其他什么。我了解双重哈希的工作原理,但上面的代码不起作用。
这个输出应该是:
[26, None, 54, 94, 17, 31, 44, 51, None, None, None, None, 77]
但我的输出是:
[26, None, 54, 94, 17, 31, 44, None, None, None, None, None, 77]
未添加索引位置的值 51。
感谢任何帮助。谢谢。
据我所知,你的函数有两个问题:
问题 1 是在您的 while
循环中,您使用 hashkey
来更新 new_hashkey
的值,这意味着如果您的函数无法找到合适的索引在 while 循环的第一次迭代中给定值,它永远不会找到它,因为 new_hashkey
的值永远不会改变。此外,通过简单地将 steps
添加到 new_hashkey
你 运行 有一个 new_hashkey
大于你的 hashtable_size
的风险,最终会抛出 IndexError
.您可以通过将该值取模 hashtable_size
来解决这个问题。
其次,您的功能 return 太早了。在您的示例中,它在遇到 44
之后正在 returning (即它第一次进入 else
块),这就是为什么您没有将 51
添加到你的输出列表。您的 return 语句实际上应该在 for 循环完成之后,以便您确保将 keys
列表中的所有值添加到输出列表中。
查看下面的更新代码(更改的行已注释):
def double_hashing(keys, hashtable_size, double_hash_value):
hashtable_list = [None] * hashtable_size
for i in range(len(keys)):
hashkey = keys[i] % hashtable_size
if hashtable_list[hashkey] is None:
hashtable_list[hashkey] = keys[i]
else:
new_hashkey = hashkey
while hashtable_list[new_hashkey] is not None:
steps = double_hash_value - (keys[i] % double_hash_value)
new_hashkey = (new_hashkey + steps) % hashtable_size ## problem 1 is here
hashtable_list[new_hashkey] = keys[i]
return hashtable_list ## problem 2 is here
values = [26, 54, 94, 17, 31, 77, 44, 51]
print( double_hashing(values, 13, 5) )
[26, None, 54, 94, 17, 31, 44, 51, None, None, None, None, 77]
一般来说,我们是这样用双重散列解决冲突的:如果有冲突,使用第二个散列函数如下,找到散列tableT中的下一个位置,如下图:
现在,让我们使用以下代码实现双重哈希:
def h1(k, m):
return (2*k+3)%m
def h2(k, m):
return (3*k+1)%m
def resolve_collision_with_double_hashing(hastable, keys, m, h1, h2):
for k in keys:
index = h1(k, m)
if not hastable[index]: # no collision
hastable[index] = k
else: # use double-hashing
v = h2(k, m)
inserted = False
i = 1 # no need to check for i = 0, since collision already occurred
while i < m:
index1 = (index + v * i) % m
i += 1
print('inserting {}, number of probings: {}'.format(k, i))
if not hastable[index1]:
hastable[index1], inserted = k, True
break
if not inserted:
print('could not insert {}'.format(k))
print('hash table: ' + ' '.join(map(lambda x: str(x) if x else '', hastable)))
m = 11
hashtable = [None]*m
keys = [3,2,9,6,11,13,7,1,12,22]
resolve_collision_with_double_hashing(hashtable, keys, m, h1, h2)
# trying to insert 13, number of probings: 2
# trying to insert 13, number of probings: 3
# trying to insert 13, number of probings: 4
# inserted 13
# trying to insert 7, number of probings: 2
# trying to insert 7, number of probings: 3
# trying to insert 7, number of probings: 4
# trying to insert 7, number of probings: 5
# trying to insert 7, number of probings: 6
# trying to insert 7, number of probings: 7
# trying to insert 7, number of probings: 8
# trying to insert 7, number of probings: 9
# trying to insert 7, number of probings: 10
# trying to insert 7, number of probings: 11
# could not insert 7
# trying to insert 12, number of probings: 2
# trying to insert 12, number of probings: 3
# inserted 12
# trying to insert 22, number of probings: 2
# trying to insert 22, number of probings: 3
# trying to insert 22, number of probings: 4
# trying to insert 22, number of probings: 5
# trying to insert 22, number of probings: 6
# inserted 22
# hash table: _ _ 12 11 6 1 13 2 22 3 9
我目前正在 class 中进行哈希处理。 我需要创建一个双哈希函数,它接受一个列表并使用双哈希和 returns 一个新列表。
我了解列表如何使用双重散列,但我很难写下它的代码。
hashkey = key % len(list)
steps = q - (key % q)
new_hashkey = steps + hashkey
#q = double_hash_value
这是我在class中学到的双重哈希函数。我只是无法在代码中实现它。
这是我目前的尝试:
def double_hashing(keys, hashtable_size, double_hash_value):
hashtable_list = [None] * hashtable_size
for i in range(len(keys)):
hashkey = keys[i] % hashtable_size
if hashtable_list[hashkey] is None:
hashtable_list[hashkey] = keys[i]
else:
new_hashkey = hashkey
while hashtable_list[new_hashkey] is not None:
steps = double_hash_value - (keys[i] % double_hash_value)
new_hashkey = hashkey + steps
hashtable_list[new_hashkey] = keys[i]
return hashtable_list
values = [26, 54, 94, 17, 31, 77, 44, 51]
double = double_hashing(values, 13, 5)
print('Double =', double)
我相当确定这几乎是正确的,但我只是犯了一个愚蠢的错误或其他什么。我了解双重哈希的工作原理,但上面的代码不起作用。
这个输出应该是:
[26, None, 54, 94, 17, 31, 44, 51, None, None, None, None, 77]
但我的输出是:
[26, None, 54, 94, 17, 31, 44, None, None, None, None, None, 77]
未添加索引位置的值 51。
感谢任何帮助。谢谢。
据我所知,你的函数有两个问题:
问题 1 是在您的 while
循环中,您使用 hashkey
来更新 new_hashkey
的值,这意味着如果您的函数无法找到合适的索引在 while 循环的第一次迭代中给定值,它永远不会找到它,因为 new_hashkey
的值永远不会改变。此外,通过简单地将 steps
添加到 new_hashkey
你 运行 有一个 new_hashkey
大于你的 hashtable_size
的风险,最终会抛出 IndexError
.您可以通过将该值取模 hashtable_size
来解决这个问题。
其次,您的功能 return 太早了。在您的示例中,它在遇到 44
之后正在 returning (即它第一次进入 else
块),这就是为什么您没有将 51
添加到你的输出列表。您的 return 语句实际上应该在 for 循环完成之后,以便您确保将 keys
列表中的所有值添加到输出列表中。
查看下面的更新代码(更改的行已注释):
def double_hashing(keys, hashtable_size, double_hash_value):
hashtable_list = [None] * hashtable_size
for i in range(len(keys)):
hashkey = keys[i] % hashtable_size
if hashtable_list[hashkey] is None:
hashtable_list[hashkey] = keys[i]
else:
new_hashkey = hashkey
while hashtable_list[new_hashkey] is not None:
steps = double_hash_value - (keys[i] % double_hash_value)
new_hashkey = (new_hashkey + steps) % hashtable_size ## problem 1 is here
hashtable_list[new_hashkey] = keys[i]
return hashtable_list ## problem 2 is here
values = [26, 54, 94, 17, 31, 77, 44, 51]
print( double_hashing(values, 13, 5) )
[26, None, 54, 94, 17, 31, 44, 51, None, None, None, None, 77]
一般来说,我们是这样用双重散列解决冲突的:如果有冲突,使用第二个散列函数如下,找到散列tableT中的下一个位置,如下图:
现在,让我们使用以下代码实现双重哈希:
def h1(k, m):
return (2*k+3)%m
def h2(k, m):
return (3*k+1)%m
def resolve_collision_with_double_hashing(hastable, keys, m, h1, h2):
for k in keys:
index = h1(k, m)
if not hastable[index]: # no collision
hastable[index] = k
else: # use double-hashing
v = h2(k, m)
inserted = False
i = 1 # no need to check for i = 0, since collision already occurred
while i < m:
index1 = (index + v * i) % m
i += 1
print('inserting {}, number of probings: {}'.format(k, i))
if not hastable[index1]:
hastable[index1], inserted = k, True
break
if not inserted:
print('could not insert {}'.format(k))
print('hash table: ' + ' '.join(map(lambda x: str(x) if x else '', hastable)))
m = 11
hashtable = [None]*m
keys = [3,2,9,6,11,13,7,1,12,22]
resolve_collision_with_double_hashing(hashtable, keys, m, h1, h2)
# trying to insert 13, number of probings: 2
# trying to insert 13, number of probings: 3
# trying to insert 13, number of probings: 4
# inserted 13
# trying to insert 7, number of probings: 2
# trying to insert 7, number of probings: 3
# trying to insert 7, number of probings: 4
# trying to insert 7, number of probings: 5
# trying to insert 7, number of probings: 6
# trying to insert 7, number of probings: 7
# trying to insert 7, number of probings: 8
# trying to insert 7, number of probings: 9
# trying to insert 7, number of probings: 10
# trying to insert 7, number of probings: 11
# could not insert 7
# trying to insert 12, number of probings: 2
# trying to insert 12, number of probings: 3
# inserted 12
# trying to insert 22, number of probings: 2
# trying to insert 22, number of probings: 3
# trying to insert 22, number of probings: 4
# trying to insert 22, number of probings: 5
# trying to insert 22, number of probings: 6
# inserted 22
# hash table: _ _ 12 11 6 1 13 2 22 3 9