为什么redis中hashmap的负载因子大到5
Why load factor of hashmap in redis is as large as 5
在算法类和授权书籍中,load-factor小于1,与Java一样,默认为0.75。但是在redis源码中,负载因子是5.
54 /* Using dictEnableResize() / dictDisableResize() we make possible to
55 * enable/disable resizing of the hash table as needed. This is very important
56 * for Redis, as we use copy-on-write and don't want to move too much memory
57 * around when there is a child performing saving operations.
58 *
59 * Note that even when dict_can_resize is set to 0, not all resizes are
60 * prevented: a hash table is still allowed to grow if the ratio between
61 * the number of elements and the buckets > dict_force_resize_ratio. */
62 static int dict_can_resize = 1;
63 static unsigned int dict_force_resize_ratio = 5;
为什么?
开始重新散列的负载因子约为 1。 dict_force_resize_ratio
值是一种安全措施,即使禁用重新散列,一旦达到该负载因子,它也会强制执行。
您可以在 dict.c
的 _dictExpandIfNeeded(dict *d)
中看到这个
/* If we reached the 1:1 ratio, and we are allowed to resize the hash
* table (global setting) or we should avoid it but the ratio between
* elements/buckets is over the "safe" threshold, we resize doubling
* the number of buckets. */
if (d->ht[0].used >= d->ht[0].size &&
(dict_can_resize ||
d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
{
return dictExpand(d, d->ht[0].used*2);
}
Redis 允许 ~1 开始重新散列,因为重新散列不是一次完成的。它是通过维护两个哈希表逐步完成的。
参见dict.h
:
/* This is our hash table structure. Every dictionary has two of this as we
* implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
dictEntry **table;
unsigned long size;
unsigned long sizemask;
unsigned long used;
} dictht;
typedef struct dict {
dictType *type;
void *privdata;
dictht ht[2];
long rehashidx; /* rehashing not in progress if rehashidx == -1 */
unsigned long iterators; /* number of iterators currently running */
} dict;
在dict.c
中:
/* Performs N steps of incremental rehashing. Returns 1 if there are still
* keys to move from the old to the new hash table, otherwise 0 is returned.
*
* Note that a rehashing step consists in moving a bucket (that may have more
* than one key as we use chaining) from the old to the new hash table, however
* since part of the hash table may be composed of empty spaces, it is not
* guaranteed that this function will rehash even a single bucket, since it
* will visit at max N*10 empty buckets in total, otherwise the amount of
* work it does would be unbound and the function may block for a long time. */
int dictRehash(dict *d, int n) {...
对于 activerehashing
设置,redis.conf
中还有一些额外的见解。
# Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in
# order to help rehashing the main Redis hash table (the one mapping top-level
# keys to values). The hash table implementation Redis uses (see dict.c)
# performs a lazy rehashing: the more operation you run into a hash table
# that is rehashing, the more rehashing "steps" are performed, so if the
# server is idle the rehashing is never complete and some more memory is used
# by the hash table.
#
# The default is to use this millisecond 10 times every second in order to
# actively rehash the main dictionaries, freeing memory when possible.
#
# If unsure:
# use "activerehashing no" if you have hard latency requirements and it is
# not a good thing in your environment that Redis can reply from time to time
# to queries with 2 milliseconds delay.
#
# use "activerehashing yes" if you don't have such hard requirements but
# want to free memory asap when possible.
activerehashing yes
因为在Redis中,load_factor的计算方式是:ht[0].used/d->ht[0].size
,used
表示hash中元素的个数table,size
表示只底层数组的大小,所以如果发生哈希冲突,元素将存储在链表中,链表的大小不包括在ht[0].size
.
中
在算法类和授权书籍中,load-factor小于1,与Java一样,默认为0.75。但是在redis源码中,负载因子是5.
54 /* Using dictEnableResize() / dictDisableResize() we make possible to
55 * enable/disable resizing of the hash table as needed. This is very important
56 * for Redis, as we use copy-on-write and don't want to move too much memory
57 * around when there is a child performing saving operations.
58 *
59 * Note that even when dict_can_resize is set to 0, not all resizes are
60 * prevented: a hash table is still allowed to grow if the ratio between
61 * the number of elements and the buckets > dict_force_resize_ratio. */
62 static int dict_can_resize = 1;
63 static unsigned int dict_force_resize_ratio = 5;
为什么?
开始重新散列的负载因子约为 1。 dict_force_resize_ratio
值是一种安全措施,即使禁用重新散列,一旦达到该负载因子,它也会强制执行。
您可以在 dict.c
_dictExpandIfNeeded(dict *d)
中看到这个
/* If we reached the 1:1 ratio, and we are allowed to resize the hash
* table (global setting) or we should avoid it but the ratio between
* elements/buckets is over the "safe" threshold, we resize doubling
* the number of buckets. */
if (d->ht[0].used >= d->ht[0].size &&
(dict_can_resize ||
d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
{
return dictExpand(d, d->ht[0].used*2);
}
Redis 允许 ~1 开始重新散列,因为重新散列不是一次完成的。它是通过维护两个哈希表逐步完成的。
参见dict.h
:
/* This is our hash table structure. Every dictionary has two of this as we
* implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
dictEntry **table;
unsigned long size;
unsigned long sizemask;
unsigned long used;
} dictht;
typedef struct dict {
dictType *type;
void *privdata;
dictht ht[2];
long rehashidx; /* rehashing not in progress if rehashidx == -1 */
unsigned long iterators; /* number of iterators currently running */
} dict;
在dict.c
中:
/* Performs N steps of incremental rehashing. Returns 1 if there are still
* keys to move from the old to the new hash table, otherwise 0 is returned.
*
* Note that a rehashing step consists in moving a bucket (that may have more
* than one key as we use chaining) from the old to the new hash table, however
* since part of the hash table may be composed of empty spaces, it is not
* guaranteed that this function will rehash even a single bucket, since it
* will visit at max N*10 empty buckets in total, otherwise the amount of
* work it does would be unbound and the function may block for a long time. */
int dictRehash(dict *d, int n) {...
对于 activerehashing
设置,redis.conf
中还有一些额外的见解。
# Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in
# order to help rehashing the main Redis hash table (the one mapping top-level
# keys to values). The hash table implementation Redis uses (see dict.c)
# performs a lazy rehashing: the more operation you run into a hash table
# that is rehashing, the more rehashing "steps" are performed, so if the
# server is idle the rehashing is never complete and some more memory is used
# by the hash table.
#
# The default is to use this millisecond 10 times every second in order to
# actively rehash the main dictionaries, freeing memory when possible.
#
# If unsure:
# use "activerehashing no" if you have hard latency requirements and it is
# not a good thing in your environment that Redis can reply from time to time
# to queries with 2 milliseconds delay.
#
# use "activerehashing yes" if you don't have such hard requirements but
# want to free memory asap when possible.
activerehashing yes
因为在Redis中,load_factor的计算方式是:ht[0].used/d->ht[0].size
,used
表示hash中元素的个数table,size
表示只底层数组的大小,所以如果发生哈希冲突,元素将存储在链表中,链表的大小不包括在ht[0].size
.