哈希内存使用率高于 instagram 测试结果

hashes memory usage is higher than instagram test result

大家好,我是 Redis 的新手,我正在关注 Instagram 工程博客进行优化 purpose.I 通过哈希测试了 100 万个密钥存储的内存使用情况(1000 个哈希,每个哈希有 1000 个密钥)。根据 Instagram post here 它只占用了 16MB 的存储空间 space 但我的测试用了 38MB.can 谁能告诉我哪里出错了?

这是我的测试代码:

# -*- coding: utf-8 -*-
import redis
#pool=redis.ConnectionPool(host=127.0.0.1,port=6379,db=4)
NUM_ENTRIES=1000000
MAX_VAL=12000000
def createData(min,max,userId):
        r_server=redis.Redis(host='localhost',port=6379,db=5)
        p=r_server.pipeline()
        for i in xrange(0,1000):
                for j in xrange(0,1000):
                        p.hset('follower:%s' % (i),j,j)
        p.execute()
        size = int(r_server.info()['used_memory'])
        print '%s bytes, %s MB' % (size, size / 1024 / 1024)

redis 信息:

# Server
redis_version:2.8.9
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:a9b5dff7da49156c
redis_mode:standalone
os:Linux 3.19.0-15-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.9.2
process_id:11037
run_id:c069c22be15f6b7cbd6490cea6d4ca497d8ad7cb
tcp_port:6379
uptime_in_seconds:230666
uptime_in_days:2
hz:10
lru_clock:8643496
config_file:

# Clients
connected_clients:1
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

# Memory
used_memory:41186920
used_memory_human:39.28M
used_memory_rss:60039168
used_memory_peak:256243984
used_memory_peak_human:244.37M
used_memory_lua:33792
mem_fragmentation_ratio:1.46
mem_allocator:jemalloc-3.2.0

# Persistence
loading:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1434659507
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok

# Stats
total_connections_received:21
total_commands_processed:3010067
instantaneous_ops_per_sec:0
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
evicted_keys:0
keyspace_hits:10
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:2774

# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:264.43
used_cpu_user:110.01
used_cpu_sys_children:0.27
used_cpu_user_children:1.55

# Keyspace
db5:keys=1000,expires=0,avg_ttl=0

这可能是因为您的服务器使用了默认设置 hash-max-ziplist-entries,因为您存储了 1000 个字段 - 这是我 运行 使用您的代码段进行的一个小测试:

foo@bar:/tmp$ redis-cli config get hash-max-ziplist-entries
1) "hash-max-ziplist-entries"
2) "512"
foo@bar:/tmp$ time python so.py 
56791944 bytes, 54 MB

real    0m23.225s
user    0m18.574s
sys 0m0.377s
foo@bar:/tmp$ redis-cli config set hash-max-ziplist-entries 1000
OK
foo@bar:/tmp$ redis-cli flushall
OK
foo@bar:/tmp$ time python so.py 
9112080 bytes, 8 MB

real    0m28.928s
user    0m18.663s
sys 0m0.315s