Redis 内存占用分析,随着业务的增长,时间的迁移,Redis 的内存占用会不断增加。常规方法使用redis-cli --bigkeys
把大 key 扫描出来,但是这个方法很难定位问题。
Redis 内存分析工具
备份线上快照,使用工具导出所有的 key,包含 key 的类型,内存大小,过期时间等。这里会用到工具redis-rdb-tools,这个工具是用 python 写的,所以我们需要一个 python 的环境
导出 rdb 每个 key 的内存占用
- bytes key 的大小,这里的 128 表示过滤出大于等于 128 字节的 key
rdb -c memory /var/redis/6379/dump.rdb --bytes 128 -f memory.csv
head memory.csv
# outout
database,type,key,size_in_bytes,encoding,num_elements,len_largest_element
0,list,lizards,241,quicklist,5,19
0,list,user_list,190,quicklist,3,7
2,hash,baloon,138,ziplist,3,11
2,list,armadillo,231,quicklist,5,20
2,hash,aroma,129,ziplist,3,11
python stat_memery_usage memory.csv
# output
user:extend:* => 635.881 MB
suspension:*:app:* => 134.631 MB
user:bomb:reward:rate:* => 70.771 MB
user:follow:* => 70.220 MB
room:*:share:* => 66.688 MB
app:daily_tasks:* => 54.152 MB
room:ranking:total:* => 34.799 MB
...
Python 工具
import argparse
import re
from collections import Counter
# 公共前缀
prefix = ['job:php', ]
# 不同部分替换为
re_replace = re.compile(r'(([a-z0-9\-]{36})|([A-Z0-9])+|\d+|android|ios)')
def init():
parser = argparse.ArgumentParser(description="analysis redis key, and find the same key")
parser.add_argument('file', default=None, type=str, help="filename")
return parser.parse_args()
def read_file(filename):
with open(filename) as f:
while True:
line = f.readline()
if line:
yield line
else:
break
def decimal_format(size, multiple=1024):
if size < 0:
raise ValueError('size must be non-negative')
suffix = ['KB', 'MB', 'GB', 'TB', ]
for suf in suffix:
size /= multiple
if size < multiple:
return '{0:.3f} {1}'.format(size, suf)
raise ValueError('size too large')
def replace(s):
for pre in prefix:
if s.startswith(pre):
return pre + ':*'
return re_replace.sub('*', s)
if __name__ == '__main__':
args = init()
counter = Counter()
hc = Counter()
count = 0
for ll in read_file(args.file):
if count == 0 and ll.startswith('database,'):
count += 1
continue
data = ll.split(',')
key = replace(data[2])
counter[key] += int(data[3])
hc[key] += 1
new_dict = sorted(counter.items(), key=lambda item: item[1], reverse=True)
for (k, v) in new_dict:
print('%s => %s => %d' % (k, decimal_format(v), hc[k]))