Redis 内存占用分析,随着业务的增长,时间的迁移,Redis 的内存占用会不断增加。常规方法使用redis-cli --bigkeys把大 key 扫描出来,但是这个方法很难定位问题。

Redis 内存分析工具

备份线上快照,使用工具导出所有的 key,包含 key 的类型,内存大小,过期时间等。这里会用到工具redis-rdb-tools,这个工具是用 python 写的,所以我们需要一个 python 的环境

导出 rdb 每个 key 的内存占用

  • bytes key 的大小,这里的 128 表示过滤出大于等于 128 字节的 key
rdb -c memory /var/redis/6379/dump.rdb --bytes 128 -f memory.csv
head memory.csv

# outout
database,type,key,size_in_bytes,encoding,num_elements,len_largest_element
0,list,lizards,241,quicklist,5,19
0,list,user_list,190,quicklist,3,7
2,hash,baloon,138,ziplist,3,11
2,list,armadillo,231,quicklist,5,20
2,hash,aroma,129,ziplist,3,11
python stat_memery_usage memory.csv

# output
user:extend:* => 635.881 MB
suspension:*:app:* => 134.631 MB
user:bomb:reward:rate:* => 70.771 MB
user:follow:* => 70.220 MB
room:*:share:* => 66.688 MB
app:daily_tasks:* => 54.152 MB
room:ranking:total:* => 34.799 MB
...

Python 工具

import argparse
import re
from collections import Counter

# 公共前缀
prefix = ['job:php', ]
# 不同部分替换为
re_replace = re.compile(r'(([a-z0-9\-]{36})|([A-Z0-9])+|\d+|android|ios)')


def init():
    parser = argparse.ArgumentParser(description="analysis redis key, and find the same key")
    parser.add_argument('file', default=None, type=str, help="filename")
    return parser.parse_args()


def read_file(filename):
    with open(filename) as f:
        while True:
            line = f.readline()
            if line:
                yield line
            else:
                break

def decimal_format(size, multiple=1024):
    if size < 0:
        raise ValueError('size must be non-negative')
    suffix = ['KB', 'MB', 'GB', 'TB', ]
    for suf in suffix:
        size /= multiple
        if size < multiple:
            return '{0:.3f} {1}'.format(size, suf)
    raise ValueError('size too large')


def replace(s):
    for pre in prefix:
        if s.startswith(pre):
            return pre + ':*'
    return re_replace.sub('*', s)


if __name__ == '__main__':
    args = init()
    counter = Counter()
    hc = Counter()
    count = 0
    for ll in read_file(args.file):
        if count == 0 and ll.startswith('database,'):
            count += 1
            continue
        data = ll.split(',')
        key = replace(data[2])
        counter[key] += int(data[3])
        hc[key] += 1

    new_dict = sorted(counter.items(), key=lambda item: item[1], reverse=True)
    for (k, v) in new_dict:
        print('%s => %s => %d' % (k, decimal_format(v), hc[k]))

推荐阅读: