使用正则表达式根据值列表对字典列表进行排序
Sort a list of dicts according to a list of values with regex
我想根据 list_months
对 list_of_dicts
的键进行排序。一旦我从 list_of_dicts
的键中删除数字(年),它就可以正常工作,但我无法弄清楚如何在 lambda 函数中正确使用正则表达式来包含数字。
到目前为止我的代码:
import re
list_months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
list_of_dicts = [{'Apr23': '64.401'}, {'Aug23': '56.955'}, {'Dec23': '57.453'}, {'Feb23': '90.459'}, {'Jan23': '92.731'}, {'Jul23': '56.6'}, {'Jun23': '56.509'},{'Mar23': '86.209'}, {'May23': '58.705'}, {'Nov23': '57.368'}, {'Oct23': '56.711'}, {'Sep23': '57.952'}]
r = re.compile("[a-zA-Z]{3}[0-9]{2}")
print(sorted(list_of_dicts, key=lambda d: [k in d for k in list_months if re.search(r, k)], reverse=True))
这里不需要正则表达式。
dict_months = {m:i for i, m in enumerate(list_months)}
result = sorted(list_of_dicts, key=lambda d: dict_months[next(iter(d))[:3]])
print(result)
# [{'Jan23': '92.731'}, {'Feb23': '90.459'}, {'Mar23': '86.209'}, {'Apr23': '64.401'}, {'May23': '58.705'}, {'Jun23': '56.509'}, {'Jul23': '56.6'}, {'Aug23': '56.955'}, {'Sep23': '57.952'}, {'Oct23': '56.711'}, {'Nov23': '57.368'}, {'Dec23': '57.453'}]
如果您还想考虑年份,请使用
def sortby(d):
key = next(iter(d))
return int(key[3:]), dict_months[key[:3]]
result = sorted(list_of_dicts, key=sortby)
使用 re.sub
和 list.index
的一种方法。
注意 list.index
是 O(n)
,这是相当昂贵的。
def get_key_loc(dic):
k = list(dic.keys())[0]
return list_months.index(re.sub("\d+", "", k))
sorted(list_of_dicts, key=get_key_loc)
输出:
[{'Jan23': '92.731'},
{'Feb23': '90.459'},
{'Mar23': '86.209'},
{'Apr23': '64.401'},
{'May23': '58.705'},
{'Jun23': '56.509'},
{'Jul23': '56.6'},
{'Aug23': '56.955'},
{'Sep23': '57.952'},
{'Oct23': '56.711'},
{'Nov23': '57.368'},
{'Dec23': '57.453'}]
你真的不需要 dict_month
,使用 python 的 batteries included:
from datetime import datetime
sorted(list_of_dicts, key=lambda x: datetime.strptime(next(iter(x)), '%b%d'))
或:
import dateutil.parser
sorted(list_of_dicts, key=lambda x: dateutil.parser.parse(next(iter(x))))
输出:
[{'Jan23': '92.731'},
{'Feb23': '90.459'},
{'Mar23': '86.209'},
{'Apr23': '64.401'},
{'May23': '58.705'},
{'Jun23': '56.509'},
{'Jul23': '56.6'},
{'Aug23': '56.955'},
{'Sep23': '57.952'},
{'Oct23': '56.711'},
{'Nov23': '57.368'},
{'Dec23': '57.453'}]
我想根据 list_months
对 list_of_dicts
的键进行排序。一旦我从 list_of_dicts
的键中删除数字(年),它就可以正常工作,但我无法弄清楚如何在 lambda 函数中正确使用正则表达式来包含数字。
到目前为止我的代码:
import re
list_months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
list_of_dicts = [{'Apr23': '64.401'}, {'Aug23': '56.955'}, {'Dec23': '57.453'}, {'Feb23': '90.459'}, {'Jan23': '92.731'}, {'Jul23': '56.6'}, {'Jun23': '56.509'},{'Mar23': '86.209'}, {'May23': '58.705'}, {'Nov23': '57.368'}, {'Oct23': '56.711'}, {'Sep23': '57.952'}]
r = re.compile("[a-zA-Z]{3}[0-9]{2}")
print(sorted(list_of_dicts, key=lambda d: [k in d for k in list_months if re.search(r, k)], reverse=True))
这里不需要正则表达式。
dict_months = {m:i for i, m in enumerate(list_months)}
result = sorted(list_of_dicts, key=lambda d: dict_months[next(iter(d))[:3]])
print(result)
# [{'Jan23': '92.731'}, {'Feb23': '90.459'}, {'Mar23': '86.209'}, {'Apr23': '64.401'}, {'May23': '58.705'}, {'Jun23': '56.509'}, {'Jul23': '56.6'}, {'Aug23': '56.955'}, {'Sep23': '57.952'}, {'Oct23': '56.711'}, {'Nov23': '57.368'}, {'Dec23': '57.453'}]
如果您还想考虑年份,请使用
def sortby(d):
key = next(iter(d))
return int(key[3:]), dict_months[key[:3]]
result = sorted(list_of_dicts, key=sortby)
使用 re.sub
和 list.index
的一种方法。
注意 list.index
是 O(n)
,这是相当昂贵的。
def get_key_loc(dic):
k = list(dic.keys())[0]
return list_months.index(re.sub("\d+", "", k))
sorted(list_of_dicts, key=get_key_loc)
输出:
[{'Jan23': '92.731'},
{'Feb23': '90.459'},
{'Mar23': '86.209'},
{'Apr23': '64.401'},
{'May23': '58.705'},
{'Jun23': '56.509'},
{'Jul23': '56.6'},
{'Aug23': '56.955'},
{'Sep23': '57.952'},
{'Oct23': '56.711'},
{'Nov23': '57.368'},
{'Dec23': '57.453'}]
你真的不需要 dict_month
,使用 python 的 batteries included:
from datetime import datetime
sorted(list_of_dicts, key=lambda x: datetime.strptime(next(iter(x)), '%b%d'))
或:
import dateutil.parser
sorted(list_of_dicts, key=lambda x: dateutil.parser.parse(next(iter(x))))
输出:
[{'Jan23': '92.731'},
{'Feb23': '90.459'},
{'Mar23': '86.209'},
{'Apr23': '64.401'},
{'May23': '58.705'},
{'Jun23': '56.509'},
{'Jul23': '56.6'},
{'Aug23': '56.955'},
{'Sep23': '57.952'},
{'Oct23': '56.711'},
{'Nov23': '57.368'},
{'Dec23': '57.453'}]