如何对由“.”分隔的字符串列表进行排序中间还有数字?
How to sort a list of strings delimited by '.' with also numbers in the middle?
我有一个字符串列表,其中包含以点 .
分隔的命令,如下所示:
DeviceA.CommandA.1.Hello,
DeviceA.CommandA.2.Hello,
DeviceA.CommandA.11.Hello,
DeviceA.CommandA.3.Hello,
DeviceA.CommandB.1.Hello,
DeviceA.CommandB.1.Bye,
DeviceB.CommandB.What,
DeviceA.SubdeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.2.Hello,
DeviceA.SubdeviceB.CommandA.1.What
我想按自然顺序排列它们:
- 顺序必须按字段索引确定优先级(例如,以 DeviceA 开头的命令将始终排在 DeviceB 之前等)
- 按字母顺序排列字符串
- 当它找到一个数字时,按数字升序排列
因此,排序后的输出应该是:
DeviceA.CommandA.1.Hello,
DeviceA.CommandA.2.Hello,
DeviceA.CommandA.3.Hello,
DeviceA.CommandA.11.Hello,
DeviceA.CommandB.1.Bye,
DeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.2.Hello,
DeviceA.SubdeviceB.CommandA.What,
DeviceB.CommandB.What
另请注意,命令字段的长度是动态的,由点分隔的字段数可以是任意大小。
到目前为止,我尝试了这个但没有成功(数字按字母顺序排列,例如 11 在 5 之前):
list = [
"DeviceA.CommandA.1.Hello",
"DeviceA.CommandA.2.Hello",
"DeviceA.CommandA.11.Hello",
"DeviceA.CommandA.3.Hello",
"DeviceA.CommandB.1.Hello",
"DeviceA.CommandB.1.Bye",
"DeviceB.CommandB.What",
"DeviceA.SubdeviceA.CommandB.1.Hello",
"DeviceA.SubdeviceA.CommandB.2.Hello",
"DeviceA.SubdeviceB.CommandA.1.What"
]
sorted_list = sorted(list, key=lambda x: x.split('.'))
编辑:更正拼写错误。
这样的事情应该会让你继续下去。
from pprint import pprint
data_list = [
"DeviceA.CommandA.1.Hello",
"DeviceA.CommandA.2.Hello",
"DeviceA.CommandA.3.Hello",
"DeviceA.CommandB.1.Hello",
"DeviceA.CommandB.1.Bye",
"DeviceB.CommandB.What",
"DeviceA.SubdeviceA.CommandB.1.Hello",
"DeviceA.SubdeviceA.CommandB.15.Hello", # added test case to ensure numbers are sorted numerically
"DeviceA.SubdeviceA.CommandB.2.Hello",
"DeviceA.SubdeviceB.CommandA.1.What",
]
def get_sort_key(s):
# Turning the pieces to integers would fail some comparisons (1 vs "What")
# so instead pad them on the left to a suitably long string
return [
bit.rjust(30, "0") if bit.isdigit() else bit
for bit in s.split(".")
]
# Note the key function must be passed as a kwarg.
sorted_list = sorted(data_list, key=get_sort_key)
pprint(sorted_list)
输出为
['DeviceA.CommandA.1.Hello',
'DeviceA.CommandA.2.Hello',
'DeviceA.CommandA.3.Hello',
'DeviceA.CommandB.1.Bye',
'DeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.2.Hello',
'DeviceA.SubdeviceA.CommandB.15.Hello',
'DeviceA.SubdeviceB.CommandA.1.What',
'DeviceB.CommandB.What']
在 sorted
中指定一个 key
似乎可以实现你想要的:
import re
def my_key(s):
n = re.search("\d+",s)
return (s[:n.span()[0]], int(n[0])) if n else (s,)
print(sorted(l, key = my_key))
输出:
['DeviceA.CommandA.1.Hello', 'DeviceA.CommandA.2.Hello', 'DeviceA.CommandA.3.Hello', 'DeviceA.CommandA.11.Hello', 'DeviceA.CommandB.1.Hello', 'DeviceA.CommandB.1.Bye', 'DeviceA.SubdeviceA.CommandB.1.Hello', 'DeviceA.SubdeviceA.CommandB.2.Hello', 'DeviceA.SubdeviceB.CommandA.1.What', 'DeviceB.CommandB.What']
有很多方法可以实现这一点。这是一个不依赖于导入任何附加模块的模块:
LOS = ['DeviceA.CommandA.1.Hello',
'DeviceA.CommandA.2.Hello',
'DeviceA.CommandA.11.Hello',
'DeviceA.CommandA.3.Hello',
'DeviceA.CommandB.1.Hello',
'DeviceA.CommandB.1.Bye',
'DeviceB.CommandB.What',
'DeviceA.SubdeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.2.Hello',
'DeviceA.SubdeviceB.CommandA.1.What']
def func(s):
tokens = s.split('.')
for i, token in enumerate(tokens):
try:
v = int(token)
return ('.'.join(tokens[0:i]), v)
except ValueError:
pass
return (s, 0)
print(sorted(LOS, key=func))
我有一个字符串列表,其中包含以点 .
分隔的命令,如下所示:
DeviceA.CommandA.1.Hello,
DeviceA.CommandA.2.Hello,
DeviceA.CommandA.11.Hello,
DeviceA.CommandA.3.Hello,
DeviceA.CommandB.1.Hello,
DeviceA.CommandB.1.Bye,
DeviceB.CommandB.What,
DeviceA.SubdeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.2.Hello,
DeviceA.SubdeviceB.CommandA.1.What
我想按自然顺序排列它们:
- 顺序必须按字段索引确定优先级(例如,以 DeviceA 开头的命令将始终排在 DeviceB 之前等)
- 按字母顺序排列字符串
- 当它找到一个数字时,按数字升序排列
因此,排序后的输出应该是:
DeviceA.CommandA.1.Hello,
DeviceA.CommandA.2.Hello,
DeviceA.CommandA.3.Hello,
DeviceA.CommandA.11.Hello,
DeviceA.CommandB.1.Bye,
DeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.1.Hello,
DeviceA.SubdeviceA.CommandB.2.Hello,
DeviceA.SubdeviceB.CommandA.What,
DeviceB.CommandB.What
另请注意,命令字段的长度是动态的,由点分隔的字段数可以是任意大小。
到目前为止,我尝试了这个但没有成功(数字按字母顺序排列,例如 11 在 5 之前):
list = [
"DeviceA.CommandA.1.Hello",
"DeviceA.CommandA.2.Hello",
"DeviceA.CommandA.11.Hello",
"DeviceA.CommandA.3.Hello",
"DeviceA.CommandB.1.Hello",
"DeviceA.CommandB.1.Bye",
"DeviceB.CommandB.What",
"DeviceA.SubdeviceA.CommandB.1.Hello",
"DeviceA.SubdeviceA.CommandB.2.Hello",
"DeviceA.SubdeviceB.CommandA.1.What"
]
sorted_list = sorted(list, key=lambda x: x.split('.'))
编辑:更正拼写错误。
这样的事情应该会让你继续下去。
from pprint import pprint
data_list = [
"DeviceA.CommandA.1.Hello",
"DeviceA.CommandA.2.Hello",
"DeviceA.CommandA.3.Hello",
"DeviceA.CommandB.1.Hello",
"DeviceA.CommandB.1.Bye",
"DeviceB.CommandB.What",
"DeviceA.SubdeviceA.CommandB.1.Hello",
"DeviceA.SubdeviceA.CommandB.15.Hello", # added test case to ensure numbers are sorted numerically
"DeviceA.SubdeviceA.CommandB.2.Hello",
"DeviceA.SubdeviceB.CommandA.1.What",
]
def get_sort_key(s):
# Turning the pieces to integers would fail some comparisons (1 vs "What")
# so instead pad them on the left to a suitably long string
return [
bit.rjust(30, "0") if bit.isdigit() else bit
for bit in s.split(".")
]
# Note the key function must be passed as a kwarg.
sorted_list = sorted(data_list, key=get_sort_key)
pprint(sorted_list)
输出为
['DeviceA.CommandA.1.Hello',
'DeviceA.CommandA.2.Hello',
'DeviceA.CommandA.3.Hello',
'DeviceA.CommandB.1.Bye',
'DeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.2.Hello',
'DeviceA.SubdeviceA.CommandB.15.Hello',
'DeviceA.SubdeviceB.CommandA.1.What',
'DeviceB.CommandB.What']
在 sorted
中指定一个 key
似乎可以实现你想要的:
import re
def my_key(s):
n = re.search("\d+",s)
return (s[:n.span()[0]], int(n[0])) if n else (s,)
print(sorted(l, key = my_key))
输出:
['DeviceA.CommandA.1.Hello', 'DeviceA.CommandA.2.Hello', 'DeviceA.CommandA.3.Hello', 'DeviceA.CommandA.11.Hello', 'DeviceA.CommandB.1.Hello', 'DeviceA.CommandB.1.Bye', 'DeviceA.SubdeviceA.CommandB.1.Hello', 'DeviceA.SubdeviceA.CommandB.2.Hello', 'DeviceA.SubdeviceB.CommandA.1.What', 'DeviceB.CommandB.What']
有很多方法可以实现这一点。这是一个不依赖于导入任何附加模块的模块:
LOS = ['DeviceA.CommandA.1.Hello',
'DeviceA.CommandA.2.Hello',
'DeviceA.CommandA.11.Hello',
'DeviceA.CommandA.3.Hello',
'DeviceA.CommandB.1.Hello',
'DeviceA.CommandB.1.Bye',
'DeviceB.CommandB.What',
'DeviceA.SubdeviceA.CommandB.1.Hello',
'DeviceA.SubdeviceA.CommandB.2.Hello',
'DeviceA.SubdeviceB.CommandA.1.What']
def func(s):
tokens = s.split('.')
for i, token in enumerate(tokens):
try:
v = int(token)
return ('.'.join(tokens[0:i]), v)
except ValueError:
pass
return (s, 0)
print(sorted(LOS, key=func))