连接并简化包含数字和字母对的列表
Concatenate and simplify a list containing number and letter pairs
我有一个代表数字的字符串列表。我不能使用 int,因为有些数字带有附加字母,例如“33a”或“33b”
['21', '23a', '23b', '23k', '23l', '23x', '25', '33a', '33b', '33c', '33d', '33e', '33f', '34', '34', '35a', '35a' ]
我的目标是将数字连接成一个字符串并使用正斜杠将它们分开。
如果一个数字重复并且其附加字母按字母顺序连续,则表示应简化如下:
['23a'/'23b'] --> '23a-b'
如果一个数字在没有附加字母的情况下重复出现,则它应该只列出一次。这同样适用于重复相同的数字对和附加字母。
对于完整的示例,所需的输出如下所示:
'21/23a-b/23k-l/23x/25/33a-f/34/35a'
使用下面的代码我可以连接数字并排除重复项,但我无法根据上面的示例尝试用字母简化数字。
numbers = ['21', '23a', '23b', '23k', '23l', '23x', '25', '33a', '33b', '33c', '33d', '33e', '33f', '34', '34', '35a', '35a' ]
concat_numbers = ""
numbers_set = list(set(numbers))
numbers_set.sort()
for number in numbers_set:
concat_numbers += number + "/"
print(concat_numbers)
>>> '21/23a/23b/23k/23l/23x/25/33a/33b/33c/33d/33e/33f/34/35a/'
关于如何以最 pythonic 方式实现此目的的任何提示?
这可以通过利用 defaultdict(list) 并像这样重新创建输出来完成:
data = ['21', '23a', '23b', '23k', '23l', '23x', '25', '33a', '33b',
'33c', '33d', '33e', '33f', '34', '34', '35a', '35a']
data.sort() # easier if letters are sorted - so sort it
from collections import defaultdict
from itertools import takewhile
d = defaultdict(list)
for n in data:
# split in number/letters
number = ''.join(takewhile(str.isdigit, n))
letter = n[len(number):]
# add to dict
d[number].append(letter)
print(d)
我们现在有一个以“数字”作为键、所有字母作为列表的字典,需要进一步清理它:
# concat letters that follow each other
def unify(l):
u = [""]
# remember start/end values
first = l[0]
last = l[0]
# iterate the list of letters given
for letter in l:
# for same letters or a->b letters, move last forward
if last == letter or ord(last) == ord(letter)-1:
last = letter
else:
# letter range stopped, add to list
u.append(f"{first}-{last}")
# start over with new values
first = letter
last = letter
# add if not part of the resulting list already
if not u[-1].startswith(first):
# either single letter or range, then add as range
u.append( first if last == first else f"{first}-{last}")
# ignore empty results in u
return ",".join( (w for w in u if w) )
# unify letters
for key,value in d.items():
d[key] = unify(value)
print(d)
然后构建最终输出:
r = "/".join(f"{ky}{v}" for ky,vl in d.items() for v in vl.split(","))
print(r)
输出:
# collected splitted key/values
defaultdict(<class 'list'>,
{'21': [''], '23': ['a', 'b', 'k', 'l', 'x'],
'25': [''], '33': ['a', 'b', 'c', 'd', 'e', 'f'],
'34': ['', ''], '35': ['a', 'a']})
# unified values
defaultdict(<class 'list'>,
{'21': '', '23': 'a-b,k-l,x', '25': '',
'33': 'a-f', '34': '', '35': 'a'})
# as string
21/23a-b/23k-l/23x/25/33a-f/34/35a
我有一个代表数字的字符串列表。我不能使用 int,因为有些数字带有附加字母,例如“33a”或“33b”
['21', '23a', '23b', '23k', '23l', '23x', '25', '33a', '33b', '33c', '33d', '33e', '33f', '34', '34', '35a', '35a' ]
我的目标是将数字连接成一个字符串并使用正斜杠将它们分开。
如果一个数字重复并且其附加字母按字母顺序连续,则表示应简化如下:
['23a'/'23b'] --> '23a-b'
如果一个数字在没有附加字母的情况下重复出现,则它应该只列出一次。这同样适用于重复相同的数字对和附加字母。
对于完整的示例,所需的输出如下所示:
'21/23a-b/23k-l/23x/25/33a-f/34/35a'
使用下面的代码我可以连接数字并排除重复项,但我无法根据上面的示例尝试用字母简化数字。
numbers = ['21', '23a', '23b', '23k', '23l', '23x', '25', '33a', '33b', '33c', '33d', '33e', '33f', '34', '34', '35a', '35a' ]
concat_numbers = ""
numbers_set = list(set(numbers))
numbers_set.sort()
for number in numbers_set:
concat_numbers += number + "/"
print(concat_numbers)
>>> '21/23a/23b/23k/23l/23x/25/33a/33b/33c/33d/33e/33f/34/35a/'
关于如何以最 pythonic 方式实现此目的的任何提示?
这可以通过利用 defaultdict(list) 并像这样重新创建输出来完成:
data = ['21', '23a', '23b', '23k', '23l', '23x', '25', '33a', '33b',
'33c', '33d', '33e', '33f', '34', '34', '35a', '35a']
data.sort() # easier if letters are sorted - so sort it
from collections import defaultdict
from itertools import takewhile
d = defaultdict(list)
for n in data:
# split in number/letters
number = ''.join(takewhile(str.isdigit, n))
letter = n[len(number):]
# add to dict
d[number].append(letter)
print(d)
我们现在有一个以“数字”作为键、所有字母作为列表的字典,需要进一步清理它:
# concat letters that follow each other
def unify(l):
u = [""]
# remember start/end values
first = l[0]
last = l[0]
# iterate the list of letters given
for letter in l:
# for same letters or a->b letters, move last forward
if last == letter or ord(last) == ord(letter)-1:
last = letter
else:
# letter range stopped, add to list
u.append(f"{first}-{last}")
# start over with new values
first = letter
last = letter
# add if not part of the resulting list already
if not u[-1].startswith(first):
# either single letter or range, then add as range
u.append( first if last == first else f"{first}-{last}")
# ignore empty results in u
return ",".join( (w for w in u if w) )
# unify letters
for key,value in d.items():
d[key] = unify(value)
print(d)
然后构建最终输出:
r = "/".join(f"{ky}{v}" for ky,vl in d.items() for v in vl.split(","))
print(r)
输出:
# collected splitted key/values
defaultdict(<class 'list'>,
{'21': [''], '23': ['a', 'b', 'k', 'l', 'x'],
'25': [''], '33': ['a', 'b', 'c', 'd', 'e', 'f'],
'34': ['', ''], '35': ['a', 'a']})
# unified values
defaultdict(<class 'list'>,
{'21': '', '23': 'a-b,k-l,x', '25': '',
'33': 'a-f', '34': '', '35': 'a'})
# as string
21/23a-b/23k-l/23x/25/33a-f/34/35a