如何将列表中字符串的位置添加到新的双打列表中?
How to add position of a string in a list to a new list of doubles?
示例:
r 是加载到列表中的文本文件
r = ['John is american', 'Bea is french', 'John is american', 'Ray is german', 'John is american', 'Bea is french', 'Bea is french', '', 'Lisa is dutch']
我想做的是统计出现的次数,并在r中添加位置:
finallist = ['string', frequency, [positions in r]]
finallist = [['John is american', 3, [0,2,4]], ['Bea is french', 3, [1,5,6]], ['Ray is german', 1, [3]], ['Lisa is dutch', 1, [7]]]
我知道如何计算 r:
中的字符串
[[x,r.count(x)] for x in set(r)]
(或使用 collections 库中的计数器 class)
但是如何将字符串在 r 中的位置添加到决赛列表中?
使用字典来跟踪句子的位置(构建列表);这些列表的最终长度也是频率计数:
from collections import defaultdict
pos = defaultdict(list)
for i, sentence in enumerate(r):
pos[sentence].append(i)
finallist = [[sentence, len(positions), positions] for sentence, positions in pos.items()]
演示:
>>> from collections import defaultdict
>>> r = ['John is american', 'Bea is french', 'John is american', 'Ray is german', 'John is american', 'Bea is french', 'Bea is french', '', 'Lisa is dutch']
>>> pos = defaultdict(list)
>>> for i, sentence in enumerate(r):
... pos[sentence].append(i)
...
>>> [[sentence, len(positions), positions] for sentence, positions in pos.items()]
[['John is american', 3, [0, 2, 4]], ['Bea is french', 3, [1, 5, 6]], ['Ray is german', 1, [3]], ['', 1, [7]], ['Lisa is dutch', 1, [8]]]
如果输出顺序很重要,并且您还没有访问 Python 3.6(在回答这个问题时是 in beta,但其 dict
实现保留了插入顺序),那么你可以使用一个 OrderedDict
实例,并使用 dict.setdefault()
来具体化每个键的初始空列表:
from collections import OrderedDict
pos = OrderedDict()
for i, sentence in enumerate(r):
pos.setdefault(sentence, []).append(i)
finallist = [[sentence, len(positions), positions] for sentence, positions in pos.items()]
示例:
r 是加载到列表中的文本文件
r = ['John is american', 'Bea is french', 'John is american', 'Ray is german', 'John is american', 'Bea is french', 'Bea is french', '', 'Lisa is dutch']
我想做的是统计出现的次数,并在r中添加位置:
finallist = ['string', frequency, [positions in r]]
finallist = [['John is american', 3, [0,2,4]], ['Bea is french', 3, [1,5,6]], ['Ray is german', 1, [3]], ['Lisa is dutch', 1, [7]]]
我知道如何计算 r:
中的字符串[[x,r.count(x)] for x in set(r)]
(或使用 collections 库中的计数器 class)
但是如何将字符串在 r 中的位置添加到决赛列表中?
使用字典来跟踪句子的位置(构建列表);这些列表的最终长度也是频率计数:
from collections import defaultdict
pos = defaultdict(list)
for i, sentence in enumerate(r):
pos[sentence].append(i)
finallist = [[sentence, len(positions), positions] for sentence, positions in pos.items()]
演示:
>>> from collections import defaultdict
>>> r = ['John is american', 'Bea is french', 'John is american', 'Ray is german', 'John is american', 'Bea is french', 'Bea is french', '', 'Lisa is dutch']
>>> pos = defaultdict(list)
>>> for i, sentence in enumerate(r):
... pos[sentence].append(i)
...
>>> [[sentence, len(positions), positions] for sentence, positions in pos.items()]
[['John is american', 3, [0, 2, 4]], ['Bea is french', 3, [1, 5, 6]], ['Ray is german', 1, [3]], ['', 1, [7]], ['Lisa is dutch', 1, [8]]]
如果输出顺序很重要,并且您还没有访问 Python 3.6(在回答这个问题时是 in beta,但其 dict
实现保留了插入顺序),那么你可以使用一个 OrderedDict
实例,并使用 dict.setdefault()
来具体化每个键的初始空列表:
from collections import OrderedDict
pos = OrderedDict()
for i, sentence in enumerate(r):
pos.setdefault(sentence, []).append(i)
finallist = [[sentence, len(positions), positions] for sentence, positions in pos.items()]