在 python 中打印每行 n 个元素的字典
Printing a dictionary in python with n elements per line
给定一个包含大约 200,000 行单个单词的 .txt,我需要计算每个字母作为单词的第一个字母出现了多少次。我有一个字典,键为 'a' - 'z',每个值都有计数。我需要以
的形式打印出来
a:10,978 b:7,890 c:12,201 d:9,562 e:6,008
f:7,095 g:5,660 (...)
字典目前打印成这样
[('a', 10898), ('b', 9950), ('c', 17045), ('d', 10675), ('e', 7421), ('f', 7138), ('g', 5998), ('h', 6619), ('i', 7128), ('j', 1505), ('k'...
如何删除方括号和圆括号并每行只打印 5 个计数?此外,在我按键对字典进行排序后,它开始打印为键,值而不是 key:value
def main():
file_name = open('dictionary.txt', 'r').readlines()
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
letter = {}
for i in alphabet:
letter[i]=0
for n in letter:
for p in file_name:
if p.startswith(n):
letter[n] = letter[n]+1
letter = sorted(letter.items())
print(letter)
main()
您可以使用以下内容:
它遍历您的列表,将其按 5 个元素分组,然后以所需格式打印。
在[15]中:
letter = [('a', 10898), ('b', 9950), ('c', 17045), ('d', 10675), ('e', 7421), ('f', 7138), ('g', 5998), ('h', 6619), ('i', 7128), ('j', 1505)]
将 print(letter)
替换为以下内容:
for grp in range(0, len(letter), 5):
print(' '.join(elm[0] + ':' + '{:,}'.format(elm[1]) for elm in letter[grp:grp+5]))
a:10,898 b:9,950 c:17,045 d:10,675 e:7,421
f:7,138 g:5,998 h:6,619 i:7,128 j:1,505
一个collections.Counter dict 会得到每行所有第一个字母的计数,然后拆分成块并连接:
from collections import Counter
with open('dictionary.txt') as f: # automatically closes your file
# iterate once over the file object as opposed to storing 200k lines
# and 26 iterations over the lines
c = Counter(line[0] for line in f)
srt = sorted(c.items())
# create five element chunks from the sorted items
chunks = (srt[i:i+5] for i in range(0, len(srt), 5))
for chk in chunks:
# format and join
print(" ".join("{}:{:,}".format(c[0],c[1]) for c in chk))
如果您可能有除字母 a-z 之外的其他内容,请在循环中使用 isalpha:
c = Counter(line[0] for line in f if line[0].isalpha())
在 python 2.7 中添加了一个 Format Specifier for Thousands Separator。
给定一个包含大约 200,000 行单个单词的 .txt,我需要计算每个字母作为单词的第一个字母出现了多少次。我有一个字典,键为 'a' - 'z',每个值都有计数。我需要以
的形式打印出来a:10,978 b:7,890 c:12,201 d:9,562 e:6,008
f:7,095 g:5,660 (...)
字典目前打印成这样
[('a', 10898), ('b', 9950), ('c', 17045), ('d', 10675), ('e', 7421), ('f', 7138), ('g', 5998), ('h', 6619), ('i', 7128), ('j', 1505), ('k'...
如何删除方括号和圆括号并每行只打印 5 个计数?此外,在我按键对字典进行排序后,它开始打印为键,值而不是 key:value
def main():
file_name = open('dictionary.txt', 'r').readlines()
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
letter = {}
for i in alphabet:
letter[i]=0
for n in letter:
for p in file_name:
if p.startswith(n):
letter[n] = letter[n]+1
letter = sorted(letter.items())
print(letter)
main()
您可以使用以下内容:
它遍历您的列表,将其按 5 个元素分组,然后以所需格式打印。
在[15]中:
letter = [('a', 10898), ('b', 9950), ('c', 17045), ('d', 10675), ('e', 7421), ('f', 7138), ('g', 5998), ('h', 6619), ('i', 7128), ('j', 1505)]
将 print(letter)
替换为以下内容:
for grp in range(0, len(letter), 5):
print(' '.join(elm[0] + ':' + '{:,}'.format(elm[1]) for elm in letter[grp:grp+5]))
a:10,898 b:9,950 c:17,045 d:10,675 e:7,421
f:7,138 g:5,998 h:6,619 i:7,128 j:1,505
一个collections.Counter dict 会得到每行所有第一个字母的计数,然后拆分成块并连接:
from collections import Counter
with open('dictionary.txt') as f: # automatically closes your file
# iterate once over the file object as opposed to storing 200k lines
# and 26 iterations over the lines
c = Counter(line[0] for line in f)
srt = sorted(c.items())
# create five element chunks from the sorted items
chunks = (srt[i:i+5] for i in range(0, len(srt), 5))
for chk in chunks:
# format and join
print(" ".join("{}:{:,}".format(c[0],c[1]) for c in chk))
如果您可能有除字母 a-z 之外的其他内容,请在循环中使用 isalpha:
c = Counter(line[0] for line in f if line[0].isalpha())
在 python 2.7 中添加了一个 Format Specifier for Thousands Separator。