使用 python 中的字典计算以 a-z 字符开头的单词

Question

Load the dictionary in python as a list from following url "https://cfstatic.org/static/words.txt". Using this word list, create a python dictionary (or an array if you are using php)

with the following property :
key : alphabet [a-z]
value: number of words starting with that alphabet in the dictionary

我想要的结果是

a : number of words start with a 
b : number of words start with b  
[...]
z : number of words start with z

我做了以下，

import urllib2  # the lib that handles the url stuff
try:
    input_file = urllib2.urlopen('https://cfstatic.org/static/words.txt') # it's a file like object and works just like a file
    myNames = []
    for line in input_file:
        myNames.append(line.strip()) #strips the new line in list
    print myNames                      #print the file as list
except urllib2.URLError as e:         #raise the exception if url is not found
    print "Error Message : %s" %e
else:
    print "File reading operation successful!!!"

Answer 1

更改您的 for 循环以创建一个字典而不仅仅是一个列表。像这样的东西：-

alphabet = {}
for line in input_file:
    line = line.strip()
    starts_with = line[0]
    if line[0] in alphabet:
        alphabet[line[0]].append(line)
    else:
        alphabet[line[0]] = [line]
for key in alphabet:
    alphabet[key] = len(alphabet[key])

正如其他答案之一所暗示的那样，您也可以这样做（不需要存储元素）：-

alphabet = {}
for line in input_file:
    line = line.strip()
    starts_with = line[0]
    if starts_with in alphabet:
        alphabet[starts_with]+= 1
    else:
        alphabet[starts_with] = 1

print alphabet

Answer 2

我会创建一个包含字母的列表，而不是用您拥有的单词遍历列表，然后按照以下方式将它们添加到字典或字典中的递增计数器：

letters = [chr(l) for l in range(97,123)]
d = {}
for word in myNames:
    d.update({word[0]: 1}) if not d.has_key(word[0]) else d.update({word[0]: d[word[0]]+1})

希望对您有用。如果您需要解释，请写信给我。

Answer 3

collections 模块 (https://docs.python.org/2/library/collections.html#collections.Counter) 中的 Counter 就是为此而制作的。

将单词列表转换为第一个字符列表（下面的 map(...) 调用），然后将该可迭代直接输入 collections.Counter 对象：

>>> import collections                                                                                                                 
>>> words = ["aap", "noot", "mies", "foo", "appel"]                                                                                    
>>> collections.Counter(map(lambda x: x[0], words))                                                                                    
Counter({'a': 2, 'f': 1, 'm': 1, 'n': 1})

Answer 4

我能想到的最简单的是：

from string import ascii_lowercase
output_dict = dict.fromkeys(ascii_lowercase, 0)
input = " this is a text message"
for ch in input:
    if ch in ascii_lowercase:
        output_dict[ch] += 1

for character, count in output_dict.items():
    if count:
        print "%s : count is %s" % (character, count)

如果你不想使用字符串模块或者想自己减少字符，你可以这样写：

alphabets_lower = "abcdefghijklmnopqrstuvwxyz"
output_dict = dict.fromkeys(alphabets_lower, 0)

玩得开心:-)

使用 python 中的字典计算以 a-z 字符开头的单词

count the words starts with a-z character using dictionary in python

python

dictionary

list

counting

key-value