python 中空格的默认字典计数

Question

为什么默认字典计算我的列表中的空格数？

我使用默认字典计算一个字符在单词中出现的次数。但我的代码也计算单词之间的空格数。那么我如何只计算单词的出现并忽略单词中出现的空格。

from collections import defaultdict

def count_var(word):
    d = defaultdict(int)
    for val in word:
        d[val]+=1
    return d

ct = count_var('big data examiner')


print ct

defaultdict(<type 'int'>, {'a': 3, ' ': 2, 'b': 1, 'e': 2, 'd': 1, 'g': 1, 'i': 2, 'm': 1, 'n': 1, 'r': 1, 't': 1, 'x': 1})

Answer 1

更改此行

ct = count_var('big data examiner')

至

ct = count_var('big data examiner'.split())

这将计算单词而不是字符。并回答为什么要计算 spaces，因为 spaces 是一个有效字符，就像任何字母或数字一样，所以它会被计算在内。

另请注意，存在 collections.Counter 更适合为您解决此问题，特别是因为您已经从 collections.

导入

编辑

关于如何使用 collections.Counter 与上面的想法相同。

这算 个字符

>>> Counter('big data examiner')
Counter({'a': 3, 'i': 2, 'e': 2, ' ': 2, 't': 1, 'b': 1, 'n': 1, 'd': 1, 'm': 1, 'g': 1, 'x': 1, 'r': 1})

这算字数

>>> Counter('big data examiner'.split())
Counter({'big': 1, 'data': 1, 'examiner': 1})

编辑 #2 计算所有非 space 字符

您可以使用str.replace(' ', '')

>>> from collections import Counter
>>> Counter('big data examiner'.replace(' ', ''))
Counter({'a': 3, 'i': 2, 'e': 2, 'x': 1, 'b': 1, 'r': 1, 'g': 1, 'n': 1, 't': 1, 'm': 1, 'd': 1})

Answer 2

回答具体问题：

why does default dict count for the number of empty spaces in my list?

因为空格仍然是字符。例如：

>>> list('big data examiner')
['b', 'i', 'g', ' ', 'd', 'a', 't', 'a', ' ', 'e', 'x', 'a', 'm', 'i', 'n', 'e', 'r']
               # ^                        ^

如当前所写，您的代码计算每个字符，包括空格。如果你想从计数中排除空格，你需要明确:

def count_var(word):
    d = defaultdict(int)
    for val in word:
        if val != ' ':  # exclude spaces
            d[val]+=1
    return d

或者，与其将 ' ' 排除在计数过程之外，不如 在接下来对 d 所做的任何操作中都不要使用该密钥。

请注意 collections 还提供了 Counter，它可以显着简化您的代码：

>>> from collections import Counter
>>> Counter(char for char in 'big data examiner' if char != ' ')
Counter({'a': 3, 'e': 2, 'i': 2, 'b': 1, 'd': 1, 'g': 1, 'm': 1, 'n': 1, 'r': 1, 't': 1, 'x': 1})

python 中空格的默认字典计数

default dict counts for spaces in python

python

dictionary