无法将值添加到 python 字典并写入文件

Question

我正在尝试检查 dict 中是否存在某个词。如果没有，我会将 key 和 value 添加到 dict:

mydict = {}    
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
      for Word in filei:
        Word = Word.split()
        if not Word in dict:
            dict[Word] = 1
        elif Word in dict:
            dict[Word] = dict[Word] + 1
    print [unicode(i) for i in dict.items()]

它抛出以下错误：

if not Word in dict:
TypeError: unhashable type: 'list'

如果我删除 Word = Word.split() 部分，它会起作用，但会考虑整行。那对我没有帮助。我想把你看到的每一个字都数一遍。

Answer 1

Word = Word.split() 将使 Word 成为一个列表，并且您不能将 list（或任何其他不可散列的类型）作为字典键。

您应该考虑使用 collections.Counter，但要稍微修改您现有的代码：

with io.open("fileo.txt", "r", encoding="utf-8") as filei:
    d = dict()
    for line in filei:
        words = line.strip().split()
        for word in words:
            if word in d:
                d[word] += 1
            else:
                d[word] = 1
    print d
    print [unicode(i) for i in d.items()]

Answer 2

因为你已经分词了，你可以使用for循环进行检查和计数：

words = Word.split()
for word in words:
    if not word in dict:
        ...

但由于您只是在计算字数，我建议您改用 Counter：

从集合中导入计数器

with io.open("fileo.txt", "r", encoding="utf-8") as f:
    word_count = Counter()
    for line in f:
        words = line.strip.split()
        word_count.update(words)
    print [unicode(word) for word in d.most_common(100)]

这将计算唯一单词并在最后打印 100 个最常用的单词。

可以写的短一些（如果你的文件不是太大，因为整个文件是一次性读取的）：

with io.open("fileo.txt", "r", encoding="utf-8") as f:
    word_count = Counter(word.strip() for word in f.read().split())

Answer 3

如果您不想导入和使用 defaultdict 或 Counter 字典，请使用 dict.setdefault 并避免使用 if/else。使用单词字符串作为键：

dct = {}    
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
      for line in filei:
          words = line.split()
          for word in words:
              word = word.lower() 
              # if key does not exist add it and set a default value of 0
              dct.setdefault(word, 0)
              dct[word] += 1 #  increment the count

对变量使用小写名称，不要使用 dict 作为变量名称，因为它会遮盖 python dict。我假设您认为 Word 和 word 是相同的，因此您需要对每个单词调用 lower 以捕获单词具有大写字母的任何情况。

如果您想将字典存储到文件中，请使用 pickle or json:

import pickle
with open("count.pkl", "wb") as f:
    pickle.dump(dct ,f)

简单加载：

with open("count.pkl", "rb") as f:
   dct = pickle.load(f)

在文件中使用 json 作为人类可读的输出：

import json
with open("count.json", "w") as f:
    json.dump(dct,f)


with open("count.json") as f:
    dct = json.load(f)

无法将值添加到 python 字典并写入文件

Unable to add Value to python dictionary and write to a file

python

dictionary