无法将值添加到 python 字典并写入文件
Unable to add Value to python dictionary and write to a file
我正在尝试检查 dict
中是否存在某个词。如果没有,我会将 key
和 value
添加到 dict
:
mydict = {}
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
for Word in filei:
Word = Word.split()
if not Word in dict:
dict[Word] = 1
elif Word in dict:
dict[Word] = dict[Word] + 1
print [unicode(i) for i in dict.items()]
它抛出以下错误:
if not Word in dict:
TypeError: unhashable type: 'list'
如果我删除 Word = Word.split()
部分,它会起作用,但会考虑整行。那对我没有帮助。我想把你看到的每一个字都数一遍。
Word = Word.split()
将使 Word
成为一个列表,并且您不能将 list
(或任何其他不可散列的类型)作为字典键。
您应该考虑使用 collections.Counter
,但要稍微修改您现有的代码:
with io.open("fileo.txt", "r", encoding="utf-8") as filei:
d = dict()
for line in filei:
words = line.strip().split()
for word in words:
if word in d:
d[word] += 1
else:
d[word] = 1
print d
print [unicode(i) for i in d.items()]
因为你已经分词了,你可以使用for循环进行检查和计数:
words = Word.split()
for word in words:
if not word in dict:
...
但由于您只是在计算字数,我建议您改用 Counter
:
从集合中导入计数器
with io.open("fileo.txt", "r", encoding="utf-8") as f:
word_count = Counter()
for line in f:
words = line.strip.split()
word_count.update(words)
print [unicode(word) for word in d.most_common(100)]
这将计算唯一单词并在最后打印 100 个最常用的单词。
可以写的短一些(如果你的文件不是太大,因为整个文件是一次性读取的):
with io.open("fileo.txt", "r", encoding="utf-8") as f:
word_count = Counter(word.strip() for word in f.read().split())
如果您不想导入和使用 defaultdict
或 Counter
字典,请使用 dict.setdefault
并避免使用 if/else
。使用单词字符串作为键:
dct = {}
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
for line in filei:
words = line.split()
for word in words:
word = word.lower()
# if key does not exist add it and set a default value of 0
dct.setdefault(word, 0)
dct[word] += 1 # increment the count
对变量使用小写名称,不要使用 dict
作为变量名称,因为它会遮盖 python dict
。我假设您认为 Word
和 word
是相同的,因此您需要对每个单词调用 lower 以捕获单词具有大写字母的任何情况。
如果您想将字典存储到文件中,请使用 pickle or json:
import pickle
with open("count.pkl", "wb") as f:
pickle.dump(dct ,f)
简单加载:
with open("count.pkl", "rb") as f:
dct = pickle.load(f)
在文件中使用 json 作为人类可读的输出:
import json
with open("count.json", "w") as f:
json.dump(dct,f)
with open("count.json") as f:
dct = json.load(f)
我正在尝试检查 dict
中是否存在某个词。如果没有,我会将 key
和 value
添加到 dict
:
mydict = {}
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
for Word in filei:
Word = Word.split()
if not Word in dict:
dict[Word] = 1
elif Word in dict:
dict[Word] = dict[Word] + 1
print [unicode(i) for i in dict.items()]
它抛出以下错误:
if not Word in dict:
TypeError: unhashable type: 'list'
如果我删除 Word = Word.split()
部分,它会起作用,但会考虑整行。那对我没有帮助。我想把你看到的每一个字都数一遍。
Word = Word.split()
将使 Word
成为一个列表,并且您不能将 list
(或任何其他不可散列的类型)作为字典键。
您应该考虑使用 collections.Counter
,但要稍微修改您现有的代码:
with io.open("fileo.txt", "r", encoding="utf-8") as filei:
d = dict()
for line in filei:
words = line.strip().split()
for word in words:
if word in d:
d[word] += 1
else:
d[word] = 1
print d
print [unicode(i) for i in d.items()]
因为你已经分词了,你可以使用for循环进行检查和计数:
words = Word.split()
for word in words:
if not word in dict:
...
但由于您只是在计算字数,我建议您改用 Counter
:
从集合中导入计数器
with io.open("fileo.txt", "r", encoding="utf-8") as f:
word_count = Counter()
for line in f:
words = line.strip.split()
word_count.update(words)
print [unicode(word) for word in d.most_common(100)]
这将计算唯一单词并在最后打印 100 个最常用的单词。
可以写的短一些(如果你的文件不是太大,因为整个文件是一次性读取的):
with io.open("fileo.txt", "r", encoding="utf-8") as f:
word_count = Counter(word.strip() for word in f.read().split())
如果您不想导入和使用 defaultdict
或 Counter
字典,请使用 dict.setdefault
并避免使用 if/else
。使用单词字符串作为键:
dct = {}
with io.open("fileo.txt", "r", encoding="utf-8") as fileo:
for line in filei:
words = line.split()
for word in words:
word = word.lower()
# if key does not exist add it and set a default value of 0
dct.setdefault(word, 0)
dct[word] += 1 # increment the count
对变量使用小写名称,不要使用 dict
作为变量名称,因为它会遮盖 python dict
。我假设您认为 Word
和 word
是相同的,因此您需要对每个单词调用 lower 以捕获单词具有大写字母的任何情况。
如果您想将字典存储到文件中,请使用 pickle or json:
import pickle
with open("count.pkl", "wb") as f:
pickle.dump(dct ,f)
简单加载:
with open("count.pkl", "rb") as f:
dct = pickle.load(f)
在文件中使用 json 作为人类可读的输出:
import json
with open("count.json", "w") as f:
json.dump(dct,f)
with open("count.json") as f:
dct = json.load(f)