Python urllib keyerror

Question

我想计算来自特定 url

的所有单词

import urllib.request
url = 'http://www.py4inf.com/code/romeo.txt'
fhand = urllib.request.Request(url)
resp = urllib.request.urlopen(fhand)
counts = dict()
for line in resp:
    words = line.split()
    print (words)
    for word in words:
        counts[word] = counts[word] +1
print (counts)

我在运行执行此操作时遇到错误： [b'But', b'soft', b'what', b'light', b'through', b'yonder', b'window', b'breaks']

回溯（最后一次调用）：文件 "C:/Python/Hello/Exercise.py"，第 13 行，位于计数[字] = 计数[字] +1

KeyError: b'But'

为什么每个单词或每一行都附加 b'？如果我使用相同的代码从文件中读取，它工作正常。

Answer 1

当它还不存在时，您正在尝试添加它。例如。

counts = {}
counts["test"] = counts["test"] + 1 # counts["test"] does not exist...

因为 "test" 还没有在 counts 中，它会引发一个 KeyError。

简单的解决办法是检查它是否在里面。如果不是，则将其分配给 1:

import urllib.request
url = 'http://www.py4inf.com/code/romeo.txt'
fhand = urllib.request.Request(url)
resp = urllib.request.urlopen(fhand)
counts = dict()
for line in resp:
    words = line.split()
    print (words)
    for word in words:
        counts[word] = counts[word]+1 if word in counts else 1
print (counts)

Answer 2

我知道了。虽然我声明为字典，但我正在添加为列表。

对于字典，我试过了

计数[字] = counts.get(字,0) +1

成功了。

Answer 3

好像每天都有一道题的答案是defaultdict.

import urllib.request
from collections import defaultdict

url = 'http://www.py4inf.com/code/romeo.txt'
fhand = urllib.request.Request(url)
resp = urllib.request.urlopen(fhand)
counts = defaultdict(int) # pass a default type in, int() == 0
for line in resp:
    words = line.split()
    print (words)
    for word in words:
        counts[word] = counts[word] +1
print (counts)

使用常规字典时，count[word] 尚未定义，将抛出 KeyError。 defaultdict 的简单实现可能类似于：

class defaultdict(dict):
    def __init__(self, default_type, *args, **kwargs):
        # this allows for the regular dictionary constructor to be used
        dict.__init__(self, *args, **kwargs) 
        self._type = default_type

    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            dict.__setitem__(self, key, self._type())
            return dict.__getitem__(self, key)

我确信有更好的方法可以做到这一点，但它的工作方式应该大致相同。 __setitem__ 的默认实现将隐式引用 __getitem__.

的新定义

Python urllib keyerror

Python Urllib keyerror

python

urllib

keyerror

KeyError: b'But'