为什么 WSD 不匹配 WordNet?

Why isn't WSD matching WordNet?

我正在研究 WSD 和 WordNet,我正试图找出它们输出不同结果的原因。我在使用以下代码时的理解是,消除歧义命令会指定最有可能的同义词集:

from pywsd import disambiguate
from nltk.corpus import wordnet as wn

mysent = 'I went to have a drink in a bar'

wsd = disambiguate(mysent)

这给了我下面的输出

('I', None)
('went', Synset('travel.v.01'))
('to', None)
('have', None)
('a', None)
('drink', Synset('swallow.n.02'))
('in', None)
('a', None)
('bar', Synset('barroom.n.01'))

由此,我发现单词 'I' 返回为 'nonetype' 很奇怪,因为在 WordNet 中查找该单词时我得到了四种可能的解释之一。当然,'I'应该至少对应其中一个吧?

wordnet.synsets('I')

Out:
[Synset('iodine.n.01'), Synset('one.n.01'), Synset('i.n.03'), Synset('one.s.01')]

在你上面的句子中,'I' 是一个代词。 wordnet FAQ 指出:

Q: Why is WordNet missing: of, an, the, and, about, above, because, etc.

A: WordNet only contains "open-class words": nouns, verbs, adjectives, and adverbs. Thus, excluded words include determiners, prepositions, pronouns, conjunctions, and particles.