NLTK 中的 wordnet 词形还原器不适用于副词

wordnet lemmatizer in NLTK is not working for adverbs

from nltk.stem import WordNetLemmatizer
x = WordNetLemmatizer()   
x.lemmatize("angrily", pos='r')
Out[41]: 'angrily'

这里是 nltk wordnet 中 pos 标签的参考文档,http://www.nltk.org/_modules/nltk/corpus/reader/wordnet.html

我可能遗漏了一些基本的东西。请告诉我

尝试:

>>> from nltk.corpus import wordnet as wn
>>> wn.synset('angrily.r.1').lemmas()[0].pertainyms()[0].name()
u'angry'

有关详细信息,请参阅 Getting adjective from an adverb in nltk or other NLP library

问题是为什么你必须通过引理来获得相关词?

>>> wn.synset('angrily.r.1').pertainyms()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Synset' object has no attribute 'pertainyms'

这是因为 WordNet 将其视为词类之间的词汇关联,参见 http://wordnet.princeton.edu/man/wngloss.7WN.html

Pertainyms are relational adjectives and do not follow the structure just described. Pertainyms do not have antonyms; the synset for a pertainym most often contains only one word or collocation and a lexical pointer to the noun that the adjective is "pertaining to". Participial adjectives have lexical pointers to the verbs that they are derived from.

然后,如果我们查看 Java 界面,获取同义词集的相关词就像 AdjectiveSynset.getPertainyms() (http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/AdjectiveSynset.html)

一样简单

所以我想这取决于谁编写界面,他们对形容词-副词关系采取什么样的观点。

对我来说,我认为 pertainyms 会直接与同义词集相关,而不是与词条相关。