如何在 Python NLTK 中提取 WordNet 同义词集的偏移量以提供同义词集？

Question

WordNet 中的意义偏移量是一个 8 位数字后跟一个 POS 标记。例如，synset 'dog.n.01' 的偏移量是“02084071-n”。我试过以下代码：

    from nltk.corpus import wordnet as wn

    ss = wn.synset('dog.n.01')
    offset = str(ss.offset)
    print (offset)

但是，我得到了这个输出：

    <bound method Synset.offset of Synset('dog.n.01')>

如何获得这种格式的实际偏移量：“02084071-n”？

Answer 1

>>> from nltk.corpus import wordnet as wn
>>> ss = wn.synset('dog.n.01')
>>> offset = str(ss.offset()).zfill(8) + '-' + ss.pos()
>>> offset
u'02084071-n'

如何在 Python NLTK 中提取 WordNet 同义词集的偏移量以提供同义词集？

How do I extract the offset of a WordNet synset give a synset in Python NLTK?

python

nlp

nltk

wordnet

semantics