在nltk上工作时,如何操作nltk.corpus.reader.wordnet.Synset?
While working on nltk, how to manipulate the nltk.corpus.reader.wordnet.Synset?
import nltk
from nltk.corpus import wordnet as wn
synsets = wn.synsets('killed','v')
sense=synsets[0]
这里的sense是nltk.corpus.reader.wordnet.Synset类型的。它输出 Synset('kill.v.01')。
当我尝试使用 senti wordnet
k = sentiwordnet.senti_synset('kill.v.01')
print(k)
这会输出'kill'的正面和负面分数。
我的问题是 - 如何在代码片段 2 中使用 sense(来自代码片段 1)?
当我尝试直接使用它时,它抛出了这个错误
引理, pos, synset_index_str = name.lower().rsplit('.', 2)
AttributeError: 'Synset' 对象没有属性 'lower'
Synset 属性可以通过 Synset 对象中的 get 函数返回,例如
>> from nltk.corpus import wordnet as wn
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
>>> dog = wn.synsets('dog')[0]
>>> dog.definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> dog.lemma_names()
[u'dog', u'domestic_dog', u'Canis_familiaris']
>>> dog.pos()
u'n'
>>> dog.offset()
2084071
>>> dog.name()
u'dog.n.01'
如果要保留同义词集名称、POS 和同义词集 ID 的索引,请使用 tne synset.name()
which returns a unicode
string:
>>> type(dog.name())
<type 'unicode'>
>>> name, pos, sid = dog.name().split('.')
>>> name
u'dog'
>>> pos
u'n'
>>> sid
u'01'
这些是 Synset
对象可以访问的模块和变量:
>>> dir(dog)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_all_hypernyms', '_definition', '_examples', '_frame_ids', '_hypernyms', '_instance_hypernyms', '_iter_hypernym_lists', '_lemma_names', '_lemma_pointers', '_lemmas', '_lexname', '_max_depth', '_min_depth', '_name', '_needs_root', '_offset', '_pointers', '_pos', '_related', '_shortest_hypernym_paths', '_wordnet_corpus_reader', 'also_sees', 'attributes', 'causes', 'closure', 'common_hypernyms', 'definition', 'entailments', 'examples', 'frame_ids', 'hypernym_distances', 'hypernym_paths', 'hypernyms', 'hyponyms', 'instance_hypernyms', 'instance_hyponyms', 'jcn_similarity', 'lch_similarity', 'lemma_names', 'lemmas', 'lexname', 'lin_similarity', 'lowest_common_hypernyms', 'max_depth', 'member_holonyms', 'member_meronyms', 'min_depth', 'name', 'offset', 'part_holonyms', 'part_meronyms', 'path_similarity', 'pos', 'region_domains', 'res_similarity', 'root_hypernyms', 'shortest_path_distance', 'similar_tos', 'substance_holonyms', 'substance_meronyms', 'topic_domains', 'tree', 'unicode_repr', 'usage_domains', 'verb_groups', 'wup_similarity']
import nltk
from nltk.corpus import wordnet as wn
synsets = wn.synsets('killed','v')
sense=synsets[0]
这里的sense是nltk.corpus.reader.wordnet.Synset类型的。它输出 Synset('kill.v.01')。 当我尝试使用 senti wordnet
k = sentiwordnet.senti_synset('kill.v.01')
print(k)
这会输出'kill'的正面和负面分数。
我的问题是 - 如何在代码片段 2 中使用 sense(来自代码片段 1)? 当我尝试直接使用它时,它抛出了这个错误 引理, pos, synset_index_str = name.lower().rsplit('.', 2) AttributeError: 'Synset' 对象没有属性 'lower'
Synset 属性可以通过 Synset 对象中的 get 函数返回,例如
>> from nltk.corpus import wordnet as wn
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
>>> dog = wn.synsets('dog')[0]
>>> dog.definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> dog.lemma_names()
[u'dog', u'domestic_dog', u'Canis_familiaris']
>>> dog.pos()
u'n'
>>> dog.offset()
2084071
>>> dog.name()
u'dog.n.01'
如果要保留同义词集名称、POS 和同义词集 ID 的索引,请使用 tne synset.name()
which returns a unicode
string:
>>> type(dog.name())
<type 'unicode'>
>>> name, pos, sid = dog.name().split('.')
>>> name
u'dog'
>>> pos
u'n'
>>> sid
u'01'
这些是 Synset
对象可以访问的模块和变量:
>>> dir(dog)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_all_hypernyms', '_definition', '_examples', '_frame_ids', '_hypernyms', '_instance_hypernyms', '_iter_hypernym_lists', '_lemma_names', '_lemma_pointers', '_lemmas', '_lexname', '_max_depth', '_min_depth', '_name', '_needs_root', '_offset', '_pointers', '_pos', '_related', '_shortest_hypernym_paths', '_wordnet_corpus_reader', 'also_sees', 'attributes', 'causes', 'closure', 'common_hypernyms', 'definition', 'entailments', 'examples', 'frame_ids', 'hypernym_distances', 'hypernym_paths', 'hypernyms', 'hyponyms', 'instance_hypernyms', 'instance_hyponyms', 'jcn_similarity', 'lch_similarity', 'lemma_names', 'lemmas', 'lexname', 'lin_similarity', 'lowest_common_hypernyms', 'max_depth', 'member_holonyms', 'member_meronyms', 'min_depth', 'name', 'offset', 'part_holonyms', 'part_meronyms', 'path_similarity', 'pos', 'region_domains', 'res_similarity', 'root_hypernyms', 'shortest_path_distance', 'similar_tos', 'substance_holonyms', 'substance_meronyms', 'topic_domains', 'tree', 'unicode_repr', 'usage_domains', 'verb_groups', 'wup_similarity']