WordNet 意义上的每个数字是什么意思?
What does each number in a WordNet sense mean?
WordNet 语义编码了 ID 中有关语义的一些信息。根据lemma_from_key
方法here,可以看出前三个数分别是pos_number、lexname_index ,以及 lex_id。另外两个是什么?是否有关于(更具体地)每个含义的文档?
ss_type:lex_filenum:lex_id:head_word:head_id
lemma is the ASCII text of the word or collocation as found in the
WordNet database index file corresponding to pos . lemma is in lower
case, and collocations are formed by joining individual words with an
underscore (_ ) character.
ss_type is a one digit decimal integer representing the synset type
for the sense. See Synset Type below for a listing of the numbers
corresponding to each synset type.
lex_filenum is a two digit decimal integer representing the name of
the lexicographer file containing the synset for the sense. See
lexnames(5WN) for the list of lexicographer file names and their
corresponding numbers.
lex_id is a two digit decimal integer that, when appended onto lemma ,
uniquely identifies a sense within a lexicographer file. lex_id
numbers usually start with 00 , and are incremented as additional
senses of the word are added to the same file, although there is no
requirement that the numbers be consecutive or begin with 00 . Note
that a value of 00 is the default, and therefore is not present in
lexicographer files. Only non-default lex_id values must be explicitly
assigned in lexicographer files. See wninput(5WN) for information on
the format of lexicographer files.
head_word is only present if the sense is in an adjective satellite
synset. It is the lemma of the first word of the satellite's head
synset.
head_id is a two digit decimal integer that, when appended onto
head_word , uniquely identifies the sense of head_word within a
lexicographer file, as described for lex_id . There is a value in this
field only if head_word is present.
WordNet 语义编码了 ID 中有关语义的一些信息。根据lemma_from_key
方法here,可以看出前三个数分别是pos_number、lexname_index ,以及 lex_id。另外两个是什么?是否有关于(更具体地)每个含义的文档?
ss_type:lex_filenum:lex_id:head_word:head_id
lemma is the ASCII text of the word or collocation as found in the WordNet database index file corresponding to pos . lemma is in lower case, and collocations are formed by joining individual words with an underscore (_ ) character.
ss_type is a one digit decimal integer representing the synset type for the sense. See Synset Type below for a listing of the numbers corresponding to each synset type.
lex_filenum is a two digit decimal integer representing the name of the lexicographer file containing the synset for the sense. See lexnames(5WN) for the list of lexicographer file names and their corresponding numbers.
lex_id is a two digit decimal integer that, when appended onto lemma , uniquely identifies a sense within a lexicographer file. lex_id numbers usually start with 00 , and are incremented as additional senses of the word are added to the same file, although there is no requirement that the numbers be consecutive or begin with 00 . Note that a value of 00 is the default, and therefore is not present in lexicographer files. Only non-default lex_id values must be explicitly assigned in lexicographer files. See wninput(5WN) for information on the format of lexicographer files.
head_word is only present if the sense is in an adjective satellite synset. It is the lemma of the first word of the satellite's head synset.
head_id is a two digit decimal integer that, when appended onto head_word , uniquely identifies the sense of head_word within a lexicographer file, as described for lex_id . There is a value in this field only if head_word is present.