如何检查单词是否在同义词集中?
How to check if words are in synsets or not?
我正在尝试比较两个单词列表以检查是否:
word1 列表包含的单词也在 word2 列表的同义词集中
word2 列表包含的单词也在 word1 列表的同义词集中
如果单词在同义词集中,它们 return True
这是我的代码:
from nltk.corpus import wordnet as wn
word1 = ['study', 'car']
word2 = ['learn', 'motor']
def getSynonyms(word1):
synonymList1 = []
for data1 in word1:
wordnetSynset1 = wn.synsets(data1)
tempList1=[]
for synset1 in wordnetSynset1:
synLemmas = synset1.lemma_names()
for i in xrange(len(synLemmas)):
word = synLemmas[i].replace('_',' ')
if word not in tempList1:
tempList1.append(word)
synonymList1.append(tempList1)
return synonymList1
def checkSynonyms(word1, word2):
for i in xrange(len(word1)):
for j in xrange(len(word2)):
d1 = getSynonyms(word1)
d2 = getSynonyms(word2)
if word1[i] in d2:
return True
elif word2[j] in d1:
return True
else:
return False
print word1
print
print word2
print
print getSynonyms(word1)
print
print getSynonyms(word2)
print
print checkSynonyms(word1, word2)
print
但这是输出:
['study', 'car']
['learn', 'motor']
[[u'survey', u'study', u'work', u'report', u'written report', u'discipline',
u'subject', u'subject area', u'subject field', u'field', u'field of study',
u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine',
u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the
books', u'meditate', u'contemplate'], [u'car', u'auto', u'automobile',
u'machine', u'motorcar', u'railcar', u'railway car', u'railroad car',
u'gondola', u'elevator car', u'cable car']]
[[u'learn', u'larn', u'acquire', u'hear', u'get word', u'get wind', u'pick
up', u'find out', u'get a line', u'discover', u'see', u'memorize',
u'memorise', u'con', u'study', u'read', u'take', u'teach', u'instruct',
u'determine', u'check', u'ascertain', u'watch'], [u'motor', u'drive',
u'centrifugal', u'motive']]
False
正如我们所见,word1 中的单词 'study'
也在 word2 的同义词集中 >> u'study'
为什么 return 是假的?
因为你想比较word1和d2的字符串值,不要使用if word1[i] in d2:
因为它会比较word1的字符串值和d2的数组值,例如它会比较:
'study' == [u'survey', u'study', u'work', u'report', u'written report', u'discipline',
u'subject', u'subject area', u'subject field', u'field', u'field of study',
u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine',
u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the
books', u'meditate', u'contemplate']
绝对会returnFalse
因此,不应使用 if word1[i] in d2:
,而应使用 if word1[i] in d2[k]:
,其中 k
是迭代器。
希望对您有所帮助。
我正在尝试比较两个单词列表以检查是否:
word1 列表包含的单词也在 word2 列表的同义词集中
word2 列表包含的单词也在 word1 列表的同义词集中
如果单词在同义词集中,它们 return True
这是我的代码:
from nltk.corpus import wordnet as wn
word1 = ['study', 'car']
word2 = ['learn', 'motor']
def getSynonyms(word1):
synonymList1 = []
for data1 in word1:
wordnetSynset1 = wn.synsets(data1)
tempList1=[]
for synset1 in wordnetSynset1:
synLemmas = synset1.lemma_names()
for i in xrange(len(synLemmas)):
word = synLemmas[i].replace('_',' ')
if word not in tempList1:
tempList1.append(word)
synonymList1.append(tempList1)
return synonymList1
def checkSynonyms(word1, word2):
for i in xrange(len(word1)):
for j in xrange(len(word2)):
d1 = getSynonyms(word1)
d2 = getSynonyms(word2)
if word1[i] in d2:
return True
elif word2[j] in d1:
return True
else:
return False
print word1
print
print word2
print
print getSynonyms(word1)
print
print getSynonyms(word2)
print
print checkSynonyms(word1, word2)
print
但这是输出:
['study', 'car']
['learn', 'motor']
[[u'survey', u'study', u'work', u'report', u'written report', u'discipline',
u'subject', u'subject area', u'subject field', u'field', u'field of study',
u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine',
u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the
books', u'meditate', u'contemplate'], [u'car', u'auto', u'automobile',
u'machine', u'motorcar', u'railcar', u'railway car', u'railroad car',
u'gondola', u'elevator car', u'cable car']]
[[u'learn', u'larn', u'acquire', u'hear', u'get word', u'get wind', u'pick
up', u'find out', u'get a line', u'discover', u'see', u'memorize',
u'memorise', u'con', u'study', u'read', u'take', u'teach', u'instruct',
u'determine', u'check', u'ascertain', u'watch'], [u'motor', u'drive',
u'centrifugal', u'motive']]
False
正如我们所见,word1 中的单词 'study'
也在 word2 的同义词集中 >> u'study'
为什么 return 是假的?
因为你想比较word1和d2的字符串值,不要使用if word1[i] in d2:
因为它会比较word1的字符串值和d2的数组值,例如它会比较:
'study' == [u'survey', u'study', u'work', u'report', u'written report', u'discipline',
u'subject', u'subject area', u'subject field', u'field', u'field of study',
u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine',
u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the
books', u'meditate', u'contemplate']
绝对会returnFalse
因此,不应使用 if word1[i] in d2:
,而应使用 if word1[i] in d2[k]:
,其中 k
是迭代器。
希望对您有所帮助。