如何找到单词列表之间的语义相似性?
How to find semantic similarity between a list of words?
输入:
listToStr = 'degeneration agents alpha alternative amd analysis angiogenesis anti anti vegf appears associated based best bevacizumab blindness blood'
我使用的代码:
simi = []
tokens = nlp(listToStr)
length = len(tokens)
for i in range(length):
#print(i)
sim = tokens[i].similarity(tokens[i+1])
simi.append(sim)
print(simi)
错误:
[E040] Attempt to access token at 17, max length 17.
如何消除这个错误?
我正在使用spacy。这是它的 link :
https://www.geeksforgeeks.org/python-word-similarity-using-spacy/#:~:text=Python%20%7C%20Word%20Similarity%20using%20spaCy,simple%20method%20for%20this%20task.
在 for
循环内,由于 tokens[i + 1]
操作,创建了一个超出标记列表范围的索引。你可以改为这样做:
import spacy
nlp = spacy.load("en_core_web_sm")
listToStr = 'degeneration agents alpha alternative amd analysis angiogenesis anti anti vegf appears associated based best bevacizumab blindness blood'
simi = []
tokens = nlp(listToStr)
for idx, tok in enumerate(tokens):
sim = []
for nextok in tokens[idx:]:
sim.append(tok.similarity(nextok))
simi.append(sim)
这测试了句子中每个词与下一个词的相似性,因此结果是一个列表列表。
输入:
listToStr = 'degeneration agents alpha alternative amd analysis angiogenesis anti anti vegf appears associated based best bevacizumab blindness blood'
我使用的代码:
simi = []
tokens = nlp(listToStr)
length = len(tokens)
for i in range(length):
#print(i)
sim = tokens[i].similarity(tokens[i+1])
simi.append(sim)
print(simi)
错误:
[E040] Attempt to access token at 17, max length 17.
如何消除这个错误?
我正在使用spacy。这是它的 link : https://www.geeksforgeeks.org/python-word-similarity-using-spacy/#:~:text=Python%20%7C%20Word%20Similarity%20using%20spaCy,simple%20method%20for%20this%20task.
在 for
循环内,由于 tokens[i + 1]
操作,创建了一个超出标记列表范围的索引。你可以改为这样做:
import spacy
nlp = spacy.load("en_core_web_sm")
listToStr = 'degeneration agents alpha alternative amd analysis angiogenesis anti anti vegf appears associated based best bevacizumab blindness blood'
simi = []
tokens = nlp(listToStr)
for idx, tok in enumerate(tokens):
sim = []
for nextok in tokens[idx:]:
sim.append(tok.similarity(nextok))
simi.append(sim)
这测试了句子中每个词与下一个词的相似性,因此结果是一个列表列表。