对 python 中句子相似度的结果进行排序
Sorting the result for sentence similarity in python
我试图找到句子标记化文档与将结果保存在列表中的句子之间的相似性。我想根据相似度得分对结果进行排序。当我尝试根据相似度得分对输出进行排序时出现错误?
results=[]
#embedding all the documents and find the similarity between search text and all the tokenize sentences
for docs_sent_token in docs_sent_tokens:
sentence_embeddings = model.encode(docs_sent_token)
sim_score1 = cosine_sim(search_sentence_embeddings, sentence_embeddings)
if sim_score1 > 0:
results.append({
sim_score1,
docs_sent_token,
})
results.sort(key=lambda k : k['sim_score1'] , reverse=True)
print(results)
这是我得到的错误。
TypeError: 'set' object is not subscriptable
这个问题可以用字典解决。
if sim_score1 > 0:
results.append({
'Score':sim_score1,
'Token':docs_sent_token,
})
results.sort(key=lambda k : k['Score'] , reverse=True)
print(results)
但是有什么方法可以使用列表完成排序吗?我想得到这种格式的结果。
[{0.91, 'Sentence 1'}, {0.87, 'Sentence 2'}, {0.33, 'Sentence 3'}, {0.30, 'Sentence 4'},
set
s 没有索引或键来指示要排序的值。您可以改为创建 tuple
或 dict
的列表,对其进行排序并稍后将其转换为 set
s
results.append((
sim_score1,
docs_sent_token
))
# results = [(0.91, 'Sentence 1'), (0.33, 'Sentence 3'), (0.87, 'Sentence 2'), (0.30, 'Sentence 4')]
results.sort(key=lambda k: k[0], reverse=True)
results = [set(t) for t in results]
# or
results.append({
'Score': sim_score1,
'Token': docs_sent_token
})
# results = [{'Score': 0.91, 'Token': 'Sentence 1'}, {'Score': 0.33, 'Token': 'Sentence 3'}, {'Score': 0.87, 'Token': 'Sentence 2'}, {'Score': 0.30, 'Token': 'Sentence 4'}]
results.sort(key=lambda k: k['Score'], reverse=True)
results = [set(d.values()) for d in results]
print(results)
输出
[{0.91, 'Sentence 1'}, {0.87, 'Sentence 2'}, {0.33, 'Sentence 3'}, {0.3, 'Sentence 4'}]
我试图找到句子标记化文档与将结果保存在列表中的句子之间的相似性。我想根据相似度得分对结果进行排序。当我尝试根据相似度得分对输出进行排序时出现错误?
results=[]
#embedding all the documents and find the similarity between search text and all the tokenize sentences
for docs_sent_token in docs_sent_tokens:
sentence_embeddings = model.encode(docs_sent_token)
sim_score1 = cosine_sim(search_sentence_embeddings, sentence_embeddings)
if sim_score1 > 0:
results.append({
sim_score1,
docs_sent_token,
})
results.sort(key=lambda k : k['sim_score1'] , reverse=True)
print(results)
这是我得到的错误。
TypeError: 'set' object is not subscriptable
这个问题可以用字典解决。
if sim_score1 > 0:
results.append({
'Score':sim_score1,
'Token':docs_sent_token,
})
results.sort(key=lambda k : k['Score'] , reverse=True)
print(results)
但是有什么方法可以使用列表完成排序吗?我想得到这种格式的结果。
[{0.91, 'Sentence 1'}, {0.87, 'Sentence 2'}, {0.33, 'Sentence 3'}, {0.30, 'Sentence 4'},
set
s 没有索引或键来指示要排序的值。您可以改为创建 tuple
或 dict
的列表,对其进行排序并稍后将其转换为 set
s
results.append((
sim_score1,
docs_sent_token
))
# results = [(0.91, 'Sentence 1'), (0.33, 'Sentence 3'), (0.87, 'Sentence 2'), (0.30, 'Sentence 4')]
results.sort(key=lambda k: k[0], reverse=True)
results = [set(t) for t in results]
# or
results.append({
'Score': sim_score1,
'Token': docs_sent_token
})
# results = [{'Score': 0.91, 'Token': 'Sentence 1'}, {'Score': 0.33, 'Token': 'Sentence 3'}, {'Score': 0.87, 'Token': 'Sentence 2'}, {'Score': 0.30, 'Token': 'Sentence 4'}]
results.sort(key=lambda k: k['Score'], reverse=True)
results = [set(d.values()) for d in results]
print(results)
输出
[{0.91, 'Sentence 1'}, {0.87, 'Sentence 2'}, {0.33, 'Sentence 3'}, {0.3, 'Sentence 4'}]