Solr

Question

我只想为术语匹配打分一次，而不是多次出现。

Ex - Search Query - Parle G Biscuits

Document 1 - Parle G Biscuits
Document 2 - Parle G Biscuits. I can eat 10 packets of Parle G Biscuits anytime. 
Document 3 - Parle G Biscuits V2 

I want to rank documents as Doc 1 > Doc 3 > Doc 2
Default answer from Solr - Doc 2 > Doc 1 > Doc 3

发生这种情况是因为该字符串在较长的字符串中被发现两次。如果我能以某种方式停止为两次出现打分，我会得到想要的结果，因为文档 2 和 3 会因字符串长度过大而受到轻微惩罚。

如何修改 Solr 以按给定的方式工作？

谢谢！

Answer 1

如果您不需要术语位置（例如，如果您不使用 foo:"word1 word2" 等短语进行搜索），您可以 set the field to drop any term frequency information, payloads and positions: omitTermFreqAndPositions="true".

If true, omits term frequency, positions, and payloads from postings for this field. This can be a performance boost for fields that don't require that information. It also reduces the storage space required for the index. Queries that rely on position that are issued on a field with this option will silently fail to find documents. This property defaults to true for all field types that are not text fields.

由于没有单独的设置来降低词频，如果您需要该设置禁用的其他两个功能，则必须实现自定义相似性。

Solr - 在文档中重复查询中的单词没有额外分数

Solr - No extra score for repeating words from query in document

lucene

solr4