Solr 中默认、FastVector 和 Posting 荧光笔之间有什么区别?
What it the difference between default, FastVector and Posting highlighters in Solr?
我正在使用 Solr 搜索引擎开发 Web 应用程序,并尝试了解突出显示的工作原理。
Solr 中默认、FastVector 和 Posting 荧光笔之间有什么区别?
Standard Highlighter
The Standard Highlighter is the swiss-army knife of the highlighters.
It has the most sophisticated and fine-grained query representation of
the three highlighters. For example, this highlighter is capable of
providing precise matches even for advanced queryparsers such as the
surround parser. It does not require any special datastructures such
as termVectors, although it will use them if they are present. If they
are not, this highlighter will re-analyze the document on-the-fly to
highlight it. This highlighter is a good choice for a wide variety of
search use-cases.
FastVector Highlighter
The FastVector Highlighter requires term vector options (termVectors,
termPositions, and termOffsets) on the field, and is optimized with
that in mind. It tends to work better for more languages than the
Standard Highlighter, because it supports Unicode breakiterators. On
the other hand, its query-representation is less advanced than the
Standard Highlighter: for example it will not work well with the
surround parser. This highlighter is a good choice for large documents
and highlighting text in a variety of languages.
Postings Highlighter
The Postings Highlighter requires storeOffsetsWithPositions to be
configured on the field. This is a much more compact and efficient
structure than term vectors, but is not appropriate for huge numbers
of query terms (e.g. wildcard queries). Like the FastVector
Highlighter, it supports Unicode algorithms for dividing up the
document. On the other hand, it has the most coarse
query-representation: it focuses on summary quality and ignores the
structure of the query completely, ranking passages based solely on
query terms and statistics. This highlighter a good choice for classic
full-text keyword search.
我正在使用 Solr 搜索引擎开发 Web 应用程序,并尝试了解突出显示的工作原理。
Solr 中默认、FastVector 和 Posting 荧光笔之间有什么区别?
Standard Highlighter
The Standard Highlighter is the swiss-army knife of the highlighters. It has the most sophisticated and fine-grained query representation of the three highlighters. For example, this highlighter is capable of providing precise matches even for advanced queryparsers such as the surround parser. It does not require any special datastructures such as termVectors, although it will use them if they are present. If they are not, this highlighter will re-analyze the document on-the-fly to highlight it. This highlighter is a good choice for a wide variety of search use-cases.
FastVector Highlighter
The FastVector Highlighter requires term vector options (termVectors, termPositions, and termOffsets) on the field, and is optimized with that in mind. It tends to work better for more languages than the Standard Highlighter, because it supports Unicode breakiterators. On the other hand, its query-representation is less advanced than the Standard Highlighter: for example it will not work well with the surround parser. This highlighter is a good choice for large documents and highlighting text in a variety of languages.
Postings Highlighter
The Postings Highlighter requires storeOffsetsWithPositions to be configured on the field. This is a much more compact and efficient structure than term vectors, but is not appropriate for huge numbers of query terms (e.g. wildcard queries). Like the FastVector Highlighter, it supports Unicode algorithms for dividing up the document. On the other hand, it has the most coarse query-representation: it focuses on summary quality and ignores the structure of the query completely, ranking passages based solely on query terms and statistics. This highlighter a good choice for classic full-text keyword search.