确切的词不会提升 Solr
Exact word not boosting much Solr
作为给定的参考 link
https://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_make_exact-case_matches_score_higher
我试过一个例子。我的 schema.xml 配置如下。
<field name="product_name" type="text_wslc" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="nameString" type="string_ci" indexed="true" stored="false" required="true" />
<copyField source="product_name" dest="nameString"/>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="text_wslc" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" />
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" />
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="string_ci" class="solr.TextField" tMissingLast="true" omitNorms="true">
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
精确词搜索可以很好地处理这个问题。
但是带有精确匹配提升的模糊搜索没有给出预期的结果。
这是我的查询
/select?q=(laptop bag)&defType=dismax&qf=nameString^22+product_name^0.1
有帮助吗?
您需要以这种方式创建一个新的字段类型...
<fieldType name="string_ci" class="solr.TextField"
sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
现在像这样创建字段 nameSrting
:
<field name="nameSrting" type="string_ci" indexed="true" stored="true"/>
并将product_name
的内容复制到nameSrting
,像这样:
<copyField source="product_name" dest="nameSrting"/>
现在您需要 运行 一个查询,指定您想要使用双引号获得准确的短语,如下所示:
http://localhost:8983/solr/Dummy2/select?q="laptop+bag"&wt=json&defType=dismax&qf=nameSrting^222+product_name^0.1
作为给定的参考 link
https://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_make_exact-case_matches_score_higher
我试过一个例子。我的 schema.xml 配置如下。
<field name="product_name" type="text_wslc" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="nameString" type="string_ci" indexed="true" stored="false" required="true" />
<copyField source="product_name" dest="nameString"/>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="text_wslc" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" />
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1"
generateNumberParts="1"
catenateWords="1"
catenateNumbers="1"
catenateAll="1"
preserveOriginal="1"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" />
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="string_ci" class="solr.TextField" tMissingLast="true" omitNorms="true">
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
精确词搜索可以很好地处理这个问题。
但是带有精确匹配提升的模糊搜索没有给出预期的结果。 这是我的查询
/select?q=(laptop bag)&defType=dismax&qf=nameString^22+product_name^0.1
有帮助吗?
您需要以这种方式创建一个新的字段类型...
<fieldType name="string_ci" class="solr.TextField"
sortMissingLast="true" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
现在像这样创建字段 nameSrting
:
<field name="nameSrting" type="string_ci" indexed="true" stored="true"/>
并将product_name
的内容复制到nameSrting
,像这样:
<copyField source="product_name" dest="nameSrting"/>
现在您需要 运行 一个查询,指定您想要使用双引号获得准确的短语,如下所示:
http://localhost:8983/solr/Dummy2/select?q="laptop+bag"&wt=json&defType=dismax&qf=nameSrting^222+product_name^0.1