solr suggester没有结果
solr suggester has no results
我尝试为 solr 设置一个建议器。我有多个信息字段。这是一个例子(字段:值):
基因:EGFR
cac: abf.4c3C
ccd: frl.dlgfX
主题:EGFR - 此更改
编号:7390
现在,我希望 solr 能够在输入时获取文档,无论用户是开始输入基因名称或 ID 还是...
solrconfig.xml 中的建议者看起来像这样(或多或少来自示例 copy/paste):
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggest_muripedia</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<!-- Suggester component -->
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggest_muripedia</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">_text_cf</str>
<str name="weightField">subject</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
_text_cf是按照上述字段的复制规则填充的字段,定义如下:
{
"name":"_text_cf",
"type":"mytext",
"multiValued":true,
"indexed":true,
"stored":true},
字段类型 mytext
看起来像这样
{
"name":"mytext",
"class":"solr.TextField",
"positionIncrementGap":"100",
"multiValued":true,
"indexAnalyzer":{
"tokenizer":{
"class":"solr.PatternTokenizerFactory",
"pattern":"-"},
"filters":[{
"class":"solr.TrimFilterFactory"},
{
"class":"solr.StopFilterFactory",
"words":"stopwords.txt",
"ignoreCase":"true"},
{
"class":"solr.LowerCaseFilterFactory"}]},
"queryAnalyzer":{
"tokenizer":{
"class":"solr.PatternTokenizerFactory",
"pattern":"-"},
"filters":[{
"class":"solr.TrimFilterFactory"},
{
"class":"solr.StopFilterFactory",
"words":"stopwords.txt",
"ignoreCase":"true"},
{
"class":"solr.SynonymGraphFilterFactory",
"expand":"true",
"ignoreCase":"true",
"synonyms":"synonyms.txt"},
{
"class":"solr.LowerCaseFilterFactory"}]}},
我试过的查询没有return任何结果:
suggest?q=egfr
我不知道如何解决问题,我想我还没有完全理解建议请求会发生什么。
建议者实际上做了 return 结果,但由于 suggestAnalyzerFieldType
已设置为 string
,查询区分大小写。将字段类型更改为我自己定义的字段类型 mytext
解决了这个问题。
我尝试为 solr 设置一个建议器。我有多个信息字段。这是一个例子(字段:值):
基因:EGFR
cac: abf.4c3C
ccd: frl.dlgfX
主题:EGFR - 此更改
编号:7390
现在,我希望 solr 能够在输入时获取文档,无论用户是开始输入基因名称或 ID 还是...
solrconfig.xml 中的建议者看起来像这样(或多或少来自示例 copy/paste):
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggest_muripedia</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<!-- Suggester component -->
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggest_muripedia</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">_text_cf</str>
<str name="weightField">subject</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
_text_cf是按照上述字段的复制规则填充的字段,定义如下:
{
"name":"_text_cf",
"type":"mytext",
"multiValued":true,
"indexed":true,
"stored":true},
字段类型 mytext
看起来像这样
{
"name":"mytext",
"class":"solr.TextField",
"positionIncrementGap":"100",
"multiValued":true,
"indexAnalyzer":{
"tokenizer":{
"class":"solr.PatternTokenizerFactory",
"pattern":"-"},
"filters":[{
"class":"solr.TrimFilterFactory"},
{
"class":"solr.StopFilterFactory",
"words":"stopwords.txt",
"ignoreCase":"true"},
{
"class":"solr.LowerCaseFilterFactory"}]},
"queryAnalyzer":{
"tokenizer":{
"class":"solr.PatternTokenizerFactory",
"pattern":"-"},
"filters":[{
"class":"solr.TrimFilterFactory"},
{
"class":"solr.StopFilterFactory",
"words":"stopwords.txt",
"ignoreCase":"true"},
{
"class":"solr.SynonymGraphFilterFactory",
"expand":"true",
"ignoreCase":"true",
"synonyms":"synonyms.txt"},
{
"class":"solr.LowerCaseFilterFactory"}]}},
我试过的查询没有return任何结果:
suggest?q=egfr
我不知道如何解决问题,我想我还没有完全理解建议请求会发生什么。
建议者实际上做了 return 结果,但由于 suggestAnalyzerFieldType
已设置为 string
,查询区分大小写。将字段类型更改为我自己定义的字段类型 mytext
解决了这个问题。