如何在布尔查询中提升索引查询

Question

所以我想要实现的是与每个索引的自定义可搜索字段部分匹配。我生成一个 match_phrase_prefix 和要搜索的值，如果它不止一个词，我会为每个词生成另一个。（我可以使用 prefix，但它有问题，或者有未记录的设置）。

在这种情况下，我正在尝试查找 "belden cable"；查询如下所示：

{
    "query":{
        "bool":{
            "should":
            [
                {
                    "indices":{
                        "indices":["addresss"],
                        "query":{
                            "bool":{
                                "should":
                                [
                                    {"match_phrase_prefix":{"name":"BELDEN CABLE"}}
                                    {"match_phrase_prefix":{"name":"BELDEN"}},
                                    {"match_phrase_prefix":{"name":"CABLE"}}
                                ]
                            }
                        },
                        "no_match_query":"none"
                    }
                },
                {
                    "indices":{
                        "indices":["customers"],
                        "query":{
                            "bool":{
                                "should":[
                                    {"match_phrase_prefix":{"_all":"BELDEN CABLE"}},
                                    {"match_phrase_prefix":{"_all":"CABLE"}},
                                    {"match_phrase_prefix":{"_all":"BELDEN"}}
                                ]
                            }
                        },
                    "no_match_query":"none"
                }
            }
        ]
    }
}

我的目标搜索是首先获得具有 "belden cable" 的结果，然后仅搜索 "belden" 或 "cable"。

这个returns（举例）4个结果有"belden cable"，然后一个结果只有"cable"，然后更多"belden cable"的结果。

如何提升具有完整搜索价值的结果？("belden cable")

我试过将两个词的索引查询和分离词的索引查询分开，但它给出的相关性结果最差。

我也试过在 match_phrase_prefix 中使用 boost 语句来实现 "belden cable" 而结果没有改变..

Answer 1

您真正需要的是一种分析输入数据的不同方法。请参阅下面的内容，这些内容应该是您最终解决方案的起点（因为您需要考虑查询和数据分析的全套要求）。使用 ES 进行搜索不仅涉及查询，还涉及如何 构建和准备数据 .

我们的想法是您希望对数据进行分析，以便 belden cable 保持原样。使用 "name": {"type": "string"} 的映射，正在使用 standard 分析器，这意味着索引中的术语列表是 belden 和 cable。您实际需要的是 [belden cable、belden、cable]。所以，我考虑建议使用 shingles 标记过滤器。

DELETE /addresss
PUT /addresss
{
  "settings": {
    "analysis": {
      "analyzer": {
        "analyzer_shingle": {
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "shingle"
          ]
        }
      }
    }
  },
  "mappings": {
    "test": {
      "properties": {
        "name": {
          "type": "string",
          "analyzer": "analyzer_shingle"
        }
      }
    }
  }
}
DELETE /customers
PUT /customers
{
  "settings": {
    "analysis": {
      "analyzer": {
        "analyzer_shingle": {
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "shingle"
          ]
        }
      }
    }
  },
  "mappings": {
    "test": {
      "_all": {
        "analyzer": "analyzer_shingle"
      }
    }
  }
}

POST /addresss/test/_bulk
{"index":{}}
{"name": "belden cable"}
{"index":{}}
{"name": "belden cable yyy"}
{"index":{}}
{"name": "belden cable xxx"}
{"index":{}}
{"name": "belden bla"}
{"index":{}}
{"name": "cable bla"}

POST /customers/test/_bulk
{"index":{}}
{"field1": "belden", "field2": "cable"}
{"index":{}}
{"field1": "belden cable yyy"}
{"index":{}}
{"field2": "belden cable xxx"}
{"index":{}}
{"field2": "belden bla"}
{"index":{}}
{"field2": "cable bla"}

GET /addresss,customers/test/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "indices": {
            "indices": [
              "addresss"
            ],
            "query": {
              "bool": {
                "should": [
                  {
                    "match_phrase_prefix": {
                      "name": "BELDEN CABLE"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "name": "BELDEN"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "name": "CABLE"
                    }
                  }
                ]
              }
            },
            "no_match_query": "none"
          }
        },
        {
          "indices": {
            "indices": [
              "customers"
            ],
            "query": {
              "bool": {
                "should": [
                  {
                    "match_phrase_prefix": {
                      "_all": "BELDEN CABLE"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "_all": "CABLE"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "_all": "BELDEN"
                    }
                  }
                ]
              }
            },
            "no_match_query": "none"
          }
        }
      ]
    }
  }
}

如何在布尔查询中提升索引查询

How to boost indices query inside a boolean query

elasticsearch

solr-boost