是否可以在 Elastic Search 中使用简单的查询字符串查询来设置 TYPE 参数

Is it possible to set the TYPE parameter using simple query string query in Elastic Search

当在 ES 中使用 query string query 并匹配多个字段时,我可以设置一个 TYPE 参数来配置 ES combines/scores 在匹配多个字段时的方式。

例如我想匹配索引中的两个字段,并合并两个字段的分数

GET /_search
{
    "query": {
        "query_string" : {
            "query" : "test",
            "fields": ["titel", "content"],
            "type": "most_fields"
      }
    }
}

使用 simple query string 时似乎缺少参数。简单查询字符串的默认模式是什么?分数如何chosen/combined?是否可以设置类型

简单查询字符串 没有类型参数。它对每个字段的分数求和。

考虑下面的索引,让我们看看不同的查询如何使用 explanation api

计算分数

映射:

PUT testindex6
{
  "mappings": {
    "properties": {
      "title":{
        "type": "text"
      },
      "description":{
        "type": "text"
      }
    }
  }
}

数据:

POST testindex6/_doc
{
  "title":  "dog",
  "description":"dog is brown"
}

1. Query_string best_fields(默认)

Finds documents which match any field, but uses the _score from the best field

GET testindex6/_search?explain=true
{
  "query": {
    "query_string": {
      "default_field": "*",
      "query": "dog brown",
      "type":"best_fields"
    }
  }
}

结果:

  "_explanation" : {
          "value" : 0.5753642,
          "description" : "max of:",
          "details" : [
            {
              "value" : 0.5753642,
              "description" : "sum of:",              
            },
            {
              "value" : 0.2876821,
              "description" : "sum of:",              
            }
          ]
        }   

Best_fields 从匹配字段中获取最高分数

2。 Query_stringmost_fields

Does sum of scores from matched fields
GET testindex6/_search?explain=true
{
  "query": {
    "query_string": {
      "default_field": "*",
      "query": "dog brown",
      "type":"most_fields"
    }
  }
}

结果

"_explanation" : {
          "value" : 0.8630463,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.5753642,
              "description" : "sum of:"
              ....
            },
            {
              "value" : 0.2876821,
              "description" : "sum of:"              
              ....
            }
          ]
        }
      }

3。 Simple_Query_String

查询

GET testindex6/_search?explain=true
{
  "query": {
    "simple_query_string": {
      "query": "dog brown",
      "fields": ["*"]
    }
  }
}

结果:

"_explanation" : {
          "value" : 0.8630463,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.5753642,
              "description" : "sum of:",              
            },
            {
              "value" : 0.2876821,
              "description" : "sum of:"              
            }
          ]
        }
      }

所以你可以看到 most_fields 和 simple_query_string 的分数是一样的(两者相加)。但是它们是有区别的。考虑以下索引

我用带状疱疹分析器创建了一个类型为文本和子字段带状疱疹的字段标题。

PUT index_2
{
  "settings": {
    "analysis": {
      "analyzer": {
        "analyzer_shingle": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "shingle"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "shingles": {
            "search_analyzer": "analyzer_shingle",
            "analyzer": "analyzer_shingle",
            "type": "text"
          }
        }
      }
    }
  }
}

数据:

POST index_2/_doc
{
  "title":"the brown fox"
}

1. Most_fields 查询:

GET index_2/_search?explain=true
{
  "query": {
    "query_string": {
      "query": "brown fox",
      "fields": ["*"],
      "type":"most_fields"
    }
  }
}

结果:

"_explanation" : {
          "value" : 1.3650365,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.7896724,
              "description" : "sum of:",              
            },
            {
              "value" : 0.5753642,
              "description" : "sum of:",              
            }
          ]
        }

2。简单_Query_string 查询

GET index_2/_search?explain=true
{
  "query": {
    "simple_query_string": {
      "query": "brown fox",
      "fields": ["*"]
    }
  }
}

结果:

"_explanation" : {
          "value" : 1.2632996,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.6316498,
              "description" : "sum of:",              
            },
            {
              "value" : 0.6316498,
              "description" : "sum of:"              
            }
          ]
        }
      }

如果您会看到 most_fields 和 simple_query_string 的分数不同,即使两者都计算分数。

原因是most_fields在查询时使用字段分析器,记住标题(标准)和标题带状疱疹(analyzer_shingle)有不同的分析器,而simple_query_string使用所有字段的索引(标准)的默认分析器。

如果我们将查询 most_fields 并强制它使用标准分析器,您的得分是相同的 查询:

GET index_2/_search?explain=true
{
  "query": {
    "query_string": {
      "query": "brown fox",
      "fields": ["*"],
      "type":"most_fields",
      "analyzer": "standard"-->instead of field analyzer respectively use standard for all
    }
  }
}

结果:

"_explanation" : {
          "value" : 1.2632996,
          "description" : "sum of:"
          "details" : [
            {
              "value" : 0.6879354,
              "description" : "sum of:"
            },
            {
              "value" : 0.5753642,
              "description" : "sum of:"              
            }
          ]
        }

simple_query_string我认为是针对简单的场景,如果你对不同的字段使用不同的分析器simple_query_string或布尔匹配查询