如何获得多个完全匹配短语

Question

下面是获取精确匹配的查询

GET courses/_search
{
  "query": {
    "term" : {
         "name.keyword": "Anthropology 230"
      }
  }
}

我需要找到 Anthropology 230 和 Anthropology 250 also

如何获得精确匹配

Answer 1

您可以检查并尝试，match, match_phrase or match_phrase_prefix

使用匹配,

GET courses/_search
{
    "query": {
        "match" : {
            "name" : "Anthropology 230"
        }
    },
    "_source": "name"
}

使用match_phrase,

GET courses/_search
{
    "query": {
        "match_phrase" : {
            "name" : "Anthropology"
        }
    },
    "_source": "name"
}

或使用 regexp,

GET courses/_search
{
    "query": {
        "regexp" : {
            "name" : "Anthropology [0-9]{3}"
        }
    },
    "_source": "name"
}

Answer 2

您犯的错误是您在关键字字段上使用了术语查询，并且它们都没有被分析，这意味着它们试图在倒排索引中找到完全相同的搜索字符串。

你应该做的是：定义一个 text 字段，如果你没有定义你的映射，你无论如何都会有。我还假设与您提到的 .keyword 中的查询相同，如果您没有定义映射，它会自动创建。

现在您可以使用下面的 match query which is analyzed and uses standard analyzer 将标记拆分为空白，因此将为您的 2 个示例文档生成 Anthropology 250 和 230。

简单高效的查询，同时带来了文档

{
    "query": {
        "match" : {
            "name" : "Anthropology 230"
        }
    }
}

和搜索结果

 "hits": [
      {
        "_index": "matchterm",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.8754687,
        "_source": {
          "name": "Anthropology 230"
        }
      },
      {
        "_index": "matchterm",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.18232156,
        "_source": {
          "name": "Anthropology 250"
        }
      }
    ]

上述查询匹配两个文档的原因是它创建了两个标记 anthropology 和 230 并在两个文档中匹配 anthropology。

您绝对应该阅读 analysis process and can also try analyze API 以查看为任何文本生成的标记。

分析 API 文本输出

POST http://{{hostname}}:{{port}}/{{index-name}}/_analyze

{
  "analyzer": "standard",
  "text": "Anthropology 250"
}


{
    "tokens": [
        {
            "token": "anthropology",
            "start_offset": 0,
            "end_offset": 12,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "250",
            "start_offset": 13,
            "end_offset": 16,
            "type": "<NUM>",
            "position": 1
        }
    ]
}

Answer 3

假设您可能有更多 'Anthropology nnn' 项，这应该可以满足您的需要：

"query":{
    "bool":{
        "must":[
            {"term": {"name.keyword":"Anthropology 230"}},
            {"term": {"name.keyword":"Anthropology 250"}},
        ]  
    }
}

如何获得多个完全匹配短语

How to get exact match phrase more than one

dsl

elasticsearch

elasticsearch-dsl