Elasticsearch 查询多个术语

Question

我正在尝试创建允许按名称和类型进行搜索的搜索查询。我已经为这些值编制了索引，我在 Elasticsearch 中的记录如下所示：

{
  _index: "assets",
  _type: "asset",
  _id: "eAOEN28BcFmQazI-nngR",
  _score: 1,
  _source: {
    name: "test.png",
    mediaType: "IMAGE",
    meta: {
      content-type: "image/png",
      width: 3348,
      height: 1890,
    },
    createdAt: "2019-12-24T10:47:15.727Z",
    updatedAt: "2019-12-24T10:47:15.727Z",
  }
}

例如，我将如何创建一个查询来查找名称为“test”且为图像的所有资产？

我尝试了 multi_mach 查询，但没有 return 正确的结果：

{
  "query": {
    "multi_match" : {
      "query":      "*test* IMAGE",
      "type":       "cross_fields",
      "fields":     [ "name", "mediaType" ],
      "operator":   "and" 
    }
  }
}

上面的查询 returns 0 结果，如果我将运算符更改为 "or" 它 returns 所有 IMAGE 类型的资产。

如有任何建议，我们将不胜感激。 TIA！

编辑：添加映射 下面是映射：

{
    "assets": {
        "aliases": {},
        "mappings": {
            "properties": {
                "__v": {
                    "type": "long"
                },
                "createdAt": {
                    "type": "date"
                },
                "deleted": {
                    "type": "date"
                },
                "mediaType": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "meta": {
                    "properties": {
                        "content-type": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "width": {
                            "type": "long"
                        },
                        "height": {
                          "type": "long"
                      }
                    }
                },
                "name": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "originalName": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "updatedAt": {
                    "type": "date"
                }
            }
        },
        "settings": {
            "index": {
                "creation_date": "1575884312237",
                "number_of_shards": "1",
                "number_of_replicas": "1",
                "uuid": "nSiAoIIwQJqXQRTyqw9CSA",
                "version": {
                    "created": "7030099"
                },
                "provided_name": "assets"
            }
        }
    }
}

Answer 1

你试过 best_fields 了吗？

{
  "query": {
    "multi_match" : {
      "query":      "Will Smith",
      "type":       "best_fields",
      "fields":     [ "name", "mediaType" ],
      "operator":   "and" 
    }
  }
}

Answer 2

您不必为这个简单的查询使用通配符表达式。

首先，更改 `name` 字段上的分析器。

您需要创建一个自定义分析器，将 . 替换为 space，因为默认的标准分析器不会这样做，因此您在搜索 test 时会得到 test.png 因为在倒排索引中会有 test 和 png。 这样做的主要好处是避免了非常昂贵的正则表达式查询。

更新了自定义分析器的映射，可以为您完成工作。只需更新您的映射并重新索引所有文档即可。

{
    "aliases": {},
    "mappings": {
        "properties": {
            "__v": {
                "type": "long"
            },
            "createdAt": {
                "type": "date"
            },
            "deleted": {
                "type": "date"
            },
            "mediaType": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    }
                }
            },
            "meta": {
                "properties": {
                    "content-type": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "width": {
                        "type": "long"
                    },
                    "height": {
                        "type": "long"
                    }
                }
            },
            "name": {
                "type": "text",
                "analyzer" : "my_analyzer"
            },
            "originalName": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    }
                }
            },
            "updatedAt": {
                "type": "date"
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "standard",
                    "char_filter": [
                        "replace_dots"
                    ]
                }
            },
            "char_filter": {
                "replace_dots": {
                    "type": "mapping",
                    "mappings": [
                        ". => \u0020"
                    ]
                }
            }
        },
        "index": {
            "number_of_shards": "1",
            "number_of_replicas": "1"
        }
    }
}

其次，您应该将查询更改为 bool 查询，如下所示：

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "name": "test"
                    }
                },
                {
                    "match": {
                        "mediaType.keyword": "IMAGE"
                    }
                }
            ]
        }
    }
}

which is using must with 2 match queries 意思是，只有当 must 查询的所有子句都匹配时，它才会 return docs。

我已经通过创建索引、插入一些示例文档并查询它们来测试我的解决方案，如果您需要任何帮助，请告诉我。

Elasticsearch 查询多个术语

Elasticsearch query for multiple terms

elasticsearch

elasticsearch-query

首先，更改 `name` 字段上的分析器。

其次，您应该将查询更改为 bool 查询，如下所示：

Elasticsearch 查询多个术语

Elasticsearch query for multiple terms

elasticsearch

elasticsearch-query

首先，更改 name 字段上的分析器。

其次，您应该将查询更改为 bool 查询，如下所示：

首先，更改 `name` 字段上的分析器。