Elasticsearch-dsl 具有嵌套过滤器和具有精确匹配的 AND 和 OR 条件

Elasticsearch-dsl with nested filters and AND and OR conditions with exact match

前端传来三个参数:

  1. State - 字符串
  2. Categories - 字符串数组。字符串可以由几个单词组成。
  3. Tags - 类似于类别。

所有参数都是可选的。

如果传送了几个,需要通过AND实现他们的bundle(而且statecategory的巧合,和 tag)。 如果提交了多个 categoriestags,将匹配 其中至少一个 .

也就是说,如果请求到达时带有参数

{"state": "Alaska", "categories": ["category 1", "category 2"]}

答案将是

不适合

我从 python (3.7) 向 elastikserch 发送请求。拿了一个图书馆 elasticsearch-dsl

通过 Q 个对象(在其中使用 match)收集了 三个过滤器

combined_filter = state_filter & categories_filter & tags_filter

列表categoriestags分为subfilters through OR.

query = queries.pop()
for item in queries:
    query |= item

这样的请求是为 elasticsearch 创建的。

Bool(minimum_should_match=1, 
    must=[Match(state='Alaska'), MatchAll()], 
    should=[Match(categories='category 1'), Match(categories='category 2')]
)

为什么此逻辑按 不准确 category / tag 名称查找条目?

from typing import List

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Q, Search
from flask import request
from flask.views import MethodView


es = Elasticsearch()


class ArticleSearchAPIView(MethodView):
    """
    Search articles using ElasticSearch
    """

    @staticmethod
    def filter_create(queries: List[Q]) -> Q:
        """
        Creates Q.OR filter
        """
        query = queries.pop()
        for item in queries:
            query |= item
        return query

    def get(self) -> dict:
        """
        Search article
        First request - with empty params
        """
        search = Search(using=es, index=ArticleModel.__tablename__)
        state_filter = categories_filter = tags_filter = Q()
        result = "Articles not found."

        data = request.get_json()
        categories = data.get("categories")
        tags = data.get("tags")
        state = data.get("state")

        if state:
            state_filter = Q("match", state=state)

        if categories:
            queries = [Q("match", categories=value) for value in categories]
            categories_filter = self.filter_create(queries)

        if tags:
            queries = [Q("match", tags=value) for value in tags]
            tags_filter = self.filter_create(queries)

        combined_filter = state_filter & categories_filter & tags_filter
        found = (
            search.filter(combined_filter)
            .execute()
            .to_dict()["hits"]
            .get("hits")
        )

        if found:
            result = [article["_source"] for article in found]
        return {"response": result}

更新


Article and CategoryArticle and Tag 之间的关系 - MTM

映射

{
  "articles": {
    "mappings": {
      "properties": {
        ...
        "categories": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "state": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "tags": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
        ...
      }
    }
  }
}

您可以使用布尔查询。

布尔查询中ElasticSearch Boolean Query

你有 'must' 相当于 'AND' 运算符。 'should' 作为 'OR' 运算符。

{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "user" : "kimchy" }
      },
      "should" : [
        { "term" : { "tag" : "wow" } },
        { "term" : { "tag" : "elasticsearch" } }
      ],
    }
  }
}

我决定这里不需要使用 elasticsearch-dsl

这是我做出的决定。

from typing import Dict, List, Tuple, Union

from elasticsearch import Elasticsearch
from flask import request
from flask.views import MethodView

from .models import AticleModel  # ArticleModel.__tablename__ == "articles"


es = Elasticsearch()


class ArticleSearchAPIView(MethodView):
    """
    Search articles using ElasticSearch
    """

    def get(
        self,
    ) -> Union[
        Dict[str, Union[list, List[str]]],
        Tuple[Dict[str, str], int],
        Dict[str, Union[list, str]],
    ]:
        """
        Search articles
        """
        data = request.get_json()
        categories = data.get("categories")
        tags = data.get("tags")
        state = data.get("state")
        result = "Articles not found."

        query = {"bool": {"must": []}}
        if state:
            query["bool"]["must"].append({"term": {"state.keyword": state}})
        if categories:
            query["bool"]["must"].append(
                {"terms": {"categories.keyword": categories}}
            )
        if tags:
            query["bool"]["must"].append({"terms": {"tags.keyword": tags}})

        found = es.search(
            index=ArticleModel.__tablename__, body={"query": query},
        )["hits"].get("hits")

        if found:
            result = [article["_source"] for article in found]
        return {"response": result}