对数组对象的弹性搜索匹配查询

elastic search match query over array object

假设我有 3 个文档

doc_1 = {
    "citedIn": [
        "Bar Councils Act, 1926 - Section 15",
        "Contract Act, 1872 - Section 23"
    ]
}

doc_2 = {
    "citedIn":[
        "15 C. B 400", 
        "Contract Act, 1872 - Section 55"
    ]
}

doc_3 = {
    "citedIn":[
        "15 C. B 400", 
        "Contract Act, 1872 - Section 15"
    ]
}

这里citedIn字段是一个数组object.Now我要运行一个standermatch查询

{
    "query":
    {
        "match": {"citedIn":{"query": "Contract act 15" , "operator":"and" }}
    }

}

上面的查询 return 所有的 3 文档,但它假设 return doc_3 因为只有 doc_3 包含 Contract, act15 放在一个数组元素中。

我该如何实现?

任何 suggestion/Solution 将是可取的

嵌套数据类型更新:

我试过嵌套字段。 这是我的映射

{
    "mappings": {
        "properties": {
            "citedIn": {
                "type": "nested",
                "include_in_parent": true,
                "properties": {
                    "someFiled": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
                }
            }
        }
    }
}

这是我的数据

doc_1 = {
    "citedIn": [
        {"someFiled" : "Bar Councils Act, 1926 - Section 15"},
        {"someFiled" : "Contract Act, 1872 - Section 23"}
    ]
}

doc_2 = {
    "citedIn":[
        {"someFiled" : "15 C. B 400"}
        {"someFiled" : "Contract Act, 1872 - Section 55"}
    ]
}

doc_3 = {
    "citedIn":[
        {"someFiled" : "15 C. B 400"},
        {"someFiled" : "Contract Act, 1872 - Section 15"}
    ]
}

这是我的查询

{
    "query":
    {

        "match": {"citedIn.someFiled":{"query": "Contract act 15" , "operator":"and" }}
            
        
    }
}

但仍然得到相同的结果

你无法实现这一点,因为你正在索引的是 citedIn 字段中的字符串数组,并且所有 Elasticsearch 字段在设计时默认为 multi-valued在 Lucene 中,elasticsearch 建立在 Lucene search library.

之上

请阅读 arrays in elasticsearch 了解更多信息,尤其是下图所示的最后一条重要说明:

如上图所示,数组中的所有字符串实际上属于同一字段,因此 ES 无法识别您的搜索字符串是否属于数组中的同一字符串,因此您在搜索中获得了所有文档。

除非您将这些字符串作为其他字段(例如 nested 字段)的一部分进行索引,但为此您需要提供字段名称,它就像一个映射,其中键是您的字段名称,值是字段值而不是查询字段名称,您将无法实现 use-case.

添加包含索引数据、映射、搜索查询和搜索结果的工作示例。

您需要使用 nested query 来搜索嵌套字段

索引映射

{
    "mappings": {
        "properties": {
            "citedIn": {
                "type": "nested"
            }
        }
    }
}

索引数据:

 {
        "citedIn": [
            {
                "someFiled": "Bar Councils Act, 1926 - Section 15"
            },
            {
                "someFiled": "Contract Act, 1872 - Section 23"
            }
        ]
    }
    {
        "citedIn": [
            {
                "someFiled": "15 C. B 400"
            },
            {
                "someFiled": "Contract Act, 1872 - Section 55"
            }
        ]
    }
    {
        "citedIn": [
            {
                "someFiled": "15 C. B 400"
            },
            {
                "someFiled": "Contract Act, 1872 - Section 15"
            }
        ]
    }

搜索查询:

{
    "query": {
        "nested": {
            "path": "citedIn",
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "citedIn.someFiled": "contract"
                            }
                        },
                        {
                            "match": {
                                "citedIn.someFiled": "act"
                            }
                        },
                        {
                            "match": {
                                "citedIn.someFiled": 15
                            }
                        }
                    ]
                }
            },
            "inner_hits": {}
        }
    }
}

搜索结果:

"inner_hits": {
          "citedIn": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": 1.620718,
              "hits": [
                {
                  "_index": "stof_64170705",
                  "_type": "_doc",
                  "_id": "3",
                  "_nested": {
                    "field": "citedIn",
                    "offset": 1
                  },
                  "_score": 1.620718,
                  "_source": {
                    "someFiled": "Contract Act, 1872 - Section 15"
                  }
                }
              ]
            }
          }
        }
      }