弹性搜索过滤包含空字符串数组的文档
Elastic search filter documents that contain array with empty string
我在弹性搜索中有文档,我想过滤掉只包含空字符串数组或什么都没有/空数组的文档。
#doc 1
{
"_index": "my-index-000001",
"_type": "_doc",
"_id": "0",
"_source": {
"doc":{
"field": ["",""]
}
}
}
#doc 2
{
"_index": "my-index-000001",
"_type": "_doc",
"_id": "0",
"_source": {
"doc":{
"field": []
}
}
}
#doc 3
{
"_index": "my-index-000001",
"_type": "_doc",
"_id": "0",
"_source": {
"doc":{
"field": ["hello",""]
}
}
}
从上述文档中是否可以仅过滤掉 doc 1 和 doc 2 至于这些,“字段”在数组中不包含任何内容或仅包含空字符串。
请检查下面的查询,它将 return 仅包含空数组或包含所有空字符串的数组的文档。
这里第一个 should 子句将检查空字符串是否是数组的一部分,第二个子句将检查数组字段是否不存在,must_not 与通配符将从结果中删除至少有一个元素的文档数组。
{
"query": {
"bool": {
"should": [
{
"term": {
"city.keyword": {
"value": ""
}
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "city.keyword"
}
}
]
}
}
],
"must_not": [
{
"wildcard": {
"city.keyword": "?*"
}
}
]
}
}
}
下面是我索引中的示例文档:
{
"hits" : [
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4g3P2H4BrzeQ9ErqJwUL",
"_score" : 1.0,
"_source" : {
"city" : [
"",
""
]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4w3P2H4BrzeQ9ErqXgWT",
"_score" : 1.0,
"_source" : {
"city" : [ ]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "5A3P2H4BrzeQ9ErqhwUI",
"_score" : 1.0,
"_source" : {
"city" : [
"hello",
""
]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "5Q3q2H4BrzeQ9ErqOAXW",
"_score" : 1.0,
"_source" : {
"city" : [
"hello",
"sagar"
]
}
}
]
}
执行上述查询后的示例输出:
{
"hits" : [
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4g3P2H4BrzeQ9ErqJwUL",
"_score" : 0.5619608,
"_source" : {
"city" : [
"",
""
]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4w3P2H4BrzeQ9ErqXgWT",
"_score" : 0.0,
"_source" : {
"city" : [ ]
}
}
]
}
我在弹性搜索中有文档,我想过滤掉只包含空字符串数组或什么都没有/空数组的文档。
#doc 1
{
"_index": "my-index-000001",
"_type": "_doc",
"_id": "0",
"_source": {
"doc":{
"field": ["",""]
}
}
}
#doc 2
{
"_index": "my-index-000001",
"_type": "_doc",
"_id": "0",
"_source": {
"doc":{
"field": []
}
}
}
#doc 3
{
"_index": "my-index-000001",
"_type": "_doc",
"_id": "0",
"_source": {
"doc":{
"field": ["hello",""]
}
}
}
从上述文档中是否可以仅过滤掉 doc 1 和 doc 2 至于这些,“字段”在数组中不包含任何内容或仅包含空字符串。
请检查下面的查询,它将 return 仅包含空数组或包含所有空字符串的数组的文档。
这里第一个 should 子句将检查空字符串是否是数组的一部分,第二个子句将检查数组字段是否不存在,must_not 与通配符将从结果中删除至少有一个元素的文档数组。
{
"query": {
"bool": {
"should": [
{
"term": {
"city.keyword": {
"value": ""
}
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "city.keyword"
}
}
]
}
}
],
"must_not": [
{
"wildcard": {
"city.keyword": "?*"
}
}
]
}
}
}
下面是我索引中的示例文档:
{
"hits" : [
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4g3P2H4BrzeQ9ErqJwUL",
"_score" : 1.0,
"_source" : {
"city" : [
"",
""
]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4w3P2H4BrzeQ9ErqXgWT",
"_score" : 1.0,
"_source" : {
"city" : [ ]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "5A3P2H4BrzeQ9ErqhwUI",
"_score" : 1.0,
"_source" : {
"city" : [
"hello",
""
]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "5Q3q2H4BrzeQ9ErqOAXW",
"_score" : 1.0,
"_source" : {
"city" : [
"hello",
"sagar"
]
}
}
]
}
执行上述查询后的示例输出:
{
"hits" : [
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4g3P2H4BrzeQ9ErqJwUL",
"_score" : 0.5619608,
"_source" : {
"city" : [
"",
""
]
}
},
{
"_index" : "arrayindex",
"_type" : "_doc",
"_id" : "4w3P2H4BrzeQ9ErqXgWT",
"_score" : 0.0,
"_source" : {
"city" : [ ]
}
}
]
}