按数组中出现的次数计数和排序
Count and sort by the number of occurences in an array
我有一个名为 account 的类型,具有以下映射:
"country" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"followingClientIds" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
},
"fielddata" : true
},
followingClientIds
是我关注的其他帐户的字符串 ID 数组。
我想构建一个查询,从一个国家/地区获取每个帐户,并根据我们共同关注的帐户数量对它们进行排序。
以下是我到目前为止所做的一些查询:
GET account/_search
{
"size": 20,
"query": {
"bool": {
"filter": {
"term": {
"country.keyword": "AT"
}
}
}
},
"sort": [
{
"followingClientIds.keyword": {
"order": "asc",
"nested_filter": {
"terms": {
"followingClientIds.keyword": [
"dFbEW23hVZ3w8jhH9LeCw3QG33UjuF5C"
]
}
}
}
}
]
}
例如,我在帐户类型中有这 3 个文档:
{
"username": "user2",
"country": "AT",
"followingClientIds": ["abc"]
},
{
"username": "user3",
"country": "AT",
"followingClientIds": ["abc", "bcd", "cde"]
},
{
"username": "user4",
"country": "AT",
"followingClientIds": ["abc"]
}
假设我将 country 和 followingClientIds 发送到查询进行排序:
{
"country": "AT",
"followingClientIds": ["abc", "bcd", "cde"]
}
我希望结果是这样的:
{
"username": "user3",
"country": "AT",
"followingClientIds": ["abc", "bcd", "cde"],
"fields": [ // dont really need this custom field, but would be cool
"mutual_following_count": 3
]
},
{
"username": "user2",
"country": "AT",
"followingClientIds": ["abc"],
"fields": [
"mutual_following_count": 1
]
},
{
"username": "user4",
"country": "AT",
"followingClientIds": ["abc"],
"fields": [
"mutual_following_count": 1
]
}
如果您正在寻找独立的,computed field called mutual_following_count
, you can do just that with the script below. But you won't be able to sort on it。
唯一的其他选择是脚本排序,它首先计算一个值,然后按它排序。结果查询可能如下所示:
{
"size": 20,
"query": {
"bool": {
"filter": {
"term": {
"country.keyword": "AT"
}
}
}
},
"sort": [
{
"_script": {
"type": "number",
"order": "desc",
"script": {
"lang": "painless",
"params": {
"followingClientIds": ["abc", "bcd", "cde"]
},
"source": """
// deduplicate
def fromSource = doc.followingClientIds
.stream()
.distinct()
.collect(Collectors.toList());
def fromParams = params.followingClientIds
.stream()
.distinct()
.collect(Collectors.toList());
// size() is a float so cast
return (int) fromParams.findAll(x -> fromSource.contains(x)).size();
"""
}
}
}
]
}
缺点是您不能 'name' 那样排序。 mutual_following_count
和其他任何东西都没有。
我有一个名为 account 的类型,具有以下映射:
"country" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"followingClientIds" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
},
"fielddata" : true
},
followingClientIds
是我关注的其他帐户的字符串 ID 数组。
我想构建一个查询,从一个国家/地区获取每个帐户,并根据我们共同关注的帐户数量对它们进行排序。
以下是我到目前为止所做的一些查询:
GET account/_search
{
"size": 20,
"query": {
"bool": {
"filter": {
"term": {
"country.keyword": "AT"
}
}
}
},
"sort": [
{
"followingClientIds.keyword": {
"order": "asc",
"nested_filter": {
"terms": {
"followingClientIds.keyword": [
"dFbEW23hVZ3w8jhH9LeCw3QG33UjuF5C"
]
}
}
}
}
]
}
例如,我在帐户类型中有这 3 个文档:
{
"username": "user2",
"country": "AT",
"followingClientIds": ["abc"]
},
{
"username": "user3",
"country": "AT",
"followingClientIds": ["abc", "bcd", "cde"]
},
{
"username": "user4",
"country": "AT",
"followingClientIds": ["abc"]
}
假设我将 country 和 followingClientIds 发送到查询进行排序:
{
"country": "AT",
"followingClientIds": ["abc", "bcd", "cde"]
}
我希望结果是这样的:
{
"username": "user3",
"country": "AT",
"followingClientIds": ["abc", "bcd", "cde"],
"fields": [ // dont really need this custom field, but would be cool
"mutual_following_count": 3
]
},
{
"username": "user2",
"country": "AT",
"followingClientIds": ["abc"],
"fields": [
"mutual_following_count": 1
]
},
{
"username": "user4",
"country": "AT",
"followingClientIds": ["abc"],
"fields": [
"mutual_following_count": 1
]
}
如果您正在寻找独立的,computed field called mutual_following_count
, you can do just that with the script below. But you won't be able to sort on it。
唯一的其他选择是脚本排序,它首先计算一个值,然后按它排序。结果查询可能如下所示:
{
"size": 20,
"query": {
"bool": {
"filter": {
"term": {
"country.keyword": "AT"
}
}
}
},
"sort": [
{
"_script": {
"type": "number",
"order": "desc",
"script": {
"lang": "painless",
"params": {
"followingClientIds": ["abc", "bcd", "cde"]
},
"source": """
// deduplicate
def fromSource = doc.followingClientIds
.stream()
.distinct()
.collect(Collectors.toList());
def fromParams = params.followingClientIds
.stream()
.distinct()
.collect(Collectors.toList());
// size() is a float so cast
return (int) fromParams.findAll(x -> fromSource.contains(x)).size();
"""
}
}
}
]
}
缺点是您不能 'name' 那样排序。 mutual_following_count
和其他任何东西都没有。