具有未知数据类型的 Elasticsearch 术语聚合
Elasticsearch Terms aggregation with unknown datatype
我正在使用 dynamic mapping, i.e. we don't know the shape, datatypes, etc. of much of the data ahead of time. In queries, I want to be able to aggregate on any field. Strings are (by default) mapped as both text
and keyword
types, and only the latter can be aggregated on. So for strings my terms aggregations 在 Elasticsearch 中索引未知模式的数据必须如下所示:
"aggs": {
"something": {
"terms": {
"field": "something.keyword"
}
}
}
但是数字和布尔值等其他类型没有此 .keyword
子字段,因此这些类型的聚合必须如下所示(对于文本字段将失败):
"aggs": {
"something": {
"terms": {
"field": "something"
}
}
}
有没有什么方法可以指定基本 "if something.keyword
exists, use that, otherwise just use something
" 的术语聚合,并且不会对性能造成重大影响?
要求在查询时提供数据类型信息对我来说可能是一个选项,但理想情况下我想尽可能避免它。
如果主要用例是聚合,可能值得将 string
属性的动态映射更改为索引为 keyword
数据类型,并将多字段子字段索引为text
数据类型,即 dynamic_templates
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"ignore_above": 256,
"fields": {
"text": {
"type": "text"
}
}
}
}
},
我正在使用 dynamic mapping, i.e. we don't know the shape, datatypes, etc. of much of the data ahead of time. In queries, I want to be able to aggregate on any field. Strings are (by default) mapped as both text
and keyword
types, and only the latter can be aggregated on. So for strings my terms aggregations 在 Elasticsearch 中索引未知模式的数据必须如下所示:
"aggs": {
"something": {
"terms": {
"field": "something.keyword"
}
}
}
但是数字和布尔值等其他类型没有此 .keyword
子字段,因此这些类型的聚合必须如下所示(对于文本字段将失败):
"aggs": {
"something": {
"terms": {
"field": "something"
}
}
}
有没有什么方法可以指定基本 "if something.keyword
exists, use that, otherwise just use something
" 的术语聚合,并且不会对性能造成重大影响?
要求在查询时提供数据类型信息对我来说可能是一个选项,但理想情况下我想尽可能避免它。
如果主要用例是聚合,可能值得将 string
属性的动态映射更改为索引为 keyword
数据类型,并将多字段子字段索引为text
数据类型,即 dynamic_templates
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"ignore_above": 256,
"fields": {
"text": {
"type": "text"
}
}
}
}
},