在 Kibana 的 Vega 中,如何在一个请求中从两个不同的聚合创建层
In Kibana's Vega, how can I create layers from two different aggs in one request
在 Elasticsearch 的 HTTP API 中,您可以在对 _search
API 的单个请求中进行分桶聚合和度量聚合。在 Kibana 的 Vega 环境中,如何创建 Vega 可视化,它使用单个 _search
请求和桶聚合和指标聚合;然后制作一个图表,其中一层使用存储桶中的数据,一层使用指标中的数据?
为了使这个问题更具体,考虑这个例子:
假设我们是帽子制造商。多家商店出售我们的帽子。我们有一个 Elasticsearch 索引 hat-sales
,每次我们的一顶帽子售出时都有一个文档。本文档中包含出售帽子的商店。
以下是该索引中文档的两个示例:
{
"type": "top",
"color": "black",
"price": 19,
"store": "Macy's"
}
{
"type": "fez",
"color": "red",
"price": 94,
"store": "Walmart"
}
我想创建一个条形图来显示前 3 家商店的帽子销量。我也想要
此图表上的水平规则显示所有商店销售的帽子的平均数量 - 而不仅仅是前 3 名。这是我希望图表看起来像的 草图:
如果我们这样做,让 Vega 计算平均值:
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.json",
"title": "Hat Sales",
"data": {
"url": {
"index": "hat-sales",
"body": {
"size": 0,
"query": {"match_all": {}},
"aggs": {"stores": {"terms": {"field": "store.keyword", "size": 3}}}
}
},
"format": {"property": "aggregations.stores.buckets"}
},
"transform": [
{"calculate": "datum.key", "as": "store"},
{"calculate": "datum.doc_count", "as": "count"}
],
"layer": [
{
"name": "Sales of top 3 stores",
"mark": "bar",
"encoding": {
"x": {"type": "nominal", "field": "store", "sort": "-y"},
"y": {"type": "quantitative", "field": "count"}
}
},
{
"name": "Average number of sales over all stores",
"mark": {"type": "rule", "color": "red"},
"encoding": {"y": {"aggregate": "mean", "field": "count"}}
}
]
}
看起来像这样:
那么水平规则将只是前 3 家商店的平均值。相反,我们需要向 Elasticsearch 请求添加另一个指标聚合,计算 全球 商店销售帽子的平均值 ()。我们想做这样的事情:
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.json",
"title": "Hat Sales",
"data": {
"url": {
"index": "hat-sales",
"body": {
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {"terms": {"field": "store.keyword", "size": 3}},
"global": {
"filters": {
"filters": {"all": {"exists": {"field": "store.keyword"}}}
},
"aggs": {
"count": {"value_count": {"field": "store.keyword"}},
"unique_count": {"cardinality": {"field": "store.keyword"}},
"global_average": {
"bucket_script": {
"buckets_path": {"total": "count", "unique": "unique_count"},
"script": "params.total / params.unique"
}
}
}
}
}
}
},
"format": {"property": "aggregations.stores.buckets"}
},
"transform": [
{"calculate": "datum.key", "as": "store"},
{"calculate": "datum.doc_count", "as": "count"}
],
"layer": [
{
"name": "Sales of top 3 stores",
"mark": "bar",
"encoding": {
"x": {"type": "nominal", "field": "store", "sort": "-y"},
"y": {"type": "quantitative", "field": "count"}
}
},
{
"name": "Average number of sales over all stores",
"mark": {"type": "rule", "color": "red"},
??????????????????
}
]
}
但是我怎样才能让一层使用来自 "aggregations.stores.buckets"
的数据而另一层使用来自 "aggregations.global.buckets"
的数据来访问 global_average
?
我确实使用了它:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A simple bar chart with embedded data.",
"data": {
"url": {
"index": "hat-sales",
"body": {
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {"terms": {"field": "store.keyword", "size": 3}},
"global": {
"filters": {
"filters": {"all": {"exists": {"field": "store.keyword"}}}
},
"aggs": {
"count": {"value_count": {"field": "store.keyword"}},
"unique_count": {"cardinality": {"field": "store.keyword"}},
"global_average": {
"bucket_script": {
"buckets_path": {"total": "count", "unique": "unique_count"},
"script": "params.total / params.unique"
}
}
}
}
}
}
}
},
"transform": [
{"flatten": ["aggregations.stores.buckets"]},
{"calculate": "datum['aggregations.stores.buckets'].key", "as": "store"},
{
"calculate": "datum['aggregations.stores.buckets'].doc_count",
"as": "count"
},
{
"calculate": "datum.aggregations.global.buckets.all.global_average.value",
"as": "global_average"
}
],
"layer": [
{
"name": "Sales of top 3 stores",
"mark": "bar",
"encoding": {
"x": {"type": "nominal", "field": "store", "sort": "-y"},
"y": {"type": "quantitative", "field": "count"}
}
},
{
"name": "Global Average",
"mark": {"type": "rule", "color": "red"},
"encoding": {"y": {"field": "global_average", "type": "quantitative"}}
}
]
}
它不太理想,因为 flatten
转换使得单个 datum
对象稍微大一些。同样令人困惑的是,一旦你将 aggregations.stores.buckets
展平,它就变成了 datum
字段的字面名称 - "aggregations.stores.buckets"
-- 必须通过方括号表示法访问,因为它包含句点.
在 Elasticsearch 的 HTTP API 中,您可以在对 _search
API 的单个请求中进行分桶聚合和度量聚合。在 Kibana 的 Vega 环境中,如何创建 Vega 可视化,它使用单个 _search
请求和桶聚合和指标聚合;然后制作一个图表,其中一层使用存储桶中的数据,一层使用指标中的数据?
为了使这个问题更具体,考虑这个例子:
假设我们是帽子制造商。多家商店出售我们的帽子。我们有一个 Elasticsearch 索引 hat-sales
,每次我们的一顶帽子售出时都有一个文档。本文档中包含出售帽子的商店。
以下是该索引中文档的两个示例:
{
"type": "top",
"color": "black",
"price": 19,
"store": "Macy's"
}
{
"type": "fez",
"color": "red",
"price": 94,
"store": "Walmart"
}
我想创建一个条形图来显示前 3 家商店的帽子销量。我也想要 此图表上的水平规则显示所有商店销售的帽子的平均数量 - 而不仅仅是前 3 名。这是我希望图表看起来像的 草图:
如果我们这样做,让 Vega 计算平均值:
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.json",
"title": "Hat Sales",
"data": {
"url": {
"index": "hat-sales",
"body": {
"size": 0,
"query": {"match_all": {}},
"aggs": {"stores": {"terms": {"field": "store.keyword", "size": 3}}}
}
},
"format": {"property": "aggregations.stores.buckets"}
},
"transform": [
{"calculate": "datum.key", "as": "store"},
{"calculate": "datum.doc_count", "as": "count"}
],
"layer": [
{
"name": "Sales of top 3 stores",
"mark": "bar",
"encoding": {
"x": {"type": "nominal", "field": "store", "sort": "-y"},
"y": {"type": "quantitative", "field": "count"}
}
},
{
"name": "Average number of sales over all stores",
"mark": {"type": "rule", "color": "red"},
"encoding": {"y": {"aggregate": "mean", "field": "count"}}
}
]
}
看起来像这样:
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.json",
"title": "Hat Sales",
"data": {
"url": {
"index": "hat-sales",
"body": {
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {"terms": {"field": "store.keyword", "size": 3}},
"global": {
"filters": {
"filters": {"all": {"exists": {"field": "store.keyword"}}}
},
"aggs": {
"count": {"value_count": {"field": "store.keyword"}},
"unique_count": {"cardinality": {"field": "store.keyword"}},
"global_average": {
"bucket_script": {
"buckets_path": {"total": "count", "unique": "unique_count"},
"script": "params.total / params.unique"
}
}
}
}
}
}
},
"format": {"property": "aggregations.stores.buckets"}
},
"transform": [
{"calculate": "datum.key", "as": "store"},
{"calculate": "datum.doc_count", "as": "count"}
],
"layer": [
{
"name": "Sales of top 3 stores",
"mark": "bar",
"encoding": {
"x": {"type": "nominal", "field": "store", "sort": "-y"},
"y": {"type": "quantitative", "field": "count"}
}
},
{
"name": "Average number of sales over all stores",
"mark": {"type": "rule", "color": "red"},
??????????????????
}
]
}
但是我怎样才能让一层使用来自 "aggregations.stores.buckets"
的数据而另一层使用来自 "aggregations.global.buckets"
的数据来访问 global_average
?
我确实使用了它:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "A simple bar chart with embedded data.",
"data": {
"url": {
"index": "hat-sales",
"body": {
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {"terms": {"field": "store.keyword", "size": 3}},
"global": {
"filters": {
"filters": {"all": {"exists": {"field": "store.keyword"}}}
},
"aggs": {
"count": {"value_count": {"field": "store.keyword"}},
"unique_count": {"cardinality": {"field": "store.keyword"}},
"global_average": {
"bucket_script": {
"buckets_path": {"total": "count", "unique": "unique_count"},
"script": "params.total / params.unique"
}
}
}
}
}
}
}
},
"transform": [
{"flatten": ["aggregations.stores.buckets"]},
{"calculate": "datum['aggregations.stores.buckets'].key", "as": "store"},
{
"calculate": "datum['aggregations.stores.buckets'].doc_count",
"as": "count"
},
{
"calculate": "datum.aggregations.global.buckets.all.global_average.value",
"as": "global_average"
}
],
"layer": [
{
"name": "Sales of top 3 stores",
"mark": "bar",
"encoding": {
"x": {"type": "nominal", "field": "store", "sort": "-y"},
"y": {"type": "quantitative", "field": "count"}
}
},
{
"name": "Global Average",
"mark": {"type": "rule", "color": "red"},
"encoding": {"y": {"field": "global_average", "type": "quantitative"}}
}
]
}
它不太理想,因为 flatten
转换使得单个 datum
对象稍微大一些。同样令人困惑的是,一旦你将 aggregations.stores.buckets
展平,它就变成了 datum
字段的字面名称 - "aggregations.stores.buckets"
-- 必须通过方括号表示法访问,因为它包含句点.