转换中的 Vega-lite 多重聚合
Vega-lite Multiple aggregations in Transforms
我想在转换中实现两个 diff 聚合,因为它们具有不同的 groupby 条件,但使用 vega-lite 似乎不可能
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": {
{"response":200,"request":"/ST"},
{"response":500,"request":"/ST"},
{"response":200,"request":"/PP"},
{"response":500,"request":"/PP"},
{"response":200,"request":"/CP"},
{"response":200,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":503,"request":"/CP"},
{"response":503,"request":"/CP"},
{"response":503,"request":"/CP"}
"transform": [
{
"aggregate": [{
"op": "count",
"as": "response_count"
}],
"groupby": ["response","request"]
},
{
"aggregate": [{
"op": "count",
"as": "response_c"
}],
"groupby": ["request"]
}
],
{"mark": "bar",
"encoding": {
"x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
"y": {"field": "request", "type": "nominal"},
"color": {"field": "response", "type": "nominal"}}
}
有什么办法可以实现吗?是否支持这样的多个聚合?
是的,支持这样的多个聚合,但是你的图表最后有未定义的字段,因为你没有在你的聚合中引用它们。您从以下数据开始:
[
{"response": 200, "request": "/ST"},
{"response": 500, "request": "/ST"},
{"response": 200, "request": "/PP"},
{"response": 500, "request": "/PP"},
{"response": 200, "request": "/CP"},
{"response": 200, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"}
]
第一个聚合按"response"
和"request"
分组,并在每个组中添加"response_count"
,如下所示:
[
{"response": 200, "request": "/ST", "response_count": 1},
{"response": 500, "request": "/ST", "response_count": 1},
{"response": 200, "request": "/PP", "response_count": 1},
{"response": 500, "request": "/PP", "response_count": 1},
{"response": 200, "request": "/CP", "response_count": 2},
{"response": 500, "request": "/CP", "response_count": 6},
{"response": 503, "request": "/CP", "response_count": 3},
]
您的第二个聚合采用此方法,按 "request"
分组,并在每个组中添加 "response_c"
,如下所示:
[
{"request": "/ST", "response_c": 2},
{"request": "/PP", "response_c": 2},
{"request": "/CP", "response_c": 3},
]
请注意,您未在聚合中引用的任何字段都将被删除。
然后您的规范引用了数据集中不再存在的字段,这导致图表为空白。
您可以通过在第二个聚合中指定要对删除的字段执行的操作来解决此问题;例如,您可以保留 "response_count" 的总和和 "response" (open in editor) 的最小值:
{
"data": {
"values": [
{"response": 200, "request": "/ST"},
{"response": 500, "request": "/ST"},
{"response": 200, "request": "/PP"},
{"response": 500, "request": "/PP"},
{"response": 200, "request": "/CP"},
{"response": 200, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"}
]
},
"transform": [
{
"aggregate": [{"op": "count", "as": "response_count"}],
"groupby": ["response", "request"]
},
{
"aggregate": [
{"op": "count", "as": "response_c"},
{"op": "sum", "field": "response_count", "as": "response_count"},
{"op": "min", "field": "response", "as": "response"}
],
"groupby": ["request"]
}
],
"mark": "bar",
"encoding": {
"x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
"y": {"field": "request", "type": "nominal"},
"color": {"field": "response", "type": "nominal"}
}
}
在这种特殊情况下更好的做法可能是完全省略第二个聚合,在这种情况下,第二个聚合基本上是通过条形图的堆叠在视觉上发生的 (editor):
{
"data": {
"values": [
{"response": 200, "request": "/ST"},
{"response": 500, "request": "/ST"},
{"response": 200, "request": "/PP"},
{"response": 500, "request": "/PP"},
{"response": 200, "request": "/CP"},
{"response": 200, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"}
]
},
"transform": [
{
"aggregate": [{"op": "count", "as": "response_count"}],
"groupby": ["response", "request"]
}
],
"mark": "bar",
"encoding": {
"x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
"y": {"field": "request", "type": "nominal"},
"color": {"field": "response", "type": "nominal"}
}
}
我想在转换中实现两个 diff 聚合,因为它们具有不同的 groupby 条件,但使用 vega-lite 似乎不可能
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": {
{"response":200,"request":"/ST"},
{"response":500,"request":"/ST"},
{"response":200,"request":"/PP"},
{"response":500,"request":"/PP"},
{"response":200,"request":"/CP"},
{"response":200,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":500,"request":"/CP"},
{"response":503,"request":"/CP"},
{"response":503,"request":"/CP"},
{"response":503,"request":"/CP"}
"transform": [
{
"aggregate": [{
"op": "count",
"as": "response_count"
}],
"groupby": ["response","request"]
},
{
"aggregate": [{
"op": "count",
"as": "response_c"
}],
"groupby": ["request"]
}
],
{"mark": "bar",
"encoding": {
"x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
"y": {"field": "request", "type": "nominal"},
"color": {"field": "response", "type": "nominal"}}
}
有什么办法可以实现吗?是否支持这样的多个聚合?
是的,支持这样的多个聚合,但是你的图表最后有未定义的字段,因为你没有在你的聚合中引用它们。您从以下数据开始:
[
{"response": 200, "request": "/ST"},
{"response": 500, "request": "/ST"},
{"response": 200, "request": "/PP"},
{"response": 500, "request": "/PP"},
{"response": 200, "request": "/CP"},
{"response": 200, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"}
]
第一个聚合按"response"
和"request"
分组,并在每个组中添加"response_count"
,如下所示:
[
{"response": 200, "request": "/ST", "response_count": 1},
{"response": 500, "request": "/ST", "response_count": 1},
{"response": 200, "request": "/PP", "response_count": 1},
{"response": 500, "request": "/PP", "response_count": 1},
{"response": 200, "request": "/CP", "response_count": 2},
{"response": 500, "request": "/CP", "response_count": 6},
{"response": 503, "request": "/CP", "response_count": 3},
]
您的第二个聚合采用此方法,按 "request"
分组,并在每个组中添加 "response_c"
,如下所示:
[
{"request": "/ST", "response_c": 2},
{"request": "/PP", "response_c": 2},
{"request": "/CP", "response_c": 3},
]
请注意,您未在聚合中引用的任何字段都将被删除。
然后您的规范引用了数据集中不再存在的字段,这导致图表为空白。
您可以通过在第二个聚合中指定要对删除的字段执行的操作来解决此问题;例如,您可以保留 "response_count" 的总和和 "response" (open in editor) 的最小值:
{
"data": {
"values": [
{"response": 200, "request": "/ST"},
{"response": 500, "request": "/ST"},
{"response": 200, "request": "/PP"},
{"response": 500, "request": "/PP"},
{"response": 200, "request": "/CP"},
{"response": 200, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"}
]
},
"transform": [
{
"aggregate": [{"op": "count", "as": "response_count"}],
"groupby": ["response", "request"]
},
{
"aggregate": [
{"op": "count", "as": "response_c"},
{"op": "sum", "field": "response_count", "as": "response_count"},
{"op": "min", "field": "response", "as": "response"}
],
"groupby": ["request"]
}
],
"mark": "bar",
"encoding": {
"x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
"y": {"field": "request", "type": "nominal"},
"color": {"field": "response", "type": "nominal"}
}
}
在这种特殊情况下更好的做法可能是完全省略第二个聚合,在这种情况下,第二个聚合基本上是通过条形图的堆叠在视觉上发生的 (editor):
{
"data": {
"values": [
{"response": 200, "request": "/ST"},
{"response": 500, "request": "/ST"},
{"response": 200, "request": "/PP"},
{"response": 500, "request": "/PP"},
{"response": 200, "request": "/CP"},
{"response": 200, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 500, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"},
{"response": 503, "request": "/CP"}
]
},
"transform": [
{
"aggregate": [{"op": "count", "as": "response_count"}],
"groupby": ["response", "request"]
}
],
"mark": "bar",
"encoding": {
"x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
"y": {"field": "request", "type": "nominal"},
"color": {"field": "response", "type": "nominal"}
}
}