转换中的 Vega-lite 多重聚合

Vega-lite Multiple aggregations in Transforms

我想在转换中实现两个 diff 聚合,因为它们具有不同的 groupby 条件,但使用 vega-lite 似乎不可能

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "data": {

      {"response":200,"request":"/ST"},
      {"response":500,"request":"/ST"},
      {"response":200,"request":"/PP"},
      {"response":500,"request":"/PP"},
      {"response":200,"request":"/CP"},
      {"response":200,"request":"/CP"},
      {"response":500,"request":"/CP"},
      {"response":500,"request":"/CP"},
      {"response":500,"request":"/CP"},
      {"response":500,"request":"/CP"},
      {"response":500,"request":"/CP"},
      {"response":500,"request":"/CP"},
      {"response":503,"request":"/CP"},
      {"response":503,"request":"/CP"},
      {"response":503,"request":"/CP"}

  "transform": [
    {
      "aggregate": [{
       "op": "count",
       "as": "response_count"
      }],
      "groupby": ["response","request"]
    },
    {
      "aggregate": [{
       "op": "count",
       "as": "response_c"
      }],
      "groupby": ["request"]
    }
     ],

     {"mark": "bar",
      "encoding": {
        "x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
        "y": {"field": "request", "type": "nominal"},
        "color": {"field": "response", "type": "nominal"}}
}

有什么办法可以实现吗?是否支持这样的多个聚合?

是的,支持这样的多个聚合,但是你的图表最后有未定义的字段,因为你没有在你的聚合中引用它们。您从以下数据开始:

[
  {"response": 200, "request": "/ST"},
  {"response": 500, "request": "/ST"},
  {"response": 200, "request": "/PP"},
  {"response": 500, "request": "/PP"},
  {"response": 200, "request": "/CP"},
  {"response": 200, "request": "/CP"},
  {"response": 500, "request": "/CP"},
  {"response": 500, "request": "/CP"},
  {"response": 500, "request": "/CP"},
  {"response": 500, "request": "/CP"},
  {"response": 500, "request": "/CP"},
  {"response": 500, "request": "/CP"},
  {"response": 503, "request": "/CP"},
  {"response": 503, "request": "/CP"},
  {"response": 503, "request": "/CP"}
]

第一个聚合按"response""request"分组,并在每个组中添加"response_count",如下所示:

[
  {"response": 200, "request": "/ST", "response_count": 1},
  {"response": 500, "request": "/ST", "response_count": 1},
  {"response": 200, "request": "/PP", "response_count": 1},
  {"response": 500, "request": "/PP", "response_count": 1},
  {"response": 200, "request": "/CP", "response_count": 2},
  {"response": 500, "request": "/CP", "response_count": 6},
  {"response": 503, "request": "/CP", "response_count": 3},
]

您的第二个聚合采用此方法,按 "request" 分组,并在每个组中添加 "response_c",如下所示:

[
  {"request": "/ST", "response_c": 2},
  {"request": "/PP", "response_c": 2},
  {"request": "/CP", "response_c": 3},
]

请注意,您未在聚合中引用的任何字段都将被删除。

然后您的规范引用了数据集中不再存在的字段,这导致图表为空白。

您可以通过在第二个聚合中指定要对删除的字段执行的操作来解决此问题;例如,您可以保留 "response_count" 的总和和 "response" (open in editor) 的最小值:

{
  "data": {
    "values": [
      {"response": 200, "request": "/ST"},
      {"response": 500, "request": "/ST"},
      {"response": 200, "request": "/PP"},
      {"response": 500, "request": "/PP"},
      {"response": 200, "request": "/CP"},
      {"response": 200, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 503, "request": "/CP"},
      {"response": 503, "request": "/CP"},
      {"response": 503, "request": "/CP"}
    ]
  },
  "transform": [
    {
      "aggregate": [{"op": "count", "as": "response_count"}],
      "groupby": ["response", "request"]
    },
    {
      "aggregate": [
        {"op": "count", "as": "response_c"},
        {"op": "sum", "field": "response_count", "as": "response_count"},
        {"op": "min", "field": "response", "as": "response"}
      ],
      "groupby": ["request"]
    }
  ],
  "mark": "bar",
  "encoding": {
    "x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
    "y": {"field": "request", "type": "nominal"},
    "color": {"field": "response", "type": "nominal"}
  }
}

在这种特殊情况下更好的做法可能是完全省略第二个聚合,在这种情况下,第二个聚合基本上是通过条形图的堆叠在视觉上发生的 (editor):

{
  "data": {
    "values": [
      {"response": 200, "request": "/ST"},
      {"response": 500, "request": "/ST"},
      {"response": 200, "request": "/PP"},
      {"response": 500, "request": "/PP"},
      {"response": 200, "request": "/CP"},
      {"response": 200, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 500, "request": "/CP"},
      {"response": 503, "request": "/CP"},
      {"response": 503, "request": "/CP"},
      {"response": 503, "request": "/CP"}
    ]
  },
  "transform": [
    {
      "aggregate": [{"op": "count", "as": "response_count"}],
      "groupby": ["response", "request"]
    }
  ],
  "mark": "bar",
  "encoding": {
    "x": {"field": "response_count", "type": "quantitative", "stack": "zero"},
    "y": {"field": "request", "type": "nominal"},
    "color": {"field": "response", "type": "nominal"}
  }
}