使用另一个聚合字段过滤聚合图表

Question

我正在尝试生成类似于 K-top 示例的内容。

除了不过滤并显示 相同的聚合字段 数据，我想要：

显示一种聚合数据（每日温度的最大值）
并过滤另一个聚合字段（每日温度的平均值）

我创建了一个可观察的笔记本 here 来构建我的测试用例，这就是我的进展。

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": {"url": "data/seattle-weather.csv"},
  "transform": [
    {"timeUnit": "month", "field": "date", "as": "month_date"},
    {
      "joinaggregate": [
        {"op": "mean", "field": "precipitation", "as": "mean_precipitation"},
        {"op": "max", "field": "precipitation", "as": "max_precipitation"}
      ],
      "groupby": ["month_date"]
    },
    {
      "aggregate": [
        {"as": "aggregation", "field": "precipitation", "op": "mean"}
      ],
      "groupby": ["month_date"]
    },
    {"window": [{"op": "row_number", "as": "rank"}]},
    {"calculate": "datum.rank <= 100? datum.month_date : null", "as": "dates"},
    {"filter": "datum.dates != null"}
  ],
  "encoding": {
    "x": {"field": "dates", "type": "ordinal", "timeUnit": "month"}
  },
  "layer": [
    {
      "mark": {"type": "bar"},
      "encoding": {
        "y": {
          "aggregate": "max",
          "field": "precipitation",
          "type": "quantitative"
        }
      }
    },
    {
      "mark": "tick",
      "encoding": {
        "y": {
          "aggregate": "mean",
          "field": "precipitation",
          "type": "quantitative"
        },
        "color": {"value": "red"},
        "size": {"value": 15}
      }
    }
  ]
}

我觉得我遗漏了一些东西 link 来自 pandas.DataFrame

的 GroupBy.ngroup

Answer 1

你可以按照 Vega-Lite 的 Filtering Top-K Items example along with an extra aggregate transform. Here is an example adapting your spec from above (vega editor):

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "title": "Top Months by Mean Precipitation",
  "data": {"url": "data/seattle-weather.csv"},
  "transform": [
    {"timeUnit": "month", "field": "date", "as": "month_date"},
    {
      "aggregate": [
        {"op": "mean", "field": "precipitation", "as": "mean_precipitation"},
        {"op": "max", "field": "precipitation", "as": "max_precipitation"}
      ],
      "groupby": ["month_date"]
    },
    {
      "window": [{"op": "row_number", "as": "rank"}],
      "sort": [{"field": "mean_precipitation", "order": "descending"}]
    },
    {"filter": "datum.rank < 10"}
  ],
  "encoding": {
    "x": {
      "field": "month_date",
      "type": "ordinal",
      "timeUnit": "month",
      "title": "month (descending by max precip)",
      "sort": {
        "field": "max_precipitation",
        "op": "average",
        "order": "descending"
      }
    }
  },
  "layer": [
    {
      "mark": {"type": "bar"},
      "encoding": {
        "y": {
          "field": "mean_precipitation",
          "type": "quantitative",
          "title": "precipitation (mean & max)"
        }
      }
    },
    {
      "mark": "tick",
      "encoding": {
        "y": {"field": "max_precipitation", "type": "quantitative"},
        "color": {"value": "red"},
        "size": {"value": 15}
      }
    }
  ]
}

使用另一个聚合字段过滤聚合图表

Filtering an aggregated chart with another aggregation field

filtering

group-by

vega-lite

altair