获取 vega 数据集中的最后一个数据
Getting the last datum in a vega dataset
我有一个数据源 A,我想创建一个仅包含 A 的最后一个元素的新数据源 B。在 Vega 中执行此操作的最佳方法是什么?
我能够使用以下方法在 Vega Editor 中运行它:
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": [
{
"name": "source",
"url": "https://raw.githubusercontent.com/vega/vega/master/docs/data/cars.json",
"transform": [
{
"type": "filter",
"expr": "datum['Horsepower'] != null && datum['Miles_per_Gallon'] != null && datum['Acceleration'] != null"
}
]
},
{
"name": "avg",
"source":"source",
"transform":[
{
"type":"aggregate",
"groupby":["Horsepower"],
"ops": ["average"],
"fields":["Miles_per_Gallon"],
"as":["Avg_Miles_per_Gallon"]
}
]
},
{
"name":"last",
"source": "avg",
"transform": [
{
"type": "aggregate",
"ops": ["max"],
"fields": ["Horsepower"],
"as": ["maxHorsepower"]
},
{
"type": "lookup",
"from": "avg",
"key": "Horsepower",
"fields": ["maxHorsepower"],
"values": ["Horsepower","Avg_Miles_per_Gallon"]
}
]
}
]
}
maxHorsepower
Horsepower
Avg_Miles_per_Gallon
230
230
16
我很想知道是否有更好的方法,但这对我有用。
这样做相对简单。尽管我对您在聚合中使用“max”感到有点困惑,因为这不是最后一个值?
这两种方法都是我使用这一系列转换获取数据集中最后一个值的解决方案,
transform: [
{
type: window
ops: [
row_number
]
}
{
type: joinaggregate
fields: [
row_number
]
ops: [
max
]
as: [
max_row_number
]
}
{
type: filter
expr: datum.row_number==datum.max_row_number
}
]
我有一个数据源 A,我想创建一个仅包含 A 的最后一个元素的新数据源 B。在 Vega 中执行此操作的最佳方法是什么?
我能够使用以下方法在 Vega Editor 中运行它:
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": [
{
"name": "source",
"url": "https://raw.githubusercontent.com/vega/vega/master/docs/data/cars.json",
"transform": [
{
"type": "filter",
"expr": "datum['Horsepower'] != null && datum['Miles_per_Gallon'] != null && datum['Acceleration'] != null"
}
]
},
{
"name": "avg",
"source":"source",
"transform":[
{
"type":"aggregate",
"groupby":["Horsepower"],
"ops": ["average"],
"fields":["Miles_per_Gallon"],
"as":["Avg_Miles_per_Gallon"]
}
]
},
{
"name":"last",
"source": "avg",
"transform": [
{
"type": "aggregate",
"ops": ["max"],
"fields": ["Horsepower"],
"as": ["maxHorsepower"]
},
{
"type": "lookup",
"from": "avg",
"key": "Horsepower",
"fields": ["maxHorsepower"],
"values": ["Horsepower","Avg_Miles_per_Gallon"]
}
]
}
]
}
maxHorsepower | Horsepower | Avg_Miles_per_Gallon |
---|---|---|
230 | 230 | 16 |
我很想知道是否有更好的方法,但这对我有用。
这样做相对简单。尽管我对您在聚合中使用“max”感到有点困惑,因为这不是最后一个值?
这两种方法都是我使用这一系列转换获取数据集中最后一个值的解决方案,
transform: [
{
type: window
ops: [
row_number
]
}
{
type: joinaggregate
fields: [
row_number
]
ops: [
max
]
as: [
max_row_number
]
}
{
type: filter
expr: datum.row_number==datum.max_row_number
}
]