xarray.Dataset 条件索引变量
xarray.Dataset conditionally indexing variables
从 ncep 下载的 hrrr file
开始。
读成xarray.Dataset
喜欢...
ds: xr.Dataset = xr.open_dataset(file, engine="pynio")
Dataset
<xarray.Dataset>
Dimensions: (ygrid_0: 1059, xgrid_0: 1799, lv_HYBL0: 50,
lv_HTGL1: 2, lv_HTGL2: 2, lv_TMPL3: 2,
lv_SPDL4: 3, lv_HTGL5: 2, lv_HTGL6: 2,
lv_DBLL7: 2, lv_HTGL8: 2, lv_HTGL9: 3)
Coordinates:
* lv_HTGL6 (lv_HTGL6) float32 1e+03 4e+03
* lv_TMPL3 (lv_TMPL3) float32 253.0 263.0
* lv_HTGL1 (lv_HTGL1) float32 10.0 80.0
* lv_HYBL0 (lv_HYBL0) float32 1.0 2.0 3.0 ... 49.0 50.0
gridlat_0 (ygrid_0, xgrid_0) float32 ...
gridlon_0 (ygrid_0, xgrid_0) float32 ...
Dimensions without coordinates: ygrid_0, xgrid_0, lv_HTGL2, lv_SPDL4, lv_HTGL5,
lv_DBLL7, lv_HTGL8, lv_HTGL9
Data variables: (12/149)
TMP_P0_L1_GLC0 (ygrid_0, xgrid_0) float32 ...
TMP_P0_L103_GLC0 (ygrid_0, xgrid_0) float32 ...
TMP_P0_L105_GLC0 (lv_HYBL0, ygrid_0, xgrid_0) float32 ...
POT_P0_L103_GLC0 (ygrid_0, xgrid_0) float32 ...
DPT_P0_L103_GLC0 (ygrid_0, xgrid_0) float32 ...
LHTFL_P0_L1_GLC0 (ygrid_0, xgrid_0) float32 ...
... ...
lv_HTGL5_l0 (lv_HTGL5) float32 ...
lv_SPDL4_l1 (lv_SPDL4) float32 ...
lv_SPDL4_l0 (lv_SPDL4) float32 ...
lv_HTGL2_l1 (lv_HTGL2) float32 ...
lv_HTGL2_l0 (lv_HTGL2) float32 ...
gridrot_0 (ygrid_0, xgrid_0) float32 ...
暂时我只关注包含这3个常见的Variables
Coordinates
[lv_HYBL0, gridlat_0, gridlon_0]
我可以手动 select/index 那些具有我想要的 Coordinates
的 Variables
,比如....
ds[["TMP_P0_L105_GLC0",...]]
但我更喜欢 abstract
方法。在 pandas 中,我会按照 ... ds[ds.variables[ds.coords.isin(["gridlat_0","gridlon_0","lv_HYBL0"])]]
进行某种 bool
索引
不幸的是,这不起作用。
如何根据 Variable
绑定到 Coordinate
的条件 select Variables
?
您仍然可以做类似的事情。您可以使用键列表过滤数据集的变量,并通过测试每个数组的 dims
属性(元组)的元素来确定维度。
在这种情况下:
required_dims = ['lv_HYBL0', 'gridlat_0', 'gridlon_0']
#sorted tuple
required_dims = tuple(sorted(required_dims))
subset = ds[[
k for k, v in ds.data_vars.items()
if tuple(sorted(v.dims)) == required_dims
]]
我发现 drop_dims
方法很有效
def dont_drop(dims: Mapping, *args: str):
a = np.array(tuple(dims.keys()))
mask = np.all(a == np.array(args)[:, np.newaxis], axis=0)
return a[~mask]
ds.drop_dims(dont_drop(ds.dims, "lv_HYBL0", "ygrid_0", "xgrid_0"))
从 ncep 下载的 hrrr file
开始。
读成xarray.Dataset
喜欢...
ds: xr.Dataset = xr.open_dataset(file, engine="pynio")
Dataset
<xarray.Dataset>
Dimensions: (ygrid_0: 1059, xgrid_0: 1799, lv_HYBL0: 50,
lv_HTGL1: 2, lv_HTGL2: 2, lv_TMPL3: 2,
lv_SPDL4: 3, lv_HTGL5: 2, lv_HTGL6: 2,
lv_DBLL7: 2, lv_HTGL8: 2, lv_HTGL9: 3)
Coordinates:
* lv_HTGL6 (lv_HTGL6) float32 1e+03 4e+03
* lv_TMPL3 (lv_TMPL3) float32 253.0 263.0
* lv_HTGL1 (lv_HTGL1) float32 10.0 80.0
* lv_HYBL0 (lv_HYBL0) float32 1.0 2.0 3.0 ... 49.0 50.0
gridlat_0 (ygrid_0, xgrid_0) float32 ...
gridlon_0 (ygrid_0, xgrid_0) float32 ...
Dimensions without coordinates: ygrid_0, xgrid_0, lv_HTGL2, lv_SPDL4, lv_HTGL5,
lv_DBLL7, lv_HTGL8, lv_HTGL9
Data variables: (12/149)
TMP_P0_L1_GLC0 (ygrid_0, xgrid_0) float32 ...
TMP_P0_L103_GLC0 (ygrid_0, xgrid_0) float32 ...
TMP_P0_L105_GLC0 (lv_HYBL0, ygrid_0, xgrid_0) float32 ...
POT_P0_L103_GLC0 (ygrid_0, xgrid_0) float32 ...
DPT_P0_L103_GLC0 (ygrid_0, xgrid_0) float32 ...
LHTFL_P0_L1_GLC0 (ygrid_0, xgrid_0) float32 ...
... ...
lv_HTGL5_l0 (lv_HTGL5) float32 ...
lv_SPDL4_l1 (lv_SPDL4) float32 ...
lv_SPDL4_l0 (lv_SPDL4) float32 ...
lv_HTGL2_l1 (lv_HTGL2) float32 ...
lv_HTGL2_l0 (lv_HTGL2) float32 ...
gridrot_0 (ygrid_0, xgrid_0) float32 ...
暂时我只关注包含这3个常见的Variables
Coordinates
[lv_HYBL0, gridlat_0, gridlon_0]
我可以手动 select/index 那些具有我想要的 Coordinates
的 Variables
,比如....
ds[["TMP_P0_L105_GLC0",...]]
但我更喜欢 abstract
方法。在 pandas 中,我会按照 ... ds[ds.variables[ds.coords.isin(["gridlat_0","gridlon_0","lv_HYBL0"])]]
bool
索引
不幸的是,这不起作用。
如何根据 Variable
绑定到 Coordinate
的条件 select Variables
?
您仍然可以做类似的事情。您可以使用键列表过滤数据集的变量,并通过测试每个数组的 dims
属性(元组)的元素来确定维度。
在这种情况下:
required_dims = ['lv_HYBL0', 'gridlat_0', 'gridlon_0']
#sorted tuple
required_dims = tuple(sorted(required_dims))
subset = ds[[
k for k, v in ds.data_vars.items()
if tuple(sorted(v.dims)) == required_dims
]]
我发现 drop_dims
方法很有效
def dont_drop(dims: Mapping, *args: str):
a = np.array(tuple(dims.keys()))
mask = np.all(a == np.array(args)[:, np.newaxis], axis=0)
return a[~mask]
ds.drop_dims(dont_drop(ds.dims, "lv_HYBL0", "ygrid_0", "xgrid_0"))