如何找到与一个变量中的最大值对应的所有变量的最大值？

Question

我有一个包含许多变量的每日数据 xarray。我想提取每年的最大值q_routed和最大值q_routed发生当天其他变量的对应值。

    <xarray.Dataset>
    Dimensions:    (latitude: 1, longitude: 1, param_set: 1, time: 17167)
    Coordinates:
      * time       (time) datetime64[ns] 1970-01-01 ...
      * latitude   (latitude) float32 44.5118
      * longitude  (longitude) float32 -111.435
      * param_set  (param_set) |S1 b''
    Data variables:
        ppt        (time, param_set, latitude, longitude) float64 ...
        pet        (time, param_set, latitude, longitude) float64 ...
        obsq       (time, param_set, latitude, longitude) float64 ...
        q_routed   (time, param_set, latitude, longitude) float64 ...

下面的命令给出了一年中每个每个变量的最大值，但这不是我想要的。

ncdat['q_routed'].groupby('time.year').max( )

试用

我试过了

ncdat.groupby('time.year').argmax('time')

导致此错误：

ValueError: All-NaN slice encountered

我该怎么做？

Answer 1

对于这种操作，您可能需要使用自定义 reduce 函数：

def my_func(ds, dim=None):
    return ds.isel(**{dim: ds['q_routed'].argmax(dim)})


new = ncdat.groupby('time.year').apply(my_func, dim='time')

现在，当您拥有完整的 nan 数组时，argmax 效果不佳，因此您可能希望仅将此函数应用于包含数据的位置或预填充现有的 nan。像这样的东西可以工作：

mask = ncdat['q_routed'].isel(time=0).notnull()  # determine where you have valid data

ncdat2 = ncdat.fillna(-9999)  # fill nans with a missing flag of some kind
new = ncdat2.groupby('time.year').apply(my_func, dim='time').where(mask)  # do the groupby operation/reduction and reapply the mask

如何找到与一个变量中的最大值对应的所有变量的最大值？

How can I find the maximum across all variables corrresponding to the max in one variable?

python

pandas

python-xarray

xarray

试用