xarray select 只有当月第一天的时间步长

xarray select only the timesteps which are the first of the month

我只想 select 数据是当月第一天的时间步长。原因是对于不是该月第一天的日期,数据都是 nan

制作虚拟数据集:

times = [
    pd.to_datetime('2017-01-01'),
    pd.to_datetime('2017-01-31'),
    pd.to_datetime('2017-02-01'),
    pd.to_datetime('2017-02-02'),
    pd.to_datetime('2017-03-01'),
    pd.to_datetime('2017-03-29'),
    pd.to_datetime('2017-03-30'),
    pd.to_datetime('2017-04-01'),
]
data = np.ones((8, 3, 3))
data[[1, 3, 5, 6], :, :] = np.nan

lat = [0, 1, 2]
lon = [0, 1, 2]


ds = xr.Dataset(
    {'data': (['time', 'lat', 'lon'], data)},
    coords={
        'lon': lon,
        'lat': lat,
        'time': times,
    }
)

ds

Out[]:
<xarray.Dataset>
Dimensions:  (lat: 3, lon: 3, time: 8)
Coordinates:
  * lon      (lon) int64 0 1 2
  * lat      (lat) int64 0 1 2
  * time     (time) datetime64[ns] 2017-01-01 2017-01-31 ... 2017-04-01
Data variables:
    data     (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0

理想情况下,我想要一个 select 仅 [0, 2, 4, 7] 索引时间的输出。

<xarray.Dataset>
Dimensions:  (lat: 3, lon: 3, time: 4)
Coordinates:
  * lon      (lon) int64 0 1 2
  * lat      (lat) int64 0 1 2
  * time     (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
    data     (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0

到目前为止我的工作方式是使用 xarray .where() 结合方括号时间子集功能:

ds.where(ds['time.day'] == 1, drop=True)


<xarray.Dataset>
Dimensions:  (lat: 3, lon: 3, time: 4)
Coordinates:
  * lon      (lon) int64 0 1 2
  * lat      (lat) int64 0 1 2
  * time     (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
    data     (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0

wheredrop=True 结合使用是一种可行的方法,但最直接的方法可能是将 sel 与布尔数据数组结合使用:

ds.sel(time=ds.time.dt.day == 1)

<xarray.Dataset>
Dimensions:  (lat: 3, lon: 3, time: 4)
Coordinates:
  * lon      (lon) int64 0 1 2
  * lat      (lat) int64 0 1 2
  * time     (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
    data     (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0