xarray select 只有当月第一天的时间步长
xarray select only the timesteps which are the first of the month
我只想 select 数据是当月第一天的时间步长。原因是对于不是该月第一天的日期,数据都是 nan
。
制作虚拟数据集:
times = [
pd.to_datetime('2017-01-01'),
pd.to_datetime('2017-01-31'),
pd.to_datetime('2017-02-01'),
pd.to_datetime('2017-02-02'),
pd.to_datetime('2017-03-01'),
pd.to_datetime('2017-03-29'),
pd.to_datetime('2017-03-30'),
pd.to_datetime('2017-04-01'),
]
data = np.ones((8, 3, 3))
data[[1, 3, 5, 6], :, :] = np.nan
lat = [0, 1, 2]
lon = [0, 1, 2]
ds = xr.Dataset(
{'data': (['time', 'lat', 'lon'], data)},
coords={
'lon': lon,
'lat': lat,
'time': times,
}
)
ds
Out[]:
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 8)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-01-31 ... 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
理想情况下,我想要一个 select 仅 [0, 2, 4, 7]
索引时间的输出。
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 4)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
到目前为止我的工作方式是使用 xarray .where()
结合方括号时间子集功能:
ds.where(ds['time.day'] == 1, drop=True)
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 4)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
将 where
与 drop=True
结合使用是一种可行的方法,但最直接的方法可能是将 sel
与布尔数据数组结合使用:
ds.sel(time=ds.time.dt.day == 1)
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 4)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
我只想 select 数据是当月第一天的时间步长。原因是对于不是该月第一天的日期,数据都是 nan
。
制作虚拟数据集:
times = [
pd.to_datetime('2017-01-01'),
pd.to_datetime('2017-01-31'),
pd.to_datetime('2017-02-01'),
pd.to_datetime('2017-02-02'),
pd.to_datetime('2017-03-01'),
pd.to_datetime('2017-03-29'),
pd.to_datetime('2017-03-30'),
pd.to_datetime('2017-04-01'),
]
data = np.ones((8, 3, 3))
data[[1, 3, 5, 6], :, :] = np.nan
lat = [0, 1, 2]
lon = [0, 1, 2]
ds = xr.Dataset(
{'data': (['time', 'lat', 'lon'], data)},
coords={
'lon': lon,
'lat': lat,
'time': times,
}
)
ds
Out[]:
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 8)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-01-31 ... 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
理想情况下,我想要一个 select 仅 [0, 2, 4, 7]
索引时间的输出。
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 4)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
到目前为止我的工作方式是使用 xarray .where()
结合方括号时间子集功能:
ds.where(ds['time.day'] == 1, drop=True)
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 4)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
将 where
与 drop=True
结合使用是一种可行的方法,但最直接的方法可能是将 sel
与布尔数据数组结合使用:
ds.sel(time=ds.time.dt.day == 1)
<xarray.Dataset>
Dimensions: (lat: 3, lon: 3, time: 4)
Coordinates:
* lon (lon) int64 0 1 2
* lat (lat) int64 0 1 2
* time (time) datetime64[ns] 2017-01-01 2017-02-01 2017-03-01 2017-04-01
Data variables:
data (time, lat, lon) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0