如何处理 NaT/1970 日期以便 python-xarray ds.time.dt.season 有效?

How to deal with NaT/1970 dates so that python-xarray ds.time.dt.season works?

我有这个 python-xarray 数据集:

<xarray.Dataset>
Dimensions:       (airport: 8, profnum: 9993, level: 3)
Coordinates:
  * airport       (airport) <U9 'Frankfurt' 'Windhoek' ... 'Madras' 'Hyderabad'
  * profnum       (profnum) int64 0 1 2 3 4 5 ... 9987 9988 9989 9990 9991 9992
  * level         (level) int64 0 1 2
    time          (airport, profnum, level) datetime64[ns] 2008-01-01T10:27:0...
    yearMonthDay  (airport, profnum, level) object '08-01-01' '08-01-01' ... nan
Data variables:
    iasi          (airport, profnum, level) float64 0.5094 1.345 ... nan nan
    IM            (airport, profnum, level) float64 0.515 1.775 ... nan nan
    IMS           (airport, profnum, level) float64 0.5221 1.514 ... nan nan
    err           (airport, profnum, level) float64 0.04518 0.2714 ... nan nan
    std           (airport, profnum, level) float64 0.0324 0.1542 ... nan nan
    dfs           (airport, profnum, level) float64 1.476 nan nan ... nan nan

ds.time 显示了一些 1970-01-01 日期,如果需要,我设法更改为 np.datetime64("NaT") 但 ds.time.dt.season 不喜欢它们。所以我这样做:

ds = ds.where( (ds.time.dt.year >= 2008) & (ds.time.dt.year <= currentYear), drop=True)
ds = ds.where( (ds.time.dt.year >= 2008) & (ds.time.dt.year <= currentYear), other=np.nan )

我希望在这之后我不会看到任何带有 ds.time 的 1970 年日期,但是替换不起作用。

看起来“其他”需要浮动,因为

ds.where( (ds.time.dt.year >= 2008) & (ds.time.dt.year <= currentYear), other=np.datetime64("NaT"))

产出

TypeError: The DTypes <class 'numpy.dtype[datetime64]'> and <class 'numpy.dtype[float64]'> do not have a common DType. For example they cannot be stored in a single array unless the dtype is `object`.

这很奇怪,因为 df.time 是 datetime64。

谢谢

声明

ds.where(
    (ds.time.dt.year >= 2008) & (ds.time.dt.year <= currentYear),
    other=np.datetime64("NaT"),
)

可以解释为

wherever 2008 ≤ year ≤ currentYear, return ds, otherwise return NaT

这会导致问题,因为此操作是针对数据集中的每个 变量 执行的。因为您所有的 data_variables 都是类型 float64,所以您会收到此错误。要仅及时替换值,请将您的条件限制为 ds.time:

ds.time.where(
    (ds.time.dt.year >= 2008) & (ds.time.dt.year <= currentYear),
    other=np.datetime64("NaT"),
)