如何更正 Xarray 数据集中的时间日历等坐标元数据属性?

How to correct coordinate metadata attributes like the time calendar in an Xarray dataset?

不幸的是,我读入 xarray 的 NetCDF 文件在时间坐标上的日历属性指定为 gregorian_proleptic 而不是 CF 标准 proleptic_gregorian.

我该如何解决这个问题?

我尝试只更改属性,但 xarray 必须已经将元数据存储在我需要修改的其他地方,因为当我尝试使用 decode_cf 时,它仍然认为日历是 gregorian_proleptic .这是我尝试过的:

import xarray as xr
import fsspec

url='s3://noaa-ofs-pds/dbofs.20200826/nos.dbofs.fields.f044.20200826.t06z.nc'
ncfile = fsspec.open(url)
ds = xr.open_dataset(ncfile.open(), decode_times=False)

ds.ocean_time.attrs['calendar']='proleptic_gregorian'

xr.decode_cf(ds, decode_times=True)

产生:

---------------------------------------------------------------------------
OutOfBoundsDatetime                       Traceback (most recent call last)
/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in decode_cf_datetime(num_dates, units, calendar, use_cftime)
    157         try:
--> 158             dates = _decode_datetime_with_pandas(flat_num_dates, units, calendar)
    159         except (KeyError, OutOfBoundsDatetime, OverflowError):

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in _decode_datetime_with_pandas(flat_num_dates, units, calendar)
    105             "Cannot decode times from a non-standard calendar, {!r}, using "
--> 106             "pandas.".format(calendar)
    107         )

OutOfBoundsDatetime: Cannot decode times from a non-standard calendar, 'gregorian_proleptic', using pandas.

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in _decode_cf_datetime_dtype(data, units, calendar, use_cftime)
     76     try:
---> 77         result = decode_cf_datetime(example_value, units, calendar, use_cftime)
     78     except Exception:

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in decode_cf_datetime(num_dates, units, calendar, use_cftime)
    160             dates = _decode_datetime_with_cftime(
--> 161                 flat_num_dates.astype(float), units, calendar
    162             )

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in _decode_datetime_with_cftime(num_dates, units, calendar)
     97     return np.asarray(
---> 98         cftime.num2date(num_dates, units, calendar, only_use_cftime_datetimes=True)
     99     )

cftime/_cftime.pyx in cftime._cftime.num2date()

cftime/_cftime.pyx in cftime._cftime.to_calendar_specific_datetime()

KeyError: 'gregorian_proleptic'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-41-174237fc09de> in <module>
----> 1 xr.decode_cf(ds, decode_times=True)

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/conventions.py in decode_cf(obj, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta)
    594         drop_variables=drop_variables,
    595         use_cftime=use_cftime,
--> 596         decode_timedelta=decode_timedelta,
    597     )
    598     ds = Dataset(vars, attrs=attrs)

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/conventions.py in decode_cf_variables(variables, attributes, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta)
    496             stack_char_dim=stack_char_dim,
    497             use_cftime=use_cftime,
--> 498             decode_timedelta=decode_timedelta,
    499         )
    500         if decode_coords:

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/conventions.py in decode_cf_variable(name, var, concat_characters, mask_and_scale, decode_times, decode_endianness, stack_char_dim, use_cftime, decode_timedelta)
    336         var = times.CFTimedeltaCoder().decode(var, name=name)
    337     if decode_times:
--> 338         var = times.CFDatetimeCoder(use_cftime=use_cftime).decode(var, name=name)
    339 
    340     dimensions, data, attributes, encoding = variables.unpack_for_decoding(var)

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in decode(self, variable, name)
    425             units = pop_to(attrs, encoding, "units")
    426             calendar = pop_to(attrs, encoding, "calendar")
--> 427             dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime)
    428             transform = partial(
    429                 decode_cf_datetime,

/srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/coding/times.py in _decode_cf_datetime_dtype(data, units, calendar, use_cftime)
     85             "if it is not installed."
     86         )
---> 87         raise ValueError(msg)
     88     else:
     89         dtype = getattr(result, "dtype", np.dtype("object"))

ValueError: unable to decode time units 'days since 2016-01-01 00:00:00' with "calendar 'gregorian_proleptic'". Try opening your dataset with decode_times=False or installing cftime if it is not installed.

如果有用,这里是full notebook

在有问题的数据集中,还有另一个 non-coordinate 变量 (dstart),它也有错误的 gregorian_proleptic 日历属性。

如果设置正确:

ds.dstart.attrs['calendar']='proleptic_gregorian'
xr.decode_cf(ds, decode_times=True)

cf 解码可以。