xarray 使用 cfgrib 在 s3 上读取远程 grib 文件

xarray read remote grib file on s3 using cfgrib

crgrib 引擎可以读取远程文件吗?根据 Martin Durant 的评论,它看起来不像 (https://github.com/ecmwf/cfgrib/issues/198#issuecomment-772852412)

s3 上托管了一个较小的 grib 文件:https://mf-nwp-models.s3.amazonaws.com/index.html#arpege-world/v2/2021-02-16/00/UGRD/10m/(注意不要单击文件,因为它会下载)。

当我尝试阅读它时使用 sf3s 我得到

import s3fs
import xarray as xr

fs = s3fs.S3FileSystem(anon=True)

uri = "s3://mf-nwp-models/arpege-world/v2/2021-02-16/00/UGRD/10m/0h.grib2"

file = s3fs.S3Map(uri, s3=fs)
ds = xr.open_dataset(file, engine="cfgrib")

Can't create file '<File-like object S3FileSystem, mf-nwp-models/arpege-world/v2/2021-02-16/00/UGRD/10m/0h.grib2>.90c91.idx'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 342, in from_indexpath_or_filestream
    with compat_create_exclusive(indexpath) as new_index_file:
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 274, in compat_create_exclusive
    fd = os.open(path, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
FileNotFoundError: [Errno 2] No such file or directory: '<File-like object S3FileSystem, mf-nwp-models/arpege-world/v2/2021-02-16/00/UGRD/10m/0h.grib2>.90c91.idx'
Can't read index file '<File-like object S3FileSystem, mf-nwp-models/arpege-world/v2/2021-02-16/00/UGRD/10m/0h.grib2>.90c91.idx'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 352, in from_indexpath_or_filestream
    index_mtime = os.path.getmtime(indexpath)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/genericpath.py", line 55, in getmtime
    return os.stat(filename).st_mtime
FileNotFoundError: [Errno 2] No such file or directory: '<File-like object S3FileSystem, mf-nwp-models/arpege-world/v2/2021-02-16/00/UGRD/10m/0h.grib2>.90c91.idx'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/xarray/backends/api.py", line 572, in open_dataset
    store = opener(filename_or_obj, **extra_kwargs, **backend_kwargs)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/xarray/backends/cfgrib_.py", line 45, in __init__
    self.ds = cfgrib.open_file(filename, **backend_kwargs)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 650, in open_file
    index = open_fileindex(path, grib_errors, indexpath, index_keys).subindex(filter_by_keys)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 637, in open_fileindex
    return stream.index(index_keys, indexpath=indexpath)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 269, in index
    return FileIndex.from_indexpath_or_filestream(self, index_keys, indexpath)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 370, in from_indexpath_or_filestream
    return cls.from_filestream(filestream, index_keys)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 297, in from_filestream
    for message in filestream:
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/messages.py", line 240, in __iter__
    with open(self.path, 'rb') as file:
TypeError: expected str, bytes or os.PathLike object, not S3File

我想我是通过 https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally

得到的
import fsspec
import xarray as xr

uri = "simplecache::s3://mf-nwp-models/arpege-world/v2/2021-02-16/00/UGRD/10m/0h.grib2"

file = fsspec.open_local(uri, s3={'anon': True}, filecache={'cache_storage':'/tmp/files'})

ds = xr.open_dataset(file, engine="cfgrib")