打开有许多组的远程 zarr 存储并使用 xarray 保持坐标
open remote zarr store with many groups and keep coordinates using xarray
我想读取 https://hrrrzarr.s3.amazonaws.com/index.html#sfc/20210208/20210208_00z_anl.zarr/. Info of the zarr store is at https://mesowest.utah.edu/html/hrrr/zarr_documentation/zarrFileVariables.html
的远程 zarr 存储
我能够读入一个变量,但它似乎没有捕获与该变量关联的坐标或属性(我很可能缺少 open_mfdataset
或 open_zarr
的 kwargs)。因为有不同层次的嵌套我不确定什么是正确的传递路径
import xarray as xr
import s3fs
fs = s3fs.S3FileSystem(anon=True)
uri = "s3://hrrrzarr/sfc/20210208/20210208_00z_anl.zarr/10m_above_ground/UGRD/10m_above_ground"
file = s3fs.S3Map(uri, s3=fs)
ds = xr.open_mfdataset([file], engine="zarr")
>>> ds
<xarray.Dataset>
Dimensions: (projection_x_coordinate: 1799, projection_y_coordinate: 1059)
Dimensions without coordinates: projection_x_coordinate, projection_y_coordinate
Data variables:
UGRD (projection_y_coordinate, projection_x_coordinate) float16 dask.array<chunksize=(150, 150), meta=np.ndarray>
uri = "s3://hrrrzarr/sfc/20210208/20210208_00z_anl.zarr/10m_above_ground/UGRD"
file = s3fs.S3Map(uri, s3=fs)
ds = xr.open_mfdataset([file], engine="zarr")
>>> ds
<xarray.Dataset>
Dimensions: (projection_x_coordinate: 1799, projection_y_coordinate: 1059)
Coordinates:
* projection_x_coordinate (projection_x_coordinate) float64 -2.698e+06 ......
* projection_y_coordinate (projection_y_coordinate) float64 -1.587e+06 ......
Data variables:
forecast_period timedelta64[ns] ...
forecast_reference_time datetime64[ns] ...
height float64 ...
pressure float64 ...
time datetime64[ns] ...
Xarray 无法理解嵌套的 zarr 组。它期望所有变量和坐标都在一个平面组中。我认为您在这里唯一的选择是手动合并数据集。你试过了吗
ds = xr.open_mfdataset([file1, file2], engine="zarr")
?
我想读取 https://hrrrzarr.s3.amazonaws.com/index.html#sfc/20210208/20210208_00z_anl.zarr/. Info of the zarr store is at https://mesowest.utah.edu/html/hrrr/zarr_documentation/zarrFileVariables.html
的远程 zarr 存储我能够读入一个变量,但它似乎没有捕获与该变量关联的坐标或属性(我很可能缺少 open_mfdataset
或 open_zarr
的 kwargs)。因为有不同层次的嵌套我不确定什么是正确的传递路径
import xarray as xr
import s3fs
fs = s3fs.S3FileSystem(anon=True)
uri = "s3://hrrrzarr/sfc/20210208/20210208_00z_anl.zarr/10m_above_ground/UGRD/10m_above_ground"
file = s3fs.S3Map(uri, s3=fs)
ds = xr.open_mfdataset([file], engine="zarr")
>>> ds
<xarray.Dataset>
Dimensions: (projection_x_coordinate: 1799, projection_y_coordinate: 1059)
Dimensions without coordinates: projection_x_coordinate, projection_y_coordinate
Data variables:
UGRD (projection_y_coordinate, projection_x_coordinate) float16 dask.array<chunksize=(150, 150), meta=np.ndarray>
uri = "s3://hrrrzarr/sfc/20210208/20210208_00z_anl.zarr/10m_above_ground/UGRD"
file = s3fs.S3Map(uri, s3=fs)
ds = xr.open_mfdataset([file], engine="zarr")
>>> ds
<xarray.Dataset>
Dimensions: (projection_x_coordinate: 1799, projection_y_coordinate: 1059)
Coordinates:
* projection_x_coordinate (projection_x_coordinate) float64 -2.698e+06 ......
* projection_y_coordinate (projection_y_coordinate) float64 -1.587e+06 ......
Data variables:
forecast_period timedelta64[ns] ...
forecast_reference_time datetime64[ns] ...
height float64 ...
pressure float64 ...
time datetime64[ns] ...
Xarray 无法理解嵌套的 zarr 组。它期望所有变量和坐标都在一个平面组中。我认为您在这里唯一的选择是手动合并数据集。你试过了吗
ds = xr.open_mfdataset([file1, file2], engine="zarr")
?