open_mfdataset xarray 找不到坐标

open_mfdataset with xarray failing to find coordinates

我正在尝试下载一堆 GOES-16 辐射数据并在 xarray 中一起打开它们以使用 xr.open_mfdataset() 函数进行分析。这些 netcdf 文件有一个坐标 t,这是我尝试用作连接的时间戳,但是当我尝试这样做时出现错误 ValueError: Could not find any dimension coordinates to use to order the datasets for concatenation。这是我的代码以及下载两个示例 .nc 文件的链接。

下载两个文件:

wget https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2019/141/02/OR_ABI-L1b-RadF-M6C14_G16_s20191410240370_e20191410250078_c20191410250143.nc
wget https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2019/141/03/OR_ABI-L1b-RadF-M6C14_G16_s20191410310370_e20191410320078_c20191410320142.nc

代码:

import xarray as xr
ds_sst = xr.open_mfdataset("OR_ABI-L1b-RadF*nc", concat_dim='t',combine='by_coords')

我可以做些什么来使这项工作正常进行,以便我可以同时打开其中的几十个文件?

改用combine='nested'

来自 Xarray documentation 按坐标组合:

Attempt to auto-magically combine the given datasets into one by using dimension coordinates.

't' 不是维度坐标,因此 xarray 魔法在这种情况下不起作用,因为 xarray 的 combine_by_coords 会在导入的 netcdfs 之间寻找匹配的维度坐标。

在这种情况下,您需要更具体:使用 combine = 'nested' 并使用 concat_dim='t' 指定新的维度名称。由于已经有一个名为 't' 的坐标,xarray 会自动将其提升为维度坐标。

ds_sst = xr.open_mfdataset("OR_ABI-L1b-RadF*nc", concat_dim='t', combine='nested')

生成的数据集如下所示。

<xarray.Dataset>
Dimensions:                                           (band: 1, num_star_looks: 24, number_of_image_bounds: 2, number_of_time_bounds: 2, t: 2, x: 5424, y: 5424)
Coordinates:
    band_wavelength_star_look                         (num_star_looks) float32 dask.array<chunksize=(24,), meta=np.ndarray>
    x_image                                           float32 0.0
    y_image                                           float32 0.0
    band_wavelength                                   (band) float32 dask.array<chunksize=(1,), meta=np.ndarray>
    band_id                                           (band) int8 dask.array<chunksize=(1,), meta=np.ndarray>
    t_star_look                                       (num_star_looks) datetime64[ns] dask.array<chunksize=(24,), meta=np.ndarray>
  * y                                                 (y) float32 0.151844 ... -0.151844
  * x                                                 (x) float32 -0.151844 ... 0.151844
  * t                                                 (t) datetime64[ns] 2019-05-21T02:45:22.400760064 2019-05-21T03:15:22.406056960
Dimensions without coordinates: band, num_star_looks, number_of_image_bounds, number_of_time_bounds
Data variables:
    Rad                                               (t, y, x) float32 dask.array<chunksize=(1, 5424, 5424), meta=np.ndarray>
    DQF                                               (t, y, x) float32 dask.array<chunksize=(1, 5424, 5424), meta=np.ndarray>
    time_bounds                                       (t, number_of_time_bounds) datetime64[ns] dask.array<chunksize=(1, 2), meta=np.ndarray>
    goes_imager_projection                            (t) int32 -2147483647 -2147483647
    y_image_bounds                                    (t, number_of_image_bounds) float32 dask.array<chunksize=(1, 2), meta=np.ndarray>
    x_image_bounds                                    (t, number_of_image_bounds) float32 dask.array<chunksize=(1, 2), meta=np.ndarray>
    nominal_satellite_subpoint_lat                    (t) float64 0.0 0.0
    nominal_satellite_subpoint_lon                    (t) float64 -75.2 -75.2
    nominal_satellite_height                          (t) float64 3.579e+04 3.579e+04
    geospatial_lat_lon_extent                         (t) float32 9.96921e+36 9.96921e+36
    yaw_flip_flag                                     (t) float64 0.0 0.0
    esun                                              (t) float64 nan nan
    kappa0                                            (t) float64 nan nan
    planck_fk1                                        (t) float64 8.51e+03 8.51e+03
    planck_fk2                                        (t) float64 1.286e+03 1.286e+03
    planck_bc1                                        (t) float64 0.2252 0.2252
    planck_bc2                                        (t) float64 0.9992 0.9992
    valid_pixel_count                                 (t) float64 2.305e+07 2.305e+07
    missing_pixel_count                               (t) float64 268.0 290.0
    saturated_pixel_count                             (t) float64 0.0 0.0
    undersaturated_pixel_count                        (t) float64 0.0 0.0
    focal_plane_temperature_threshold_exceeded_count  (t) float64 0.0 0.0
    min_radiance_value_of_valid_pixels                (t) float64 8.217 8.472
    max_radiance_value_of_valid_pixels                (t) float64 125.5 123.2
    mean_radiance_value_of_valid_pixels               (t) float64 82.01 81.96
    std_dev_radiance_value_of_valid_pixels            (t) float64 24.64 24.53
    maximum_focal_plane_temperature                   (t) float64 62.12 62.12
    focal_plane_temperature_threshold_increasing      (t) float64 81.0 81.0
    focal_plane_temperature_threshold_decreasing      (t) float64 81.0 81.0
    percent_uncorrectable_L0_errors                   (t) float64 0.0 0.0
    earth_sun_distance_anomaly_in_AU                  (t) float64 1.012 1.012
    algorithm_dynamic_input_data_container            (t) int32 -2147483647 -2147483647
    processing_parm_version_container                 (t) int32 -2147483647 -2147483647
    algorithm_product_version_container               (t) int32 -2147483647 -2147483647
    star_id                                           (t, num_star_looks) float32 dask.array<chunksize=(1, 24), meta=np.ndarray>
Attributes:
    naming_authority:          gov.nesdis.noaa
    Conventions:               CF-1.7
    Metadata_Conventions:      Unidata Dataset Discovery v1.0
    standard_name_vocabulary:  CF Standard Name Table (v35, 20 July 2016)
    institution:               DOC/NOAA/NESDIS > U.S. Department of Commerce,...
    project:                   GOES
    production_site:           WCDAS
    production_environment:    OE
    spatial_resolution:        2km at nadir
    orbital_slot:              GOES-East
    platform_ID:               G16
    instrument_type:           GOES R Series Advanced Baseline Imager
    scene_id:                  Full Disk
    instrument_ID:             FM1
    title:                     ABI L1b Radiances
    summary:                   Single emissive band ABI L1b Radiance Products...
    keywords:                  SPECTRAL/ENGINEERING > INFRARED WAVELENGTHS > ...
    keywords_vocabulary:       NASA Global Change Master Directory (GCMD) Ear...
    iso_series_metadata_id:    a70be540-c38b-11e0-962b-0800200c9a66
    license:                   Unclassified data.  Access is restricted to ap...
    processing_level:          National Aeronautics and Space Administration ...
    cdm_data_type:             Image
    dataset_name:              OR_ABI-L1b-RadF-M6C14_G16_s20191410240370_e201...
    production_data_source:    Realtime
    timeline_id:               ABI Mode 6
    date_created:              2019-05-21T02:50:14.3Z
    time_coverage_start:       2019-05-21T02:40:37.0Z
    time_coverage_end:         2019-05-21T02:50:07.8Z
    id:                        abb3657a-03c0-47a9-a1ba-f3196c07c5a9

或者,您可以定义一个函数,将坐标 't' 提升为维度坐标,并将其传递给 open_mfdataset 中的 preprocess 参数。此函数在与其他函数连接之前应用于每个导入的 NetCDF。

def preprocessing(ds): 
    return ds.expand_dims(dim='t')

ds_sst = xr.open_mfdataset("OR_ABI-L1b-RadF*nc", concat_dim='t',combine='by_coords', preprocess = preprocessing)

结果同上