具有两个组共有维度的几个 hdf5 的串联

Question

我想连接几个 hdf5 文件。这是 panoply 给出的 header :

我想在 npixels 维度上进行串联。但是，如果我在 npixels 上执行 ncrcat，它会告诉我 'variable unknown'。实际上，如果我执行 ncdump -c，我看不到 npixels 维度，但 Data_Fields 组中的 phony_dim_0 和 Geolocation_Fields 中的 phony_dim_4 , 每个都有 655 像素。

我将这些维度设置为无限：

ncks --mk_rec_dmn phony_dim_0 ${file} ${file}
ncks -O --mk_rec_dmn phony_dim_4 ${file} ${file}

如果我这样做：

ncrcat Valid_CO_SOFRID-v4.0_200???.he5 Valid_CO_SOFRID-v4.0_200801-200907.he5 -v Latitude,Longitude,Day,Hour,Minute,"CO Total Column"

（只有一维变量），它似乎适用于 Geolocation_Fields 变量。对于 Data_Fields 变量，我得到了预期的元素数量，但具有相同的值（可能是平均值）。如果我只保留 1 个变量，则输出相同：

ncrcat -d phony_dim_0,0, Valid_CO_SOFRID-v4.0_200???.he5 Valid_CO_SOFRID-v4.0_200801-200907_dim0.he5 -v "CO Total Column"

实际上我还需要一个额外的二维变量，但它不起作用：

ERROR: nco_put_vara() failed to nc_put_vara() variable "CO"
nco_err_exit(): ERROR Short NCO-generated message (usually name of function that triggered error): nco_put_vara()
nco_err_exit(): ERROR Error code is -40. Translation into English with nc_strerror(-40) is "NetCDF: Index exceeds dimension bound"

谢谢

Answer 1

我用h5py成功了\o/

import h5py
import numpy as np
import glob
import collections
import os

def nested_dict():
    return collections.defaultdict(nested_dict)

path = '/home/loip/Documents/tech/sofrid/data/windhoekSameDayOnly/'
listOfAllFiles = glob.glob( path + 'Valid_CO_SOFRID-v2.2_200???.he5' )

os.chdir( path )

dico = nested_dict()
with h5py.File('Valid_CO_SOFRID-v2.2_200801-200907.he5', 'w') as h5w:
    for iFile, eachFile in enumerate(listOfAllFiles):   
        with h5py.File(eachFile, 'r') as h5r:
            for group in ['HDFEOS']:
                for subgroup1 in ['SWATHS']:
                    for subgroup2 in ['CO']:
                        for subgroup3 in h5r[group][subgroup1][subgroup2].keys(): #Data Fields & Geolocation Fields
                            for varName in h5r[group][subgroup1][subgroup2][subgroup3].keys():
                                dataThisTime = h5r[group][subgroup1][subgroup2][subgroup3][varName][:]
                                if iFile == 0:
                                    dico[group][subgroup1][subgroup2][subgroup3][varName] = dataThisTime
                                else:
                                    if dico[group][subgroup1][subgroup2][subgroup3][varName].ndim == 2:
                                        dico[group][subgroup1][subgroup2][subgroup3][varName] = np.append( dico[group][subgroup1][subgroup2][subgroup3][varName], dataThisTime, axis=1 )
                                    else:
                                        dico[group][subgroup1][subgroup2][subgroup3][varName] = np.append( dico[group][subgroup1][subgroup2][subgroup3][varName], dataThisTime, axis=0 )
                                if iFile == len(listOfAllFiles)-1:
                                    h5w.create_dataset(f'{group}/{subgroup1}/{subgroup2}/{subgroup3}/{varName}', 
                                                       data=dico[group][subgroup1][subgroup2][subgroup3][varName])

具有两个组共有维度的几个 hdf5 的串联

Concatenation of several hdf5 having a dimension common to two groups

h5py

nco