与 h5py 和 create_dataset 有关的问题

problem related to h5py and create_dataset

也许这个问题很愚蠢,但到目前为止我还没有找到解决办法。 我从其他人那里得到了一个代码,他可能使用与我不同的一组(例如 Python 2 而不是 3,等等)。 所以我做了一些小的改动来让事情正常进行,但我陷入了一个与 h5py 相关的可能很简单的问题中。

它破坏的代码部分如下所示:

labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']
for i in range(len(Labels)):
    labels_ALL.append(Labels[i])
    units_ALL.append('(mol/L)')
base.create_dataset('Labels', data=labels_ALL)
base.create_dataset('Units', data=units_ALL)

问题似乎在base.create_dataset:

Traceback (most recent call last):

  File "C:\Users\DaniJ\Documents\PostDoc_Jena\Trips, Conf, etc\Sinfonia Workshop\Exercise_1\exercise_1_SINFONIA_for_One\NR_chem_SINGLE_NoEu.py", line 252, in <module>
    base.create_dataset('Labels', data=labels_ALL)

  File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\group.py", line 136, in create_dataset
    dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)

  File "C:\Users\DaniJ\anaconda3\lib\site-packages\h5py\_hl\dataset.py", line 118, in make_new_dset
    tid = h5t.py_create(dtype, logical=1)

  File "h5py\h5t.pyx", line 1634, in h5py.h5t.py_create

  File "h5py\h5t.pyx", line 1656, in h5py.h5t.py_create

  File "h5py\h5t.pyx", line 1717, in h5py.h5t.py_create

TypeError: No conversion path for dtype: dtype('<U10')

变量基数似乎是一个h5py._hl.files.File变量。

谁能告诉我如何解决这个问题?

谢谢

此致, 丹妮

你的问题解决了吗?我 99.9% 确定它与您的 Labels 数据相关——它可能在 NumPy 数组而不是列表中。我写了 3 个简短的例子来证明差异。

  1. 第一个代码段使用 List 并成功创建了 文件 SO_69900543_1.h5.
  2. 中的数据集
  3. 第二个代码段重现了您的错误。它转换列表 到 NumPy 数组然后在尝试创建数据集时失败 在文件 SO_69900543_2.h5 中。请注意,它给出了相同的错误 您遇到的消息:TypeError: No conversion path for dtype: dtype('<U10').
  4. 第三个代码段显示如何将 numpy.str_ 元素修改为 str(解决段 #2 中的问题)。请注意,每个 Labels 值在添加到 Labels_All.
  5. 之前先用 str() 转换

也许这会帮助您找到(并修复)您的 Unicode 数据问题。

代码段 1(有效):

Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']

for i in range(len(Labels)):
    labels_ALL.append(Labels[i])
    units_ALL.append('(mol/L)')
with h5py.File('SO_69900543_1.h5','w') as base:
    base.create_dataset('Labels', data=labels_ALL)
    base.create_dataset('Units', data=units_ALL)

代码段 2 (returns TypeError):

Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array 
# This will trigger the error when creating the dataset
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']

for i in range(len(Labels)):
    labels_ALL.append(Labels[i])
    units_ALL.append('(mol/L)')

for i in range(len(labels_ALL)):   
    print(i, type(labels_ALL[i]), type(units_ALL[i]))

with h5py.File('SO_69900543_2.h5','w') as base:
    base.create_dataset('Labels', data=labels_ALL)
    base.create_dataset('Units', data=units_ALL)  

代码段 3(有效):

Labels = ['H+','Na+','Cl-','OH-','>SOH_x','>SO-_x','>SONa_x','>SOH2+_x','>SOH2Cl_x','>SOH_y','>SO-_y','>SONa_y']
# Convert Labels List to NumPy array 
# This will trigger the error when creating the dataset if not modified
Labels = np.array(Labels)
labels_ALL = ['ionic_str','psi0','psi1','psi2','psid','zeta','sig0','sig1','sig2','sigd','sig0_eq','sig1_eq','sig2_eq','sigd_eq','ch_bal_EDL','ch_bal_aq', 'sum_resid']
units_ALL = ['(mol/L)','(V)','(V)','(V)','(V)','(V)','(C/m**2)','(C/m**2)','(C/m**2)','(C/m**2)','(mol(eq))','(mol(eq))','(mol(eq))','(mol(eq))','(C/m**2)','(mol(eq)/L)',' ']

for i in range(len(Labels)):
    # use str() to convert from 'numpy.str_' to 'str'
    labels_ALL.append(str(Labels[i])) 
    units_ALL.append('(mol/L)')

for i in range(len(labels_ALL)):   
    print(i, type(labels_ALL[i]), type(units_ALL[i]))
    
with h5py.File('SO_69900543_2.h5','w') as base:
    base.create_dataset('Labels', data=labels_ALL)
    base.create_dataset('Units', data=units_ALL)