使用列名和行名作为具有字典理解的键创建嵌套字典

Create a nested dictionary using columns and row names as keys with dictionary comprehension

上下文:我有以下数据框:

                    gene_id  Control_3Aligned.sortedByCoord.out.gtf  Control_4Aligned.sortedByCoord.out.gtf  ...  NET_101Aligned.sortedByCoord.out.gtf  NET_103Aligned.sortedByCoord.out.gtf  NET_105Aligned.sortedByCoord.out.gtf
0  ENSG00000213279|Z97192.2                                       0                                       0  ...                                     3                                     2                                     7     
1     ENSG00000132680|KHDC4                                     625                                     382  ...                                   406                                   465                                   262     
2     ENSG00000145041|DCAF1                                     423                                     104  ...                                   231                                   475                                   254     
3    ENSG00000102547|CAB39L                                     370                                     112  ...                                   265                                   393                                   389     
4     ENSG00000173826|KCNH6                                       0                                       0  ...                                     0                                     0                                     0 

我想要一个嵌套字典作为这个例子:

   {Control_3Aligned.sortedByCoord.out.gtf: 
             {ENSG00000213279|Z97192.2:0, 
              ENSG00000132680|KHDC4:625,...},
    Control_4Aligned.sortedByCoord.out.gtf: 
             {ENSG00000213279|Z97192.2:0, 
              ENSG00000132680|KHDC4:382,...}}

所以一般格式为:

{column_name : {row_name:value,...},...}

我正在尝试这样的事情:

sample_dict ={}

for column in df.columns[1:]:
    for index in range(0,len(df.index)+1):
        sample_dict.setdefault(column, {row_name:value for row_name,value in zip(df.iloc[index,0], df.loc[index,column])})
        sample_dict[column] += {row_name:value for row_name,value in zip(df.iloc[index,0], df.loc[index,column])}

但我一直收到 TypeError: 'numpy.int64' object is not iterable(问题似乎出在 zip() 中,因为 zip 只接受可迭代对象,在这个例子中我并没有真正这样做,而且我肯定是这样做的也填充字典)

非常欢迎任何帮助!提前谢谢你

设法做到这样:

sample_dict ={}
gene_list = []
for index in range(0,len(df.index)):
    temp_data = df.loc[index,'gene_id']
    gene_list.append(temp_data)

for column in df.columns[1:]:
    column_list = df.loc[:,column]
    gene_dict = {}
    for index in range(0,len(df.index)):
        if gene_list[index] not in gene_dict:
            gene_dict[gene_list[index]]=df.loc[index,column]
    sample_dict[column] = gene_dict

sample_dict.items()

dict_pairs = sample_dict.items()
pairs_iterator = iter(dict_pairs)
first_pair = next(pairs_iterator)
first_pair