将具有多个值和变量键的嵌套字典导出到 excel

Question

的第二次尝试。我需要的是将以下字典导出到 excel。

{1: {'Field Cluster': ['This', 'This', 'This'], 
     'Exploration Block': ['Is', 'Is', 'Is'], 
     'Producing since': [1923.0, 1923.0, 1923.0], 
     'Fluids': ['A ', 'A ', 'A '], 
     'Reservoirs': ['Test', 'Test', 'Test'], 
     'Area (km2)': ['File', 'File', 'File'], 
     'Depth (m)': ['A\nHuge\nDepth', 'A\nHuge\nDepth', 'A\nHuge\nDepth'],   
     'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license'], 
     'License Expiry Date / Extension': ['Everlasting', 'Everlasting', 'Everlasting'], 
     'Working Interest with SB': ['There is one\n', 'There is one\n', 'There is one\n'], 
     'Government approval:': ['It is!', 'It is!', 'It is!'], 
     'Last study:': ['Million years ago', 'Million years ago', 'Million years ago'], 
     'Parameters': ['Horizon1', 'Horizon2', 'Horizon3'], 
     'Reservoir rock': ['First', 'Second', 'Third'], 
     'Net pay thickness (m)': [1.0, 21.0, 41.0], 
     'Avr. porosity (%)': [2.0, 22.0, 42.0], 
     'Average absolute permeability  (mD)': [3.0, 23.0, 43.0], 
     'Swi (%)': [4.0, 24.0, 44.0], 
     'Initial pressure (at)': [5.0, 25.0, 45.0], 
     'Bubble Pressure (at.)': [6.0, 26.0, 46.0], 
     'Dew Point Pressure (at)': [7.0, 27.0, 47.0], 
     'Initial Solution Ratio (Stm3/m3)': [8.0, 28.0, 48.0], 
     'Initial Condensate Gas Ratio (g/Stm3)': [9.0, 29.0, 49.0], 
     'Oil density (kg/cm)': [10.0, 30.0, 50.0], 
     'Oil viscosity (Pb) (cP)': [11.0, 31.0, 51.0], 
     'Contaminants (H2S, CO2)': [12.0, 32.0, 52.0], 
     'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0], 
     'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0]}, 
 2: {'Field Cluster': ['This fff', 'This fff', 'This fff', 'This fff'],                 
     'Exploration Block': ['fff', 'fff', 'fff', 'fff'], 
     'Producing since': ['1923fff', '1923fff', '1923fff', '1923fff'],     
     'Fluids': ['A fff', 'A fff', 'A fff', 'A fff'],
     'Reservoirs': ['Test', 'Test', 'Test', 'Test'], 
     'Area (km2)': ['File', 'File', 'File', 'File'], 
     'Depth (m)': ['A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff'], 
     'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license', 'UNIX license'], 
     'License Expiry Date / Extension': ['Everlastingfff', 'Everlastingfff', 'Everlastingfff', 'Everlastingfff'], 
     'Working Interest': ['There is one\n', 'There is one\n', 'There is one\n', 'There is one\n'], 
     'Gouvernment approval:': ['ffff', 'ffff', 'ffff', 'ffff'], 
     'Last study:': ['Million years fffff', 'Million years fffff', 'Million years fffff', 'Million years fffff'], 
     'Parameters': ['Horizon1', 'Horizon2', 'Horizon3', 'Horizon4'],     
     'Reservoir rock': ['First', 'Second', 'Third', 'Fourth'], 
     'Net pay thickness (m)': [1.0, 21.0, 41.0, 61.0], 
     'Avr. porosity (%)': [2.0, 22.0, 42.0, 62.0], 
     'Average absolute permeability  (mD)': [3.0, 23.0, 43.0, 63.0], 
     'Swi (%)': [4.0, 24.0, 44.0, 64.0], 
     'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0, 73.0], 
     'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0, 74.0], 
     'Initial Gas (assoc.) in Place (e6 m3) sol.gas/gas cap': [15.0, 35.0, 55.0, 75.0], 
     'Initial Gas (non assoc.) in Place (e6 m3)': [16.0, 36.0, 56.0, 76.0],    
     'Primary recovery / drive mechanism\nNone': ['Wow\nA', 'Recovery\nNone', 'Mechanism\nNone', 'Nice\nNone', ''], 
     'Secondary recovery': ['Another one', '', '', '', ''], 
     'Total Wells': ['1000', '-', '-', '-', ''], 
     'Productive wells (oil/gas)': ['500', '-', '-', '-', ''], 
     'Injection wells (water/gas)': ['500', '-', '-', '-', ''], 
     'Rate of best producer in the field (tons / e3 Sm3/day)': ['30', '-', '-', '-', ''], 
     'WOW Production (Something)': ['1', 2.0, '3', '4', '']}}

前面post给出了两个答案。第一个：

df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(df[i].dropna().tolist()) for i in cols],axis=1,keys=cols)
final.index=df.index
print(final)

这个仅适用于第一个嵌套词典。关键问题是第二个子字典中有一些键丢失，并且值是根据第一个字典使用的顺序排序的。这会导致值与相应的参数不匹配。

另一个答案非常相似，它适用于测试字典，但不适用于上面的字典：

df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(v).T for k,v in d.items()],axis=1,sort=False,keys=d.keys())
final.index=df.index
print(final)

对于实际的字典，此代码 returns 只有两行带有元组中的参数。而且，它只考虑了第二个子词典。

简而言之，我想要的是：假设我们有这本小字典，与实际的非常相似：

{1: 
    {'Parameter 1': ['Value 1', 'Value 2', 'Value 3'], 
     'Parameter 2': ['Value 11', 'Value 22', 'Value 33'], 
     'Parameter 3': ['Num1', 'Num2', 'Num3']},
 2:
    {'Parameter 1': ['Data 1', 'Data 2', 'Data 3'], 
     'Parameter 2': ['Data 11', 'Data 22', 'Data 33'], 
     'Parameter 4': ['Numb11', 'Numb22', 'Numb33']}
}

我想从中得到这样的table:

            |               1             |             2            |    
---------------------------------------------------------------------
Parameter 1 | Value 1 | Value 2 | Value 3 | Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 2 | Value 11| Value 22| Value 33| Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 3 |   Num1  |   Num2  |   Num3  |        |        |        |
----------------------------------------------------------------------
Parameter 4 |         |         |         | Numb11 | Numb22 | Numb33 | 
----------------------------------------------------------------------

所以每个值都对应着它的参数，所有的参数都在第一列，不重复

Answer 1

以下与你给的做同样的工作（但少了两个）：

df_to_concat = {k: pd.DataFrame(v).transpose() for (k, v) in d.items()}
df = pd.concat(df_to_concat.values(), keys=df_to_concat.keys(), axis='columns')

但是你的大字典有不相等的列表，给出以下错误：

ValueError: arrays must all be same length

最后一个键有最后一个空值。当我手动删除时，代码有效。如果您想以编程方式执行此操作，则可以在创建数据框之前执行类似的操作，它会删除包含太多项目的列表的最后一个值：

min_length = {k: min([len(one_list) for one_list in v.values()]) for (k, v) in d.items()}
new_d = {}
for k, v in d.items():
    new_v = {}
    for k2, one_list in v.items():
        new_v.update({k2: one_list[:min_length[k]]})
    new_d.update({k: new_v})

将具有多个值和变量键的嵌套字典导出到 excel

Exporting a nested dictionary with multiple values and variable keys to excel

dictionary

xlrd

dataframe

python-3.x

pandas