将具有多个值和变量键的嵌套字典导出到 excel
Exporting a nested dictionary with multiple values and variable keys to excel
的第二次尝试。我需要的是将以下字典导出到 excel。
{1: {'Field Cluster': ['This', 'This', 'This'],
'Exploration Block': ['Is', 'Is', 'Is'],
'Producing since': [1923.0, 1923.0, 1923.0],
'Fluids': ['A ', 'A ', 'A '],
'Reservoirs': ['Test', 'Test', 'Test'],
'Area (km2)': ['File', 'File', 'File'],
'Depth (m)': ['A\nHuge\nDepth', 'A\nHuge\nDepth', 'A\nHuge\nDepth'],
'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license'],
'License Expiry Date / Extension': ['Everlasting', 'Everlasting', 'Everlasting'],
'Working Interest with SB': ['There is one\n', 'There is one\n', 'There is one\n'],
'Government approval:': ['It is!', 'It is!', 'It is!'],
'Last study:': ['Million years ago', 'Million years ago', 'Million years ago'],
'Parameters': ['Horizon1', 'Horizon2', 'Horizon3'],
'Reservoir rock': ['First', 'Second', 'Third'],
'Net pay thickness (m)': [1.0, 21.0, 41.0],
'Avr. porosity (%)': [2.0, 22.0, 42.0],
'Average absolute permeability (mD)': [3.0, 23.0, 43.0],
'Swi (%)': [4.0, 24.0, 44.0],
'Initial pressure (at)': [5.0, 25.0, 45.0],
'Bubble Pressure (at.)': [6.0, 26.0, 46.0],
'Dew Point Pressure (at)': [7.0, 27.0, 47.0],
'Initial Solution Ratio (Stm3/m3)': [8.0, 28.0, 48.0],
'Initial Condensate Gas Ratio (g/Stm3)': [9.0, 29.0, 49.0],
'Oil density (kg/cm)': [10.0, 30.0, 50.0],
'Oil viscosity (Pb) (cP)': [11.0, 31.0, 51.0],
'Contaminants (H2S, CO2)': [12.0, 32.0, 52.0],
'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0],
'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0]},
2: {'Field Cluster': ['This fff', 'This fff', 'This fff', 'This fff'],
'Exploration Block': ['fff', 'fff', 'fff', 'fff'],
'Producing since': ['1923fff', '1923fff', '1923fff', '1923fff'],
'Fluids': ['A fff', 'A fff', 'A fff', 'A fff'],
'Reservoirs': ['Test', 'Test', 'Test', 'Test'],
'Area (km2)': ['File', 'File', 'File', 'File'],
'Depth (m)': ['A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff'],
'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license', 'UNIX license'],
'License Expiry Date / Extension': ['Everlastingfff', 'Everlastingfff', 'Everlastingfff', 'Everlastingfff'],
'Working Interest': ['There is one\n', 'There is one\n', 'There is one\n', 'There is one\n'],
'Gouvernment approval:': ['ffff', 'ffff', 'ffff', 'ffff'],
'Last study:': ['Million years fffff', 'Million years fffff', 'Million years fffff', 'Million years fffff'],
'Parameters': ['Horizon1', 'Horizon2', 'Horizon3', 'Horizon4'],
'Reservoir rock': ['First', 'Second', 'Third', 'Fourth'],
'Net pay thickness (m)': [1.0, 21.0, 41.0, 61.0],
'Avr. porosity (%)': [2.0, 22.0, 42.0, 62.0],
'Average absolute permeability (mD)': [3.0, 23.0, 43.0, 63.0],
'Swi (%)': [4.0, 24.0, 44.0, 64.0],
'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0, 73.0],
'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0, 74.0],
'Initial Gas (assoc.) in Place (e6 m3) sol.gas/gas cap': [15.0, 35.0, 55.0, 75.0],
'Initial Gas (non assoc.) in Place (e6 m3)': [16.0, 36.0, 56.0, 76.0],
'Primary recovery / drive mechanism\nNone': ['Wow\nA', 'Recovery\nNone', 'Mechanism\nNone', 'Nice\nNone', ''],
'Secondary recovery': ['Another one', '', '', '', ''],
'Total Wells': ['1000', '-', '-', '-', ''],
'Productive wells (oil/gas)': ['500', '-', '-', '-', ''],
'Injection wells (water/gas)': ['500', '-', '-', '-', ''],
'Rate of best producer in the field (tons / e3 Sm3/day)': ['30', '-', '-', '-', ''],
'WOW Production (Something)': ['1', 2.0, '3', '4', '']}}
前面post给出了两个答案。第一个:
df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(df[i].dropna().tolist()) for i in cols],axis=1,keys=cols)
final.index=df.index
print(final)
这个仅适用于第一个嵌套词典。关键问题是第二个子字典中有一些键丢失,并且值是根据第一个字典使用的顺序排序的。这会导致值与相应的参数不匹配。
另一个答案非常相似,它适用于测试字典,但不适用于上面的字典:
df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(v).T for k,v in d.items()],axis=1,sort=False,keys=d.keys())
final.index=df.index
print(final)
对于实际的字典,此代码 returns 只有两行带有元组中的参数。而且,它只考虑了第二个子词典。
简而言之,我想要的是:
假设我们有这本小字典,与实际的非常相似:
{1:
{'Parameter 1': ['Value 1', 'Value 2', 'Value 3'],
'Parameter 2': ['Value 11', 'Value 22', 'Value 33'],
'Parameter 3': ['Num1', 'Num2', 'Num3']},
2:
{'Parameter 1': ['Data 1', 'Data 2', 'Data 3'],
'Parameter 2': ['Data 11', 'Data 22', 'Data 33'],
'Parameter 4': ['Numb11', 'Numb22', 'Numb33']}
}
我想从中得到这样的table:
| 1 | 2 |
---------------------------------------------------------------------
Parameter 1 | Value 1 | Value 2 | Value 3 | Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 2 | Value 11| Value 22| Value 33| Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 3 | Num1 | Num2 | Num3 | | | |
----------------------------------------------------------------------
Parameter 4 | | | | Numb11 | Numb22 | Numb33 |
----------------------------------------------------------------------
所以每个值都对应着它的参数,所有的参数都在第一列,不重复
以下与你给的做同样的工作(但少了两个):
df_to_concat = {k: pd.DataFrame(v).transpose() for (k, v) in d.items()}
df = pd.concat(df_to_concat.values(), keys=df_to_concat.keys(), axis='columns')
但是你的大字典有不相等的列表,给出以下错误:
ValueError: arrays must all be same length
最后一个键有最后一个空值。当我手动删除时,代码有效。如果您想以编程方式执行此操作,则可以在创建数据框之前执行类似的操作,它会删除包含太多项目的列表的最后一个值:
min_length = {k: min([len(one_list) for one_list in v.values()]) for (k, v) in d.items()}
new_d = {}
for k, v in d.items():
new_v = {}
for k2, one_list in v.items():
new_v.update({k2: one_list[:min_length[k]]})
new_d.update({k: new_v})
{1: {'Field Cluster': ['This', 'This', 'This'],
'Exploration Block': ['Is', 'Is', 'Is'],
'Producing since': [1923.0, 1923.0, 1923.0],
'Fluids': ['A ', 'A ', 'A '],
'Reservoirs': ['Test', 'Test', 'Test'],
'Area (km2)': ['File', 'File', 'File'],
'Depth (m)': ['A\nHuge\nDepth', 'A\nHuge\nDepth', 'A\nHuge\nDepth'],
'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license'],
'License Expiry Date / Extension': ['Everlasting', 'Everlasting', 'Everlasting'],
'Working Interest with SB': ['There is one\n', 'There is one\n', 'There is one\n'],
'Government approval:': ['It is!', 'It is!', 'It is!'],
'Last study:': ['Million years ago', 'Million years ago', 'Million years ago'],
'Parameters': ['Horizon1', 'Horizon2', 'Horizon3'],
'Reservoir rock': ['First', 'Second', 'Third'],
'Net pay thickness (m)': [1.0, 21.0, 41.0],
'Avr. porosity (%)': [2.0, 22.0, 42.0],
'Average absolute permeability (mD)': [3.0, 23.0, 43.0],
'Swi (%)': [4.0, 24.0, 44.0],
'Initial pressure (at)': [5.0, 25.0, 45.0],
'Bubble Pressure (at.)': [6.0, 26.0, 46.0],
'Dew Point Pressure (at)': [7.0, 27.0, 47.0],
'Initial Solution Ratio (Stm3/m3)': [8.0, 28.0, 48.0],
'Initial Condensate Gas Ratio (g/Stm3)': [9.0, 29.0, 49.0],
'Oil density (kg/cm)': [10.0, 30.0, 50.0],
'Oil viscosity (Pb) (cP)': [11.0, 31.0, 51.0],
'Contaminants (H2S, CO2)': [12.0, 32.0, 52.0],
'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0],
'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0]},
2: {'Field Cluster': ['This fff', 'This fff', 'This fff', 'This fff'],
'Exploration Block': ['fff', 'fff', 'fff', 'fff'],
'Producing since': ['1923fff', '1923fff', '1923fff', '1923fff'],
'Fluids': ['A fff', 'A fff', 'A fff', 'A fff'],
'Reservoirs': ['Test', 'Test', 'Test', 'Test'],
'Area (km2)': ['File', 'File', 'File', 'File'],
'Depth (m)': ['A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff', 'A\nHuge\nDepthfff'],
'Concession License No.': ['UNIX license', 'UNIX license', 'UNIX license', 'UNIX license'],
'License Expiry Date / Extension': ['Everlastingfff', 'Everlastingfff', 'Everlastingfff', 'Everlastingfff'],
'Working Interest': ['There is one\n', 'There is one\n', 'There is one\n', 'There is one\n'],
'Gouvernment approval:': ['ffff', 'ffff', 'ffff', 'ffff'],
'Last study:': ['Million years fffff', 'Million years fffff', 'Million years fffff', 'Million years fffff'],
'Parameters': ['Horizon1', 'Horizon2', 'Horizon3', 'Horizon4'],
'Reservoir rock': ['First', 'Second', 'Third', 'Fourth'],
'Net pay thickness (m)': [1.0, 21.0, 41.0, 61.0],
'Avr. porosity (%)': [2.0, 22.0, 42.0, 62.0],
'Average absolute permeability (mD)': [3.0, 23.0, 43.0, 63.0],
'Swi (%)': [4.0, 24.0, 44.0, 64.0],
'Initial Oil in Place (e3 to)': [13.0, 33.0, 53.0, 73.0],
'Initial NGL in Place (e3 to)': [14.0, 34.0, 54.0, 74.0],
'Initial Gas (assoc.) in Place (e6 m3) sol.gas/gas cap': [15.0, 35.0, 55.0, 75.0],
'Initial Gas (non assoc.) in Place (e6 m3)': [16.0, 36.0, 56.0, 76.0],
'Primary recovery / drive mechanism\nNone': ['Wow\nA', 'Recovery\nNone', 'Mechanism\nNone', 'Nice\nNone', ''],
'Secondary recovery': ['Another one', '', '', '', ''],
'Total Wells': ['1000', '-', '-', '-', ''],
'Productive wells (oil/gas)': ['500', '-', '-', '-', ''],
'Injection wells (water/gas)': ['500', '-', '-', '-', ''],
'Rate of best producer in the field (tons / e3 Sm3/day)': ['30', '-', '-', '-', ''],
'WOW Production (Something)': ['1', 2.0, '3', '4', '']}}
前面post给出了两个答案。第一个:
df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(df[i].dropna().tolist()) for i in cols],axis=1,keys=cols)
final.index=df.index
print(final)
这个仅适用于第一个嵌套词典。关键问题是第二个子字典中有一些键丢失,并且值是根据第一个字典使用的顺序排序的。这会导致值与相应的参数不匹配。
另一个答案非常相似,它适用于测试字典,但不适用于上面的字典:
df=pd.DataFrame(d) # assuming d is the name of the dict
cols=df.columns
final=pd.concat([pd.DataFrame(v).T for k,v in d.items()],axis=1,sort=False,keys=d.keys())
final.index=df.index
print(final)
对于实际的字典,此代码 returns 只有两行带有元组中的参数。而且,它只考虑了第二个子词典。
简而言之,我想要的是: 假设我们有这本小字典,与实际的非常相似:
{1:
{'Parameter 1': ['Value 1', 'Value 2', 'Value 3'],
'Parameter 2': ['Value 11', 'Value 22', 'Value 33'],
'Parameter 3': ['Num1', 'Num2', 'Num3']},
2:
{'Parameter 1': ['Data 1', 'Data 2', 'Data 3'],
'Parameter 2': ['Data 11', 'Data 22', 'Data 33'],
'Parameter 4': ['Numb11', 'Numb22', 'Numb33']}
}
我想从中得到这样的table:
| 1 | 2 |
---------------------------------------------------------------------
Parameter 1 | Value 1 | Value 2 | Value 3 | Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 2 | Value 11| Value 22| Value 33| Data 1 | Data 2 | Data 3 |
----------------------------------------------------------------------
Parameter 3 | Num1 | Num2 | Num3 | | | |
----------------------------------------------------------------------
Parameter 4 | | | | Numb11 | Numb22 | Numb33 |
----------------------------------------------------------------------
所以每个值都对应着它的参数,所有的参数都在第一列,不重复
以下与你给的做同样的工作(但少了两个):
df_to_concat = {k: pd.DataFrame(v).transpose() for (k, v) in d.items()}
df = pd.concat(df_to_concat.values(), keys=df_to_concat.keys(), axis='columns')
但是你的大字典有不相等的列表,给出以下错误:
ValueError: arrays must all be same length
最后一个键有最后一个空值。当我手动删除时,代码有效。如果您想以编程方式执行此操作,则可以在创建数据框之前执行类似的操作,它会删除包含太多项目的列表的最后一个值:
min_length = {k: min([len(one_list) for one_list in v.values()]) for (k, v) in d.items()}
new_d = {}
for k, v in d.items():
new_v = {}
for k2, one_list in v.items():
new_v.update({k2: one_list[:min_length[k]]})
new_d.update({k: new_v})