解析 JSON Excel - Pandas + chlot
Parse JSON to Excel - Pandas + xlwt
我已经完成了这个功能的一半。但是,我需要一些帮助来格式化包含输出的 sheet 中的数据。
我当前的代码...
response = {"sic2":[{"confidence":1.0,"label":"73"}],"sic4":[{"confidence":0.5,"label":"7310"}],"sic8":[{"confidence":0.5,"label":"73101000"},{"confidence":0.25,"label":"73102000"},{"confidence":0.25,"label":"73109999"}]}
# Create a Pandas dataframe from the data.
df = pd.DataFrame.from_dict(json.loads(response), orient='index')
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
输出如下...
我想要的是这样的...
我想首先我需要提取和组织 headers。
这还包括为默认情况下不能具有 header 的列手动分配 header,如 SIC 列。
之后,我可以将数据提供给具有各自 header 的列。
您可以遍历 json 对象的键并从每个键创建一个数据框,然后使用 pd.concat
将它们全部组合起来:
import json
import pandas as pd
response = '{"sic2":[{"confidence":1.0,"label":"73"}],"sic4":[{"confidence":0.5,"label":"7310"}],"sic8":[{"confidence":0.5,"label":"73101000"},{"confidence":0.25,"label":"73102000"},{"confidence":0.25,"label":"73109999"}]}'
json_data = json.loads(response)
all_frames = []
for k, v in json_data.items():
df = pd.DataFrame(v)
df['SIC Category'] = k
all_frames.append(df)
final_data = pd.concat(all_frames).set_index('SIC Category')
print(final_data)
这会打印:
confidence label
SIC Category
sic2 1.00 73
sic4 0.50 7310
sic8 0.50 73101000
sic8 0.25 73102000
sic8 0.25 73109999
您可以像以前一样通过 final_data.to_excel(writer, sheet_name='Sheet1')
导出到 Excel
我已经完成了这个功能的一半。但是,我需要一些帮助来格式化包含输出的 sheet 中的数据。
我当前的代码...
response = {"sic2":[{"confidence":1.0,"label":"73"}],"sic4":[{"confidence":0.5,"label":"7310"}],"sic8":[{"confidence":0.5,"label":"73101000"},{"confidence":0.25,"label":"73102000"},{"confidence":0.25,"label":"73109999"}]}
# Create a Pandas dataframe from the data.
df = pd.DataFrame.from_dict(json.loads(response), orient='index')
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
输出如下...
我想要的是这样的...
我想首先我需要提取和组织 headers。 这还包括为默认情况下不能具有 header 的列手动分配 header,如 SIC 列。
之后,我可以将数据提供给具有各自 header 的列。
您可以遍历 json 对象的键并从每个键创建一个数据框,然后使用 pd.concat
将它们全部组合起来:
import json
import pandas as pd
response = '{"sic2":[{"confidence":1.0,"label":"73"}],"sic4":[{"confidence":0.5,"label":"7310"}],"sic8":[{"confidence":0.5,"label":"73101000"},{"confidence":0.25,"label":"73102000"},{"confidence":0.25,"label":"73109999"}]}'
json_data = json.loads(response)
all_frames = []
for k, v in json_data.items():
df = pd.DataFrame(v)
df['SIC Category'] = k
all_frames.append(df)
final_data = pd.concat(all_frames).set_index('SIC Category')
print(final_data)
这会打印:
confidence label
SIC Category
sic2 1.00 73
sic4 0.50 7310
sic8 0.50 73101000
sic8 0.25 73102000
sic8 0.25 73109999
您可以像以前一样通过 final_data.to_excel(writer, sheet_name='Sheet1')