将 JSON 的巨大 TextEdit 文件转换为 Pandas 数据框

Question

我有一个非常大的 JSON 文本编辑文档形式的文件列表，每个文件都有 6 个键值对。

我想将每个键值对转换为 Pandas Dataframe 的列名，并在列下列出值。

{'column1': "stuff stuff", 'column2': "details details, ....}

有没有标准的方法来做到这一点？

我认为您可以使用

开始将文件上传到数据框中

import pandas as pd
df = pd.read_table(file_name)

我认为可以通过使用 groupby 遍历每个 JSON 文档来创建每一列。

编辑：我认为正确的方法是将每个 JSON 对象解析为一个 Dataframe，然后创建一个函数来遍历所有 JSON 并创建一个 Dataframe。

Answer 1

查看 read_json or json_normalize. You would indeed most likely read each file and then use for instance pd.concat 以根据需要组合它们。

下面几行应该可以工作，具体取决于您的文件的外观（这里假设每个 json dictionary 在文件中组成一行：

df = pd.DataFrame()
f = open('workfile', 'r')
for line in f:
    df = pd.concat([df, pd.read_json(line, orient='columns')])

Turning a huge TextEdit file of JSON into a Pandas dataframe