将数据框中列中的 JSON/list 个字典拆分为 python 中的新行
split JSON/list of dictionaries in the column in dataframe to new rows in python
我是 Python 的新手,我试图找到答案,但似乎没有任何效果。当整个数据采用 JSON 格式
时,会提供大部分答案
通过PYODBC我使用下面的代码来检索数据
formula = """select id, type, custbody_attachment_1 from transaction """
lineitem = pd.read_sql_query(formula, cnxn)
它给了我类似下面的东西
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Internal_ID | Type | Formula_Text |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2895531 | Bill | |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3492009 | Bill | [{"FL":"https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile0","NM":"someFileName0"}] |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3529162 | Bill | [{"FL":"5https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile1","NM":"someFileName1"},{"FL":"https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile2","NM":"someFileName2"}] |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
我需要这样的输出。 (单元格中可能有超过 2 个链接。)
+-------------+------+---------------------------------------------------------------------+---------------+
| Internal_ID | Type | FL | NM |
+-------------+------+---------------------------------------------------------------------+---------------+
| 2895531 | Bill | | |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3492009 | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile0 | someFileName0 |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3529162 | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile1 | someFileName1 |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3529162 | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile2 | someFileName2 |
+-------------+------+---------------------------------------------------------------------+---------------+
我试着玩 JSON 但问题一个接一个(因为对我来说它看起来像 JSON 数据)。最后我运行
print(lineitem['custbody_attachment_1'])
并在 Python 控制台中得到以下内容
999 [{"FL":"https://4811553.app.netsuite.com/core/...
Name: custbody_attachment_1, Length: 1000, dtype: object
所以,我不知道如何转换它以便创建新行
df = df.explode('Formula_Text')
df = pd.concat([df.drop(['Formula_Text'], axis=1), df['Formula_Text'].apply(pd.Series)], axis=1)
print(df)
我是 Python 的新手,我试图找到答案,但似乎没有任何效果。当整个数据采用 JSON 格式
时,会提供大部分答案通过PYODBC我使用下面的代码来检索数据
formula = """select id, type, custbody_attachment_1 from transaction """
lineitem = pd.read_sql_query(formula, cnxn)
它给了我类似下面的东西
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Internal_ID | Type | Formula_Text |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2895531 | Bill | |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3492009 | Bill | [{"FL":"https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile0","NM":"someFileName0"}] |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3529162 | Bill | [{"FL":"5https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile1","NM":"someFileName1"},{"FL":"https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile2","NM":"someFileName2"}] |
+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
我需要这样的输出。 (单元格中可能有超过 2 个链接。)
+-------------+------+---------------------------------------------------------------------+---------------+
| Internal_ID | Type | FL | NM |
+-------------+------+---------------------------------------------------------------------+---------------+
| 2895531 | Bill | | |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3492009 | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile0 | someFileName0 |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3529162 | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile1 | someFileName1 |
+-------------+------+---------------------------------------------------------------------+---------------+
| 3529162 | Bill | https://.app.netsuite.com/core/media/media.nl?id=someLinkToTheFile2 | someFileName2 |
+-------------+------+---------------------------------------------------------------------+---------------+
我试着玩 JSON 但问题一个接一个(因为对我来说它看起来像 JSON 数据)。最后我运行
print(lineitem['custbody_attachment_1'])
并在 Python 控制台中得到以下内容
999 [{"FL":"https://4811553.app.netsuite.com/core/...
Name: custbody_attachment_1, Length: 1000, dtype: object
所以,我不知道如何转换它以便创建新行
df = df.explode('Formula_Text')
df = pd.concat([df.drop(['Formula_Text'], axis=1), df['Formula_Text'].apply(pd.Series)], axis=1)
print(df)