无法将 key:value 从列中的字典拉到多个列

Unable to pull the key:value from dictionary in column to multiple columns

所以我在这个 post (Split / Explode a column of dictionaries into separate columns with pandas) 中使用了解决方案,但我的 df 没有任何变化。

这里是代码前的df:

    number  status_timestamps
0   234234  {"created": "2020-11-30T19:44:42Z", "complete"...
1   2342    {"created": "2020-12-14T13:43:48Z", "complete"...

这是该列中的字典示例:

{"created": "2020-11-30T19:44:42Z", 
"complete": "2021-01-17T14:20:58Z",
 "invoiced": "2020-12-16T22:55:02Z", 
 "confirmed": "2020-11-30T21:16:48Z", 
 "in_production": "2020-12-11T18:59:26Z",
 "invoice_needed": "2020-12-11T22:00:09Z",
 "accepted": "2020-12-01T00:00:23Z", 
 "assets_uploaded": "2020-12-11T17:16:53Z", 
 "notified": "2020-11-30T21:17:48Z", 
 "processing": "2020-12-11T18:49:50Z",
 "classified": "2020-12-11T18:49:50Z"}

这是我试过的,df 没有改变:

df_final = pd.concat([df, df['status_timestamps'].progress_apply(pd.Series)], axis = 1).drop('status_timestamps', axis = 1)

这是笔记本中发生的事情:

请提供您下次尝试过的最小可重现工作示例。

如果我按照提到的 post 中的解决方案进行操作,它会起作用。

这是我用过的代码:

import pandas as pd

json_data = {"created": "2020-11-30T19:44:42Z", 
"complete": "2021-01-17T14:20:58Z",
 "invoiced": "2020-12-16T22:55:02Z", 
 "confirmed": "2020-11-30T21:16:48Z", 
 "in_production": "2020-12-11T18:59:26Z",
 "invoice_needed": "2020-12-11T22:00:09Z",
 "accepted": "2020-12-01T00:00:23Z", 
 "assets_uploaded": "2020-12-11T17:16:53Z", 
 "notified": "2020-11-30T21:17:48Z", 
 "processing": "2020-12-11T18:49:50Z",
 "classified": "2020-12-11T18:49:50Z"}
 
df = pd.DataFrame({"number": 2342, "status_timestamps": [json_data]})

# fastest solution proposed by your reference post
df.join(pd.DataFrame(df.pop('status_timestamps').values.tolist()))

我能够使用那个 post 的另一个答案,但更改为更安全的 literal_eval 选项,因为它使用的是 eval

这是工作代码:

import pandas as pd
from ast import literal_eval
df  = pd.read_csv('c:/status_timestamps.csv')
df["status_timestamps"] = df["status_timestamps"].apply(lambda x : dict(literal_eval(x)) )
df2 = df["status_timestamps"].apply(pd.Series )
df_final = pd.concat([df, df2], axis=1).drop('status_timestamps', axis=1)
df_final