[python]:将字典项的 pandas 列转换为 DataFrame 中的单独行

[python]: Convert pandas column of dictionary items to individual rows in a DataFrame

我有一个 pandas DataFrame,如下所示:

date_time    country  src_type  edges
2021-05-01   DE       home      {"home": 10, "nav": 3}
2021-05-03   IN       nav       {"support": 1}
2021-05-04   AE       cart      {"chat": 1, "about": 4, "home": 5}
2021-05-07   US       about     {}

edges 是一个包含边 dst_type 到其值 edge_count 的映射的字典。我希望字典中的每个单独项目都是 DataFrame 中的单独一行。

这在查看预期输出时会更清楚:

date_time    country  src_type  dst_type  edge_count
2021-05-01   DE       home      home      10
2021-05-01   DE       home      nav       3
2021-05-03   IN       nav       support   1
2021-05-04   AE       cart      chat      1
2021-05-04   AE       cart      about     4
2021-05-04   AE       cart      home      5

原始 DataFrame 中的最后一行被删除,因为 edges 中的字典为空。

date_time    country  src_type  edges
. . .
2021-05-07   US       about     {}

目前,我正在做以下事情:

records = []

for _, row in df.iterrows():
    for dst_type, edge_count in sorted(row["edges"].items()):
        records.append(
            (row["date_time"], row["country"], row["src_type"], dst_type, edge_count)
        )

df = pd.DataFrame.from_records(
    records, columns=["date_time", "country", "src_type", "dst_type", "edge_count"]
)

但是,这非常慢,因为遍历 DataFrame 需要时间。我想 向量化 这个操作并使其更快。有任何指示或建议吗?

如果您对此有任何帮助,我将不胜感激,因为它可以优化我们的处理速度,使其更快。谢谢!

可以使用pd.DataFrame() to convert the dictionary to new columns with dict keys as column labels. Then use .melt() to convert the new columns to individual rows. Sort by date_time column as required using .sort_values(). Finally clean up those rows without value (or with NaN value) in the resulting edge_count column using .dropna(),如下:

df2 = df.drop('edges', axis=1).join(pd.DataFrame(df['edges'].tolist()))

(df2.melt(id_vars=['date_time', 'country', 'src_type'], var_name='dst_type', value_name='edge_count')
    .sort_values('date_time')
    .dropna(subset=['edge_count'])
)

结果:

     date_time country src_type dst_type  edge_count
0   2021-05-01      DE     home     home        10.0
4   2021-05-01      DE     home      nav         3.0
9   2021-05-03      IN      nav  support         1.0
18  2021-05-04      AE     cart    about         4.0
14  2021-05-04      AE     cart     chat         1.0
2   2021-05-04      AE     cart     home         5.0