如何在 pandas 数据框的多列中展平字典列表

How to flatten list of dictionaries in multiple columns of pandas dataframe

我有一个数据框,每条记录都存储一个这样的字典列表:

row prodect_id recommend_info
  0 XQ002      [{"recommend_key":"XXX567","recommend_point":50},
                {"recommend_key":"XXX236","recommend_point":20},
                {"recommend_key":"XXX090","recommend_point":35}]
  1 XQ003      [{"recommend_key":"XXX089","recommend_point":30},
                {"recommend_key":"XXX567","recommend_point":20}]

我想将字典列表展平,这样它看起来像这样

row prodect_id recommend_info_recommend_key recommend_info_recommend_point
  0 XQ002      XXX567                       50
  1 XQ002      XXX236                       20
  2 XQ002      XXX090                       35
  3 XQ003      XXX089                       30
  4 XQ003      XXX567                       20

我知道如何只将一个字典列表转换为数据框。 像这样:

d = [{"recommend_key":"XXX089","recommend_point":30},
     {"recommend_key":"XXX567","recommend_point":20}]

df = pd.DataFrame(d)

row recommend_key recommend_point
  0 XXX089        30
  1 XXX567        20

但是当有一列存储字典列表,或者有多列存储字典列表时,我不知道如何对数据框执行此操作

row  col_a  col_b                  col_c
  0  B001   [{"a":"b"},{"a":"c"}]  [{"y":11},{"a":"c"}]
  1  D009   [{"c":"o"},{"g":"c"}]  [{"y":11},{"a":"c"},{"l":"c"}]   
  2  G068   [{"c":"b"},{"a":"c"}]  [{"a":56},{"d":"c"}]
  3  C004   [{"d":"a"},{"b":"c"}]  [{"c":22},{"a":"c"},{"b":"c"}]
  4  F011   [{"h":"u"},{"d":"c"}]  [{"h":27},{"d":"c"}]

尝试:

pd.concat([df.explode('recommend_info').drop(['recommend_info'], axis=1),
           df.explode('recommend_info')['recommend_info'].apply(pd.Series)],
          axis=1)

你可以对每一列一遍又一遍地做同样的事情

这是一个例子:

>>> df = pd.DataFrame({'a': [[{3: 4, 5: 6}, {3:8, 5: 1}],
...                          [{3:2, 5:4}, {3: 8, 5: 10}]],
...                    'b': ['X', "Y"]})
>>> df
                               a  b
0   [{3: 4, 5: 6}, {3: 8, 5: 1}]  X
1  [{3: 2, 5: 4}, {3: 8, 5: 10}]  Y
>>> df = pd.concat([df.explode('a').drop(['a'], axis=1),
...                 df.explode('a')['a'].apply(pd.Series)],
...                axis=1)
>>> df
   b  3   5
0  X  4   6
0  X  8   1
1  Y  2   4
1  Y  8  10

我有一个包含多列的数据框。其中一列包含一个列表,每个列表中有一个字典。我需要展开字典,然后将其附加到它来自的同一行。里卡多的回答主要对我有用。我在下面对其进行了概括:

def explode_column_from_list_dict(df_in, column_name_to_explode):
    df = df_in.copy()
    df = pd.concat(
        [
            df.explode(column_name_to_explode).drop([column_name_to_explode], axis=1),
            df.explode(column_name_to_explode)[column_name_to_explode].apply(pd.Series),
        ],
        axis=1,
    )
    return df