如何在 pandas 数据框的多列中展平字典列表
How to flatten list of dictionaries in multiple columns of pandas dataframe
我有一个数据框,每条记录都存储一个这样的字典列表:
row prodect_id recommend_info
0 XQ002 [{"recommend_key":"XXX567","recommend_point":50},
{"recommend_key":"XXX236","recommend_point":20},
{"recommend_key":"XXX090","recommend_point":35}]
1 XQ003 [{"recommend_key":"XXX089","recommend_point":30},
{"recommend_key":"XXX567","recommend_point":20}]
我想将字典列表展平,这样它看起来像这样
row prodect_id recommend_info_recommend_key recommend_info_recommend_point
0 XQ002 XXX567 50
1 XQ002 XXX236 20
2 XQ002 XXX090 35
3 XQ003 XXX089 30
4 XQ003 XXX567 20
我知道如何只将一个字典列表转换为数据框。
像这样:
d = [{"recommend_key":"XXX089","recommend_point":30},
{"recommend_key":"XXX567","recommend_point":20}]
df = pd.DataFrame(d)
row recommend_key recommend_point
0 XXX089 30
1 XXX567 20
但是当有一列存储字典列表,或者有多列存储字典列表时,我不知道如何对数据框执行此操作
row col_a col_b col_c
0 B001 [{"a":"b"},{"a":"c"}] [{"y":11},{"a":"c"}]
1 D009 [{"c":"o"},{"g":"c"}] [{"y":11},{"a":"c"},{"l":"c"}]
2 G068 [{"c":"b"},{"a":"c"}] [{"a":56},{"d":"c"}]
3 C004 [{"d":"a"},{"b":"c"}] [{"c":22},{"a":"c"},{"b":"c"}]
4 F011 [{"h":"u"},{"d":"c"}] [{"h":27},{"d":"c"}]
尝试:
pd.concat([df.explode('recommend_info').drop(['recommend_info'], axis=1),
df.explode('recommend_info')['recommend_info'].apply(pd.Series)],
axis=1)
你可以对每一列一遍又一遍地做同样的事情
这是一个例子:
>>> df = pd.DataFrame({'a': [[{3: 4, 5: 6}, {3:8, 5: 1}],
... [{3:2, 5:4}, {3: 8, 5: 10}]],
... 'b': ['X', "Y"]})
>>> df
a b
0 [{3: 4, 5: 6}, {3: 8, 5: 1}] X
1 [{3: 2, 5: 4}, {3: 8, 5: 10}] Y
>>> df = pd.concat([df.explode('a').drop(['a'], axis=1),
... df.explode('a')['a'].apply(pd.Series)],
... axis=1)
>>> df
b 3 5
0 X 4 6
0 X 8 1
1 Y 2 4
1 Y 8 10
我有一个包含多列的数据框。其中一列包含一个列表,每个列表中有一个字典。我需要展开字典,然后将其附加到它来自的同一行。里卡多的回答主要对我有用。我在下面对其进行了概括:
def explode_column_from_list_dict(df_in, column_name_to_explode):
df = df_in.copy()
df = pd.concat(
[
df.explode(column_name_to_explode).drop([column_name_to_explode], axis=1),
df.explode(column_name_to_explode)[column_name_to_explode].apply(pd.Series),
],
axis=1,
)
return df
我有一个数据框,每条记录都存储一个这样的字典列表:
row prodect_id recommend_info
0 XQ002 [{"recommend_key":"XXX567","recommend_point":50},
{"recommend_key":"XXX236","recommend_point":20},
{"recommend_key":"XXX090","recommend_point":35}]
1 XQ003 [{"recommend_key":"XXX089","recommend_point":30},
{"recommend_key":"XXX567","recommend_point":20}]
我想将字典列表展平,这样它看起来像这样
row prodect_id recommend_info_recommend_key recommend_info_recommend_point
0 XQ002 XXX567 50
1 XQ002 XXX236 20
2 XQ002 XXX090 35
3 XQ003 XXX089 30
4 XQ003 XXX567 20
我知道如何只将一个字典列表转换为数据框。 像这样:
d = [{"recommend_key":"XXX089","recommend_point":30},
{"recommend_key":"XXX567","recommend_point":20}]
df = pd.DataFrame(d)
row recommend_key recommend_point
0 XXX089 30
1 XXX567 20
但是当有一列存储字典列表,或者有多列存储字典列表时,我不知道如何对数据框执行此操作
row col_a col_b col_c
0 B001 [{"a":"b"},{"a":"c"}] [{"y":11},{"a":"c"}]
1 D009 [{"c":"o"},{"g":"c"}] [{"y":11},{"a":"c"},{"l":"c"}]
2 G068 [{"c":"b"},{"a":"c"}] [{"a":56},{"d":"c"}]
3 C004 [{"d":"a"},{"b":"c"}] [{"c":22},{"a":"c"},{"b":"c"}]
4 F011 [{"h":"u"},{"d":"c"}] [{"h":27},{"d":"c"}]
尝试:
pd.concat([df.explode('recommend_info').drop(['recommend_info'], axis=1),
df.explode('recommend_info')['recommend_info'].apply(pd.Series)],
axis=1)
你可以对每一列一遍又一遍地做同样的事情
这是一个例子:
>>> df = pd.DataFrame({'a': [[{3: 4, 5: 6}, {3:8, 5: 1}],
... [{3:2, 5:4}, {3: 8, 5: 10}]],
... 'b': ['X', "Y"]})
>>> df
a b
0 [{3: 4, 5: 6}, {3: 8, 5: 1}] X
1 [{3: 2, 5: 4}, {3: 8, 5: 10}] Y
>>> df = pd.concat([df.explode('a').drop(['a'], axis=1),
... df.explode('a')['a'].apply(pd.Series)],
... axis=1)
>>> df
b 3 5
0 X 4 6
0 X 8 1
1 Y 2 4
1 Y 8 10
我有一个包含多列的数据框。其中一列包含一个列表,每个列表中有一个字典。我需要展开字典,然后将其附加到它来自的同一行。里卡多的回答主要对我有用。我在下面对其进行了概括:
def explode_column_from_list_dict(df_in, column_name_to_explode):
df = df_in.copy()
df = pd.concat(
[
df.explode(column_name_to_explode).drop([column_name_to_explode], axis=1),
df.explode(column_name_to_explode)[column_name_to_explode].apply(pd.Series),
],
axis=1,
)
return df