用字典映射 df 数组列
Map df array column with dict
我有一个包含数组列的数据框:
id_food1 id_food2
[1] NaN
[2] NaN
[2 3] [1]
我想将这些列映射到具有以下值的字典:
food_dict = {1: 'cake',
2: 'choco',
3: 'cream'}
我想要这样的东西:
id_food1 id_food2 id_food1_name id_food2_name
[1] NaN. [cake] 0
[2] NaN [choco] 0
[2 3] [1] [choco,cream] [cake]
我知道当列不是这样的数组时该怎么做
data['id_food1_name'] = data['id_food1'].map(food_dict)
但是当它是一个数组时无法做到。
任何帮助将不胜感激
使用 Series.explode
来展平值、映射和最后一个聚合列表预索引:
data['id_food1_name'] = (data['id_food1'].explode().astype(float)
.map(food_dict).groupby(level=0).agg(list))
对于所有列:
#converting strings to lists
import ast
c = ['id_food1', 'id_food2']
def f(x):
try:
return ast.literal_eval(x)
except:
return np.nan
data[c] = data[c].applymap(f)
转换为列表的替代解决方案:
data[c] = data[c].stack().str.strip('[]').str.split().unstack()
然后映射
for x in c:
f = lambda x: [food_dict.get(int(y)) for y in x if int(y) in food_dict]
data[f'{x}_name'] = data[x].dropna().apply(f)
data[f'{x}_name'] = data[f'{x}_name'].fillna(0)
print (data)
id_food1 id_food2 id_food1_name id_food2_name
0 [1] NaN [cake] 0
1 [2] NaN [choco] 0
2 [2, 3] [1] [choco, cream] [cake]
我有一个包含数组列的数据框:
id_food1 id_food2
[1] NaN
[2] NaN
[2 3] [1]
我想将这些列映射到具有以下值的字典:
food_dict = {1: 'cake',
2: 'choco',
3: 'cream'}
我想要这样的东西:
id_food1 id_food2 id_food1_name id_food2_name
[1] NaN. [cake] 0
[2] NaN [choco] 0
[2 3] [1] [choco,cream] [cake]
我知道当列不是这样的数组时该怎么做
data['id_food1_name'] = data['id_food1'].map(food_dict)
但是当它是一个数组时无法做到。
任何帮助将不胜感激
使用 Series.explode
来展平值、映射和最后一个聚合列表预索引:
data['id_food1_name'] = (data['id_food1'].explode().astype(float)
.map(food_dict).groupby(level=0).agg(list))
对于所有列:
#converting strings to lists
import ast
c = ['id_food1', 'id_food2']
def f(x):
try:
return ast.literal_eval(x)
except:
return np.nan
data[c] = data[c].applymap(f)
转换为列表的替代解决方案:
data[c] = data[c].stack().str.strip('[]').str.split().unstack()
然后映射
for x in c:
f = lambda x: [food_dict.get(int(y)) for y in x if int(y) in food_dict]
data[f'{x}_name'] = data[x].dropna().apply(f)
data[f'{x}_name'] = data[f'{x}_name'].fillna(0)
print (data)
id_food1 id_food2 id_food1_name id_food2_name
0 [1] NaN [cake] 0
1 [2] NaN [choco] 0
2 [2, 3] [1] [choco, cream] [cake]