使用不具有相同列名的另一个数据框中的一个数据框中的 id 获取数据并将字符串附加到一个值

Grabbing data using an id in one dataframe in another separate dataframe that do not posses the same column name and appending string to one value

我有两个数据框

jedis = {jedi_id': ["2", "4", "6", "1"],
'name':["Kylo", "Bastila", "Revan", "Steve from Minecraft"],
'Looted Items':}

inventory = {jedi_number': ["9", "4" , "6", "1", "1", "0", "2", "6", "1" , "55", "4", 
"4", "0", "9"], 'Loot':["Holocron", "Bantha Fodder", "Blaster", "Bantha Fodder", "Credits", "Bantha Fodder", "Blaster", "Bantha Fodder", "Holocron", "Blaster", "Holocron", "bread loaf", "Credits", "Holocron"]}

jedis_df = pd.DataFrame(jedis)
inventory_df = pd.DataFrame(inventory)

如果有人能帮助解决这个问题,我们将不胜感激

groupby.sum 自动排除非数字列,因此它不会按照您预期的方式连接组内的字符串。

解决方法是运行 ', ',join对每组的Loot值。

选项 1: groupby.agg

inventory_df.groupby("jedi_number")['Loot'].agg(', '.join)

选项 2: groupby.apply

inventory_df.groupby("jedi_number")['Loot'].apply(lambda x: ', '.join(x))

两个选项产生相同的输出:

jedi_number
0                  Bantha Fodder, Credits
1        Bantha Fodder, Credits, Holocron
2                                 Blaster
4     Bantha Fodder, Holocron, bread loaf
55                                Blaster
6                  Blaster, Bantha Fodder
9                      Holocron, Holocron
Name: Loot, dtype: object

x 的表达式中将 inventory_df.groupby("jedi_number").sum() 替换为这些选项之一应该会产生所需的结果。