Pandas Dataframe: groupby id 查找最大列值和 return 另一列的对应值

Pandas Dataframe: groupby id to find max column value and return corresponding value of another column

我有一个包含不同食物条目的大型数据框。每种食物都有一种营养素(A、B、C、D),该营养素在另一列中具有相应的值。 我想定义一个函数,该函数将特定营养素作为参数,并且 return 是营养价值最高的食物 的 名称。如果参数不存在,应该return 'Sorry, {requested nutrient} not found'.

df = pd.DataFrame([[0.99, 0.87, 0.58, 0.66, 0.62, 0.81, 0.63, 0.71, 0.77, 0.73, 0.69, 0.61, 0.92, 0.49],
               list('DAABBBBABCBDDD'),
               ['apple', 'banana', 'kiwi', 'lemon', 'grape', 'cheese', 'eggs', 'spam', 'fish', 'bread',
                'salad', 'milk', 'soda', 'juice'],
               ['***', '**', '****', '*', '***', '*', '**', '***', '*', '*', '****', '**', '**', '****']]).T
df.columns = ['value', 'nutrient', 'food', 'price']

我试过以下方法:

def food_for_nutrient(lookup_nutrient, dataframe=df):
    max_values = dataframe.groupby(['nutrient'])['value'].max()
    result = max_values[lookup_nutrient]
    return print(result)

它似乎正确地识别了营养素的最大值,但它 return 只是营养素值。我需要来自列 food 的相应 str。 例如,如果我给出以下参数

food_for_nutrient('A‘)

我想要的输出是:

banana

我的第二个问题是我的 if 语句 不起作用。它总是 returns else

def food_for_nutrient(lookup_nutrient, dataframe=df):
    max_values = dataframe.groupby(['nutrient'])['value'].max()
    if lookup_nutrient in dataframe['nutrient']:
        result = max_values[lookup_nutrient]
        return print(result)
    else:
        return print(f'Sorry, {lookup_nutrient} not found.')

food_for_nutrient('A')

非常感谢您的帮助!

试试这个:

def food_for_nutrient(lookup_nutrient):
    try:
        return df[df['nutrient'] == lookup_nutrient].set_index('food')['value'].astype(float).idxmax()
    except ValueError:
        return f'Sorry, {lookup_nutrient} not found.'

输出:

>>> food_for_nutrient('A')
'banana'

>>> food_for_nutrient('B')
'cheese'

>>> food_for_nutrient('C')
'bread'

>>> food_for_nutrient('D')
'apple'

>>> food_for_nutrient('E')
'Sorry, E not found.'