如何用 python pandas 分组并计算新字段?

How to groupby and calculate new field with python pandas?

我想按名为 'Fruit' 的数据框中的特定列进行分组,并计算 'Good'

的特定水果的百分比

查看下面我的初始数据框

import pandas as pd
df = pd.DataFrame({'Fruit': ['Apple','Apple','Banana'], 'Condition': ['Good','Bad','Good']})

数据框

    Fruit   Condition
0   Apple   Good
1   Apple   Bad
2   Banana  Good

下面是我想要的输出数据框

    Fruit   Percentage
0   Apple   50%
1   Banana  100%

注意:因为有 1 个“好”苹果和 1 个“坏”苹果,所以好苹果的百分比是 50%。

下面是我覆盖所有列的尝试

groupedDF = df.groupby('Fruit')
groupedDF.apply(lambda x: x[(x['Condition'] == 'Good')].count()/x.count())

请参阅下面的结果 table,它似乎是在现有列而不是新列中计算百分比:

        Fruit Condition
Fruit       
Apple   0.5 0.5
Banana  1.0 1.0

我们可以比较 Conditioneq and take advantage of the fact that True is (1) and False is (0) when processed as numbers and take the groupby mean 而不是 Fruits:

new_df = (
    df['Condition'].eq('Good').groupby(df['Fruit']).mean().reset_index()
)

new_df:

    Fruit  Condition
0   Apple        0.5
1  Banana        1.0

我们可以进一步 map to a format string and rename 将输出变成显示的所需输出:

new_df = (
    df['Condition'].eq('Good')
        .groupby(df['Fruit']).mean()
        .map('{:.0%}'.format)  # Change to Percent Format
        .rename('Percentage')  # Rename Column to Percentage
        .reset_index()  # Restore RangeIndex and make Fruit a Column
)

new_df:

    Fruit Percentage
0   Apple        50%
1  Banana       100%

*自然也可以进行进一步的操作。