如何根据具有布尔值的多列查找 Python 中一列的平均值

How to find mean of a column in Python based on multiple columns with Boolean values

我有一个数据集,我需要在其中为每种类型找到平均评分最高的 2 部电影。有了给定的数据布局,我该如何实现呢?有人可以帮我了解解决方法吗?

为了简化它,我创建了新的列,将 'genres' 中的值分成不同的类型,并为其分配了值 1 和 0。因此,分组依据将基于这些新的独特类型,而不是 'genres' 列。例如,如果我将 'Comedy' 列筛选为 1,则标题 'abc'、'pqr'、'ghi' 和 'mno' 的平均评分为 3.75、2.9 、2.25 和 3.12。因此,根据喜剧类型的平均评分排名前 2 位的电影将是 abc 和 mno。其他独特的类型也会遵循类似的逻辑。

我的原始数据集与下面的示例数据集非常相似:

| movieId | title |                      genres                     | userId | rating | release_year | review_year | Children | Action | Mystery | Sci-Fi | (no genres   listed) | Drama | Fantasy | Film-Noir | Romance | Thriller | Western | Comedy | Documentary | Crime | War | Horror | Animation | IMAX | Adventure | Musical |
|:-------:|:-----:|:-----------------------------------------------:|:------:|:------:|--------------|:-----------:|:--------:|:------:|:-------:|:------:|:--------------------:|:-----:|:-------:|:---------:|:-------:|:--------:|:-------:|:------:|:-----------:|:-----:|:---:|:------:|:---------:|:----:|:---------:|:-------:|
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 3437   | 3.5    | 1995         | 2009        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 7107   | 3      | 1995         | 2014        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 17211  | 4      | 1995         | 2010        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 18056  | 3      | 1995         | 2013        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 22223  | 5      | 1995         | 2014        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 28268  | 3.5    | 1995         | 2016        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 32984  | 4.5    | 1995         | 2013        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 33260  | 0.5    | 1995         | 2014        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 35223  | 3.5    | 1995         | 2009        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 54852  | 4.5    | 1995         | 2014        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 56556  | 4      | 1995         | 2016        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 59622  | 5      | 1995         | 2017        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 78317  | 4.5    | 1995         | 2011        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 1       | abc   | Adventure__Animation__Children__Comedy__Fantasy | 80922  | 4      | 1995         | 2005        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 1         | 0    | 1         | 0       |
| 2       | xyz   | Adventure__Children__Fantasy                    | 8667   | 3      | 2000         | 2010        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 0      | 0           | 0     | 0   | 0      | 0         | 0    | 1         | 0       |
| 2       | xyz   | Adventure__Children__Fantasy                    | 16458  | 2      | 2000         | 2005        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 0      | 0           | 0     | 0   | 0      | 0         | 0    | 1         | 0       |
| 2       | xyz   | Adventure__Children__Fantasy                    | 16464  | 3.5    | 2000         | 2011        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 0      | 0           | 0     | 0   | 0      | 0         | 0    | 1         | 0       |
| 2       | xyz   | Adventure__Children__Fantasy                    | 19786  | 4      | 2000         | 2005        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 0      | 0           | 0     | 0   | 0      | 0         | 0    | 1         | 0       |
| 2       | xyz   | Adventure__Children__Fantasy                    | 30494  | 3.5    | 2000         | 2005        | 1        | 0      | 0       | 0      | 0                    | 0     | 1       | 0         | 0       | 0        | 0       | 0      | 0           | 0     | 0   | 0      | 0         | 0    | 1         | 0       |
| 3       | pqr   | Comedy__Romance                                 | 204961 | 4      | 2008         | 2016        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 3       | pqr   | Comedy__Romance                                 | 220587 | 3.5    | 2008         | 2016        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 3       | pqr   | Comedy__Romance                                 | 224358 | 3      | 2008         | 2014        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 3       | pqr   | Comedy__Romance                                 | 231990 | 2.5    | 2008         | 2005        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 3       | pqr   | Comedy__Romance                                 | 234484 | 3      | 2008         | 2016        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 3       | pqr   | Comedy__Romance                                 | 251794 | 1.5    | 2008         | 2011        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 4       | ghi   | Comedy__Drama__Romance                          | 4294   | 3      | 2012         | 2016        | 0        | 0      | 0       | 0      | 0                    | 1     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 4       | ghi   | Comedy__Drama__Romance                          | 81593  | 2.5    | 2012         | 2009        | 0        | 0      | 0       | 0      | 0                    | 1     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 4       | ghi   | Comedy__Drama__Romance                          | 114192 | 2.5    | 2012         | 2013        | 0        | 0      | 0       | 0      | 0                    | 1     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 4       | ghi   | Comedy__Drama__Romance                          | 178767 | 0.5    | 2012         | 2014        | 0        | 0      | 0       | 0      | 0                    | 1     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 4       | ghi   | Comedy__Drama__Romance                          | 212863 | 2      | 2012         | 2005        | 0        | 0      | 0       | 0      | 0                    | 1     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 4       | ghi   | Comedy__Drama__Romance                          | 251932 | 3      | 2012         | 2005        | 0        | 0      | 0       | 0      | 0                    | 1     | 0       | 0         | 1       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 5       | mno   | Comedy                                          | 16481  | 4      | 2015         | 2016        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 5       | mno   | Comedy                                          | 25066  | 4      | 2015         | 2014        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 5       | mno   | Comedy                                          | 27833  | 3      | 2015         | 2005        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |
| 5       | mno   | Comedy                                          | 31864  | 1.5    | 2015         | 2014        | 0        | 0      | 0       | 0      | 0                    | 0     | 0       | 0         | 0       | 0        | 0       | 1      | 0           | 0     | 0   | 0      | 0         | 0    | 0         | 0       |

作为数据框:

pd.DataFrame({'movieId': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5],
              'title': ['abc','abc','abc','abc','abc','abc','abc','abc','abc','abc','abc','abc','abc','abc','xyz','xyz','xyz','xyz','xyz','pqr','pqr','pqr','pqr','pqr','pqr','ghi','ghi','ghi','ghi','ghi','ghi','mno','mno','mno','mno'],
              'genres': ['Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Animation__Children__Comedy__Fantasy','Adventure__Children__Fantasy','Adventure__Children__Fantasy','Adventure__Children__Fantasy','Adventure__Children__Fantasy','Adventure__Children__Fantasy','Comedy__Romance','Comedy__Romance','Comedy__Romance','Comedy__Romance','Comedy__Romance','Comedy__Romance','Comedy__Drama__Romance','Comedy__Drama__Romance','Comedy__Drama__Romance','Comedy__Drama__Romance','Comedy__Drama__Romance','Comedy__Drama__Romance','Comedy','Comedy','Comedy','Comedy'],
              'userId': [3437,7107,17211,18056,22223,28268,32984,33260,35223,54852,56556,59622,78317,80922,8667,16458,16464,19786,30494,204961,220587,224358,231990,234484,251794,4294,81593,114192,178767,212863,251932,16481,25066,27833,31864],
              'rating': [3.5,3.0,4.0,3.0,5.0,3.5,4.5,0.5,3.5,4.5,4.0,5.0,4.5,4.0,3.0,2.0,3.5,4.0,3.5,4.0,3.5,3.0,2.5,3.0,1.5,3.0,2.5,2.5,0.5,2.0,3.0,4.0,4.0,3.0,1.5],
              'release_year': [1995,1995,1995,1995,1995,1995,1995,1995,1995,1995,1995,1995,1995,1995,2000,2000,2000,2000,2000,2008,2008,2008,2008,2008,2008,2012,2012,2012,2012,2012,2012,2015,2015,2015,2015],
              'review_year': [2009,2014,2010,2013,2014,2016,2013,2014,2009,2014,2016,2017,2011,2005,2010,2005,2011,2005,2005,2016,2016,2014,2005,2016,2011,2016,2009,2013,2014,2005,2005,2016,2014,2005,2014],
              'Children': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Action': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Mystery': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Sci-Fi': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              '(no genres listed)': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Drama': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0],
              'Fantasy': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Film-Noir': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Romance': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0],
              'Thriller': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Western': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Comedy': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
              'Documentary': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Crime': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'War': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Horror': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Animation': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'IMAX': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Adventure': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              'Musical': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]})

我的输出类似于下面的输出:

| Genre     | title | avg rating |
|-----------|-------|------------|
| Children  | abc   | 3.75       |
| Drama     | ghi   | 2.25       |
| Fantasy   | abc   | 3.75       |
| Fantasy   | xyz   | 3.2        |
| Romance   | pqr   | 2.9        |
| Romance   | ghi   | 2.25       |
| Comedy    | abc   | 3.75       |
| Comedy    | mno   | 3.12       |
| Animation | abc   | 3.75       |
| Adventure | abc   | 3.75       |
| Adventure | xyz   | 3.2        |

请注意,这是一个示例数据集,其中有许多值为 0 的独特流派。因此,最终输出的 'genre' 列中不存在这些独特流派。此外,其中一些独特的类型只有一部电影属于它们的类别,因此它们只有一个列表。

你想获得每种类型的前 2 部电影,你首先需要 melt 虚拟变量,然后 groupby+agg:

(df.melt(id_vars=df.columns[:7], var_name='genre')
   .query('value == 1')
   .groupby(['genre', 'movieId'], as_index=False).agg({'title': 'first', 'rating': 'mean'})
   .groupby('genre').head(2)
)

输出:

        genre  movieId title    rating
0   Adventure        1   abc  3.750000
1   Adventure        2   xyz  3.200000
2   Animation        1   abc  3.750000
3    Children        1   abc  3.750000
4    Children        2   xyz  3.200000
5      Comedy        1   abc  3.750000
6      Comedy        3   pqr  2.916667
9       Drama        4   ghi  2.250000
10    Fantasy        1   abc  3.750000
11    Fantasy        2   xyz  3.200000
12    Romance        3   pqr  2.916667
13    Romance        4   ghi  2.250000