如何根据评论找到每部电影的平均分 - Python
How to find average score for each movie based on reviews - Python
我有这样的数据框。
UserID Review MovieID
0 10112 Good MOV001
1 10112 Excellent MOV002
2 10112 Average MOV003
3 10113 Good MOV001
4 10113 Bad MOV002
5 10113 Good MOV003
6 10113 Excellent MOV004
7 10114 Good MOV001
8 10114 Bad MOV002
9 10114 Good MOV003
10 10114 Excellent MOV004
我已将评论更改为整数值。
movies.Review[movies.Status == 'Average'] = 2
movies.Review[movies.Status == 'Good'] = 3
movies.Review[movies.Status == 'Excellent'] = 5
movies.Review[movies.Status == 'Very Good'] = 4
movies.Review[movies.Status == 'Okay'] = 1
movies.Review[movies.Status == 'Bad'] = 0
movies
现在我的数据框看起来像这样,
UserID Review MovieID
0 10112 3 MOV001
1 10112 5 MOV002
2 10112 2 MOV003
3 10113 3 MOV001
4 10113 0 MOV002
5 10113 3 MOV003
6 10113 5 MOV004
7 10114 3 MOV001
8 10114 0 MOV002
9 10114 3 MOV003
10 10114 5 MOV004
现在如何根据评论找到每部电影的平均分?谁能帮帮我?
首先,您不需要那些 movies.Review[movies.Status==...] = ...
。相反,使用 np.select
或 map
:
Status_convert = {'Bad':0, 'Okay':1, 'Average':2,
'Good':3, 'Very Good':4, 'Excellent':5}
movies['Review'] = movies.Status.map(Status_convert)
那么你可以这样做:
df.groupby('MovieID')['Review'].mean()
输出:
MovieID
MOV001 3.000000
MOV002 1.666667
MOV003 2.666667
MOV004 5.000000
Name: Review, dtype: float64
我有这样的数据框。
UserID Review MovieID
0 10112 Good MOV001
1 10112 Excellent MOV002
2 10112 Average MOV003
3 10113 Good MOV001
4 10113 Bad MOV002
5 10113 Good MOV003
6 10113 Excellent MOV004
7 10114 Good MOV001
8 10114 Bad MOV002
9 10114 Good MOV003
10 10114 Excellent MOV004
我已将评论更改为整数值。
movies.Review[movies.Status == 'Average'] = 2
movies.Review[movies.Status == 'Good'] = 3
movies.Review[movies.Status == 'Excellent'] = 5
movies.Review[movies.Status == 'Very Good'] = 4
movies.Review[movies.Status == 'Okay'] = 1
movies.Review[movies.Status == 'Bad'] = 0
movies
现在我的数据框看起来像这样,
UserID Review MovieID
0 10112 3 MOV001
1 10112 5 MOV002
2 10112 2 MOV003
3 10113 3 MOV001
4 10113 0 MOV002
5 10113 3 MOV003
6 10113 5 MOV004
7 10114 3 MOV001
8 10114 0 MOV002
9 10114 3 MOV003
10 10114 5 MOV004
现在如何根据评论找到每部电影的平均分?谁能帮帮我?
首先,您不需要那些 movies.Review[movies.Status==...] = ...
。相反,使用 np.select
或 map
:
Status_convert = {'Bad':0, 'Okay':1, 'Average':2,
'Good':3, 'Very Good':4, 'Excellent':5}
movies['Review'] = movies.Status.map(Status_convert)
那么你可以这样做:
df.groupby('MovieID')['Review'].mean()
输出:
MovieID
MOV001 3.000000
MOV002 1.666667
MOV003 2.666667
MOV004 5.000000
Name: Review, dtype: float64