如何在使用 pandas 计算平均值时摆脱某些行

Question

所以我正在尝试使用 plotly 和 pandas 制作条形图，以绘制我拥有的数据框的不同类别的平均评分。每个类别的评分为 1-5，类别是末尾带有“评分”的列。我可以让它正常工作。但是，我要考虑的是，在我的类别（数据框中的不同列）中，如果该类别未被评级，它们的值为 -1。我想知道当我计算平均值并绘制图表时，如何确保在平均值计算期间不考虑 -1 值？

我的代码

# Plot to find mean rating for different categories

# Take columns we are interested in and stack them into column named 'category'
# This will allow us to group by category and calculate mean rating
dfm = pd.melt(df, id_vars=["id", "course", "date", "overall_rating", "job_prospects_desc", "course_lecturer_desc", "facilities_desc", "student_support_desc", "local_life_desc"],
              value_vars=["job_prospects_rating", "course_lecturer_rating", "facilities_rating", "student_support_rating", "local_life_rating"],
              var_name ='Category')

# Group by category and calculate mean rating
dfg = dfm.groupby(['Category']).mean().reset_index()

print(df)

fig2 = px.bar(dfg, x = 'Category', y = 'value', color = 'Category',
             category_orders = {'Category':['job_prospects_rating','course_lecturer_rating','facilities_rating','student_support_rating','local_life_rating']},
             color_discrete_map = {
                    'job_prospects_rating' : 'light blue',
                    'course_lecturer_rating' : 'blue',
                    'facilities_rating' : 'pink',
                    'student_support_rating' : 'purple',
                    'local_life_rating' : 'violet'},
             title="Mean Rating For Different Student Categories At The University of Lincoln")

fig2.update_yaxes(title = 'Mean rating (1-5)')
fig2.show()

Answer 1

df = pd.DataFrame(data= np.array([[0,0,0,1,1,1],[1,2,-1,4,5,6],[7,8,9,10,-1,12]]).T, columns = ['Category', 'A', 'B'])
df1 = df.applymap(lambda x: x if x!= -1 else np.NaN)
df1.groupby(['Category']).mean()

逻辑非常简单：将 '-1' 替换为 NaN 并忘记它们

如何在使用 pandas 计算平均值时摆脱某些行

How to get rid of certain rows when calculating mean with pandas making a chart with plotly

python

data-analysis

dataframe

pandas

plotly