Pandas - 使用分位数获取图表上的值
Pandas - get values on a graph using quantile
我有这个df_players
:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 TableIndex 739 non-null object
1 PlayerID 739 non-null int64
2 GameWeek 739 non-null int64
3 Date 739 non-null object
4 Points 739 non-null int64
5 Price 739 non-null float64
6 BPS 739 non-null int64
7 SelectedBy 739 non-null int64
8 NetTransfersIn 739 non-null int64
9 MinutesPlayed 739 non-null float64
10 CleanSheet 739 non-null float64
11 Saves 739 non-null float64
12 PlayersBasicID 739 non-null int64
13 PlayerCode 739 non-null object
14 FirstName 739 non-null object
15 WebName 739 non-null object
16 Team 739 non-null object
17 Position 739 non-null object
18 CommentName 739 non-null object
我正在使用这个函数,使用 quantile()
(变量 'cut' 传递的值)来绘制玩家的分布:
def jointplot(X, Y, week=None, title=None,
positions=None, height=6,
xlim=None, ylim=None, cut=0.015,
color=CB91_Blue, levels=30, bw=0.5, top_rows=100000):
if positions == None:
positions = ['GKP','DEF','MID','FWD']
#Check if week is given as a list
if week == None:
week = list(range(max(df_players['GameWeek'])))
if type(week)!=list:
week = [week]
df_played = df_players.loc[(df_players['MinutesPlayed']>=45)
&(df_players['GameWeek'].isin(week))
&(df_players['Position'].isin(positions))].head(top_rows)
if xlim == None:
xlim = (df_played[X].quantile(cut),
df_played[X].quantile(1-cut))
if ylim == None:
ylim = (df_played[Y].quantile(cut),
df_played[Y].quantile(1-cut))
sns.jointplot(X, Y, data=df_played,
kind="kde", xlim=xlim, ylim=ylim,
color=color, n_levels=levels,
height=height, bw=bw);
plt.suptitle(title,fontsize=18);
plt.show()
通话:
jointplot('Price', 'Points', positions=['FWD'],
color=color_list[3], title='Forwards')
这个情节:
其中:
xlim = (4.5, 11.892999999999995)
ylim = (1.0, 13.0)
就我而言,这些 x 和 y 限制允许我使用分位数 (cut),(1-cut)
的范围来放大数据点区域。
QUESTION
现在我想为某个区域内的玩家获取玩家'WebName',像这样:
绘图后我可以在上面选择一个目标区域并定义范围,大致通过xlim和ylim:
jointplot('Price', 'Points', positions=['FWD'],
xlim=(5.5, 7.0), ylim=(11.5, 13.0),
color=color_list[3], title='Forwards')
上面红色区域放大。
但是我怎样才能得到该区域内玩家的名字呢?
您可以 select 基于图中边界的玩家数据框部分:
selected = df_players[
(df_players.Points >= points_lbound)
& (df_players.Points <= points_ubound)
& (df_players.Price >= price_lbound)
& (df_players.Price <= price_ubound)
]
WebNames 列表将是 selected.WebNames
我有这个df_players
:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 TableIndex 739 non-null object
1 PlayerID 739 non-null int64
2 GameWeek 739 non-null int64
3 Date 739 non-null object
4 Points 739 non-null int64
5 Price 739 non-null float64
6 BPS 739 non-null int64
7 SelectedBy 739 non-null int64
8 NetTransfersIn 739 non-null int64
9 MinutesPlayed 739 non-null float64
10 CleanSheet 739 non-null float64
11 Saves 739 non-null float64
12 PlayersBasicID 739 non-null int64
13 PlayerCode 739 non-null object
14 FirstName 739 non-null object
15 WebName 739 non-null object
16 Team 739 non-null object
17 Position 739 non-null object
18 CommentName 739 non-null object
我正在使用这个函数,使用 quantile()
(变量 'cut' 传递的值)来绘制玩家的分布:
def jointplot(X, Y, week=None, title=None,
positions=None, height=6,
xlim=None, ylim=None, cut=0.015,
color=CB91_Blue, levels=30, bw=0.5, top_rows=100000):
if positions == None:
positions = ['GKP','DEF','MID','FWD']
#Check if week is given as a list
if week == None:
week = list(range(max(df_players['GameWeek'])))
if type(week)!=list:
week = [week]
df_played = df_players.loc[(df_players['MinutesPlayed']>=45)
&(df_players['GameWeek'].isin(week))
&(df_players['Position'].isin(positions))].head(top_rows)
if xlim == None:
xlim = (df_played[X].quantile(cut),
df_played[X].quantile(1-cut))
if ylim == None:
ylim = (df_played[Y].quantile(cut),
df_played[Y].quantile(1-cut))
sns.jointplot(X, Y, data=df_played,
kind="kde", xlim=xlim, ylim=ylim,
color=color, n_levels=levels,
height=height, bw=bw);
plt.suptitle(title,fontsize=18);
plt.show()
通话:
jointplot('Price', 'Points', positions=['FWD'],
color=color_list[3], title='Forwards')
这个情节:
其中:
xlim = (4.5, 11.892999999999995)
ylim = (1.0, 13.0)
就我而言,这些 x 和 y 限制允许我使用分位数 (cut),(1-cut)
的范围来放大数据点区域。
QUESTION
现在我想为某个区域内的玩家获取玩家'WebName',像这样:
绘图后我可以在上面选择一个目标区域并定义范围,大致通过xlim和ylim:
jointplot('Price', 'Points', positions=['FWD'],
xlim=(5.5, 7.0), ylim=(11.5, 13.0),
color=color_list[3], title='Forwards')
上面红色区域放大。
但是我怎样才能得到该区域内玩家的名字呢?
您可以 select 基于图中边界的玩家数据框部分:
selected = df_players[
(df_players.Points >= points_lbound)
& (df_players.Points <= points_ubound)
& (df_players.Price >= price_lbound)
& (df_players.Price <= price_ubound)
]
WebNames 列表将是 selected.WebNames