Table pandas 和 altair 的气泡图

Table bubble plot with pandas and altair

我有以下 df_teams_full_stats:

Data columns (total 35 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   Unnamed: 0             428 non-null    int64  
 1   MatchID                428 non-null    int64  
 2   For Team               428 non-null    object 
 3   Against Team           428 non-null    object 
 4   Date                   428 non-null    object 
 5   GameWeek               428 non-null    int64  
 6   Home                   428 non-null    object 
 7   Possession             428 non-null    float64
 8   Touches                428 non-null    int64  
 9   Passes                 428 non-null    int64  
 10  Tackles                428 non-null    int64  
 11  Clearances             428 non-null    int64  
 12  Corners                428 non-null    int64  
 13  Offsides               428 non-null    int64  
 14  Fouls Committed        428 non-null    int64  
 15  Yellow Cards           428 non-null    int64  
 16  Goals                  428 non-null    int64  
 17  XG                     428 non-null    float64
 18  Shots On Target        428 non-null    int64  
 19  Total Shots            428 non-null    int64  
 20  Goals Conceded         428 non-null    int64  
 21  Shots Conceded         428 non-null    int64  
 22  XGC                    428 non-null    float64
 23  Shots In Box           428 non-null    int64  
 24  Close Shots            428 non-null    int64  
 25  Headers                428 non-null    int64  
 26  Shots Centre           428 non-null    int64  
 27  Shots Left             428 non-null    int64  
 28  Shots Right            428 non-null    int64  
 29  Shots In Box Conceded  428 non-null    int64  
 30  Close Shots Conceded   428 non-null    int64  
 31  Headers Conceded       428 non-null    int64  
 32  Shots Centre Conceded  428 non-null    int64  
 33  Shots Left Conceded    428 non-null    int64  
 34  Shots Right Conceded   428 non-null    int64  
dtypes: float64(3), int64(28), object(4)
memory usage: 117.2+ KB

我正在尝试对其进行分组和排序以实现此图表:

按照这个例子:table_bubble_plot_github,我们的想法是有一些目标 'x',比如说 'Goals',这将决定所有球队与其他球队比赛的气泡大小。

这是我目前拥有的:

def team_match_ups(df_teams_full_stats, x, y):
    target_x = x[0]
    target_y = y[0]

    df_temp =  df_teams_full_stats.set_index("For Team")
    df_temp.fillna(0.0, inplace=True)

    df_temp[target_x] = df_temp.groupby(['For Team'])[target_x].mean()
    X = df_temp[target_x].to_frame()
    print ("x", X)

    df_temp[target_y] = df_temp.groupby(['For Team'])[target_y].mean()
    Y = df_temp[target_y].to_frame()

    df = df_temp.reset_index()

    #sorted_teams = sorted(df_teams_full_stats['For Team'].unique())

    bubble_plot = alt.Chart(df).mark_circle().encode(
            alt.Y(f'{target_x}:O', bin=True),
            alt.X(f'{target_y}:O', bin=True),
            alt.Size(f'{target_x}:Q',
            scale=alt.Scale(range=[0, 1500])),
            #alt.Color('Color', legend=None, scale=None),
            tooltip = [alt.Tooltip(f'{target_x}:Q'),
                       alt.Tooltip(f'{target_y}:Q')],
        )


return bubble_plot

但这远非理想:


问题

如何让团队名称命名 x 和 y 网格并target_x 设置气泡大小?

如果您希望 x 和 y 轴是团队名称,您应该将 x 和 y 编码映射到包含团队名称的列。它可能看起来像这样(未经测试,因为您没有提供对您数据的访问权限):

alt.Chart(df_teams_full_stats).mark_circle().encode(
  x='For Team:N',
  y='Against Team:N',
  size='Goals:Q'
)