编写一个函数来添加从俱乐部转移到俱乐部的列的问题
Problem writing a function that adds Columns for Transfer from and to club
我的一个项目有问题。我试图对足球转会做一个清晰的概述,我目前有这个 table:
ClubID
PlayerID
FromDate
ToDate
TeamName
c_Person
1
1
2010-01-01
2012-01-01
Club A
Player 1
2
1
2012-02-01
2015-02-01
Club B
Player 1
3
1
2015-05-01
2018-02-01
Club C
Player 1
1
2
2010-01-01
2018-02-02
Club A
Player 2
1
2
2018-03-02
2020-02-01
Club A
Player 2
但是,我想添加列 FromClub 和 ToClub。如果球员 1 从 2010-01-01 到 2012-01-01 首先为俱乐部 A 效力,然后从 2012-02-01 到 2015-02-01 转会并为俱乐部 B 效力,我想要 'FromClub' 和 'ToClub'图解转移
我希望 table 看起来像这样:
ClubID
PlayerID
FromDate
ToDate
TeamName
c_Person
FromClub
ToClub
1
1
2010-01-01
2012-01-01
Club A
Player1
Nan
Nan
2
1
2012-02-01
2015-02-01
Club B
Player 1
Club A
Club B
3
1
2015-05-01
2018-02-01
Club C
Player 1
Club B
Club C
1
2
2010-01-01
2018-02-02
Club A
Player 2
Nan
Nan
1
2
2018-03-02
2020-02-01
Club A
Player 2
Nan
Nan
我一直在尝试编写一个函数,但无法解决它。希望其他人可以帮助我解决这个问题。
这是创建第一个 Table 的代码:
import pandas as pd
from datetime import datetime
df = pd.DataFrame({'ClubID':[1, 2, 3, 1, 1],
'PlayerID':[1, 1, 1, 2, 2],
'FromDate':["2010-01-01", "2012-02-01", "2015-05-01", "2010-01-01", "2018-03-02"],
'ToDate':["2012-01-01", "2015-02-01", "2018-02-01", "2018-02-02", "2020-02-01"],
'TeamName':["Club A", "Club B", "Club C", "Club A", "Club A"],
'c_Person':["Player 1", "Player 1", "Player 1", "Player 2", "Player 2"]})
# convert the 'Date' columns to datetime format
df['FromDate']= pd.to_datetime(df['FromDate'])
df['ToDate']= pd.to_datetime(df['ToDate'])
提前致谢!
首先,对于数据框中的每一行,包括每个球员在转会前所在球队的信息:
df['PreviousTeam'] = df.groupby('PlayerID')['TeamName'].shift()
>>> df
ClubID FromDate PlayerID TeamName ToDate c_Person PreviousTeam
0 1 2010-01-01 1 Club A 2012-01-01 Player 1 NaN
1 2 2012-02-01 1 Club B 2015-02-01 Player 1 Club A
2 3 2015-05-01 1 Club C 2018-02-01 Player 1 Club B
3 1 2010-01-01 2 Club A 2018-02-02 Player 2 NaN
4 1 2018-03-02 2 Club A 2020-02-01 Player 2 Club A
然而,如果玩家被转移到同一支球队,则之前的球队与当前球队相同(第 4 行)。因此,请应用以下操作来修复该问题:
df['FromClub'] = df[df['PreviousTeam'] != df['TeamName']]['PreviousTeam']
最后 ToClub
列可以通过观察玩家被转移的时间从 FromClub
获得:
df['ToClub'] = df[~df['FromClub'].isna()]['TeamName']
>>> df.drop('PreviousTeam', axis=1)
ClubID FromDate PlayerID TeamName ToDate c_Person FromClub ToClub
0 1 2010-01-01 1 Club A 2012-01-01 Player 1 NaN NaN
1 2 2012-02-01 1 Club B 2015-02-01 Player 1 Club A Club B
2 3 2015-05-01 1 Club C 2018-02-01 Player 1 Club B Club C
3 1 2010-01-01 2 Club A 2018-02-02 Player 2 NaN NaN
4 1 2018-03-02 2 Club A 2020-02-01 Player 2 NaN NaN
所以把所有的东西都放在一个函数里,你可以用你的数据框在下面调用并得到想要的输出:
def fill_club_details(df):
df['PreviousTeam'] = df.groupby('PlayerID')['TeamName'].shift()
df['FromClub'] = df[df['PreviousTeam'] != df['TeamName']]['PreviousTeam']
df['ToClub'] = df[~df['FromClub'].isna()]['TeamName']
return df.drop('PreviousTeam', axis=1)
我的一个项目有问题。我试图对足球转会做一个清晰的概述,我目前有这个 table:
ClubID | PlayerID | FromDate | ToDate | TeamName | c_Person |
---|---|---|---|---|---|
1 | 1 | 2010-01-01 | 2012-01-01 | Club A | Player 1 |
2 | 1 | 2012-02-01 | 2015-02-01 | Club B | Player 1 |
3 | 1 | 2015-05-01 | 2018-02-01 | Club C | Player 1 |
1 | 2 | 2010-01-01 | 2018-02-02 | Club A | Player 2 |
1 | 2 | 2018-03-02 | 2020-02-01 | Club A | Player 2 |
但是,我想添加列 FromClub 和 ToClub。如果球员 1 从 2010-01-01 到 2012-01-01 首先为俱乐部 A 效力,然后从 2012-02-01 到 2015-02-01 转会并为俱乐部 B 效力,我想要 'FromClub' 和 'ToClub'图解转移
我希望 table 看起来像这样:
ClubID | PlayerID | FromDate | ToDate | TeamName | c_Person | FromClub | ToClub |
---|---|---|---|---|---|---|---|
1 | 1 | 2010-01-01 | 2012-01-01 | Club A | Player1 | Nan | Nan |
2 | 1 | 2012-02-01 | 2015-02-01 | Club B | Player 1 | Club A | Club B |
3 | 1 | 2015-05-01 | 2018-02-01 | Club C | Player 1 | Club B | Club C |
1 | 2 | 2010-01-01 | 2018-02-02 | Club A | Player 2 | Nan | Nan |
1 | 2 | 2018-03-02 | 2020-02-01 | Club A | Player 2 | Nan | Nan |
我一直在尝试编写一个函数,但无法解决它。希望其他人可以帮助我解决这个问题。
这是创建第一个 Table 的代码:
import pandas as pd
from datetime import datetime
df = pd.DataFrame({'ClubID':[1, 2, 3, 1, 1],
'PlayerID':[1, 1, 1, 2, 2],
'FromDate':["2010-01-01", "2012-02-01", "2015-05-01", "2010-01-01", "2018-03-02"],
'ToDate':["2012-01-01", "2015-02-01", "2018-02-01", "2018-02-02", "2020-02-01"],
'TeamName':["Club A", "Club B", "Club C", "Club A", "Club A"],
'c_Person':["Player 1", "Player 1", "Player 1", "Player 2", "Player 2"]})
# convert the 'Date' columns to datetime format
df['FromDate']= pd.to_datetime(df['FromDate'])
df['ToDate']= pd.to_datetime(df['ToDate'])
提前致谢!
首先,对于数据框中的每一行,包括每个球员在转会前所在球队的信息:
df['PreviousTeam'] = df.groupby('PlayerID')['TeamName'].shift()
>>> df
ClubID FromDate PlayerID TeamName ToDate c_Person PreviousTeam
0 1 2010-01-01 1 Club A 2012-01-01 Player 1 NaN
1 2 2012-02-01 1 Club B 2015-02-01 Player 1 Club A
2 3 2015-05-01 1 Club C 2018-02-01 Player 1 Club B
3 1 2010-01-01 2 Club A 2018-02-02 Player 2 NaN
4 1 2018-03-02 2 Club A 2020-02-01 Player 2 Club A
然而,如果玩家被转移到同一支球队,则之前的球队与当前球队相同(第 4 行)。因此,请应用以下操作来修复该问题:
df['FromClub'] = df[df['PreviousTeam'] != df['TeamName']]['PreviousTeam']
最后 ToClub
列可以通过观察玩家被转移的时间从 FromClub
获得:
df['ToClub'] = df[~df['FromClub'].isna()]['TeamName']
>>> df.drop('PreviousTeam', axis=1)
ClubID FromDate PlayerID TeamName ToDate c_Person FromClub ToClub
0 1 2010-01-01 1 Club A 2012-01-01 Player 1 NaN NaN
1 2 2012-02-01 1 Club B 2015-02-01 Player 1 Club A Club B
2 3 2015-05-01 1 Club C 2018-02-01 Player 1 Club B Club C
3 1 2010-01-01 2 Club A 2018-02-02 Player 2 NaN NaN
4 1 2018-03-02 2 Club A 2020-02-01 Player 2 NaN NaN
所以把所有的东西都放在一个函数里,你可以用你的数据框在下面调用并得到想要的输出:
def fill_club_details(df):
df['PreviousTeam'] = df.groupby('PlayerID')['TeamName'].shift()
df['FromClub'] = df[df['PreviousTeam'] != df['TeamName']]['PreviousTeam']
df['ToClub'] = df[~df['FromClub'].isna()]['TeamName']
return df.drop('PreviousTeam', axis=1)