Pandas:使用文本旋转数据框并合并列
Pandas: Pivot dataframe with text and combine columns
我正在与 Python 和 Pandas 合作,并且有一个 table 像这样:
Name Team Fixture Line-up Min IN Min Out
0 Player 1 RAY J1 Starting 68
1 Player 2 RAY J1 Bench 74
2 Player 3 RSO J2 Starting 45
3 Player 4 RSO J2 Bench 45
我需要旋转 table 使 'Fixture' 的行成为包含 'Line-up' 的文本 + Min IN 和 OUT 的新列。那么结果应该是这样的:
Name Team J1 J2
0 Player 1 RAY Starting - 68
1 Player 2 RAY Bench - 74
2 Player 3 RSO Starting - 45
3 Player 4 RSO Bench - 45
有什么办法可以做到吗?提前致谢!
您可以通过包含最小值修改 Line-up
列,然后 pivot
:
out = (df.assign(**{'Line-up': df['Line-up'] + ' - ' +
df.filter(like='Min').bfill(axis=1).iloc[:,0].astype(int).astype(str)})
.pivot(['Name','Team'], 'Fixture','Line-up').rename_axis(columns=None).reset_index())
输出:
Name Team J1 J2
0 Player 1 RAY Starting - 68 NaN
1 Player 2 RAY Bench - 74 NaN
2 Player 3 RSO NaN Starting - 45
3 Player 4 RSO NaN Bench - 45
N.B。这假定 Min
列中的空 space 是 NaN 值。如果它们实际上是空 space ''
,那么您可以先将它们转换为 NaN 值。所以喜欢:
out = (df.assign(**{'Line-up': df['Line-up'] + ' - ' +
df.filter(like='Min').replace('', pd.NA).bfill(axis=1).iloc[:,0].astype(int).astype(str)})
# here --> ^^^^^^^^^^^^
.pivot(['Name','Team'], 'Fixture','Line-up').rename_axis(columns=None).reset_index())
另一个版本:
df = (
df.set_index(["Name", "Team", "Fixture"])
.apply(lambda x: " - ".join(x[x != ""]), axis=1)
.unstack(level=2)
.reset_index()
)
df.columns.name = ""
打印:
Name Team J1 J2
0 Player 1 RAY Starting - 68 NaN
1 Player 2 RAY Bench - 74 NaN
2 Player 3 RSO NaN Starting - 45
3 Player 4 RSO NaN Bench - 45
我正在与 Python 和 Pandas 合作,并且有一个 table 像这样:
Name Team Fixture Line-up Min IN Min Out
0 Player 1 RAY J1 Starting 68
1 Player 2 RAY J1 Bench 74
2 Player 3 RSO J2 Starting 45
3 Player 4 RSO J2 Bench 45
我需要旋转 table 使 'Fixture' 的行成为包含 'Line-up' 的文本 + Min IN 和 OUT 的新列。那么结果应该是这样的:
Name Team J1 J2
0 Player 1 RAY Starting - 68
1 Player 2 RAY Bench - 74
2 Player 3 RSO Starting - 45
3 Player 4 RSO Bench - 45
有什么办法可以做到吗?提前致谢!
您可以通过包含最小值修改 Line-up
列,然后 pivot
:
out = (df.assign(**{'Line-up': df['Line-up'] + ' - ' +
df.filter(like='Min').bfill(axis=1).iloc[:,0].astype(int).astype(str)})
.pivot(['Name','Team'], 'Fixture','Line-up').rename_axis(columns=None).reset_index())
输出:
Name Team J1 J2
0 Player 1 RAY Starting - 68 NaN
1 Player 2 RAY Bench - 74 NaN
2 Player 3 RSO NaN Starting - 45
3 Player 4 RSO NaN Bench - 45
N.B。这假定 Min
列中的空 space 是 NaN 值。如果它们实际上是空 space ''
,那么您可以先将它们转换为 NaN 值。所以喜欢:
out = (df.assign(**{'Line-up': df['Line-up'] + ' - ' +
df.filter(like='Min').replace('', pd.NA).bfill(axis=1).iloc[:,0].astype(int).astype(str)})
# here --> ^^^^^^^^^^^^
.pivot(['Name','Team'], 'Fixture','Line-up').rename_axis(columns=None).reset_index())
另一个版本:
df = (
df.set_index(["Name", "Team", "Fixture"])
.apply(lambda x: " - ".join(x[x != ""]), axis=1)
.unstack(level=2)
.reset_index()
)
df.columns.name = ""
打印:
Name Team J1 J2
0 Player 1 RAY Starting - 68 NaN
1 Player 2 RAY Bench - 74 NaN
2 Player 3 RSO NaN Starting - 45
3 Player 4 RSO NaN Bench - 45