Pandas 在数据块上的意外行为

Unexpected behaviour with Pandas on databricks

我在数据块中使用 Pandas,每当我尝试使用 sum() 或 mean() 函数时,它都没有给我想要的结果,而所有其他函数如 value_count()、sort_value() 和其他函数运行良好

这是代码

df_fpl.groupby("team")["goals_scored"].sum()

这是输出

ARS    0000000000000000000000000000000000000000000000...

AVL    0000000000000000000000000000000000000000000000...

BHA    0000000000000000000000000000000000000000000000...

BUR    0000000000000000000000000000000000000000000000...

CHE    0000000000000000000000000000000000000000000000...

CRY    0000000000000000000100000000010000000000000000...

EVE    0000000000000000000000000000000000000000000000...

FUL    0000000010000000000000000000000000000000000000...

LEE    0000000000000000000000000000000000000000000000...

LEI    0000000000000000000000000000000000000000000000...

LIV    0000000000000000000000000000000000000000000000...

MCI    0000000000000000000000000000000000000000000000...

MUN    0000000000000000000000000000000000000000000000...

NEW    0000000000000000000000100000010110111100000000...

SHU    0000000000000000000000000000000000000000001000...

SOU    0000000001000100000000000000000000001000000000...

TOT    0000000000000000000000000000000000000000000000...

WBA    0000000000000000000000000000000000000000000000...

WHU    0000000000000000000002001010112100000000000000...

WOL    0000000000000000000000000000000000000000000000...
Name: goals_scored, dtype: object

我真正需要的是

ARS 15

AVL 20

BUR 10

等等。

这个问题是因为有问题的列是一个字符串。如果将其转换为 int,应该可以解决它。