Pandas 在数据块上的意外行为
Unexpected behaviour with Pandas on databricks
我在数据块中使用 Pandas,每当我尝试使用 sum() 或 mean() 函数时,它都没有给我想要的结果,而所有其他函数如 value_count()、sort_value() 和其他函数运行良好
这是代码
df_fpl.groupby("team")["goals_scored"].sum()
这是输出
ARS 0000000000000000000000000000000000000000000000...
AVL 0000000000000000000000000000000000000000000000...
BHA 0000000000000000000000000000000000000000000000...
BUR 0000000000000000000000000000000000000000000000...
CHE 0000000000000000000000000000000000000000000000...
CRY 0000000000000000000100000000010000000000000000...
EVE 0000000000000000000000000000000000000000000000...
FUL 0000000010000000000000000000000000000000000000...
LEE 0000000000000000000000000000000000000000000000...
LEI 0000000000000000000000000000000000000000000000...
LIV 0000000000000000000000000000000000000000000000...
MCI 0000000000000000000000000000000000000000000000...
MUN 0000000000000000000000000000000000000000000000...
NEW 0000000000000000000000100000010110111100000000...
SHU 0000000000000000000000000000000000000000001000...
SOU 0000000001000100000000000000000000001000000000...
TOT 0000000000000000000000000000000000000000000000...
WBA 0000000000000000000000000000000000000000000000...
WHU 0000000000000000000002001010112100000000000000...
WOL 0000000000000000000000000000000000000000000000...
Name: goals_scored, dtype: object
我真正需要的是
ARS 15
AVL 20
BUR 10
等等。
这个问题是因为有问题的列是一个字符串。如果将其转换为 int,应该可以解决它。
我在数据块中使用 Pandas,每当我尝试使用 sum() 或 mean() 函数时,它都没有给我想要的结果,而所有其他函数如 value_count()、sort_value() 和其他函数运行良好
这是代码
df_fpl.groupby("team")["goals_scored"].sum()
这是输出
ARS 0000000000000000000000000000000000000000000000...
AVL 0000000000000000000000000000000000000000000000...
BHA 0000000000000000000000000000000000000000000000...
BUR 0000000000000000000000000000000000000000000000...
CHE 0000000000000000000000000000000000000000000000...
CRY 0000000000000000000100000000010000000000000000...
EVE 0000000000000000000000000000000000000000000000...
FUL 0000000010000000000000000000000000000000000000...
LEE 0000000000000000000000000000000000000000000000...
LEI 0000000000000000000000000000000000000000000000...
LIV 0000000000000000000000000000000000000000000000...
MCI 0000000000000000000000000000000000000000000000...
MUN 0000000000000000000000000000000000000000000000...
NEW 0000000000000000000000100000010110111100000000...
SHU 0000000000000000000000000000000000000000001000...
SOU 0000000001000100000000000000000000001000000000...
TOT 0000000000000000000000000000000000000000000000...
WBA 0000000000000000000000000000000000000000000000...
WHU 0000000000000000000002001010112100000000000000...
WOL 0000000000000000000000000000000000000000000000...
Name: goals_scored, dtype: object
我真正需要的是
ARS 15
AVL 20
BUR 10
等等。
这个问题是因为有问题的列是一个字符串。如果将其转换为 int,应该可以解决它。