在数据框的每一行和数字列中使用 Humanize.intword 函数

Question

我有一个包含非常大数字的数据集。我想通过在除日期之外的所有列中使用 humanize.intword 函数来方便阅读。

当我 select 只有一栏时，它有效：

pred_df["Predictions"].apply(lambda x: humanize.intword(x))

当我尝试 select 其他数字列时，出现错误：

pred_df.apply(lambda row : humanize.intword(row['Predictions'],row['Lower'], row['Upper']), axis = 1)

TypeError: sequence item 0: expected str instance, float found

我也尝试了此 post 中建议的列表理解，但我可能做错了什么。它适用于一列：


[humanize.intword(x) for x in pred_df["Predictions"]]

当我尝试不同的列时出现错误：

[humanize.intword(row1, row[11]) for row in zip(pred_df["Predictions"],pred_df["Lower"])]
IndexError: tuple index out of range

我的数据框包含 12 行和 4 列。你能帮我理解问题是什么吗？

Answer 1

问题是 humanize.intword 使用单个值并将其转换。但这里的目标是转换 many 个数字。一种方法是 applymap:

df.set_index("fiscal_date").applymap(humanize.intword)

我们首先将日期设置为索引，以便在计算中不使用它。如果您愿意，可以在之后将其放回带有 reset_index() 的列。

至于为什么会报错：

When I select only one column, it works:

因为您 select 一个系列，而传递给 apply 的是该列的单个条目；并且有效。

When I try to select other numeric columns, I get an error:

因为您要向 intword 提供 3 个值，但它只能使用 1 + 1，其中第一个是要转换的值，其他是可选格式。（错误消息应该是类似“这个函数需要 1 到 2 个参数，但你给了 3 个”，我相信。）

It works for one single column:

同样，这类似于第一个 apply 在一列上。

When I try over different columns I get an error:

同样，intword 一次可以处理一个值。（但错误是因为您将 11 作为 row 的索引，其中 2 个元素仅来自这 2 列的条目。）

在数据框的每一行和数字列中使用 Humanize.intword 函数

Use Humanize.intword function in every row and numeric column of a dataframe

python

list-comprehension

humanize

dataframe

pandas