替换数据框中大量条目的有效方法

Question

我正在为工作创建一个自动化程序，自动生成月末报告。我运行遇到的挑战是想出一种有效的方法来进行大量替换，而无需 for 循环和一堆 if 语句。

我有一个大约有 113 个条目的文件，它为我提供了有关哪些条目需要替换为另一个条目的说明

Uom	Actual UOM
0	ML
3	ML
4	UN
7	ML
11	ML
12	ML
19	ML
55	ML
4U	GR

有大量重复项，我将值更改为相同的值（3、7、11 等更改为 ML），但似乎我仍然必须遍历相当数量的 if 语句对于每个细胞。我可能会用另一种语言为此使用 switch 语句，但 python 似乎没有它们。

我所想的伪代码：

for each in dataframe
   if (3,7,11, etc...)
      change cell to ML
   if (4)
      change cell to UN
   if (4U)
      change cell to GR
etc.

是否有更有效的方法来做到这一点，或者我是否在正确的轨道上？

Answer 1

Pandas 可能会抛出 warning/error 消息，例如“Series 的真值不明确...”。

我不确定是否理解您要实现的目标，但为了让您入门，如果您想修改“Uom”列，您可以这样做：

mask = df["Uom"] == 3 | df["Uom"] == 7 | df["Uom"] == 11
df.loc[mask, "Uom"] = "ML"
df.loc[df["Uom"] == 4, "Uom"] = "UN"

Answer 2

我会根据您的 mapping_df 创建字典（我假设您发布的数据框名为 mapping_df），然后 map结果在你的主数据框中。

这样你就不需要手动声明任何东西，所以即使在113行中添加了新行mapping_df，代码仍然可以顺利运行：

# Create a dictionary with your Uom as Key
d = dict(zip(mapping_df.Uom,mapping_df['Actual UOM']))

# And then use map on your main_df Uom column 
main_df['Actual Uom'] = main_df['Uom'].map(d)

像上面这样的东西应该可以工作。

替换数据框中大量条目的有效方法

Efficient way to replace a large number of entries in a dataframe

python

conceptual

dataframe

pandas