将重新编码模式应用于许多列

Question

我有一个包含以下列的数据框：

Name, Year, V1, V2, V5, V10, V12...

此 Table 包含大约 40 个 Vx 变量。这些变量的值可以是 1-5。我想重新编码它们

1-3 = 0 and
4-5 = 1

我知道如何像这样替换一列的数据

Table['V1_F'] = Table['V1'].apply(lambda x: 0 if x <4 else 1)

但我不知道如何在多个列上有效地应用它，或者现在是否有办法为每一列编写这个替换代码？最好的做法是“对除 Name 和 Year.

之外的所有列执行此操作

欢迎任何帮助。

Answer 1

获取所有列名到变量并比较布尔掩码，然后通过转换为整数将 True/False 转换为 1/0:

cols = Table.columns.difference(['Name','Year'])
Table[cols] = (Table[cols] >= 4).astype(int)

或通过numpy.where:

Table[cols] = np.where(Table[cols] < 4, 0, 1)

Answer 2

下面列出了两种可能的解决方案

applymap 如果需要更复杂的功能
你的逻辑是二进制的，二进制真值矩阵并变回整数表示

df = pd.DataFrame({**{"Name":np.random.choice(["this","that","other"],15),"Year":np.random.choice(range(1990,2021),15)},
             **{f"V{i}":np.random.randint(1,5,15) for i in range(10)}})

df2 = df.copy()
# solution 1
df.loc[:,[c for c in df.columns if c.startswith("V")]] = df.loc[:,[c for c in df.columns if c.startswith("V")]].applymap(lambda v: 0 if v<=3 else 1)
# solution 2
df2.loc[:,[c for c in df2.columns if c.startswith("V")]] = (df2.loc[:,[c for c in df2.columns if c.startswith("V")]]<=3).astype(int)

Name	Year	V0	V1	V2	V3	V4	V5	V6	V7	V8	V9
this	1998	0	1	0	0	1	0	0	0	0	0
that	2010	1	0	0	0	0	1	0	0	1	0
this	2004	0	0	0	0	1	0	0	1	0	0
this	1992	0	1	1	0	0	1	0	0	1	1
this	1990	0	0	1	0	0	0	0	0	0	1
this	2020	0	0	1	1	0	1	0	1	0	1
this	2016	0	1	0	0	0	0	1	0	1	0
other	1997	1	0	0	0	1	1	0	0	1	0
that	2000	1	0	1	0	0	1	1	0	0	0
that	2020	0	0	1	0	1	0	0	0	0	1
that	1991	0	0	0	0	0	0	1	0	0	1
other	2015	0	0	0	0	0	0	1	1	0	0
this	2020	0	0	0	1	0	0	0	0	0	0
other	2005	1	0	0	0	1	0	1	0	0	0
other	2008	1	0	0	0	0	0	1	0	0	0

将重新编码模式应用于许多列

Apply recode pattern to many columns

python

replace

pandas

recode