Pandas 根据多个条件删除和移动列中的单元格
Pandas delete and shift cells in a column basis multiple conditions
我有一种情况,我想根据某些条件删除和移动 pandas 数据框中的单元格。我的数据框如下所示:
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
A 1 D 1 G 1
B 1 E 2 H 1
C 1 F 2 I 3
C 1 F 2 H 1
现在我要比较以下条件:
ID_2 and ID_3 should always be less than or equal to ID_1. If anyone of them is greater than ID_1 then that cell should be deleted and shifted with the next column cell
输出应如下所示:
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
A 1 D 1 G 1
B 1 H 1 blank nan
C 1 blank nan blank nan
C 1 H 1 blank nan
您可以按条件创建掩码,此处用于更大的值,例如 ID_1
by DataFrame.gt
::
cols1 = ['Value_2','Value_3']
cols2 = ['ID_2','ID_3']
m = df[cols2].gt(df['ID_1'], axis=0)
print (m)
ID_2 ID_3
0 False False
1 True False
2 True True
3 True False
如果通过 DataFrame.mask
:
匹配掩码,则替换缺失值
df[cols2] = df[cols2].mask(m)
df[cols1] = df[cols1].mask(m.to_numpy())
最后一次使用 DataFrame.shift
with set new columns by Series.mask
:
df1 = df[cols2].shift(-1, axis=1)
df['ID_2'] = df['ID_2'].mask(m['ID_2'], df1['ID_2'])
df['ID_3'] = df['ID_3'].mask(m['ID_2'])
df2 = df[cols1].shift(-1, axis=1)
df['Value_2'] = df['Value_2'].mask(m['ID_2'], df2['Value_2'])
df['Value_3'] = df['Value_3'].mask(m['ID_2'])
print (df)
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
0 A 1 D 1.0 G 1.0
1 B 1 H 1.0 NaN NaN
2 C 1 NaN NaN NaN NaN
3 C 1 H 1.0 NaN NaN
如有必要,最后用空字符串替换:
df[cols1] = df[cols1].fillna('')
print (df)
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
0 A 1 D 1.0 G 1.0
1 B 1 H 1.0 NaN
2 C 1 NaN NaN
3 C 1 H 1.0 NaN
我有一种情况,我想根据某些条件删除和移动 pandas 数据框中的单元格。我的数据框如下所示:
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
A 1 D 1 G 1
B 1 E 2 H 1
C 1 F 2 I 3
C 1 F 2 H 1
现在我要比较以下条件:
ID_2 and ID_3 should always be less than or equal to ID_1. If anyone of them is greater than ID_1 then that cell should be deleted and shifted with the next column cell
输出应如下所示:
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
A 1 D 1 G 1
B 1 H 1 blank nan
C 1 blank nan blank nan
C 1 H 1 blank nan
您可以按条件创建掩码,此处用于更大的值,例如 ID_1
by DataFrame.gt
::
cols1 = ['Value_2','Value_3']
cols2 = ['ID_2','ID_3']
m = df[cols2].gt(df['ID_1'], axis=0)
print (m)
ID_2 ID_3
0 False False
1 True False
2 True True
3 True False
如果通过 DataFrame.mask
:
df[cols2] = df[cols2].mask(m)
df[cols1] = df[cols1].mask(m.to_numpy())
最后一次使用 DataFrame.shift
with set new columns by Series.mask
:
df1 = df[cols2].shift(-1, axis=1)
df['ID_2'] = df['ID_2'].mask(m['ID_2'], df1['ID_2'])
df['ID_3'] = df['ID_3'].mask(m['ID_2'])
df2 = df[cols1].shift(-1, axis=1)
df['Value_2'] = df['Value_2'].mask(m['ID_2'], df2['Value_2'])
df['Value_3'] = df['Value_3'].mask(m['ID_2'])
print (df)
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
0 A 1 D 1.0 G 1.0
1 B 1 H 1.0 NaN NaN
2 C 1 NaN NaN NaN NaN
3 C 1 H 1.0 NaN NaN
如有必要,最后用空字符串替换:
df[cols1] = df[cols1].fillna('')
print (df)
Value_1 ID_1 Value_2 ID_2 Value_3 ID_3
0 A 1 D 1.0 G 1.0
1 B 1 H 1.0 NaN
2 C 1 NaN NaN
3 C 1 H 1.0 NaN