如何根据某些规则向数据框添加几个新列

Question

我有一个包含多个列的 daframe，但其中一些以 test_

开头

以下只有这些 test_ 列的示例：

c = pd.DataFrame({'test_pierce':[10,30,40,50],'test_failure':[30,10,20,10] })

我需要做的事情：

对于我的数据框中以 test_ 开头的每一列，我想在之后创建另一列来对其值进行分类，如下所示：

if test_ > 30.0:
   Y
else:
   N

要获得此输出：

d = pd.DataFrame({'test_pierce':[10,30,40,50],'class_test_pierce':['N','N','Y','Y'],'test_failure':[30,10,20,10], 'class_test_failure':['N','N','N','N'] })

我做了什么：

我有需要分类的列：

cols = [c for c in c.columns if c.startswith('test_')]

我无法从这里继续

Answer 1

一种可以帮助您入门的格式是：

cols = [c for c in c.columns if c.startswith('test_')]

for col in cols:
    df[f'class_{col}'] = df.apply(lambda x: 'Y' if x[col] > 30.0 else 'N', axis=1)

输出：

   test_pierce  test_failure class_test_pierce class_test_failure
0           10            30                 N                  N
1           30            10                 N                  N
2           40            20                 Y                  N
3           50            10                 Y                  N

Answer 2

按照建议的顺序编码： 该代码有点难看，因为您要求成为其 test_ 列之后的列。否则，代码比那更简单。

cols = [(i,c) for i,c in enumerate(c.columns) if c.startswith('test_')]

count = 1 
for index,col in cols:
    value = np.where(c[col] > 30.0,'Y','N')
    c.insert(index+count, 'class_'+col, value)
    count +=1

没有建议顺序的代码：

cols = [c for c in c.columns if c.startswith('test_')]

for col in cols:
    c[f'class_{col}'] = np.where(c[col] > 30.0,'Y','N')

如何根据某些规则向数据框添加几个新列

How to add several new columns to a dataframe acording to some rules

pandas

jupyter-notebook