在一行中删除 pandas DataFrame 中的多列

Question

你好，我有一个 pandas DataFrame，它看起来像这样：

    Product_Code  W0  W1  W2  W3  W4  W5  W6  W7  W8      ...        \
806         P815   0   0   1   0   0   2   1   0   0      ...         
807         P816   0   1   0   0   1   2   2   6   0      ...         
808         P817   1   0   0   0   1   1   2   1   1      ...         
809         P818   0   0   0   1   0   0   0   0   1      ...         
810         P819   0   1   0   0   0   0   0   0   0      ...         

     Normalized 42  Normalized 43  Normalized 44  Normalized 45  \
806           0.00           0.33           0.33           0.00   
807           0.43           0.43           0.57           0.29   
808           0.50           0.00           0.00           0.50   
809           0.00           0.00           0.00           0.50   
810           0.00           0.00           0.00           0.00

但我不需要这些列，事实上我只需要 W0 和 W4，所以我想删除所有这些，所以这就是我的尝试：

raw_data = [ raw_data.drop( [i], 1, inplace = True )  for i in raw_data if i is not 'W0' and i is not  'W4'  ]

半小时后，我发现由于某种原因 != 对字符串 and I was wondering why? 不起作用，所以我有一个稳定的解决方案：

#WORKS !!!!
# for i in raw_data:
#     if i != 'W0' and i != 'W4':
#         raw_data.drop( [i], 1, inplace = True )

但我根本不喜欢它，我已经评论了它，因为它需要很多 space 而且它不漂亮，我想使单行循环 if 表达式起作用，是有可能，问题是：

  raw_data = [ raw_data.drop( [i], 1, inplace = True )  for i in raw_data if i != 'W0' and i != 'W4'  ]

尝试将 DataFrame 转换为列表，应该如何完成？

Answer 1

您可以使用：

raw_data.drop([i for i in raw_data if i is not 'W0' and i is not  'W4'], 
               axis=1, inplace=True)

这回答了问题，但你陈述的条件没有意义。您设置的条件是 if i is not 'W0' and i is not 'W4'，这将始终为真。您可能需要再次查看条件。

Answer 2

这应该有效：

 raw_data = pd.DataFrame(raw_data, index=your_index, columns=['W0', 'W4'])

在一行中删除 pandas DataFrame 中的多列

Dropping a number of columns in a pandas DataFrame on one line

python

pandas

scikit-learn

sklearn-pandas