基于多个条件在循环中隔离数据帧的行

Question

所以我最近问了一个与此相关的问题，虽然当时的答案很简单（我没有使用特定的列），但这次我没有那个列。。 None 提供的额外答案实际上有效：/

当您想要隔离给定 class 包含 1 而其他行包含零的行时，问题出在多标签数据框上。到目前为止，这是我的代码，但它会循环到无穷大并使 colab 崩溃。

在这种情况下，我只想要那个 Action 行，但我也试图循环它，所以我将附加所有带有值 1 的 Action 和 column_list 带有值 0 next History 1 所有其他 0 等等...

再次 link 上提供的选项给我一个 The truth of the answer is ambiguous 错误

Index |  Drama | Western | Action | History |
   0        1        1         0         0
   1        0        0         0         1
   2        0        0         1         0


# Column list to be popped
column_list = list(balanced_df.columns)[1:]

single_labels = []
i=0

# 28 columns total
while i < 27:
  # defining/reseting the full column list at the start of each loop
  column_list = list(balanced_df.iloc[:,1:])
  # Pop column name at index i
  x = column_list.pop(i)

  # storing the results in a list of lists
  # Filters for the popped column where the column is 1 & the remaining columns are set to 0
  single_labels.append(balanced_df[(balanced_df[x] == 1) & (balanced_df[column_list]==0)])

  # incriment the column index number for the next run
  i+=1

这里的输出类似于

single_labels[0]

    Index |  Drama | Western | Action | History |
       2        0        0         1         0


single_labels[1]
    Index |  Drama | Western | Action | History |
       1        0        0         0         1

Answer 1

你不需要循环。你很少需要使用 pandas 的循环。如果您根据条件选择行，则应使用布尔索引。

你的情况是：

df.loc[df.sum(axis='columns').eq(1)]

举个例子：

pandas.DataFrame({
    'A': [1, 0, 0, 0, 0, 1, 1, 0, 0],
    'B': [0, 1, 0, 0, 1, 0, 1, 0, 0],
    'C': [0, 0, 1, 0, 1, 0, 0, 1, 0],
    'D': [0, 0, 0, 1, 0, 1, 0, 1, 0],
}).loc[lambda df: df.sum(axis='columns').eq(1)].values.tolist()

输出：

[[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]]

基于多个条件在循环中隔离数据帧的行

Isolating Rows Of A Dataframe in a loop based on multiple conditions

python

data-manipulation

while-loop

multilabel-classification