在多列中按值定位行

Question

这是我的数据框 (df)：

我想遍历我的行，如果我在某处看到“999”值（不包括 id）：

我需要确保整行都是“999”。
我需要确保具有相同 ID 值的第二行也包含“999”。

例如，对于 id=5，我有一行包含 999，而 id=5 的第二行没有“999”。

预期输出：

这是我的：

num_of_p = len(df.columns) - 1
for v in df.index:
    if (sum(df.iloc[v] == 999) != num_of_p):
        if (sum(df.iloc[v] == 999) != 0):
          raise Exception("***** value 999 should apply to the entire row -please check and re-run*****")

此代码适用于我的第一个条件。我在想第二个问题时遇到了麻烦。任何帮助将不胜感激！

Answer 1

您可以使用布尔索引：

m = df.loc[:, "p1":].apply(lambda x: 999 in x.values, axis=1)
df.loc[df["id"].isin(df.loc[m, "id"]), "p1":] = 999
print(df)

打印：

   id   p1   p2   p3   p4
0   2    0    0    0    0
1   2    1    1    1    1
2   4  999  999  999  999
3   4  999  999  999  999
4   5  999  999  999  999
5   5  999  999  999  999
6   9    1    1    1    1
7   9    0    0    0    0

编辑：要获得包含 999 的单个“行”（假设总是有双胞胎）：

m = df.loc[:, "p1":].apply(lambda x: 999 in x.values, axis=1)

x = df.loc[m, "id"].value_counts()
print('Rows that contain 999 and are "single":')
print(x[x == 1].index.values)

打印：

Rows that contain 999 and are "single":
[5]

在多列中按值定位行

Locating row by value in multiple columns

excel

rows

dataframe

pandas