如何对多列使用 str.startswith？

Question

我有一个如下所示的数据框： my data

我用它来过滤 ID 以 b、c、e、f、5 开头的用户，并且能够成功执行。

df[df.userA.str.startswith(('b','c','e','f','5'))]

我现在想对列 userA 和 userB 执行相同的操作，但尝试运行但未成功：

df[[df.userA.str.startswith(('b','c','e','f','5'))] and [df.userB.str.startswith(('b','c','e','f','5'))]]

有什么想法吗？

Answer 1

您不能使用 and，因为在 Python 中，这将 return 第一个 具有真实性的操作数 False (或者如果 and 链中没有这样的操作数，最后一个元素）。

然而，您可以使用 & 和 | 运算符作为逻辑运算符 and 和 or 分别应用多个条件。

所以对于你的情况，你可能想使用：

df[
    df.userA.str.startswith(('b','c','e','f','5')) <b>&</b>
    df.userB.str.startswith(('b','c','e','f','5'))
]

（这给出了数据帧 df 的 "rows" 其中 both userA 和 userB 以字符开头在 ('b','c','e','f','5'));或

df[
    df.userA.str.startswith(('b','c','e','f','5')) <b>|</b>
    df.userB.str.startswith(('b','c','e','f','5'))
]

（这给出了数据帧 df 的 "rows"，其中至少 userA 或 userB 以('b','c','e','f','5'))

中的字符

how to use str.startswith for multiple columns?