如何检查长字符串变量中包含的数据框中的子字符串？

Question

我的问题是一种不同的方式，因为我们通常使用 str.contains() 检查字符串。我想检查数据框中的子字符串是否包含在长字符串变量中。

数据框如下所示：

账户	子字符串	类别
1001	现金支付	类别 #1
1002	信用卡支付	类别 #2

长字符串变量为long_str = “Cash Payment by Customer”。

因此，当使用 .loc 到 search/filter 数据帧中的记录时，long_str 中包含的子字符串是否有类似 str.contains() 的函数，但在对面？

下面是我想尝试过滤数据帧的代码，除了 str.contains() 不起作用。谢谢！

df.loc[df[‘Substring’].str.contains(long_str)]

Answer 1

您可以简单地使用 pandas.Series.apply 方法：

>>> long_str = "Cash Payment by Customer"
>>> df.loc[df.Substring.apply(lambda x: x in long_str)]
   Account     Substring     Category
0     1001  Cash Payment  Category #1

如何检查长字符串变量中包含的数据框中的子字符串？

How to check a substring in dataframe included in a long string variable?

python

string

contains

dataframe

pandas