python 中的 .find() 和 'in' 运算符之间的区别

Question

我正在使用名为 filteredDS

的 pandas 开发 Dataframe

目标：

Searching for all data, whose question column contains 'King' word.

当我通过 in 运算符添加列 king_quest 时，如下所示：

filteredDS['king_quest'] = filteredDS.question.apply(lambda x: x if ' King ' in x else None).reset_index(drop = True)
filtered_king_df = filteredDS[~filteredDS.king_quest.isnull()].reset_index()
print(filtered_king_df)

我得到了大约 2000 行的数据框，当我通过 .find() 函数添加它时，如下所示：

filteredDS['king_quest'] = filteredDS.question.apply(lambda x: x if x.find('king') else None).reset_index(drop = True)
filtered_king_df = filteredDS[~filteredDS.king_quest.isnull()].reset_index()
print(filtered_king_df)

我得到了大约 3000 行的数据框。

注意： 在这两种情况下，question 列中的每一行都有 'king'字.

你能告诉我为什么会这样吗？

Answer 1

此处可能存在多个问题。

您的查找正在语句中查找不同的值。 'King'（空格，首字母大写），另一个 'king'。
x.find('king') returns 第一个匹配的索引，否则为 -1。如果你想用它来检查，你可能应该检查 x.find('king') > 0，但这不如 'king' in x
直观

Answer 2

运算符

“in”运算符用于检查序列中是否存在某个值。如果在指定序列中找到变量，则计算结果为真，否则计算结果为假。

# Python program to illustrate 
# Finding common member in list  
# using 'in' operator 
list1=[1,2,3,4,5] 
list2=[6,7,8,9] 
for item in list1: 
    if item in list2: 
        print("found")       
else: 
    print("not found")

find() 方法 find() 方法 returns 如果在给定字符串中找到子字符串的最低索引。如果没有找到，那么它 returns -1.

word = 'the tea looks good, this tea is for me;Thank you'

# returns first occurrence of Substring
result = word.find('tea')
print("Substring 'tea' found at index:", result)

Answer 3

下面是对 find() 方法的更全面的解释：

它寻找一个子串，returns 子串第一次出现的索引
它不会为包含 non-existent 子字符串的参数生成错误 - 如果是 non-existent 子字符串，它会 returns -1
它只适用于字符串

它有一个 two-parameter 变体 find(string, position)，其中 string 是您的字符串，position 指定搜索开始的索引。如果您不指定位置，则 find() 将从字符串的开头开始。

它还有一个three-parameter变体，前两个参数和two-parameter变体一样，第三个参数是第一个索引在执行过程中不被考虑的位置搜索。你可以把它想象成有这个签名：find(string, starting_position, ending_position) and the ending_position is not included

我之前没有用过filteredDS，对它不是很熟悉，希望下面的这个例子能帮助你想办法把find()方法应用到你的案例中。此代码打印字符串“text”中所有出现的单词“it”的索引。

text = " Lorem Ipsum 只是印刷和排版行业的虚拟文本。自 1500 年代以来，Lorem Ipsum 一直是该行业的标准虚拟文本。它不仅存活了五个世纪，而且还跃入了电子排版领域，基本保持不变。它在 1960 年代随着包含 Lorem Ipsum 段落的 Letraset 表的发布而流行，最近随着桌面出版软件的出现而流行。它被广泛使用。"

index = text.find("it")
while index != -1:
    print(index)
    index = text.find("it", index+1)

python 中的 .find() 和 'in' 运算符之间的区别

Difference between .find() and 'in' operator in python

python

find

in-operator

pandas