如何使用 re.compile 和 finditer() 函数获取单词的开始和结束索引

Question

对于像

这样的字符串

评论 =“喜欢 javascript 编程！！！”

我使用 re.compile() 和 finditer() 函数在 'javascript!!!' 中获得了“script”的开始和结束索引。我需要获取包含“script”的完整单词 'javascripts!!!' 的开始和结束索引。

Answer 1

没有正则表达式怎么样？

comment = "I love javascript!!!"

all_scripts = [word for word in comment.split() if "script" in word]

# Lets look for first relevant words start & end indexes

start = comment.find(all_scripts[0])
end = start + len(all_scripts[0]) -1

print(start, end)  # 7 19

以上代码生成包含“script”的第一个单词的索引。

我想要所有单词的索引而不是第一个单词，尝试下面而不是第一个

comment = "I love javascript!!!"

all_scripts = [word for word in comment.split() if "script" in word]

all_indexes = [(comment.find(word), comment.find(word) + len(word) - 1) for word in all_scripts]

Answer 2

使用正则表达式：

import re

string = 'love programming javascripts!!!'
p = re.compile(r'[\w!]+')

for i in p.finditer(string):
    if 'script' in i.group():
        print(i.start(), i.end())

您也可以使用 p = re.compile(r'\S+') 作为模式。这取决于你的需要。

如何使用 re.compile 和 finditer() 函数获取单词的开始和结束索引

How to get the starting and ending indexes of word using re.compile and finditer() functions

python

python-re