在具有多个 '\n' 的字符串中查找子字符串

Question

我的目标是找到 search_term_start 和 search_term_end 之间的那段文字。我遇到的问题是，如果我使用不带 '\n' 字符的字符串，我只能完成此操作。下面的代码引发了一个 AttributeError。

import re

logs = 'cut-this-out \n\n givemethisstring \n\n and-this-out-too'

search_term_start = '''cut-this-out'''
search_term_end = '''and-this-out-too'''

total_pages = re.search(search_term_start + '(.*)' + search_term_end, logs)
print(total_pages.group(1))

如果我从日志中删除“\n”字符，程序将按照我的预期运行：

import re

logs = 'cut-this-out givemethisstring and-this-out-too'

search_term_start = '''cut-this-out'''
search_term_end = '''and-this-out-too'''

total_pages = re.search(search_term_start + '(.*)' + search_term_end, logs)
print(total_pages.group(1))

我似乎无法在包含“\n”字符的字符串中搜索子字符串。如何在不从原始字符串中删除 '\n' 的情况下检索并保存该子字符串？

Answer 1

re.DOTALL正是您要找的标志。

Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline. Corresponds to the inline flag (?s).

试试这个：

import re

logs = 'cut-this-out \n\n givemethisstring \n\n and-this-out-too'

search_term_start = '''cut-this-out'''
search_term_end = '''and-this-out-too'''


c = re.compile(search_term_start + r'(.*)' + search_term_end, re.DOTALL)
print(c.search(logs).group(1))

在具有多个 '\n' 的字符串中查找子字符串

Finding substrings in string with multiple '\n's

python

python-3.x

python-re