找到遵循不同模式的第一个正则表达式模式

Question

Objective: 找到第二个模式，并且只有当它是第一次看到该模式遵循不同的模式时才认为它是匹配的。

背景：

我正在使用 Python-2.7 正则表达式

我有一个特定的正则表达式匹配，但遇到了问题。我正在尝试获取以下示例中方括号之间的文本。

  Sample comments:

    [98 g/m2 Ctrl (No IP) 95 min 340oC         ]

    [    ]

我需要这条线：

98 g/m2 Ctrl (No IP) 95 min 340oC

问题是搜索模式 Sample comments: 和我想要的匹配项之间的不确定数量的空格、制表符和换行符给我带来了麻烦。

最佳尝试：

我能够轻松匹配第一部分，

match = re.findall(r'Sample comments:[.+\n+]+', string)

但是我无法匹配到我想要抓住方括号之间的部分的长度，

match = re.findall(r'Sample comments:[.+\n+]+\[(.+)\]', string)

我的想法：

有没有办法使用 ReGex 在模式 Sample comments: 匹配后找到模式 \[(.+)\] 的第一个实例？或者在我的示例案例中是否有更可靠的方法来查找方括号之间的位。

谢谢，

迈克尔

Answer 1

不确定我是否正确理解你的问题，但 re.findall('Sample comments:[^\[]*\[([^\]]*)\]', string) 似乎有效。

或者 re.findall('Sample comments:[^\[]*\[[ \t]*([^\]]*?)[ \t]*\]', string) 如果您想从您的行中删除最后的空格？

Answer 2

我建议使用

r'Sample comments:\s*\[(.*?)\s*]'

见regex and IDEONE demo

重点是 \s* 匹配零个或多个空格，包括垂直（换行符）和水平。见 Python re reference:

\s
When the UNICODE flag is not specified, it matches any whitespace character, this is equivalent to the set [ \t\n\r\f\v]. The LOCALE flag has no extra effect on matching of the space. If UNICODE is set, this will match the characters [ \t\n\r\f\v] plus whatever is classified as space in the Unicode character properties database.

图案详情:

Sample comments: - 文字字符序列
\s* - 0 个或更多空格
\[ - 文字 [
(.*?) - 第 1 组（由 re.findall 返回）捕获 0+ 个任何字符，但换行符尽可能少，直到第一个...
\s* - 0+ 个空格和
] - 文字 ]（注意它不必在字符 class 之外转义）。

找到遵循不同模式的第一个正则表达式模式

Find first ReGex pattern following a different pattern

python

regex

python-2.7

regex-lookarounds