正则表达式：匹配定界符内的文本

Question

我会尽量说清楚，希望这个问题对遇到同样问题的其他人有所帮助。

在我的 file.txt 中，我只想匹配命令 "grep" 和 "sed" 中的文本“（”和“）”。示例：

This is my line and (this is the text to match!), and bla bla bla...

但有些行可能看起来像：

Another line (text to match 1;) something else, (text to match 2 )

这里的问题：
像
这样的表达式 grep '(.*)' file.txt 或 sed 's/(.*)//' <file.txt
将不起作用，因为 .* 默认情况下是贪婪的。这意味着，它将像这样匹配第二个示例：
另一行（ text to match 1;）其他内容，（text to match 2 ）

解决方案必须是非贪婪匹配，然后我尝试使用非贪婪量词 ?.

grep -E '\(.*?\)' file.txt

或使用 sed

sed -r 's/\(.*\)//' <file.txt

在这种情况下我们需要使用-E和-r让grep和sed读取扩展表达式，并且我们还需要在(之前使用\。
但是即使这个解决方案似乎也行不通，我不知道为什么。
然后我尝试了类似的东西：

grep '(.*)[^(]*' file.txt

为了找到只有一个“（要匹配的文本）”的行，并且，例如，如果我想重写 () 内的文本，语法将是：

sed 's/(.*)\([^(]*\)/(new text)/'<file.txt

但是虽然它似乎有效，但我发现'(.*)[^(]*' 与旧的 (.*) 匹配某些行（这是一个谜......）

有更好的解决方案吗？

提前致谢

Answer 1

这个正则表达式应该可以工作：

\(([^\)]+)\)/g

如您所见，它有效：

https://regex101.com/r/rR2uF3/1

Answer 2

使用 gnu awk 非常简单：

s='Another line (text to match 1;) something else, (text to match 2 )'

awk 'BEGIN{ FPAT="\([^)]*\)" } {for (i=1; i<=NF; i++) print $i}' <<< "$s"
(text to match 1;)
(text to match 2 )

Answer 3

您只需要：

$ cat file
Another line (text to match 1;) something else, (text to match 2 )

$ sed 's/(\([^)]*\)/(foo/' file
Another line (foo) something else, (text to match 2 )

$ sed 's/(\([^)]*\)/(foo/2' file
Another line (text to match 1;) something else, (foo)

$ sed 's/(\([^)]*\)/(foo/g' file
Another line (foo) something else, (foo)

从不需要非贪婪量词 ?，也很少有用到足以保证它使您的正则表达式变得 read/understand 有多难。它也不受所有工具的支持。调试 "greedy" 匹配问题时，始终先将 .*（如果存在）更改为 [^x]*，其中 x 是紧跟在您感兴趣的字符串之后的任何字符，) 在这种情况下。

正则表达式：匹配定界符内的文本

Regular expressions: match text inside delimiters

regex

grep

sed

greedy