为什么这个 sed 命令输出“[18”而不是“18”?
Why does this sed command output "[18" instead of "18"?
echo [18%] | sed s:[\[%\]]::g
我真的很困惑,因为完全相同的模式成功地替换了 vim 中的 [18%]
。我还在一些在线正则表达式工具中测试了该表达式,它们都说它将按预期匹配 [
、%
和 ]
。我尝试添加 -r
选项以及用引号将替换命令括起来。
我知道我可以使用其他命令来完成此任务,但我想知道它为什么会这样,以便我可以更好地了解 sed。
$ echo [18%] | sed s:[][%]::g
18
sed 支持 POSIX.2 正则表达式语法:默认为基本 (BRE) 语法,带有 -r
标志的扩展语法。在 POSIX.2 语法中,无论是基本语法还是扩展语法,您都可以通过将右方括号作为字符 class 中的第一个字符来包含它。反斜杠没有帮助。
这很烦人,因为几乎所有其他现代语言和工具都使用 Perl 或类似 Perl 的正则表达式语法。 POSIX 语法不合时宜。
您可以在 regex(7) 手册页中阅读 POSIX.2 语法。
A bracket expression is a list of characters enclosed in "[]". It normally
matches any single character from the list (but see below). If the list begins
with '^', it matches any single character (but see below) not from the rest of
the list. If two characters in the list are separated by '-', this is shorthand
for the full range of characters between those two (inclusive) in the collating
sequence, for example, "[0-9]" in ASCII matches any decimal digit. It is ille‐
gal(!) for two ranges to share an endpoint, for example, "a-c-e". Ranges are
very collating-sequence-dependent, and portable programs should avoid relying on
them.
To include a literal ']' in the list, make it the first character (following a
possible '^'). To include a literal '-', make it the first or last character, or
the second endpoint of a range. To use a literal '-' as the first endpoint of a
range, enclose it in "[." and ".]" to make it a collating element (see below).
With the exception of these and some combinations using '[' (see next para‐
graphs), all other special characters, including '\', lose their special signifi‐
cance within a bracket expression.
echo [18%] | sed s:[\[%\]]::g
我真的很困惑,因为完全相同的模式成功地替换了 vim 中的 [18%]
。我还在一些在线正则表达式工具中测试了该表达式,它们都说它将按预期匹配 [
、%
和 ]
。我尝试添加 -r
选项以及用引号将替换命令括起来。
我知道我可以使用其他命令来完成此任务,但我想知道它为什么会这样,以便我可以更好地了解 sed。
$ echo [18%] | sed s:[][%]::g
18
sed 支持 POSIX.2 正则表达式语法:默认为基本 (BRE) 语法,带有 -r
标志的扩展语法。在 POSIX.2 语法中,无论是基本语法还是扩展语法,您都可以通过将右方括号作为字符 class 中的第一个字符来包含它。反斜杠没有帮助。
这很烦人,因为几乎所有其他现代语言和工具都使用 Perl 或类似 Perl 的正则表达式语法。 POSIX 语法不合时宜。
您可以在 regex(7) 手册页中阅读 POSIX.2 语法。
A bracket expression is a list of characters enclosed in "[]". It normally
matches any single character from the list (but see below). If the list begins
with '^', it matches any single character (but see below) not from the rest of
the list. If two characters in the list are separated by '-', this is shorthand
for the full range of characters between those two (inclusive) in the collating
sequence, for example, "[0-9]" in ASCII matches any decimal digit. It is ille‐
gal(!) for two ranges to share an endpoint, for example, "a-c-e". Ranges are
very collating-sequence-dependent, and portable programs should avoid relying on
them.
To include a literal ']' in the list, make it the first character (following a
possible '^'). To include a literal '-', make it the first or last character, or
the second endpoint of a range. To use a literal '-' as the first endpoint of a
range, enclose it in "[." and ".]" to make it a collating element (see below).
With the exception of these and some combinations using '[' (see next para‐
graphs), all other special characters, including '\', lose their special signifi‐
cance within a bracket expression.