忽略 Npp 正则表达式中的特定字符

Question

我正在使用 Notepad++ 风格的正则表达式。这...

Find: ([^`]{1,23} )
Replace: [=10=]\n

...接受这个输入字符串...

Now is the time for all good men to come to the aid of the party.

...并生成此输出字符串：

Now is the time for all

good men to come to the

aid of the party.

它将字符串拆分为 24 个或更少的非反引号 (`) 字符的行，在 space 秒后拆分。它仅在输入字符串的最后一个字符也是 space 字符时有效。

这个字符串...

Now is the time for all good men to █come to the aid█ of the party.

...拆分方式不同。

Now is the time for all

good men to █come to

the aid█ of the party.

我正在寻找一种跳过 █ 个字符的方法 - 处理输入字符串，就好像 █ 个字符不存在一样。

[注意：`（反引号）字符保留用于包围文本格式标记，稍后插入。 █ 个字符将用于表示“这段文本稍后将插入标签”，因此它们将被删减，但目前还没有。我在这里使用 █（完整块）表示 Unicode 7F（del）字符，因为 7F 无法正确显示。如果绝对必要，我也可以在 AHK 中使用 Perl 风味正则表达式。]

这些正则表达式模式发现未能忽略 █：

(([^`]|█?){1,23} )
((([^`])|(█)?){1,23} )
((([^`])|(?:█)){1,23} )

那么，有没有办法做到这一点？

Answer 1

您可以使用以下模式：

(?:[^`█]█*){1,23}[ ]

这匹配除反引号或 full block 后跟零个或多个完整块字符之外的任何字符，并允许整个字符重复 1 到 23 次。这确保完整的块字符不计入 {1,23} 量词。

Demo.

您也可以使用 Unicode 代码点（我认为这样看起来更好）：

(?:[^`\x{2588}]\x{2588}*){1,23}[ ]

此外，如果（最后一场比赛的）最后一个字符不必是 space 个字符，您可以使用：

(?:[^`\x{2588}]\x{2588}*){1,23}(?: |$)

忽略 Npp 正则表达式中的特定字符

Ignore specific character in Npp regex

regex

notepad++