用于 mathjax 语法的 Perl 正则表达式

Question

我在制作 perl 正则表达式以按照以下规则更改 \ 字符时遇到问题：

匹配序列应以 \(
它应该以 \)
前一个匹配序列中的任何 \ 字符应替换为双反斜杠 \

示例文本参考：

Se la \probabilit&agrave; dell'evento\ A &egrave; \(\frac{3}{4} \) e la
probabilit&agrave; dell'evento B &egrave; \(\frac{1}{4}\)&nbsp;
\(\frac{3}{4} +\frac{3}{4}\)&nbsp;.
\(\frac{1}{4} - \frac{3}{4}\)&nbsp;.
\(\frac{3}{16}\)&nbsp;.
\(\frac{1}{2}\)&nbsp;.

应该变成：

Se la \probabilit&agrave; dell'evento\ A &egrave; \(\frac{3}{4} \) e la
probabilit&agrave; dell'evento B &egrave; \(\frac{1}{4}\)&nbsp;
\(\frac{3}{4} +\frac{3}{4}\)&nbsp;.
\(\frac{1}{4} - \frac{3}{4}\)&nbsp;.
\(\frac{3}{16}\)&nbsp;.
\(\frac{1}{2}\)&nbsp;.

到目前为止，这是我最好的选择：

s/(\\()(.*)(\)(.*)(\\))/\\\(\\\\\)/mg

产生：

Se la \probabilit&agrave; dell'evento\ A &egrave; \(\frac{3}{4} \) e la
probabilit&agrave; dell'evento B &egrave; \(\frac{1}{4}\)&nbsp;
\(\frac{3}{4} +\frac{3}{4}\)&nbsp;.
\(\frac{1}{4} - \frac{3}{4}\)&nbsp;.
\(\frac{3}{16}\)&nbsp;.
\(\frac{1}{2}\)&nbsp;.

如你所见

\(\frac{3}{4} +\frac{3}{4}\)&nbsp;.
\(\frac{1}{4} - \frac{3}{4}\)&nbsp;.

错了。

如何修改我的正则表达式以满足我的需要？

Answer 1

我测试了@sln 正则表达式

s/(?x)(?:(?!\A)\G[^\]*\K\|\(?=\())(?=.*?(?<=\)\))/\\/g;

它似乎有效，尽管它对我来说仍然是一个神秘的谜。

更新说明

Formatted and tested:

 (?s)               # Inline Dot-All modifier
 (?:                # Cluster start
      (?! \A )           # Not beginning of string
      \G                 # G anchor - If matched before, start at end of last match
      [^\]*             # Many non-escape's
      \K                 # Previous is not part of match
      \                 # A lone escape
   |                   # or,
                         # Start of an opening '\('
      \                 # A lone escape
      (?= \( )           #   followed by an open parenth
 )                  # Cluster end
 (?=                # Lookahead, each match validates a final '\)'
      .*? 
      (?<= \ )
      \) 
 )

Answer 2

发布我原来的更新后的正则表达式。

原文在结尾处对所有转义进行了验证。
看了下，只做验证就可以加速了
有一次它找到了开始的方块。

底部是比较两种方法的基准。

更新的正则表达式：

$str =~ s/(?s)(?:(?!\A)\G(?!\))[^\]*\K\|$?=\(.*?\$))/\\/g;

Formatted and tested:

 (?s)               # Dot-All modifier
 (?:                # Cluster start
      (?! \A )           # Not beginning of string
      \G                 # G anchor - If matched before, start at end of last match
      (?! \) )           # Last was an escape, so ')' ends the block
      [^\]*             # Many non-escape's
      \K                 # Previous is not part of match
      \                 # A lone escape
   |                   # or,
                         # New Block Check - 
      \                 # A lone escape then,
      (?=                # One time Validation:
           \(                 #  an opening '('
           .*?                #  anything
           \ \)              #  then a final '\)'
      )                  # -------------
 )                  # Cluster end

基准：

样本$ \\\\\\\\\\\\\\\ $

结果

New Regex:   (?s)(?:(?!\A)\G(?!\))[^\]*\K\|\(?=\(.*?\\)))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   31
Elapsed Time:    1.25 s,   1253.92 ms,   1253924 µs


Old Regex:   (?s)(?:(?!\A)\G[^\]*\K\|\(?=\())(?=.*?(?<=\)\))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   31
Elapsed Time:    3.95 s,   3952.31 ms,   3952307 µs

用于 mathjax 语法的 Perl 正则表达式

Perl Regex for mathjax syntax

regex

perl

mathjax