如何将此 Perl5/PCRE 翻译成 Perl 6 正则表达式？

Question

为了解决这个问题，我会使用 index、substr 或类似的方法，因为它们是我的特定问题的明显解决方案案例，但我正在制作 grammar，所以我只能使用 regex。 :(

也就是说，关于将 Perl5/PCRE 正则表达式转换为 Perl6 正则表达式的建议无论如何都是很好的 SO 内容，因为 Perl 6 越来越受欢迎，而且它的正则表达式引擎非常不同。

Here's a regex to only match a string which doesn't contain any of a given list of characters.
(try it here.)

^(?:(?!\/).)*$
^            # assert position at start of string
(?:          # begin a noncapturing group 
   (?!       # negative lookahead: following regex must not match the string
      \/     # literal forward slash
    )        # end negative lookahead
    .        # any character, once
 )*          # the previous noncapturing group, 0..Inf times
 $           # assert position at end of string

显然，出于多种原因，它在 Perl 6 中不起作用。

出于上述原因，我想在 Perl 6 中使用它。这是我尝试将其翻译成的内容，基于 CTRL-F ing the perl6 regex docs 对于 non capturing 和 negative lookahead:

[ \/ <!before .*> \/ <!after .*> || .? ]*

以及故障（我认为？）：

[       # begin a noncapturing group which apparently look like a charclass in p6
\/      # a literal forward slash  
<!before .*> # negative lookahead for the immediately preceding regex (literal /)
\/      # a literal /
<!after .*>  # negative lookbehind for the immediately preceding regex
|| .?   # force this to be a noncapturing group, not a charclass
]*      # end noncapturing group and allow it to match 0..Inf times

我像 my regex not-in { ... } 一样实现它，然后像 /^<not-in>$/ 一样使用它。但是，它 returns Nil 用于每个字符串，这意味着它无法正常工作。

我没能找到 Perl 6 的 http://regex101.com 等价物，所以使用它并不像使用 Perl 5 那样容易。

如何将其转换为 Perl 6？

Answer 1

您的原始正则表达式 ^(?:(?!\/).)*$ 到 Perl 6 语法的文字翻译是：

^ [ <!before \/> . ]* $

直接翻译就够简单了。

将(?:...)替换为[...]
将(?!...)替换为<!before...>
默认使用 x 修饰符

在此示例中，其他所有内容都保持不变。

我用一个简单的方法测试过它：

say "Match" if "ab/c" ~~ /^ [ <!before \/> . ]* $/; # doesn't match
say "Match" if "abc"  ~~ /^ [ <!before \/> . ]* $/; # Match

Answer 2

简答

仅匹配缺少正斜杠的字符串的正则表达式：/^ <-[ / ]>* $/

/ 正则表达式开始
^ 字符串开头

<-[开负数字符class（没有-，这就是一个普通字符class）
/ 个 class 不匹配的字符
]> 关闭字符 class

* 此 class
的零个或多个“副本” $ 字符串结尾
/正则表达式结束

默认情况下忽略 Perl 6 正则表达式中的空格。

完整答案

如果我没理解错的话，您只是在尝试匹配不包含正斜杠的字符串。在那种情况下，只需使用否定字符 class.

包含a和b的字符class会写成：<[ab]>

一个字符 class 包含除 a 或 b 之外的任何内容将被写成：<-[ab]>

一个字符 class 包含除 / 之外的任何内容将被写成： <-[ / ]> 并且用于确保字符串中没有字符包含正斜杠的正则表达式将是 /^ <-[ / ]>* $/.

此代码在字符串缺少正斜杠时匹配，而在包含正斜杠时不匹配：

say "Match" if "abc/" ~~ /^ <-[ / ]>* $/; # Doesn't match
say "Match" if "abcd" ~~ /^ <-[ / ]>* $/; # Matches

仅检查排除一个字符的首选方法是使用 index 函数。但是，如果您想要排除多个字符，只需使用否定字符 class 和您不想在字符串中找到的所有字符。

Answer 3

只是为了解决这个问题

您的问题开头为：

Just to get this out of the way, I would use index, substr or similar, as they are the obvious solution for my specific case but I'm making a grammar and so I can only use regex. :(

学究气，你可以这样。事实上，您可以在 Perl 正则表达式中嵌入任意代码。

典型的 Perl 6 示例：

/ (\d**1..3) <?{ $/ < 256 }> / # match an octet

\d**1..3 位匹配 1 到 3 个十进制数字。该位周围的 (...) 括号告诉 Perl 6 将匹配存储在特殊变量 $/.

中

<?{ ... }> 位是代码断言。如果代码 returns 为真，则正则表达式继续。如果不是，则返回或失败。

在正则表达式中使用 index 等（在本例中我选择了 substr-eq）很麻烦而且可能很疯狂。但这是可行的：

say "a/c" ~~ / a <?{ $/.orig.substr-eq: '/', $/.to }> . c /;
say "abc" ~~ / a <?{ $/.orig.substr-eq: '/', $/.to }> . c /

显示：

｢a/c｣
Nil

(在匹配对象上调用 .orig returns 匹配或正在匹配的原始字符串。调用 .to returns 原始字符串中的索引匹配达到或已经达到的字符串；"abc" ~~ / a { say $/.orig, $/.to } bc / 显示 abc1。）

如何将此 Perl5/PCRE 翻译成 Perl 6 正则表达式？

How can I translate this Perl5/PCRE to Perl 6 regex?

regex

pcre

raku

简答

完整答案

只是为了解决这个问题