为什么 运行 支持 ECMAScript 风格的 .Net Regex \A

Why running .Net Regex with ECMAScript flavor support \A

我有一个 .NetStandard2.1 C# 应用程序需要 运行 Regex ECMAScript 风格。

根据MSDN documentation,我可以使用RegexOptions.ECMAScript:

Enables ECMAScript-compliant behavior for the expression.

我知道 ECMAScript 不支持 \A 锚点(根据 link and when I tried Regex101 和 ECMAScript 选项)。但似乎.Net 确实支持它。示例:

Regex emcaRegex = new Regex(@"\A\d{3}", RegexOptions.ECMAScript);
var matches =  emcaRegex.Matches("901-333-");

Console.WriteLine($"number of matches: {matches.Count}"); // number of matches: 1
Console.WriteLine($"The match: {matches[0]}"); // The match: 901

我希望完全不匹配,我错过了什么?

您需要在"ECMAScript Matching Behavior" article中进一步寻找答案。

此选项不会重新定义特定于 .NET 的锚点含义,它们仍然受支持。

The behavior of ECMAScript and canonical regular expressions differs in three areas: character class syntax, self-referencing capturing groups, and octal versus backreference interpretation.

Character class syntax. Because canonical regular expressions support Unicode whereas ECMAScript does not, character classes in ECMAScript have a more limited syntax, and some character class language elements have a different meaning. For example, ECMAScript does not support language elements such as the Unicode category or block elements \p and \P. Similarly, the \w element, which matches a word character, is equivalent to the [a-zA-Z_0-9] character class when using ECMAScript and [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}\p{Lm}] when using canonical behavior. For more information, see Character Classes.

Self-referencing capturing groups. A regular expression capture class with a backreference to itself must be updated with each capture iteration.

Resolution of ambiguities between octal escapes and backreferences.

Regular expression Canonical behavior ECMAScript behavior
[=15=] followed by 0 to 2 octal digits Interpret as an octal. For example, 4 is always interpreted as an octal value and means "$". Same behavior.
\ followed by a digit from 1 to 9, followed by no additional decimal digits, Interpret as a backreference. For example, </code> always means backreference 9, even if a ninth capturing group does not exist. If the capturing group does not exist, the regular expression parser throws an <a href="https://docs.microsoft.com/en-us/dotnet/api/system.argumentexception" rel="nofollow noreferrer">ArgumentException</a>.</td> <td>If a single decimal digit capturing group exists, backreference to that digit. Otherwise, interpret the value as a literal.</td> </tr> <tr> <td><code>\ followed by a digit from 1 to 9, followed by additional decimal digits Interpret the digits as a decimal value. If that capturing group exists, interpret the expression as a backreference. Otherwise, interpret the leading octal digits up to octal 377; that is, consider only the low 8 bits of the value. Interpret the remaining digits as literals. For example, in the expression 00, if capturing group 300 exists, interpret as backreference 300; if capturing group 300 does not exist, interpret as octal 300 followed by 0. Interpret as a backreference by converting as many digits as possible to a decimal value that can refer to a capture. If no digits can be converted, interpret as an octal by using the leading octal digits up to octal 377; interpret the remaining digits as literals.