为什么被动(非捕获)组在使用向后查看时表现得比正常组更好?

Why does a passive (non-capturing) group behave better than a normal one when using backward viewing?

有效吗:(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+

但它确实有效:((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)

失败的测试:

"12-34" gives "12-34" (correct) versus "4"  (incorrect)
"1-23"  gives "1-23"  (correct) versus "3"  (incorrect)
"12-3"  gives "12-3"  (correct) versus "-3" (incorrect)

"123""1234" 都适用。

不要在 consol 中测试!在MSSQL和NET3.5中使用:

C# DLL

using System.Data.SqlTypes; //SqlInt32, ...
using Microsoft.SqlServer.Server; //SqlFunction, ...
using System.Collections; //IEnumerable
using System.Collections.Generic; //List
using System.Text.RegularExpressions;
internal struct MatchResult {
    /// <summary>Which match or group this is</summary>
    public int ID;
    /// <summary>Where the match or group starts in the input string</summary>
    public int Pos;
    /// <summary>What string matched the pattern</summary>
    public string Match;
}
public class RE {
    [SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, IsPrecise = true, SystemDataAccess = SystemDataAccessKind.None, FillRowMethodName = "FBsRow")]
    public static IEnumerable FBs(string str, string pattern, SqlInt32 opt) {
        if (str == null || pattern == null || opt.IsNull) return null;
        var gs = Regex.Match(str, pattern, (RegexOptions)opt.Value).Groups; int gid = 0; List<MatchResult> r = new List<MatchResult>(gs.Count);
        foreach (Group g in gs) r.Add(new MatchResult { ID = gid++, Pos = g.Index, Match = g.Value }); return r;
    }

    public static void FBsRow(object obj, ref SqlInt32 ID, ref SqlInt32 Pos, ref string FB) { MatchResult g = (MatchResult)obj; ID = g.ID; Pos = g.Pos; FB = g.Match; }
}

MSSQL

go
sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
go
IF OBJECT_ID(N'dbo.FBs') IS NOT NULL DROP FUNCTION dbo.FBs
go
go
if exists(select 1 from sys.assemblies as A where A.name='SQL_CLR') DROP     ASSEMBLY SQL_CLR
go
CREATE ASSEMBLY SQL_CLR FROM 'C:\src\SQL_CLR.dll'
go
CREATE FUNCTION dbo.FBs(@str nvarchar(max),@pattern nvarchar(max),@opt int=1)
RETURNS TABLE (ID int,Pos int,FB nvarchar(max)) WITH EXECUTE AS CALLER
AS EXTERNAL NAME SQL_CLR.[RE].FBs
go
;with P(p) as (select * from (values ('(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+'),('((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)')) P(t)),
T(t) as (select * from (values ('12-34'),('12-3'),('1-23'),('1234'),('123')) T(t))
select *,iif(t=FB,'PASS','FAIL') from P cross join T outer apply dbo.FBs(t,p,0) where ID=1
go

Tests in MSSQL

您的问题是,当正则表达式应用于 12-34 时,第一个正则表达式捕获的组的值为 4,第二个正则表达式捕获的组的值为 12-34

您的正则表达式结构如下(突出显示公共部分)

不同之处在于,在第一个中,您重复了一个捕获组。捕获的值只包含最后一次迭代的结果。

在第二个中,您捕获了一个重复的组。这是用于您想要的语义的正确方法。

有关此内容的更多信息,请参阅 Repeating a Capturing Group vs. Capturing a Repeated Group