为什么被动(非捕获)组在使用向后查看时表现得比正常组更好?
Why does a passive (non-capturing) group behave better than a normal one when using backward viewing?
不有效吗:(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+
但它确实有效:((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)
失败的测试:
"12-34" gives "12-34" (correct) versus "4" (incorrect)
"1-23" gives "1-23" (correct) versus "3" (incorrect)
"12-3" gives "12-3" (correct) versus "-3" (incorrect)
"123"
或 "1234"
对 和 都适用。
不要在 consol 中测试!在MSSQL和NET3.5中使用:
C# DLL
using System.Data.SqlTypes; //SqlInt32, ...
using Microsoft.SqlServer.Server; //SqlFunction, ...
using System.Collections; //IEnumerable
using System.Collections.Generic; //List
using System.Text.RegularExpressions;
internal struct MatchResult {
/// <summary>Which match or group this is</summary>
public int ID;
/// <summary>Where the match or group starts in the input string</summary>
public int Pos;
/// <summary>What string matched the pattern</summary>
public string Match;
}
public class RE {
[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, IsPrecise = true, SystemDataAccess = SystemDataAccessKind.None, FillRowMethodName = "FBsRow")]
public static IEnumerable FBs(string str, string pattern, SqlInt32 opt) {
if (str == null || pattern == null || opt.IsNull) return null;
var gs = Regex.Match(str, pattern, (RegexOptions)opt.Value).Groups; int gid = 0; List<MatchResult> r = new List<MatchResult>(gs.Count);
foreach (Group g in gs) r.Add(new MatchResult { ID = gid++, Pos = g.Index, Match = g.Value }); return r;
}
public static void FBsRow(object obj, ref SqlInt32 ID, ref SqlInt32 Pos, ref string FB) { MatchResult g = (MatchResult)obj; ID = g.ID; Pos = g.Pos; FB = g.Match; }
}
MSSQL
go
sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
go
IF OBJECT_ID(N'dbo.FBs') IS NOT NULL DROP FUNCTION dbo.FBs
go
go
if exists(select 1 from sys.assemblies as A where A.name='SQL_CLR') DROP ASSEMBLY SQL_CLR
go
CREATE ASSEMBLY SQL_CLR FROM 'C:\src\SQL_CLR.dll'
go
CREATE FUNCTION dbo.FBs(@str nvarchar(max),@pattern nvarchar(max),@opt int=1)
RETURNS TABLE (ID int,Pos int,FB nvarchar(max)) WITH EXECUTE AS CALLER
AS EXTERNAL NAME SQL_CLR.[RE].FBs
go
;with P(p) as (select * from (values ('(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+'),('((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)')) P(t)),
T(t) as (select * from (values ('12-34'),('12-3'),('1-23'),('1234'),('123')) T(t))
select *,iif(t=FB,'PASS','FAIL') from P cross join T outer apply dbo.FBs(t,p,0) where ID=1
go
Tests in MSSQL
您的问题是,当正则表达式应用于 12-34
时,第一个正则表达式捕获的组的值为 4
,第二个正则表达式捕获的组的值为 12-34
。
您的正则表达式结构如下(突出显示公共部分)
不同之处在于,在第一个中,您重复了一个捕获组。捕获的值只包含最后一次迭代的结果。
在第二个中,您捕获了一个重复的组。这是用于您想要的语义的正确方法。
有关此内容的更多信息,请参阅 Repeating a Capturing Group vs. Capturing a Repeated Group。
不有效吗:(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+
但它确实有效:((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)
失败的测试:
"12-34" gives "12-34" (correct) versus "4" (incorrect)
"1-23" gives "1-23" (correct) versus "3" (incorrect)
"12-3" gives "12-3" (correct) versus "-3" (incorrect)
"123"
或 "1234"
对 和 都适用。
不要在 consol 中测试!在MSSQL和NET3.5中使用:
C# DLL
using System.Data.SqlTypes; //SqlInt32, ...
using Microsoft.SqlServer.Server; //SqlFunction, ...
using System.Collections; //IEnumerable
using System.Collections.Generic; //List
using System.Text.RegularExpressions;
internal struct MatchResult {
/// <summary>Which match or group this is</summary>
public int ID;
/// <summary>Where the match or group starts in the input string</summary>
public int Pos;
/// <summary>What string matched the pattern</summary>
public string Match;
}
public class RE {
[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, IsPrecise = true, SystemDataAccess = SystemDataAccessKind.None, FillRowMethodName = "FBsRow")]
public static IEnumerable FBs(string str, string pattern, SqlInt32 opt) {
if (str == null || pattern == null || opt.IsNull) return null;
var gs = Regex.Match(str, pattern, (RegexOptions)opt.Value).Groups; int gid = 0; List<MatchResult> r = new List<MatchResult>(gs.Count);
foreach (Group g in gs) r.Add(new MatchResult { ID = gid++, Pos = g.Index, Match = g.Value }); return r;
}
public static void FBsRow(object obj, ref SqlInt32 ID, ref SqlInt32 Pos, ref string FB) { MatchResult g = (MatchResult)obj; ID = g.ID; Pos = g.Pos; FB = g.Match; }
}
MSSQL
go
sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
go
IF OBJECT_ID(N'dbo.FBs') IS NOT NULL DROP FUNCTION dbo.FBs
go
go
if exists(select 1 from sys.assemblies as A where A.name='SQL_CLR') DROP ASSEMBLY SQL_CLR
go
CREATE ASSEMBLY SQL_CLR FROM 'C:\src\SQL_CLR.dll'
go
CREATE FUNCTION dbo.FBs(@str nvarchar(max),@pattern nvarchar(max),@opt int=1)
RETURNS TABLE (ID int,Pos int,FB nvarchar(max)) WITH EXECUTE AS CALLER
AS EXTERNAL NAME SQL_CLR.[RE].FBs
go
;with P(p) as (select * from (values ('(\d+\s*|(?<=\d+\s*)-\s*\d\s*)+'),('((?:\d+\s*|(?<=\d+\s*)-\s*\d\s*)+)')) P(t)),
T(t) as (select * from (values ('12-34'),('12-3'),('1-23'),('1234'),('123')) T(t))
select *,iif(t=FB,'PASS','FAIL') from P cross join T outer apply dbo.FBs(t,p,0) where ID=1
go
Tests in MSSQL
您的问题是,当正则表达式应用于 12-34
时,第一个正则表达式捕获的组的值为 4
,第二个正则表达式捕获的组的值为 12-34
。
您的正则表达式结构如下(突出显示公共部分)
不同之处在于,在第一个中,您重复了一个捕获组。捕获的值只包含最后一次迭代的结果。
在第二个中,您捕获了一个重复的组。这是用于您想要的语义的正确方法。
有关此内容的更多信息,请参阅 Repeating a Capturing Group vs. Capturing a Repeated Group。