如何在正则表达式中使用量化器通过重复模式获取组

Question

我有以下字符串：

(a,b,c,d,e)

我想用正则表达式找出所有逗号分隔的值。

如果我把括号收起来

a,b,c,d,e

并使用以下正则表达式：

([^,]),?

我为每个逗号分隔值找出一个匹配项和一组。

但是如果我想使用正则表达式处理结束括号：

\((([^,]),?)+\)

我仍然只得到一场比赛和一组。该组仅包含最后一个逗号分隔值。

我也尝试过像这样的组捕获：

(?:....)
(...?)
(...)?

但我无法通过正则表达式组获取逗号分隔值。

当逗号分隔值括在方括号中时，我该怎么做？

Answer 1

通常这就是重复组的工作方式 - 您没有单独的组，只有最后一个。如果您想在逗号之间分隔值，最好使用您的编程语言中可用的字符串函数来先去除括号，然后在逗号上拆分字符串。

例如 Ruby:

 [10] pry(main)> '(a,b,c,d,e,f)'.gsub(/[()]/,'').split(',')
 # => ["a", "b", "c", "d", "e", "f"]

Answer 2

我发现了。使用 C#，您可以使用匹配集合中的属性捕获。

使用正则表达式：

\((([^,]),?)+\)

做：

        string text = "(a,b,c,d,e)";
        Regex rgx = new Regex("\((([^,]),?)+\)");
        MatchCollection matches = rgx.Matches(text);

然后你有 1 个项目在 matchcollection 中包含以下 3 个组：

[0]: \((([^,]),?)+\) => (a,b,c,d,e)
[1]: ([^,]),?+ => value and optional comma, eg. a, or b, or e
[2]: [^,] => value only, eg. a or b or ...

组内的列表捕获存储量化器提取的每个值。所以使用组 [2] 并捕获以获取所有值。

所以解决方案是：

        string text = "(a,b,c,d,e)";
        Regex rgx = new Regex("\((([^,]),?)+\)");
        MatchCollection matches = rgx.Matches(text);

        //now get out the captured calues
        CaptureCollection captures = matches[0].Groups[2].Captures;

        //and extract them to list
        List<string> values = new List<string>();
        foreach (Capture capture in captures)
        {
            values.Add(capture.Value);
        }

如何在正则表达式中使用量化器通过重复模式获取组

Howto get groups by repeated pattern with qantizer in regular expression

c#

regex

pattern-matching

multiple-matches

regex-group