带反向引用的正则表达式捕获组

Question

所以我正在尝试解析一些具有相当重复模式的文本文件，正则表达式可以很好地完成这项工作。但是偶然发现了这样的场景：

2 people:
Juan
Gabriella

我想将 Juan 和 Gabriella 分组，这样我的正则表达式的结果如下所示：

匹配0：2人第一组：胡安第 2 组：加布里埃拉

我试过了：

/^\d+\speople.*:$\n(.*)$\n/gm

结果是：

Match 0: 2 people
Group 1: Juan

我认为我们可以使用反向引用，但不确定在这种情况下如何使用它。

正则表达式：https://regexr.com/3k86r

更新：

如评论所述，不太可能那样做，那么将 Juan 和 Gabriella 放在同一组中，然后将它们分开如何。

因此正则表达式现在将寻找 3 个连续的换行符来对项目 Juan\nGabriella 和 Foo\nBar\Bazz

进行分组

2人：娟加布里埃拉

3人：福酒吧巴兹

尝试过：

\d+\speople+:$([\s\S]*(?=\n{3,}))

https://regexr.com/3k888

Answer 1

So the regex will now look for 3 consecutive line breaks to group the items Juan\nGabriella and Foo\nBar\Bazz

您可以使用

/(?:^|\n)\d+\s*people:([\s\S]*?)(?=\n{3}|$)/

见regex demo

详情

(?:^|\n) - 字符串或 LF
\d+ - 1+ 位数
\s* - 0+ 个空白字符
people: - 文字子串
([\s\S]*?) - 第 1 组在第一个之前尽可能少地捕获任何 0+ 个字符...
(?=\n{3}|$) - 3 个连续的 LF 符号或字符串结尾。

JS 演示：

var rx = /(?:^|\n)\d+\s*people:([\s\S]*?)(?=\n{3}|$)/g;
var str = "2 people:\nJuan\nGabriella\n\n\n3 people:\nFoo\nBar\nBazz";
let m, res=[];
while (m=rx.exec(str)) {
  console.log(m[1].trim().split("\n"));
}

带反向引用的正则表达式捕获组

Regex capture group with back references

javascript

regex

backreference