如何使用正则表达式 re.compile 创建捕获组？

Question

可以成功找到字符串，但无法将匹配对象拆分到正确的组中

完整字符串如下：

 Technology libraries: Techlibhellohellohello

（全部在一条线上）。我想要做的是在文件中找到这一行（有效），但是当我想添加到字典时，我只想添加“技术库”部分而不是其他所有内容。我想使用 .group() 并指定哪个组，但只有 Techlibhellohellohello 似乎作为 group(1) 弹出，没有其他出现。此外，在 Technology Libraries

之前有前导空格

要匹配的对象

is_startline_1 = re.compile(r" Technology libraries: (.*)$")

匹配的行

startline1_match = is_startline_1.match(line)

添加到字典

bookmark_dict['context']        = startline1_match.group(1)

.groups(1) 或 .groups(2) 的所需输出包含“技术库”

Answer 1

在这里，我们可能只想用捕获组包装第一部分：

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(Technology libraries: )(.*)$"

test_str = "Technology libraries: Techlibhellohellohello"

subst = "\1\n\2"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

此 JavaScript 演示展示了捕获组的工作原理：

const regex = /(Technology libraries: )(.*)$/gm;
const str = `Technology libraries: Techlibhellohellohello`;
const subst = `\n\n`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

正则表达式

如果这不是您想要的表达方式，您可以 modify/change 您的表达方式 regex101.com。

 (Technology libraries: )(.*)

正则表达式电路

您还可以在 jex.im:

中可视化您的表情

如果您想删除 : 和空格，您只需添加一个中间捕获组即可：

Demo

(Technology libraries)(:\s+)(.*)

Python代码

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(Technology libraries)(:\s+)(.*)"

test_str = ("Technology libraries: Techlibhellohellohello\n"
    "Technology libraries:     Techlibhellohellohello")

subst = "\1\n\3"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript演示

const regex = /(Technology libraries)(:\s+)(.*)/gm;
const str = `Technology libraries: Techlibhellohellohello
Technology libraries:     Techlibhellohellohello`;
const subst = `\n\n`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

如果您想捕获 "Technology libraries" 之前的空格，只需将它们添加到捕获组即可：

^(\s+)(Technology libraries)(:\s+)(.*)$

Demo

Python 测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"^(\s+)(Technology libraries)(:\s+)(.*)$"

test_str = ("    Technology libraries: Techlibhellohellohello\n"
    "       Technology libraries:     Techlibhellohellohello")

subst = "\2\n\4"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript演示

const regex = /^(\s+)(Technology libraries)(:\s+)(.*)$/gm;
const str = `    Technology libraries: Techlibhellohellohello
       Technology libraries:     Techlibhellohellohello`;
const subst = `\n`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

如何使用正则表达式 re.compile 创建捕获组？

How to create capturing groups with regex re.compile?

python

regex

parsing

regex-group

regex-greedy

要匹配的对象

匹配的行

添加到字典

正则表达式

正则表达式电路

Demo

Python代码

JavaScript演示

Demo

Python 测试

JavaScript演示