为什么 re.findall 在这种情况下找不到匹配项？

Question

我正在尝试从给定的正则表达式重建示例字符串

test_re = r'\s([0-9A-Z]+\w*)\s+\S*[Aa]lloy\s'

然而，下面的代码只给出了['1AZabc']

import re 
txt = " 1AZabc sdfsdfAlloy "
test_re = r'\s([0-9A-Z]+\w*)\s+\S*[Aa]lloy\s'
# test_re = r'\s+\S*[Aa]lloy\s'
x = re.findall(test_re,txt)
print(x)

为什么space之后的内容（匹配\s+）没有被re捕获？什么是匹配 text_re 的简单有效的示例字符串？

Answer 1

您的代码有效并找到所有 - 您只是误解了正则表达式组及其在调用 findall 时的用法：

# code partially generated by regex101.com to demonstrate the issue
# see  https://regex101.com/r/Gngy0r/1

import re

regex = r"\s([0-9A-Z]+\w*)\s+\S*?[Aa]lloy\s"

test_str = " 1AZabc sdfsdfAlloy "

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):
    
    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
    
    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1
        
        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# use findall and print its results
print(re.findall(regex, test_str))

输出：

# full match that you got 
Match 1 was found at 0-20:  1AZabc sdfsdfAlloy 
# and what was captured
Group 1 found at 1-7: 1AZabc

# findall only gives you the groups ...
['1AZabc']

删除 ( ) 或将您感兴趣的全部放入 () 中：

regex = r"\s([0-9A-Z]+\w*\s+\S*?[Aa]lloy)\s"

为什么 re.findall 在这种情况下找不到匹配项？

Why re.findall does not find the match in this case?

python

python-re