使用正则表达式 Python 中的空格后跟方括号（非惰性）

Question

我正在尝试执行以下操作：从字符串列表中提取第一次出现（可能不止一个）空格后跟圆括号“(”之前的任何内容。

我试过以下方法：

re.findall("(.*)\s\(", line))

但它给出了例如以下字符串：

Carrollton (University of West Georgia)[2]*Dahlonega (North Georgia College & State University)[2]

提前致谢

Answer 1

您可以为此使用前瞻。试试这个正则表达式：

[a-z A-Z]+(?=[ ]+[\(]+)

Answer 2

要在第一次出现空白字符后跟圆括号 ( 之前提取任何内容，您可以使用 re.search（此方法仅用于提取第一个匹配项）：

re.search(r'^(.*?)\s\(', text, re.S).group(1)
re.search(r'^\S*(?:\s(?!\()\S*)*', text).group()

参见 regex #1 demo and regex #2 demos. Note the second one - though longer - is much more efficient since it follows the 。

详情

^ - 字符串开头
(.*?) - 第 1 组：任何 0+ 个字符尽可能少，
\s\( - 一个空格和 ( 个字符。

或者，更好：

^\S* - 字符串开头，然后是 0+ 个非空白字符
(?:\s(?!\()\S*)* - 出现 0 次或多次
- \s(?!\() - 未跟随 (
- \S* - 0+ 个非空白字符

参见Python demo：

import re
strs = ['Isla Vista (University of California, Santa Barbara)[2]','Carrollton (University of West Georgia)[2]','Dahlonega (North Georgia College & State University)[2]']
rx = re.compile(r'^\S*(?:\s(?!\()\S*)*', re.S)
for s in strs:
    m = rx.search(s) 
    if m:
        print('{} => {}'.format(s, m.group()))
    else:
        print("{}: No match!".format(s))

使用正则表达式 Python 中的空格后跟方括号（非惰性）

Whitespace follows by brackets (non lazy) in Python using regex

python

regex

regex-greedy

python-re