如何捕获不在非捕获组中的模式? - Python
How to catch a pattern that's not in the non-capturing group? - Python
给定字符串:
I'll be going home I've the 'v ' isn't want I want to split but I want to catch tokens like 'v and 'w ' .
目标是抓到:
'v
'v
'w
但要避免've
和'll
和't
。
我试图用 (?i)\'(?:ve|ll|t)\b
捕捉 've
和 'll
以及 't
,例如
>>> import re
>>> x = "I'll be going home I've the 'v ' isn't want I want to split but I want to catch tokens like 'v and 'w ' ."
>>> pattern = r"(?i)\'(?:ve|ll|t)\b"
>>> re.findall(pattern, x)
["'ll", "'ve", "'t"]
但我也试过像这样否定 (?i)\'(?:ve|ll|t)\b
中的非捕获组 (?i)\'[^(?:ve|ll|t)]\b
但它没有捕获 'v
和 'w
是期望的目标。
如何捕获单引号后面但不是来自预定义子字符串列表的子字符串,即 'll
、've
和 't
?
我也试过这个没用:
pattern = "(?i)\'(?:[^ve|ll|t|\s])\b"
但是 [^...]
只能识别单个字符,不能识别子字符串。
non-capturing 组的负面前瞻是 (?!...)
,所以它类似于 (?i)\'(?!ve|ll|t)\w\b
:
>>> pattern = r"(?i)\'(?!ve|ll|t)\w\b"
>>> x = "I'll be going home I've the 'v ' isn't want I want to split but I want to catch tokens like 'v and 'w ' ."
>>> re.findall(pattern, x)
["'v", "'v", "'w"]
也许这个 one 行得通?
\'(?!ve|ll|t|\s)\w+
您可以使用先行断言来过滤不需要的内容。
更新
在其他一些语言中,模式先行断言必须是固定长度。
这意味着(?!ve|t)
是无效的,因为ve
和t
有两个不同的长度。
给定字符串:
I'll be going home I've the 'v ' isn't want I want to split but I want to catch tokens like 'v and 'w ' .
目标是抓到:
'v
'v
'w
但要避免've
和'll
和't
。
我试图用 (?i)\'(?:ve|ll|t)\b
捕捉 've
和 'll
以及 't
,例如
>>> import re
>>> x = "I'll be going home I've the 'v ' isn't want I want to split but I want to catch tokens like 'v and 'w ' ."
>>> pattern = r"(?i)\'(?:ve|ll|t)\b"
>>> re.findall(pattern, x)
["'ll", "'ve", "'t"]
但我也试过像这样否定 (?i)\'(?:ve|ll|t)\b
中的非捕获组 (?i)\'[^(?:ve|ll|t)]\b
但它没有捕获 'v
和 'w
是期望的目标。
如何捕获单引号后面但不是来自预定义子字符串列表的子字符串,即 'll
、've
和 't
?
我也试过这个没用:
pattern = "(?i)\'(?:[^ve|ll|t|\s])\b"
但是 [^...]
只能识别单个字符,不能识别子字符串。
non-capturing 组的负面前瞻是 (?!...)
,所以它类似于 (?i)\'(?!ve|ll|t)\w\b
:
>>> pattern = r"(?i)\'(?!ve|ll|t)\w\b"
>>> x = "I'll be going home I've the 'v ' isn't want I want to split but I want to catch tokens like 'v and 'w ' ."
>>> re.findall(pattern, x)
["'v", "'v", "'w"]
也许这个 one 行得通?
\'(?!ve|ll|t|\s)\w+
您可以使用先行断言来过滤不需要的内容。
更新
在其他一些语言中,模式先行断言必须是固定长度。
这意味着(?!ve|t)
是无效的,因为ve
和t
有两个不同的长度。