如何捕获特定角色前后的所有组

Question

我正在尝试捕获 ; 之前的所有组。我还需要捕获未以 ; 结束的最后一组。这是我的声明和代码。

正则表达式：

((\*|\/|\)|\(|[-+]\d+|[-+]?\d*\.\d+|\d+|\w+d?|\+|\-|=|{|}|:=|while|do|if|else|then|skip|or|and|not|>=)+;)+

声明：

x1:=0; x2:=1; x3:= (x1,x2,+); x4:=5; while {(x4,0,>=)} do {x4:= (x4,1,-); x1:=x2; x2:=x3; x3:= (x1, x2,+)}

我的正则表达式只捕获第一组。我需要捕获所有组，包括最后一个。

所以最后的分组应该是这样的：

['x1:=0', 'x2:=1', 'x3:= (x1,x2,+)', 'x4:=5', 'while {(x4,0,>=)} do {x4:= (x4,1,-)', 'x1:=x2', 'x2:=x3', 'x3:= (x1, x2,+)']

Answer 1

看来你可以只使用拆分：

ting = 'x1:=0; x2:=1; x3:= (x1,x2,+); x4:=5; while {(x4,0,>=)} do {x4:= (x4,1,-); x1:=x2; x2:=x3; x3:= (x1, x2,+)}'
ting2 = ting.split(';')
# ['x1:=0', ' x2:=1', ' x3:= (x1,x2,+)', ' x4:=5', ' while {(x4,0,>=)} do {x4:= (x4,1,-)', ' x1:=x2', ' x2:=x3', ' x3:= (x1, x2,+)}']

Answer 2

有两种非常简单的方法可以做到这一点。一个人甚至不需要正则表达式。这是一些显示两种不同实现的代码。您想要的图案是：

' ?([^;]+);?'

示例代码：

import re

statement = 'x1:=0; x2:=1; x3:= (x1,x2,+); x4:=5; while {(x4,0,>=)} do {x4:= (x4,1,-); x1:=x2; x2:=x3; x3:= (x1, x2,+)}'

#-the quick way
print('Quick way:')
print(state.split('; '))

#-the ~magic~ regex way
print('Regex way:')
pattern = ' ?([^;]+);?'
print(re.compile(pat).findall(state))

输出：

Quick way:
['x1:=0', 'x2:=1', 'x3:= (x1,x2,+)', 'x4:=5', 'while {(x4,0,>=)} do {x4:= (x4,1,-)', 'x1:=x2', 'x2:=x3', 'x3:= (x1, x2,+)}']
Regex way:
['x1:=0', 'x2:=1', 'x3:= (x1,x2,+)', 'x4:=5', 'while {(x4,0,>=)} do {x4:= (x4,1,-)', 'x1:=x2', 'x2:=x3', 'x3:= (x1, x2,+)}']

如何捕获特定角色前后的所有组

How to capture all of the groups before and after a specific character

python

regex

text-processing