平衡不情愿和贪婪匹配

Question

我正在尝试匹配下面的两个地址行（大部分是虚构的地址）：

2320 ZINER CIR East 43123
1111 ZINER CIR East Bernstadt 43123

我的正则表达式是使用城市名称构建的，East Bernstadt 是一个城市名称。但是，街道也可以以 "East" 结尾。那么我的困境是，如果我贪婪地匹配 "East"，如：

\d+ [^ ]+ CIR( East)?( East Bernstadt)?(?: \d+)?

...那么只有第一行匹配（其他是部分匹配）。如果我使用不情愿的匹配，如：

\d+ [^ ]+ CIR( East)??( East Bernstadt)?(?: \d+)?

...第二行匹配但第一行不匹配。

如何更改正则表达式以使两行完全匹配？ "East" 和 "East Bernstadt" 必须保留在表达式的不同部分。

编辑： 我不能用一个括号组来处理 "East" 和 "East Bernstadt"；上面的两个表达式必须匹配，而且“1234 Ziner CIR East East Bernstadt”也必须匹配（一些街道上有主要方向）。

Answer 1

试试这个

\d+\s+\S+\s+CIR(?:(?!\sEast Bernstadt)\s+East)?(?:\s+East Bernstadt)?(?: +\d+)?

Regex demo

解释：
\s: "whitespace character": space, tab, newline, carriage return, vertical tab sample
\S: 一个不是白色的字符 space 由 \S sample
定义的字符 (?!…): 否定前瞻 sample

平衡不情愿和贪婪匹配

balancing reluctant and greedy matching

java

regex

non-greedy

regex-greedy