正则表达式过滤掉以“字符串”后缀结尾的单词 - 没有前瞻
Regex to filter out word ending with a “string” suffix - without lookahead
我正在尝试提出一个 Ruby 匹配以下字符串的正则表达式:
MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something
但不应匹配以下内容:
MAINT: Refactored something
STRY-1: Added something
STRY-2: Updated something
MAINT: Refactoring something
STRY-3: Adding something
STRY-4: Updating something
基本上,: 之后的第一个单词不应以 ed 或 ing 结尾
我已经使用以下正则表达式 GitLab
提交消息一段时间了。
^(MAINT|(STRY|PRB)-\d+):\s(?:(?!(?:ed|ing)\b)[A-Za-z])+\s([a-zA-Z0-9._\-"].*)
然而,最近他们似乎已经改用不支持前瞻的google/re2。
是否可以重写此正则表达式以便不使用前瞻?
没有正则表达式:
str = "MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something"
p str.lines.none?{|line| line.split[1].end_with?("ed", "ing")}
# => true
str =<<_
MAINT: Refactor something
STRY-1: Added something
MAINT: Refactoring something
Add something
STRY-3: Adding something
STRY-1: Add something
MAINT: Refactored something
Refactor something
STRY-4: Updating something
STRY-9: Update something
STRY-2: Updated something
_
r = /
^ # Match beginning of line
(?: # Begin non-capture group
MAINT\:[ ]+Refactor # Match string
| # or
STRY-\d+\:[ ]+ # match string
(?:Add|Update) # match 'Add' or 'Update'
) # end non-capture group
[ ]+something # match one or more spaces followed by 'something'
$ # match end of line
/x # free-spacing regex definition modes
str.scan(r)
#=> ["MAINT: Refactor something\n",
# "STRY-1: Add something\n",
# "STRY-9: Update something\n"]
为了匹配正则表达式中的 space,我使用了包含 space ([ ]
) 的字符 class。这是必需的,因为 free-spacing 模式删除了不在 class 字符中的 space。按照约定的方式写,正则表达式如下
/^(?:MAINT\: +Refactor|STRY-\d+\: +(?:Add|Update)) +something$/
您正在处理一个必须注意三个结尾的正则表达式:
ed\b
ing\b
ied\b
你必须考虑每个单独点的存在。例如,e[^d]\b
和 [^e]d\b
。编写所有这些你将使用这个正则表达式:
^(MAINT|(STRY|PRB)-\d+):\s*(?i:\w*(e[a-ce-z]|[a-df-z]d|i(n[a-fh-z]|[a-mo-z]g|e[a-ce-z]|[a-df-z]d)|[a-hj-z]ng|[a-hj-z][a-df-mo-z][a-cefh-z])|\w)\s([a-zA-Z0-9._\-"].*)
我正在尝试提出一个 Ruby 匹配以下字符串的正则表达式:
MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something
但不应匹配以下内容:
MAINT: Refactored something
STRY-1: Added something
STRY-2: Updated something
MAINT: Refactoring something
STRY-3: Adding something
STRY-4: Updating something
基本上,: 之后的第一个单词不应以 ed 或 ing 结尾
我已经使用以下正则表达式 GitLab
提交消息一段时间了。
^(MAINT|(STRY|PRB)-\d+):\s(?:(?!(?:ed|ing)\b)[A-Za-z])+\s([a-zA-Z0-9._\-"].*)
然而,最近他们似乎已经改用不支持前瞻的google/re2。
是否可以重写此正则表达式以便不使用前瞻?
没有正则表达式:
str = "MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something"
p str.lines.none?{|line| line.split[1].end_with?("ed", "ing")}
# => true
str =<<_
MAINT: Refactor something
STRY-1: Added something
MAINT: Refactoring something
Add something
STRY-3: Adding something
STRY-1: Add something
MAINT: Refactored something
Refactor something
STRY-4: Updating something
STRY-9: Update something
STRY-2: Updated something
_
r = /
^ # Match beginning of line
(?: # Begin non-capture group
MAINT\:[ ]+Refactor # Match string
| # or
STRY-\d+\:[ ]+ # match string
(?:Add|Update) # match 'Add' or 'Update'
) # end non-capture group
[ ]+something # match one or more spaces followed by 'something'
$ # match end of line
/x # free-spacing regex definition modes
str.scan(r)
#=> ["MAINT: Refactor something\n",
# "STRY-1: Add something\n",
# "STRY-9: Update something\n"]
为了匹配正则表达式中的 space,我使用了包含 space ([ ]
) 的字符 class。这是必需的,因为 free-spacing 模式删除了不在 class 字符中的 space。按照约定的方式写,正则表达式如下
/^(?:MAINT\: +Refactor|STRY-\d+\: +(?:Add|Update)) +something$/
您正在处理一个必须注意三个结尾的正则表达式:
ed\b
ing\b
ied\b
你必须考虑每个单独点的存在。例如,e[^d]\b
和 [^e]d\b
。编写所有这些你将使用这个正则表达式:
^(MAINT|(STRY|PRB)-\d+):\s*(?i:\w*(e[a-ce-z]|[a-df-z]d|i(n[a-fh-z]|[a-mo-z]g|e[a-ce-z]|[a-df-z]d)|[a-hj-z]ng|[a-hj-z][a-df-mo-z][a-cefh-z])|\w)\s([a-zA-Z0-9._\-"].*)