正则表达式捕获可选字符

Question

我想从一个较长的字符串中拉出一个基础字符串 (Wax)，以及前后的一些数据。我无法匹配下面列表中的最后一项 (noWax)。

任何人都可以展示他们的正则表达式肌肉吗？我对正则表达式很陌生，所以只要找到以下所有匹配项，就欢迎提供优化建议。

我在 Regex101 中使用的内容：

/(?<Wax>Wax(?:Only|-?\d+))/mg

Original string	need to extract in a capturing group
Loc3_341001_WaxOnly_S212	WaxOnly
Loc4_34412-a_Wax4_S231	Wax4
Loc3a_231121-a_Wax-4-S451	Wax-4
Loc3_34112_noWax_S311	noWax

Answer 1

这是一种方法，使用 conditional:

(?<Wax>(no)?Wax(?(2)|(?:Only|-?\d+)))

在线查看demo。

(no)?: 可选捕获组。
(?如果。
- (2)：测试捕获组2是否存在（(no)）。如果是，什么也不做。
- |: 或者.
- (?:Only|-?\d+)

Answer 2

我假设需要以下匹配项。

匹配项必须包含 'Wax'
'Wax' 前面是 '_' 或 '_no'。如果后者'no'被包含在匹配中
'Wax' 后面可能跟：
- 'Only' 后跟 '_'，在这种情况下 'Only' 是匹配的一部分，或者
- 一个或多个数字，后跟 '_'，在这种情况下，数字是匹配项的一部分，或者
- '-' 后跟一个或多个数字，然后是 '-'，在这种情况下 '-' 后跟一个或多个数字是匹配的一部分。

如果这些假设正确，则字符串可以与以下正则表达式匹配：

(?<=_)(?:(?:no)?Wax(?:(?:Only|\d+)?(?=_)|\-\d+(?=-)))

Demo

正则表达式可以分解如下

(?<=_)            # positive lookbehind asserts previous character is '_'
(?:               # begin non-capture group
  (?:no)?         # optionally match 'no'
  Wax             # match literal
  (?:             # begin non-capture group
    (?:Only|\d+)? # optionally match 'Only' or >=1 digits
    (?=_)         # positive lookahead asserts next character is '_'
    |             # or
    \-\d+         # match '-' followed by >= 1 digits
    (?=-)         # positive lookahead asserts next character is '-'
  )               # end non-capture group
)                 # end non-capture group

正则表达式捕获可选字符

Regex to capture optional characters

regex

regex-lookarounds