C# 正则表达式 - 仅在子字符串存在时才匹配?

C# Regex - only match if substring exists?

好的,所以我想我已经掌握了否定句柄 - 现在只选择其中包含指定子字符串的匹配项怎么样?

鉴于:

This is a random bit of information from 0 to 1.
  This is a non-random bit of information I do NOT want to match
This is the end of this bit

This is a random bit of information from 0 to 1.
  This is a random bit of information I do want to match
This is the end of this bit

并尝试以下正则表达式:

/(?s)This is a random bit(?:(?=This is a random).)*?This is the end/g

为什么这不起作用?我错过了什么?

我正在使用 regexstorm.com 进行测试...

您通过将消极的前瞻性转变为积极的前瞻性破坏了一个温和的贪婪令牌。它不会那样工作,因为正向前瞻要求文本在 This is a random bit.

之后的每个位置都等于 This is a random

你需要:

  • 匹配前导分隔符 (This is a random bit)
  • 匹配所有 0+ 不是 leading/closing 分隔符的文本,也不是此块内所需的随机文本
  • 匹配里面的特定字符串(This is a random)
  • 匹配所有 0+ 不是 leading/closing 分隔符的文本
  • 匹配结束分隔符(This is the end)

所以,使用

(?s)This is a random bit(?:(?!This is a random bit|This is the end|This is a random).)*This is a random(?:(?!This is a random bit|This is the end).)*This is the end

regex demo

  • (?s) - DOTALL 模式开启(. 匹配换行符)
  • This is a random bit - 前导分隔符
  • (?: # Start of the tempered greedy token (?!This is a random bit # Leading delimiter | This is the end # Trailing delimiter | This is a random) # Sepcific string inside . # Any character )* # End of tempered greedy token
  • This is a random - 指定的子字符串
  • (?:(?!This is a random bit|This is the end).)* - 另一个匹配任何文本的缓和贪婪标记,不是 leading/closing 分隔符直到第一个...
  • This is the end - 结尾分隔符

我希望你明白这个(?:(?=This is a random).)只能匹配一次,如果是量化的话,永远不会匹配两次。比如Th就可以满足lookahead。当 T 被消耗时,下一个字符是 h 永远不会满足 lookahhead Th。评估下一个表达式,永远不会 return 再次向前看。改用否定的前瞻。