正则表达式获取包含一个词而不是另一个词的多行标签

Question

对于多行文本，我必须替换所有包含文本（狗）但不包含其他文本（猫）的 <img> 标签

所以有这个文本：

<img black 
dog>
<img dog white cat>
<img black dog>
<img cat and dog>
<img red fox>
<img black dog>

应找到以下文本：

有很多方法可以使用 ^ 和 $ 为单行正则表达式找到它，但我无法使用多行找到它。

我的第一次尝试是这样使用单行选项 (/s)：

/<img ((?!cat).)*?(dog)>/gs

但是select最后一只狗（红狐狸）之前的标签因为不够贪心

然后我使用 \s\S:

使其变得贪婪（添加 ?），没有 /s 选项

/<img ((?!cat)[\s\S.])*?(dog)?>/g

我又找到了第五个标签 (<img red fox>)，即使没有狗。

如何让我的 3 条狗 select 不带猫或狐狸？

Link 我在 regex101 中的尝试：https://regex101.com/r/AGgb4z/1

Answer 1

您可以匹配 <img，然后断言没有 cat 使用否定前瞻 (?![^<>]*cat)

使用 negated character class [^<>]* 匹配除 < 和 > 之外的任何字符，在 dog 的左侧和右侧。

您可以使用单词边界，例如 \bcat\b 如果 cat 和 dog 不应该是较长单词的一部分。

<img (?![^<>]*cat)[^<>]*dog[^<>]*>

regex get multiline tag having a word and not another