正则表达式匹配所有内容直到模式中有异常

Question

这是我的正则表达式

\d+%[^\.][^0-9]*?((?!original).)percentage*

我希望它匹配从百分比（即 10%）到单词 percentage

10%“随便”百分比

除非它包含“原创”一词：

10% 原始百分比

因此，“whatever”可以是“percentage”之前的任何词，除非其中包含“original”这个词。

我已经能够获取我的正则表达式，但只有当“百分比”位于新行的开头时它才能正常工作

In some cases, 10% of the sales starts with the original percentage --> my regex match with this string but I don't want to because it contains the word "original"

The 10% of the sales starts with a certain percentage --> my regex match with this string, it's okay because it doesn't containt the word "original"

The 10% of the original
percentage of the sale is higher--> my regex doesn't match with this string, and it's okay because it containts the word "original" (maybe because the new line starts with percentage?)

The 10% of the original sale
is the percentage of that --> my regex match with this string but I don't want to because it contains the word "original"

如果我的解释有点奇怪，我很抱歉，英语不是我的母语。

谢谢！！！

Answer 1

您必须重复此部分 ((?!original).) 并在 percentage* 之后省略 *，因为它可以选择性地重复 e 字符。

然后，如果您不想匹配中间的数字，您可以使用 [^\d\r\n] 而不是 .

来匹配除换行符或数字之外的任何字符

\d+%[^.](?:(?!original\b)[^\d\r\n])*\bpercentage\b

模式匹配：

\d+%匹配1+个数字和%
[^.] 匹配除点 以外的任何字符（请注意，这是广泛匹配，您也可以使用 space 代替）
(?:非捕获组
- (?!original\b)[^\d\r\n] 匹配除数字或换行符以外的任何字符，当 wat 直接在右边时不是 original
)*关群重复0+次
\bpercentage\b 匹配 percentage

Regex demo

正则表达式匹配所有内容直到模式中有异常

Regex to match everything until pattern with an exception in it

python

regex

regex-negation

regex-lookarounds