否定前瞻被忽略

Question

考虑以下正则表达式：

我正在尝试匹配内存量和类型 Laptop HP Chromebook 14 G3 NVIDIA Tegra SOC 4GB DDRL 32GB FLASH 14inch 1366X768 Webcam Chrome OS，仅当它后面没有看起来像内存量的附加内容时。我认为负前瞻正是出于这个原因：

(?!\d+\s?(gb|tb)) 这是我的负面预测

现在应用：

/(?:\d+)\s?(?:gb|tb)\s?(?:ddrl|ddr2)\s?(?!\d+\s?(gb|tb))/i

4gb ddrl 部分仍然与我的字符串匹配，即使它后面跟着我的负面前瞻应该实现的 32gb 部分。如果我将负面前瞻更改为简单的捕获组，我的正则表达式会正确捕获字符串中的整个 4gb ddrl 32gb 部分。

我做错了什么？

Answer 1

由于您将 space 声明为可选，正则表达式 egnine 将尝试匹配字符串而不考虑 space；事实上 4GB DDRL 不是直接后跟 32GB FLASH （因此它将被匹配）。

为了修复它，请将可选的 space 放入您的前瞻中：

(?:\d+)\s?(?:gb|tb)\s?(?:ddrl|ddr2)(?!\s?\d+\s?(gb|tb))

见demo。

Answer 2

回溯是这里的关键字。

当 \s? 匹配一个空白字符时，(?!\d+\s?(gb|tb)) 会偶然发现 32GB 并“回滚”到该空白之前的前一个位置，因为它是一个空白 + 32GB，前瞻结束交易。

使用

\d+\s*[gt]b\s*ddr[l2](?!\s*\d+\s*[gt]b)

参见regex proof。前瞻不可能在这里重新匹配，因为 1 和 2 没有量词。

解释

--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  [gt]                     any character of: 'g', 't'
--------------------------------------------------------------------------------
  b                        'b'
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  ddr                      'ddr'
--------------------------------------------------------------------------------
  [l2]                     any character of: 'l', '2'
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    [gt]                     any character of: 'g', 't'
--------------------------------------------------------------------------------
    b                        'b'
--------------------------------------------------------------------------------
  )                        end of look-ahead

否定前瞻被忽略

Negative lookahead is ignored

regex

regex-lookarounds