使用正则表达式删除一系列管道分隔的数字

Remove a sequence of pipe separated numbers using regex

我正在尝试匹配字符串中由竖线分隔的四个数字的序列。数字可以是负数、浮点数或两位数,例如:

13|5|-1|35|5|0|313|4|1.5|1

该字符串还可能包含其他数字和单词;完整示例如下所示:

SOME STRING CONTENT 13|5|-1|3 MORE 1.6 CONTENT HERE

我如何使用正则表达式识别管道 left/right 之间的那些数字?

我试过 [\d\-.\|] 匹配所有数字、小数、竖线和负号,但也发现它匹配字符串中额外的 number/decimal 内容。任何有关仅选择该部分的帮助将不胜感激!

您可以使用

-?\b\d+(?:\.\d+)?(?:\|\-?\d+(?:\.\d+)?){3}\b

模式匹配:

  • -? 匹配一个可选的 -
  • \b 防止部分匹配的单词边界
  • \d+(?:\.\d+)? 匹配 1+ 个带可选小数部分的数字
  • (?:\|\-?\d+(?:\.\d+)?){3} 重复 3 次,与前面的部分相同,前面有竖线
  • \b一个单词边界

Regex demo

以及使用

(?<!\S)-?\d*\.?\d+(?:\|-?\d*\.?\d+){3}(?!\S)

参见proof

解释

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  -?                       '-' (optional (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \d*                      digits (0-9) (0 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  \.?                      '.' (optional (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (3 times):
--------------------------------------------------------------------------------
    \|                       '|'
--------------------------------------------------------------------------------
    -?                       '-' (optional (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d*                      digits (0-9) (0 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
    \.?                      '.' (optional (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  ){3}                     end of grouping
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of look-ahead