从 txt.file 中排除某些短语的正则表达式

Question

我需要从看起来 +/- 像这样的 txt 文件中检索数字：

[  Index 1  ]
1628 5704
32801 61605
71508 90612
1026061

我需要忽略 Indexe 的号码。

[0-9]+ 检索所有数字，索引也是如此。

我试过类似这种称为负前瞻的方法 (?![(Index 1)])([0-9]+)。它确实忽略了 1，但所有这些...例如 1628 变成 628。感谢您的帮助，我在正则表达式语法方面一直很薄弱：/

Answer 1

使用

\b(?<!Index )\d+

参见proof。

说明

--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    Index                    'Index '
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))

Answer 2

此模式仅匹配数字。它正在寻找字符串开头的系列或多个数字或字符串结尾的系列或多个数字。

^\d+|\d+$

https://regex101.com/r/ZNTxQ7/1

另一种方法是在字符串中查找一系列两位或更多数字。

\d{2,}

https://regex101.com/r/Di75KT/1

从 txt.file 中排除某些短语的正则表达式

regex for excluding certain phrase from txt.file

python

regex

python-re