忽略空行 [\t\s] 个空格或制表符

Question

所以我有这个正则表达式：

基本上，我试图忽略所有后跟 space 的行号，或者什么都不忽略。我当前的正则表达式：^[\d\s].+(?:[A-Z\s]*)*$

后面没有内容的行号实际上并没有被忽略。

Answer 1

您的正则表达式仅匹配 1 个数字，将其更改为 this simplified version:

^\d+\b.+$

Answer 2

您可以使用否定前瞻来断言后面不是 1+ 位后跟 0+ 次空白字符：

^(?!\d+\s*$)\d+.+$

^ 字符串开头
(?!\d+\s*$) 断言右边的内容不是 1+ 数字后跟 0+ 次空白字符和字符串结尾的否定前瞻
\d+.+ 匹配 1+ 次数字和 1+ 次任意字符
$ 字符串结束

见regex demo | Python demo

示例使用 findall:

import re
regex = r"^(?!\d+\s*$)\d+.+$"
test_str = ("Here goes some text. {tag} A wonderful day. It's soon cristmas.\n"
    "2 Happy 2019, soon. {Some useful tag!} Something else goes here.\n"
    "3 Happy ending. Yeppe! See you.\n"
    "4\n"
    "5 Happy KKK!\n"
    "6 Happy B-Day!\n"
    "7\n"
    "8 Universe is cool!\n"
    "9\n"
    "10 {Tagish}.\n"
    "11\n"
    "12 {Slugish}. Here goes another line. {Slugish} since this is a new sentence.\n"
    "13\n"
    "14 endline.")
print(re.findall(regex, test_str, re.MULTILINE));

当数字后面有一个点时，可以使用：

^(?!\d+\.\s*$)\d+.+$

忽略空行 [\t\s] 个空格或制表符

Ignore empty lines [\t\s] spaces or tabs

regex

regex-greedy

regex-lookarounds

所以我有这个正则表达式：