正则表达式 - 对除纯空格之外的任何字符进行负向回顾

Question

我正在尝试编写一个正则表达式模式，如果前面的模式包含除纯空格以外的任何字符，则匹配失败，例如

--hello (match)
--goodbye (match)
ROW_NUMBER() OVER (ORDER BY DATE) --date (fail)
  --comment with some indentation (match)
    --another comment with some indentation (match)

我最接近的是我制作的这个图案 (?<!.)--.*\n，它给了我这个结果

--hello (match)
--goodbye (match)
ROW_NUMBER() OVER (ORDER BY DATE) --date (fail)
  --comment with some indentation (fail)
    --another comment with some indentation (fail)

我已经尝试了 (?<!\s)--.*\n 和 (?<=\S)--.*\n 但两者都 return 根本没有匹配项

编辑：regexr.com 更清楚地说明问题 regexr.com/6j0mt

Answer 1

使用 PyPi regex，您可以使用

import regex

text = r"""--hello
--goodbye
ROW_NUMBER() OVER (ORDER BY DATE) --date
  --comment with some indentation
    --another comment with some indentation"""

print( regex.findall(r'(?<=^[^\S\r\n]*)--.*', text, regex.M) )
# => ['--hello', '--goodbye', '--comment with some indentation', '--another comment with some indentation']

看到这个Python demo online。

或者，使用默认值 Python re:

import re
 
text = r"""--hello
--goodbye
ROW_NUMBER() OVER (ORDER BY DATE) --date
  --comment with some indentation
    --another comment with some indentation"""
 
print( re.findall(r'^[^\S\r\n]*(--.*)', text, re.M) )

参见 this Python demo。

图案详情

(?<=^[^\S\r\n]*) - 正后视匹配紧接在 string/line 开始和零个或多个水平空格
^ - 字符串的开始（这里是一行，因为使用了 re.M / regex.M 选项）
[^\S\r\n]* - 除 non-whitespace、CR 和 LF 字符以外的零个或多个字符（除回车符 returns 和换行符之外的任何空格）
(--.*) - 第 1 组：-- 和该行的其余部分（.* 匹配除换行字符以外的零个或多个字符）。

正则表达式 - 对除纯空格之外的任何字符进行负向回顾

Regex - negative lookbehind for any character excluding pure whitespace

python

regex

python-regex