正则表达式适用于比预期更多的文本

Question

这是我尝试为其构建正则表达式的文本示例（使用 1.1.1.1 作为此 post 的示例）：

Nmap scan report for 1.1.1.1
kjerhtehrkererjh
kjhertkjherjtherjkhteter
kjehrjktherther
Nmap scan report for 1.1.1.1
Host is up (0.0011s latency).

PORT     STATE SERVICE
4786/tcp open  smart-install
| cisco-siet: 
|   Host: 1.1.1.1
|_  Status: VULNERABLE
MAC Address: XX:XX:XX:XX:XX:XX (Cisco Systems)

我的目的只是捕捉：

Nmap scan report for 1.1.1.1
Host is up (0.0011s latency).

PORT     STATE SERVICE
4786/tcp open  smart-install
| cisco-siet: 
|   Host: 1.1.1.1
|_  Status: VULNERABLE
MAC Address: XX:XX:XX:XX:XX:XX (Cisco Systems)

目前，我的正则表达式如下所示：variable_containign_string.scan(/Nmap scan report.*?cisco-siet:.*?Status: VULNERABLE/m)

所以这是我的输出：

irb(main):047:0> d.scan(/Nmap scan report.*?cisco-siet:.*?Status: VULNERABLE/m)
=> ["Nmap scan report for 1.1.1.1\nkjerhtehrkererjh\nkjhertkjherjtherjkhteter\nkjehrjktherther\nNmap scan report for 1.1.1.1\nHost is up (0.0011s latency).\n\nPORT     STATE SERVICE\n4786/tcp open  smart-install\n| cisco-siet: \n|   Host: 1.1.1.1\n|_  Status: VULNERABLE"]

虽然这确实捕获了我的预期目标，但它也捕获了存在于我的文本之上的 Nmap scan report，这不是目标。我试图捕获的文本可能会出现在很多其他文本中，所以我想找出一种方法来确保捕获的文本只包含一个“Nmap 扫描报告”实例，但仍然捕获此文本的多个“组”。

这基本上是我要找的东西：

Answer 1

您可以使用

/Nmap scan report(?:(?!Nmap scan report).)*?cisco-siet:(?:(?!Nmap scan report).)*?Status: VULNERABLE/m

参见regex demo。

详情:

Nmap scan report - 固定字符串（left-hand 分隔符）
(?:(?!Nmap scan report).)*? - 大致匹配除 left-hand 分隔符文本
cisco-siet: - 固定字符串
(?:(?!Nmap scan report).)*? - 大致匹配除 left-hand 分隔符文本
Status: VULNERABLE - 固定字符串（right-hand 分隔符）。

请注意，Onigmo 正则表达式引擎需要 m 标志用于 . 模式以匹配换行字符。

正则表达式适用于比预期更多的文本

Regex applying to way more text than intended

regex