lpeg中的非贪婪搜索而不消耗结束匹配

Question

这是从 this question 上的评论衍生出来的。

据我了解，在 PEG 语法中，可以通过编写 S <- E2 / E1 S（或 S = 模式 E2，如果可能或模式 E1 和继续 S）来实现非贪婪搜索。

但是，我不想在最终模式中捕获 E2 - 我想捕获最多 E2。当尝试在 LPEG 中实现它时，我运行遇到了几个问题，包括 'Empty loop in rule' 将其构建到语法中时出错。

我们如何在 LPEG 语法中实现以下搜索：[tag] foo [/tag] 我们要在捕获中捕获标签内容的地方 table（示例中的 'foo'） , 但我们想在结束标签之前终止？据我从其他问题的评论中了解到，这应该是可能的，但我在 LPEG 中找不到示例。

这是测试语法的片段

local tag_start = P"[tag]"
local tag_end = P"[/tag]"

G = P{'Pandoc', 
  ...
  NotTag = #tag_end + P"1" * V"NotTag"^0;
  ...
  tag = tag_start * Ct(V"NotTag"^0) * tag_end;
}

Answer 1

又是我。我认为您需要更好地了解 LPeg 捕获。 Table 捕获 (lpeg.Ct) 是一种将您的捕获收集在 table 中的捕获。由于 NotTag 规则中没有指定简单的捕获 (lpeg.C)，因此最终捕获将变为空 table {}.

再一次，我建议你从lpeg.re开始，因为它更直观。

local re = require('lpeg.re')
local inspect = require('inspect')

local g = re.compile[=[--lpeg
  tag       <- tag_start {| {NotTag} |} tag_end
  NotTag    <- &tag_end / . NotTag
  
  tag_start <- '[tag]'
  tag_end   <- '[/tag]'
]=]

print(inspect(g:match('[tag] foo [/tag]')))
-- output: { " foo " }

另外，S <- E2 / E1 S不是S <- E2 / E1 S*，这两个不等价。

但是，如果我要执行相同的任务，我不会尝试使用非贪婪匹配，因为非贪婪匹配总是比贪婪匹配慢。

tag <- tag_start {| {( !tag_end . (!'[' .)* )*} |} tag_end

结合非谓词和贪婪匹配就足够了。

lpeg中的非贪婪搜索而不消耗结束匹配

Non-greedy search in lpeg without consuming the end match

lua

peg

lpeg