使用解析器组合器在 'seq(p, many(p))' 构造中抑制来自 'many' 的空结果
Suppress empty result from 'many' in 'seq(p, many(p))' construct with parser combinators
我正在尝试按照 Hutton 和 Meijer "Monadic Parser Combinators" 构建解析器组合器。我的实现是在 PostScript 中,但我认为我的问题是组合器解析器的一般问题,而不是我的具体实现。
作为一个小练习,我正在使用解析器来识别正则表达式。
(pc9.ps)run
/Dot (.) char def
/Meta (*+?) anyof def
/Character (*+?.|()) noneof def
/Atom //Dot
//Character plus def
/Factor //Atom //Meta maybe seq def
/Term //Factor //Factor many seq def
/Expression //Term (|) char //Term xthen many seq def
/regex { string-input //Expression exec ps } def
(abc|def|ghi) regex
quit
它正在工作,但输出有很多 []
空数组,当我尝试 bind
处理程序来处理这些值时,它们确实妨碍了我。
$ gsnd -q -dNOSAFER pc9re2.ps
stack:
[[[[[97 []] [[98 []] [[99 []] []]]] [[[100 []] [[101 []] [[102 []]
[]]]] [[[103 []] [[104 []] [[105 []] []]]] []]]] null]]
每当 seq
排序组合器接受来自 maybe
或 many
(使用 maybe
)的零出现的结果时,就会发生这些情况。
使用 Parser Combinators 在输出中排除这种额外噪声的正常方法是什么?
唉。看来我可以围绕它实施。我在 seq
中添加了特殊代码来检测空的右侧并丢弃它。关于其他问题...
编辑: 我在版本 11(半)中再次遇到同样的问题。现在我有了更好的解决方案 IMO:
https://groups.google.com/g/comp.lang.functional/c/MbJxrJSk8Mw/m/MoT3Dr0IAwAJ
Ugh. I think it wasn't even an X/Y problem. It was a "doctor it hurts
when I move my arm like this; ... so don't move your arm like that"
problem.
I want the "result" part of the "reply" structure (using new terms
following usage from the Parsec document) to be any of the /usual/
PostScript types: integer, real, string, boolean, array, dictionary.
But I also need some way to arbitrarily combine or concatenate two
objects regardless of type. My then
(aka seq
) combinator needs to
do this. So I made a hack-y function that does the combining. If it
has two arrays, it composes the contents into a longer array. If it
has one array and some other object it extends the array by one and
stuffs the object in the front or back as appropriate. If it has two
non-array objects it makes a new 2-element array to contain them.
So, instead of building xthen
and thenx
off of then
and needing
to cons, car, and cdr the stuff, I can write all 3 of these as a more
general parameterized function.
sequence{ p q u }{
{ /p exec +is-ok {
next x-xs force /q exec +is-ok {
next x-xs 3 1 roll /u exec exch consok
}{
x-xs 3 2 roll ( after ) exch cons exch cons cons
} ifelse
} if } ll } @func
then { {append} sequence }
xthen { {exch pop} sequence }
thenx { {pop} sequence }
append { 1 index zero eq { exch pop }{
dup zero eq { pop }{
1 index type /arraytype eq {
dup type /arraytype eq { compose }{ one compose } ifelse
}{ dup type /arraytype eq { curry }{ cons } ifelse } ifelse } ifelse } ifelse }
(@func
is my own non-standard extension to PostScript that takes a
procedure body and list of parameters and wraps the procedure with
code that defines the arguments in a local dictionary. ll
is my
hack-y PostScript way of making lambdas with hard-patched parameters,
it's short for load all literals.
)
该代码还将 可执行数组 (即 PostScript 过程)视为
用于组合结果序列的非数组。这允许
解析器用作生成程序的语法制导编译器
作为输出。
我正在尝试按照 Hutton 和 Meijer "Monadic Parser Combinators" 构建解析器组合器。我的实现是在 PostScript 中,但我认为我的问题是组合器解析器的一般问题,而不是我的具体实现。
作为一个小练习,我正在使用解析器来识别正则表达式。
(pc9.ps)run
/Dot (.) char def
/Meta (*+?) anyof def
/Character (*+?.|()) noneof def
/Atom //Dot
//Character plus def
/Factor //Atom //Meta maybe seq def
/Term //Factor //Factor many seq def
/Expression //Term (|) char //Term xthen many seq def
/regex { string-input //Expression exec ps } def
(abc|def|ghi) regex
quit
它正在工作,但输出有很多 []
空数组,当我尝试 bind
处理程序来处理这些值时,它们确实妨碍了我。
$ gsnd -q -dNOSAFER pc9re2.ps
stack:
[[[[[97 []] [[98 []] [[99 []] []]]] [[[100 []] [[101 []] [[102 []]
[]]]] [[[103 []] [[104 []] [[105 []] []]]] []]]] null]]
每当 seq
排序组合器接受来自 maybe
或 many
(使用 maybe
)的零出现的结果时,就会发生这些情况。
使用 Parser Combinators 在输出中排除这种额外噪声的正常方法是什么?
唉。看来我可以围绕它实施。我在 seq
中添加了特殊代码来检测空的右侧并丢弃它。关于其他问题...
编辑: 我在版本 11(半)中再次遇到同样的问题。现在我有了更好的解决方案 IMO:
https://groups.google.com/g/comp.lang.functional/c/MbJxrJSk8Mw/m/MoT3Dr0IAwAJ
Ugh. I think it wasn't even an X/Y problem. It was a "doctor it hurts when I move my arm like this; ... so don't move your arm like that" problem.
I want the "result" part of the "reply" structure (using new terms following usage from the Parsec document) to be any of the /usual/ PostScript types: integer, real, string, boolean, array, dictionary.
But I also need some way to arbitrarily combine or concatenate two objects regardless of type. My
then
(akaseq
) combinator needs to do this. So I made a hack-y function that does the combining. If it has two arrays, it composes the contents into a longer array. If it has one array and some other object it extends the array by one and stuffs the object in the front or back as appropriate. If it has two non-array objects it makes a new 2-element array to contain them.So, instead of building
xthen
andthenx
off ofthen
and needing to cons, car, and cdr the stuff, I can write all 3 of these as a more general parameterized function.sequence{ p q u }{ { /p exec +is-ok { next x-xs force /q exec +is-ok { next x-xs 3 1 roll /u exec exch consok }{ x-xs 3 2 roll ( after ) exch cons exch cons cons } ifelse } if } ll } @func then { {append} sequence } xthen { {exch pop} sequence } thenx { {pop} sequence } append { 1 index zero eq { exch pop }{ dup zero eq { pop }{ 1 index type /arraytype eq { dup type /arraytype eq { compose }{ one compose } ifelse }{ dup type /arraytype eq { curry }{ cons } ifelse } ifelse } ifelse } ifelse }
(
@func
is my own non-standard extension to PostScript that takes a procedure body and list of parameters and wraps the procedure with code that defines the arguments in a local dictionary.ll
is my hack-y PostScript way of making lambdas with hard-patched parameters, it's short forload all literals.
)
该代码还将 可执行数组 (即 PostScript 过程)视为 用于组合结果序列的非数组。这允许 解析器用作生成程序的语法制导编译器 作为输出。