如何使用 fparsec 解析由双空格分隔的单词序列?
How to parse seq of words separated by double spaces using fparsec?
给定输入:
alpha beta gamma one two three
如何将其解析为以下内容?
[["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]
如果有更好的分隔符(e.g.__),我可以写这个,因为那时
sepBy (sepBy word (pchar ' ')) (pstring "__")
有效,但在双 space 的情况下,第一个 sepBy 中的 pchar 消耗第一个 space,然后解析器失败。
我建议用这样的东西替换 sepBy word (pchar ' ')
:
let pOneSpace = pchar ' ' .>> notFollowedBy (pchar ' ')
let pTwoSpaces = pstring " "
// Or if two spaces are allowed as separators but *not* three spaces...
let pTwoSpaces = pstring " " .>> notFollowedBy (pchar ' ')
sepBy (sepBy word pOneSpace) pTwoSpaces
注意:未测试(因为我目前没有时间),只是在答案框中输入。所以测试一下,以防我在某处出错。
sepBy p sep
中的FParsec手册says,如果sep
成功,后面的p
失败(不改变状态),整个sepBy
也失败了。因此,您的目标是:
- 使分隔符 失败 如果遇到多个 space 字符;
- 到回溯,这样"inner"
sepBy
循环就愉快地结束了,把控制权交给了"outer"sepBy
循环。
以下是两者的操作方法:
// this is your word parser; it can be different of course,
// I just made it as simple as possible;
let pWord = many1Satisfy isAsciiLetter
// this is the Inner separator to separate individual words
let pSepInner =
pchar ' '
.>> notFollowedBy (pchar ' ') // guard rule to prevent 2nd space
|> attempt // a wrapper that would fail NON-fatally
// this is the Outer separator
let pSepOuter =
pchar ' '
|> many1 // loop
// this is the parser that would return String list list
let pMain =
pWord
|> sepBy <| pSepInner // the Inner loop
|> sepBy <| pSepOuter // the Outer loop
使用:
run pMain "alpha beta gamma one two three"
Success: [["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]
给定输入:
alpha beta gamma one two three
如何将其解析为以下内容?
[["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]
如果有更好的分隔符(e.g.__),我可以写这个,因为那时
sepBy (sepBy word (pchar ' ')) (pstring "__")
有效,但在双 space 的情况下,第一个 sepBy 中的 pchar 消耗第一个 space,然后解析器失败。
我建议用这样的东西替换 sepBy word (pchar ' ')
:
let pOneSpace = pchar ' ' .>> notFollowedBy (pchar ' ')
let pTwoSpaces = pstring " "
// Or if two spaces are allowed as separators but *not* three spaces...
let pTwoSpaces = pstring " " .>> notFollowedBy (pchar ' ')
sepBy (sepBy word pOneSpace) pTwoSpaces
注意:未测试(因为我目前没有时间),只是在答案框中输入。所以测试一下,以防我在某处出错。
sepBy p sep
中的FParsec手册says,如果sep
成功,后面的p
失败(不改变状态),整个sepBy
也失败了。因此,您的目标是:
- 使分隔符 失败 如果遇到多个 space 字符;
- 到回溯,这样"inner"
sepBy
循环就愉快地结束了,把控制权交给了"outer"sepBy
循环。
以下是两者的操作方法:
// this is your word parser; it can be different of course,
// I just made it as simple as possible;
let pWord = many1Satisfy isAsciiLetter
// this is the Inner separator to separate individual words
let pSepInner =
pchar ' '
.>> notFollowedBy (pchar ' ') // guard rule to prevent 2nd space
|> attempt // a wrapper that would fail NON-fatally
// this is the Outer separator
let pSepOuter =
pchar ' '
|> many1 // loop
// this is the parser that would return String list list
let pMain =
pWord
|> sepBy <| pSepInner // the Inner loop
|> sepBy <| pSepOuter // the Outer loop
使用:
run pMain "alpha beta gamma one two three"
Success: [["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]