如何用 Rebol PARSE 方言表达分支？

Question

我有一个 mysql 架构，如下所示：

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}

现在我想从中提取一些信息：文件名、类型和注释（如果有的话）。见下文：

["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]

我的代码是：

parse data [
    any [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

但我得到这样的结果：

["id" "int" "the name" "content" "text" "something"]

我知道 opt .. 行不对。

我要表达如果先找到COMMENT关键字，然后提取评论信息；如果先找到lf，则继续下一个循环。 但是不知道怎么表达。谁能帮忙？

Answer 1

我想出了另一种方法来获取数据块！但不是字符串！

data: read/lines data.txt
probe data
temp: copy []

foreach d data [
    parse d [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

probe temp

Answer 2

我认为这更接近你所追求的。

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
temp: []
parse data [
  any [ 
    thru {`} copy field to {`} {`}
    some space copy field-type to [ {(} | space]
    (comm: copy "")
    opt [ thru {COMMENT} some space thru {'} copy comm to {'}]
    (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
  ]
]
probe temp

打破分歧。

为 temp
将 thru some space 更改为 some space，因为这将以相同的方式推进系列。注意下面是false
```
parse "   " [ thru some space ]
```
已将 comm: "" 更改为 comm: copy "" 以确保每次提取评论时都能得到一个新字符串（似乎不会影响输出，但这是一个很好的做法）
根据评论 2 将 {COMMENT} thru some space 更改为 {COMMENT} some space。
刚刚在最后添加了一个用于调试的探针

请注意，您可以在解析规则中的任何地方（几乎）使用 ?? 来帮助调试，这将显示您当前的位置。

Answer 3

parse/all 用于字符串解析

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
nodata:   charset { ()'}
dat: complement nodata

collect [   
    parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  copy "" )  
            copy rest thru "," (
                parse/all rest [
                    some [
                        [","   (keep comm) ]  
                     |  ["COMMENT"   some nodata copy comm to "'"  ]
                     |  skip                        
                    ]
                ]
            )
        ]
    ]
]
== ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"]

另一个（更好的）纯解析解决方案

collect [   
    probe parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  ""  further: [])  
            some [ 
            ","   (keep comm  further:  [ to end  skip]) 
            |  ["COMMENT"   some nodata copy comm to "'"  ]
            |  skip  further                     
            ]
        ]
    ]
]

Answer 4

我非常赞成（在可能的情况下）建立一套带有肯定术语的语法规则来匹配目标输入——我发现它更通俗、更精确、更灵活并且更容易调试。在您上面的代码片段中，我们可以确定五个核心组件：

space: use [space][
    space: charset "^-^/ "
    [some space]
]

word: use [letter][
    letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
    [some letter]
]

id: use [letter][
    letter: complement charset "`"
    [some letter]
]

number: use [digit][
    digit: charset "0123456789"
    [some digit]
]

string: use [char][
    char: complement charset "'"
    [any [some char | "''"]]
]

定义了术语后，编写描述输入语法的规则就相对简单了：

result: collect [
    parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
        opt space
        some [
            (field: type: none comment: copy "")
            "`" copy field id "`"
            space 
            copy type word opt ["(" number ")"]
            any [
                space [
                    "COMMENT" space "'" copy comment string "'"
                    | word | "'" string "'" | number
                ]
            ]
            opt space "," (keep reduce [field type comment])
            opt space
        ]
    ]
]

作为额外的好处，我们可以验证输入。

if parsed? [new-line/all/skip result true 3]

稍微应用 new-line 使事情变得更聪明一点应该产生：

== [
    "id" "int" "" 
    "name" "varchar" "the name" 
    "content" "text" "something"
]

如何用 Rebol PARSE 方言表达分支？

How to express branch in Rebol PARSE dialect?

parsing

rebol

rebol3