如何用 Rebol PARSE 方言表达分支?

How to express branch in Rebol PARSE dialect?

我有一个 mysql 架构,如下所示:

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}

现在我想从中提取一些信息:文件名、类型和注释(如果有的话)。见下文:

["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]

我的代码是:

parse data [
    any [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

但我得到这样的结果:

["id" "int" "the name" "content" "text" "something"]

我知道 opt .. 行不对。

我要表达如果先找到COMMENT关键字,然后提取评论信息;如果先找到lf,则继续下一个循环。 但是不知道怎么表达。谁能帮忙?

我想出了另一种方法来获取数据块!但不是字符串!

data: read/lines data.txt
probe data
temp: copy []

foreach d data [
    parse d [ 
        thru {`} copy field to {`} {`}
        thru some space copy field-type to [ {(} | space]
        (comm: "")
        opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
        (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
    ]
]

probe temp

我认为这更接近你所追求的。

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
temp: []
parse data [
  any [ 
    thru {`} copy field to {`} {`}
    some space copy field-type to [ {(} | space]
    (comm: copy "")
    opt [ thru {COMMENT} some space thru {'} copy comm to {'}]
    (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
  ]
]
probe temp

打破分歧。

  1. temp
  2. 设置一个空块的单词
  3. thru some space 更改为 some space,因为这将以相同的方式推进系列。注意下面是false

    parse "   " [ thru some space ]
    
  4. 已将 comm: "" 更改为 comm: copy "" 以确保每次提取评论时都能得到一个新字符串(似乎不会影响输出,但这是一个很好的做法)

  5. 根据评论 2 将 {COMMENT} thru some space 更改为 {COMMENT} some space
  6. 刚刚在最后添加了一个用于调试的探针

请注意,您可以在解析规则中的任何地方(几乎)使用 ?? 来帮助调试,这将显示您当前的位置。

parse/all 用于字符串解析

data: {
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `name` varchar(10) DEFAULT '' COMMENT 'the name',
    `content` text COMMENT 'something',
}
nodata:   charset { ()'}
dat: complement nodata

collect [   
    parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  copy "" )  
            copy rest thru "," (
                parse/all rest [
                    some [
                        [","   (keep comm) ]  
                     |  ["COMMENT"   some nodata copy comm to "'"  ]
                     |  skip                        
                    ]
                ]
            )
        ]
    ]
]
== ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"]

另一个(更好的)纯解析解决方案

collect [   
    probe parse/all data [
        some [
            thru {`} copy field to {`} (keep field) skip 
            some " " copy type some dat ( keep type   comm:  ""  further: [])  
            some [ 
            ","   (keep comm  further:  [ to end  skip]) 
            |  ["COMMENT"   some nodata copy comm to "'"  ]
            |  skip  further                     
            ]
        ]
    ]
]

我非常赞成(在可能的情况下)建立一套带有肯定术语的语法规则来匹配目标输入——我发现它更通俗、更精确、更灵活并且更容易调试。在您上面的代码片段中,我们可以确定五个核心组件:

space: use [space][
    space: charset "^-^/ "
    [some space]
]

word: use [letter][
    letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
    [some letter]
]

id: use [letter][
    letter: complement charset "`"
    [some letter]
]

number: use [digit][
    digit: charset "0123456789"
    [some digit]
]

string: use [char][
    char: complement charset "'"
    [any [some char | "''"]]
]

定义了术语后,编写描述输入语法的规则就相对简单了:

result: collect [
    parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
        opt space
        some [
            (field: type: none comment: copy "")
            "`" copy field id "`"
            space 
            copy type word opt ["(" number ")"]
            any [
                space [
                    "COMMENT" space "'" copy comment string "'"
                    | word | "'" string "'" | number
                ]
            ]
            opt space "," (keep reduce [field type comment])
            opt space
        ]
    ]
]

作为额外的好处,我们可以验证输入。

if parsed? [new-line/all/skip result true 3]

稍微应用 new-line 使事情变得更聪明一点应该产生:

== [
    "id" "int" "" 
    "name" "varchar" "the name" 
    "content" "text" "something"
]