如何用 Rebol PARSE 方言表达分支?
How to express branch in Rebol PARSE dialect?
我有一个 mysql 架构,如下所示:
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
现在我想从中提取一些信息:文件名、类型和注释(如果有的话)。见下文:
["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]
我的代码是:
parse data [
any [
thru {`} copy field to {`} {`}
thru some space copy field-type to [ {(} | space]
(comm: "")
opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
但我得到这样的结果:
["id" "int" "the name" "content" "text" "something"]
我知道 opt ..
行不对。
我要表达如果先找到COMMENT
关键字,然后提取评论信息;如果先找到lf,则继续下一个循环。 但是不知道怎么表达。谁能帮忙?
我想出了另一种方法来获取数据块!但不是字符串!
data: read/lines data.txt
probe data
temp: copy []
foreach d data [
parse d [
thru {`} copy field to {`} {`}
thru some space copy field-type to [ {(} | space]
(comm: "")
opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
probe temp
我认为这更接近你所追求的。
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
temp: []
parse data [
any [
thru {`} copy field to {`} {`}
some space copy field-type to [ {(} | space]
(comm: copy "")
opt [ thru {COMMENT} some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
probe temp
打破分歧。
- 为
temp
设置一个空块的单词
将 thru some space
更改为 some space
,因为这将以相同的方式推进系列。注意下面是false
parse " " [ thru some space ]
已将 comm: ""
更改为 comm: copy ""
以确保每次提取评论时都能得到一个新字符串(似乎不会影响输出,但这是一个很好的做法)
- 根据评论 2 将
{COMMENT} thru some space
更改为 {COMMENT} some space
。
- 刚刚在最后添加了一个用于调试的探针
请注意,您可以在解析规则中的任何地方(几乎)使用 ??
来帮助调试,这将显示您当前的位置。
parse/all 用于字符串解析
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
nodata: charset { ()'}
dat: complement nodata
collect [
parse/all data [
some [
thru {`} copy field to {`} (keep field) skip
some " " copy type some dat ( keep type comm: copy "" )
copy rest thru "," (
parse/all rest [
some [
["," (keep comm) ]
| ["COMMENT" some nodata copy comm to "'" ]
| skip
]
]
)
]
]
]
== ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"]
另一个(更好的)纯解析解决方案
collect [
probe parse/all data [
some [
thru {`} copy field to {`} (keep field) skip
some " " copy type some dat ( keep type comm: "" further: [])
some [
"," (keep comm further: [ to end skip])
| ["COMMENT" some nodata copy comm to "'" ]
| skip further
]
]
]
]
我非常赞成(在可能的情况下)建立一套带有肯定术语的语法规则来匹配目标输入——我发现它更通俗、更精确、更灵活并且更容易调试。在您上面的代码片段中,我们可以确定五个核心组件:
space: use [space][
space: charset "^-^/ "
[some space]
]
word: use [letter][
letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
[some letter]
]
id: use [letter][
letter: complement charset "`"
[some letter]
]
number: use [digit][
digit: charset "0123456789"
[some digit]
]
string: use [char][
char: complement charset "'"
[any [some char | "''"]]
]
定义了术语后,编写描述输入语法的规则就相对简单了:
result: collect [
parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
opt space
some [
(field: type: none comment: copy "")
"`" copy field id "`"
space
copy type word opt ["(" number ")"]
any [
space [
"COMMENT" space "'" copy comment string "'"
| word | "'" string "'" | number
]
]
opt space "," (keep reduce [field type comment])
opt space
]
]
]
作为额外的好处,我们可以验证输入。
if parsed? [new-line/all/skip result true 3]
稍微应用 new-line
使事情变得更聪明一点应该产生:
== [
"id" "int" ""
"name" "varchar" "the name"
"content" "text" "something"
]
我有一个 mysql 架构,如下所示:
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
现在我想从中提取一些信息:文件名、类型和注释(如果有的话)。见下文:
["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]
我的代码是:
parse data [
any [
thru {`} copy field to {`} {`}
thru some space copy field-type to [ {(} | space]
(comm: "")
opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
但我得到这样的结果:
["id" "int" "the name" "content" "text" "something"]
我知道 opt ..
行不对。
我要表达如果先找到COMMENT
关键字,然后提取评论信息;如果先找到lf,则继续下一个循环。 但是不知道怎么表达。谁能帮忙?
我想出了另一种方法来获取数据块!但不是字符串!
data: read/lines data.txt
probe data
temp: copy []
foreach d data [
parse d [
thru {`} copy field to {`} {`}
thru some space copy field-type to [ {(} | space]
(comm: "")
opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
probe temp
我认为这更接近你所追求的。
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
temp: []
parse data [
any [
thru {`} copy field to {`} {`}
some space copy field-type to [ {(} | space]
(comm: copy "")
opt [ thru {COMMENT} some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
probe temp
打破分歧。
- 为
temp
设置一个空块的单词
将
thru some space
更改为some space
,因为这将以相同的方式推进系列。注意下面是false
parse " " [ thru some space ]
已将
comm: ""
更改为comm: copy ""
以确保每次提取评论时都能得到一个新字符串(似乎不会影响输出,但这是一个很好的做法)- 根据评论 2 将
{COMMENT} thru some space
更改为{COMMENT} some space
。 - 刚刚在最后添加了一个用于调试的探针
请注意,您可以在解析规则中的任何地方(几乎)使用 ??
来帮助调试,这将显示您当前的位置。
parse/all 用于字符串解析
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
nodata: charset { ()'}
dat: complement nodata
collect [
parse/all data [
some [
thru {`} copy field to {`} (keep field) skip
some " " copy type some dat ( keep type comm: copy "" )
copy rest thru "," (
parse/all rest [
some [
["," (keep comm) ]
| ["COMMENT" some nodata copy comm to "'" ]
| skip
]
]
)
]
]
]
== ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"]
另一个(更好的)纯解析解决方案
collect [
probe parse/all data [
some [
thru {`} copy field to {`} (keep field) skip
some " " copy type some dat ( keep type comm: "" further: [])
some [
"," (keep comm further: [ to end skip])
| ["COMMENT" some nodata copy comm to "'" ]
| skip further
]
]
]
]
我非常赞成(在可能的情况下)建立一套带有肯定术语的语法规则来匹配目标输入——我发现它更通俗、更精确、更灵活并且更容易调试。在您上面的代码片段中,我们可以确定五个核心组件:
space: use [space][
space: charset "^-^/ "
[some space]
]
word: use [letter][
letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
[some letter]
]
id: use [letter][
letter: complement charset "`"
[some letter]
]
number: use [digit][
digit: charset "0123456789"
[some digit]
]
string: use [char][
char: complement charset "'"
[any [some char | "''"]]
]
定义了术语后,编写描述输入语法的规则就相对简单了:
result: collect [
parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
opt space
some [
(field: type: none comment: copy "")
"`" copy field id "`"
space
copy type word opt ["(" number ")"]
any [
space [
"COMMENT" space "'" copy comment string "'"
| word | "'" string "'" | number
]
]
opt space "," (keep reduce [field type comment])
opt space
]
]
]
作为额外的好处,我们可以验证输入。
if parsed? [new-line/all/skip result true 3]
稍微应用 new-line
使事情变得更聪明一点应该产生:
== [
"id" "int" ""
"name" "varchar" "the name"
"content" "text" "something"
]