Peg.js区分缺失值和白色space
Peg.js distinguish between missing values and white space
我有以下 peg.js 脚本:
start = name*
name = '** name ' var ws 'var:' vr:var ws 'len:' n:num? ws 'label:' lb:label? 'type:' ws t:type? '**\n'
{return {NAME: vr,
LENGTH: n,
LABEL:lb,
TYPE: t
}}
type = 'CHAR'/'NUM'
var = $([a-zA-Z_][a-zA-Z0-9_]*)
label = p:labChar* { return p.join('')}
labChar = [^'"<>|\*\/]
ws = [\t\r ]
num = n:[0-9]+ {return n.join('')}
解析:
** name a1 var:a1 len:9 label:The is the label for a1 type:NUM **
** name a2 var:a2 len: label:The is the label for a2 type:CHAR **
** name a3 var:a3 len:67 label: type: **
我遇到了几个问题。
首先,在我解析的文本中,我期望某些值标签,例如 'var:'、'len:'、'label:' 和 'type:'。我想使用这些标签(因为我知道它们是固定的)来划分值。
其次,我需要允许缺失值。
我这样做的方式正确吗?目前我的脚本将标签的值与类型合并,然后我在 :
处收到错误
Line 1, column 64: Expected "type:" or [^'"<>|*/] but "*" found.
此外,我也可以对文本块执行此操作吗?我尝试解析:
** name a1 var:a1 len:9 label:The is the label for a1 type:NUM **
** name a2 var:a2 len: label:The is the label for a2 type:CHAR **
randomly created text ()= that I would like to keep
** name b1 var:b1 len:9 label:This is the label for b1 type:NUM **
** name b2 var:b2 len: label:This is the label for b2 type:CHAR **
more text
通过修改第一行并添加以下内容:
start = (name/random)*
random = r:.+ (!'** name')
{return {RANDOM: r.join('')}}
我正在寻找最终结果:
[
[{
"NAME": "a1",
"LENGTH": "9",
"LABEL": "The is the label for a1",
"TYPE": "NUM"
},
{
"NAME": "a2",
"LENGTH": null,
"LABEL": "The is the label for a2",
"TYPE": "CHAR"
},
{"RANDOM":"randomly created text ()= that I would like to keep"}]
[{
"NAME": "b1",
"LENGTH": "9",
"LABEL": "This is the label for b1",
"TYPE": "NUM"
},
{
"NAME": "b2",
"LENGTH": null,
"LABEL": "This is the label for b2",
"TYPE": "CHAR"
},
{"RANDOM":"more text "}]
]
你需要一个否定的先行 !(ws 'type:')
否则,标签规则会过于贪婪并消耗所有输入到行尾。
附带说明一下,您可以使用 $()
语法来连接元素的文本,而不是 {return n.join('')}
。
start = name*
name = '** name ' var ws 'var:' vr:var ws 'len:' n:num? ws 'label:' lb:label? ws 'type:' t:type? ws '**' '\n'?
{return {NAME: vr,
LENGTH: n,
LABEL:lb,
TYPE: t
}}
var = $([a-zA-Z_][a-zA-Z0-9_]*)
num = $([0-9]+)
label = $((!(ws 'type:') [^'"<>|\*\/])*)
type = 'CHAR'/'NUM'
ws = [\t\r ]
输出:
[
{
"NAME": "a1",
"LENGTH": "9",
"LABEL": "The is the label for a1",
"TYPE": "NUM"
},
{
"NAME": "a2",
"LENGTH": null,
"LABEL": "The is the label for a2",
"TYPE": "CHAR"
},
{
"NAME": "a3",
"LENGTH": "67",
"LABEL": "",
"TYPE": null
}
]
终于得到了以下的工作:
random = r: $(!('** name').)+ {return {"RANDOM": r}}
我不确定我是否完全理解语法,但它确实有效。
我有以下 peg.js 脚本:
start = name*
name = '** name ' var ws 'var:' vr:var ws 'len:' n:num? ws 'label:' lb:label? 'type:' ws t:type? '**\n'
{return {NAME: vr,
LENGTH: n,
LABEL:lb,
TYPE: t
}}
type = 'CHAR'/'NUM'
var = $([a-zA-Z_][a-zA-Z0-9_]*)
label = p:labChar* { return p.join('')}
labChar = [^'"<>|\*\/]
ws = [\t\r ]
num = n:[0-9]+ {return n.join('')}
解析:
** name a1 var:a1 len:9 label:The is the label for a1 type:NUM **
** name a2 var:a2 len: label:The is the label for a2 type:CHAR **
** name a3 var:a3 len:67 label: type: **
我遇到了几个问题。
首先,在我解析的文本中,我期望某些值标签,例如 'var:'、'len:'、'label:' 和 'type:'。我想使用这些标签(因为我知道它们是固定的)来划分值。
其次,我需要允许缺失值。
我这样做的方式正确吗?目前我的脚本将标签的值与类型合并,然后我在 :
处收到错误Line 1, column 64: Expected "type:" or [^'"<>|*/] but "*" found.
此外,我也可以对文本块执行此操作吗?我尝试解析:
** name a1 var:a1 len:9 label:The is the label for a1 type:NUM **
** name a2 var:a2 len: label:The is the label for a2 type:CHAR **
randomly created text ()= that I would like to keep
** name b1 var:b1 len:9 label:This is the label for b1 type:NUM **
** name b2 var:b2 len: label:This is the label for b2 type:CHAR **
more text
通过修改第一行并添加以下内容:
start = (name/random)*
random = r:.+ (!'** name')
{return {RANDOM: r.join('')}}
我正在寻找最终结果:
[
[{
"NAME": "a1",
"LENGTH": "9",
"LABEL": "The is the label for a1",
"TYPE": "NUM"
},
{
"NAME": "a2",
"LENGTH": null,
"LABEL": "The is the label for a2",
"TYPE": "CHAR"
},
{"RANDOM":"randomly created text ()= that I would like to keep"}]
[{
"NAME": "b1",
"LENGTH": "9",
"LABEL": "This is the label for b1",
"TYPE": "NUM"
},
{
"NAME": "b2",
"LENGTH": null,
"LABEL": "This is the label for b2",
"TYPE": "CHAR"
},
{"RANDOM":"more text "}]
]
你需要一个否定的先行 !(ws 'type:')
否则,标签规则会过于贪婪并消耗所有输入到行尾。
附带说明一下,您可以使用 $()
语法来连接元素的文本,而不是 {return n.join('')}
。
start = name*
name = '** name ' var ws 'var:' vr:var ws 'len:' n:num? ws 'label:' lb:label? ws 'type:' t:type? ws '**' '\n'?
{return {NAME: vr,
LENGTH: n,
LABEL:lb,
TYPE: t
}}
var = $([a-zA-Z_][a-zA-Z0-9_]*)
num = $([0-9]+)
label = $((!(ws 'type:') [^'"<>|\*\/])*)
type = 'CHAR'/'NUM'
ws = [\t\r ]
输出:
[
{
"NAME": "a1",
"LENGTH": "9",
"LABEL": "The is the label for a1",
"TYPE": "NUM"
},
{
"NAME": "a2",
"LENGTH": null,
"LABEL": "The is the label for a2",
"TYPE": "CHAR"
},
{
"NAME": "a3",
"LENGTH": "67",
"LABEL": "",
"TYPE": null
}
]
终于得到了以下的工作:
random = r: $(!('** name').)+ {return {"RANDOM": r}}
我不确定我是否完全理解语法,但它确实有效。