解析 JSON RFC 语法
Parsing JSON RFC Grammar
我正在学习构建解析器,我想也许我应该从一个简单的开始,例如 JSON。我在看这篇 RFC document.
中发表的 JSON 语法
此处显示的 JSON 语法如下所示。这是什么类型的表示法,我如何 generate/write 解析器?
JSON-text = ws value ws
begin-array = ws %x5B ws ; [ left square bracket
begin-object = ws %x7B ws ; { left curly bracket
end-array = ws %x5D ws ; ] right square bracket
end-object = ws %x7D ws ; } right curly bracket
name-separator = ws %x3A ws ; : colon
value-separator = ws %x2C ws ; , comma
ws = *(
%x20 / ; Space
%x09 / ; Horizontal tab
%x0A / ; Line feed or New line
%x0D ) ; Carriage return
value = false / null / true / object / array / number / string
false = %x66.61.6c.73.65 ; false
null = %x6e.75.6c.6c ; null
true = %x74.72.75.65 ; true
object = begin-object [ member *( value-separator member ) ]
end-object
member = string name-separator value
array = begin-array [ value *( value-separator value ) ] end-array
number = [ minus ] int [ frac ] [ exp ]
decimal-point = %x2E ; .
digit1-9 = %x31-39 ; 1-9
e = %x65 / %x45 ; e E
exp = e [ minus / plus ] 1*DIGIT
frac = decimal-point 1*DIGIT
int = zero / ( digit1-9 *DIGIT )
minus = %x2D ; -
plus = %x2B ; +
zero = %x30 ; 0
string = quotation-mark *char quotation-mark
char = unescaped /
escape (
%x22 / ; " quotation mark U+0022
%x5C / ; \ reverse solidus U+005C
%x2F / ; / solidus U+002F
%x62 / ; b backspace U+0008
%x66 / ; f form feed U+000C
%x6E / ; n line feed U+000A
%x72 / ; r carriage return U+000D
%x74 / ; t tab U+0009
%x75 4HEXDIG ) ; uXXXX U+XXXX
escape = %x5C ; \
quotation-mark = %x22 ; "
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
如链接的 RFC 第 1.1 节中所述,形式主义是 RFC 5234 的形式主义,语法规范的增强 BNF (ABNF):
The grammatical rules in this document are to be interpreted as described in RFC5234.
要构建解析器,您可以将其用作规范并以您选择的语言以临时方式实现它,或者您可以将其转换为可用 lexer/parser 生成器识别的格式。
我正在学习构建解析器,我想也许我应该从一个简单的开始,例如 JSON。我在看这篇 RFC document.
中发表的 JSON 语法此处显示的 JSON 语法如下所示。这是什么类型的表示法,我如何 generate/write 解析器?
JSON-text = ws value ws
begin-array = ws %x5B ws ; [ left square bracket
begin-object = ws %x7B ws ; { left curly bracket
end-array = ws %x5D ws ; ] right square bracket
end-object = ws %x7D ws ; } right curly bracket
name-separator = ws %x3A ws ; : colon
value-separator = ws %x2C ws ; , comma
ws = *(
%x20 / ; Space
%x09 / ; Horizontal tab
%x0A / ; Line feed or New line
%x0D ) ; Carriage return
value = false / null / true / object / array / number / string
false = %x66.61.6c.73.65 ; false
null = %x6e.75.6c.6c ; null
true = %x74.72.75.65 ; true
object = begin-object [ member *( value-separator member ) ]
end-object
member = string name-separator value
array = begin-array [ value *( value-separator value ) ] end-array
number = [ minus ] int [ frac ] [ exp ]
decimal-point = %x2E ; .
digit1-9 = %x31-39 ; 1-9
e = %x65 / %x45 ; e E
exp = e [ minus / plus ] 1*DIGIT
frac = decimal-point 1*DIGIT
int = zero / ( digit1-9 *DIGIT )
minus = %x2D ; -
plus = %x2B ; +
zero = %x30 ; 0
string = quotation-mark *char quotation-mark
char = unescaped /
escape (
%x22 / ; " quotation mark U+0022
%x5C / ; \ reverse solidus U+005C
%x2F / ; / solidus U+002F
%x62 / ; b backspace U+0008
%x66 / ; f form feed U+000C
%x6E / ; n line feed U+000A
%x72 / ; r carriage return U+000D
%x74 / ; t tab U+0009
%x75 4HEXDIG ) ; uXXXX U+XXXX
escape = %x5C ; \
quotation-mark = %x22 ; "
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
如链接的 RFC 第 1.1 节中所述,形式主义是 RFC 5234 的形式主义,语法规范的增强 BNF (ABNF):
The grammatical rules in this document are to be interpreted as described in RFC5234.
要构建解析器,您可以将其用作规范并以您选择的语言以临时方式实现它,或者您可以将其转换为可用 lexer/parser 生成器识别的格式。