如何匹配 lex 中的可选标记
How to match optional token in lex
我有如下字符串的文件。
PORT = en, PIN = P3;
PORT = dummy[9], PIN = P41;
PORT = dummy[8], PIN = P42;
PORT = dummy[7], PIN = P43;
PORT = dummy[6], PIN = P44;
PORT = dummy[5], PIN = P45;
PORT = dummy[4], PIN = P46;
PORT = dummy[3], PIN = P47;
PORT = dummy[2], PIN = P48;
PORT = dummy[1], PIN = P49;
PORT = dummy[0], PIN = P50;
PORT = out1, PIN = P6;
我正在尝试使用 lex
提取 PORT
和 PIN
,如下所示。
lex 语法。
%%
"=" { return EQUALS; }
"," { return COMMA; }
";" { return SEMICOLON; }
PORT { return PORT; }
PIN { return PIN; }
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.str =strdup(yytext);return ALPHANUMERIC;}
"//".* | [\t] {; }
"/*"[.\n]*"*/" {; }
\n {; }
. {; }
%%
以及相应的 yacc
文件。
%token EQUALS
%token COMMA
%token SEMICOLON
%token PIN
%token PORT
%token <str> ALPHANUMERIC
%token <str> BUS_PORT
%type <str> port_name
%type <str> pin_name
%%
physical_command : sub_command
| physical_command sub_command
;
sub_command : port_command
;
port_command : PORT EQUALS port_name COMMA PIN EQUALS pin_name SEMICOLON
{
pm->addPortAndPin(std::string(),std::string());
}
;
port_name : ALPHANUMERIC
| ALPHANUMERIC BUS_PORT
{
$$ = ;
}
;
pin_name : ALPHANUMERIC
{
$$ = ;
}
;
%%
如果您看到 port
名称可以是数组类型 {dummy[10]
、dummy[9]
..etc} 或普通类型。
为了解析它,我编写了如下规则。
port_name : ALPHANUMERIC //for normal type
| ALPHANUMERIC BUS_PORT //for array type
{
$$ = ;
}
语法是
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.str =strdup(yytext);return ALPHANUMERIC;}
我的问题:
我无法使用上述规则解析数组类型,我的输出如下所示。请帮助我制定规则,以便我可以解析普通类型和数组类型。
Port = en Pin = P3
Port = dummy Pin = P41 //should have been dummy[9]
Port = dummy Pin = P42 //should have been dummy[8]
Port = dummy Pin = P43
Port = dummy Pin = P44
Port = dummy Pin = P45
Port = dummy Pin = P46
Port = dummy Pin = P47
Port = dummy Pin = P48
Port = dummy Pin = P49
Port = dummy Pin = P50
Port = out1 Pin = P6
数组元素访问操作,其中每个开始的“[”应该有相应的结束“]”在解析级别处理。
在 lex 规范中,替换
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
来自
[0-9]+ {}
"[" { return *yytext;}
"]" { return *yytext;}
在解析器规范中,替换
port_name : ALPHANUMERIC
| ALPHANUMERIC BUS_PORT { $$ = ;}
来自
port_name : ALPHANUMERIC
| ALPHANUMERIC '[' BUS_PORT ']' { $$ = ;}
另一种解决方案是,将 BUS_PORT 令牌规则更改为
\[[0-9]+\] {yylval.str =strdup(yytext);return BUS_PORT;}
它识别[后跟一个或数字后跟]],仅适用于固定数量的维度
我有如下字符串的文件。
PORT = en, PIN = P3;
PORT = dummy[9], PIN = P41;
PORT = dummy[8], PIN = P42;
PORT = dummy[7], PIN = P43;
PORT = dummy[6], PIN = P44;
PORT = dummy[5], PIN = P45;
PORT = dummy[4], PIN = P46;
PORT = dummy[3], PIN = P47;
PORT = dummy[2], PIN = P48;
PORT = dummy[1], PIN = P49;
PORT = dummy[0], PIN = P50;
PORT = out1, PIN = P6;
我正在尝试使用 lex
提取 PORT
和 PIN
,如下所示。
lex 语法。
%%
"=" { return EQUALS; }
"," { return COMMA; }
";" { return SEMICOLON; }
PORT { return PORT; }
PIN { return PIN; }
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.str =strdup(yytext);return ALPHANUMERIC;}
"//".* | [\t] {; }
"/*"[.\n]*"*/" {; }
\n {; }
. {; }
%%
以及相应的 yacc
文件。
%token EQUALS
%token COMMA
%token SEMICOLON
%token PIN
%token PORT
%token <str> ALPHANUMERIC
%token <str> BUS_PORT
%type <str> port_name
%type <str> pin_name
%%
physical_command : sub_command
| physical_command sub_command
;
sub_command : port_command
;
port_command : PORT EQUALS port_name COMMA PIN EQUALS pin_name SEMICOLON
{
pm->addPortAndPin(std::string(),std::string());
}
;
port_name : ALPHANUMERIC
| ALPHANUMERIC BUS_PORT
{
$$ = ;
}
;
pin_name : ALPHANUMERIC
{
$$ = ;
}
;
%%
如果您看到 port
名称可以是数组类型 {dummy[10]
、dummy[9]
..etc} 或普通类型。
为了解析它,我编写了如下规则。
port_name : ALPHANUMERIC //for normal type
| ALPHANUMERIC BUS_PORT //for array type
{
$$ = ;
}
语法是
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.str =strdup(yytext);return ALPHANUMERIC;}
我的问题:
我无法使用上述规则解析数组类型,我的输出如下所示。请帮助我制定规则,以便我可以解析普通类型和数组类型。
Port = en Pin = P3
Port = dummy Pin = P41 //should have been dummy[9]
Port = dummy Pin = P42 //should have been dummy[8]
Port = dummy Pin = P43
Port = dummy Pin = P44
Port = dummy Pin = P45
Port = dummy Pin = P46
Port = dummy Pin = P47
Port = dummy Pin = P48
Port = dummy Pin = P49
Port = dummy Pin = P50
Port = out1 Pin = P6
数组元素访问操作,其中每个开始的“[”应该有相应的结束“]”在解析级别处理。
在 lex 规范中,替换
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
来自
[0-9]+ {}
"[" { return *yytext;}
"]" { return *yytext;}
在解析器规范中,替换
port_name : ALPHANUMERIC
| ALPHANUMERIC BUS_PORT { $$ = ;}
来自
port_name : ALPHANUMERIC
| ALPHANUMERIC '[' BUS_PORT ']' { $$ = ;}
另一种解决方案是,将 BUS_PORT 令牌规则更改为
\[[0-9]+\] {yylval.str =strdup(yytext);return BUS_PORT;}
它识别[后跟一个或数字后跟]],仅适用于固定数量的维度