如何匹配 lex 中的可选标记

How to match optional token in lex

我有如下字符串的文件。

PORT = en, PIN = P3; 
PORT = dummy[9], PIN = P41;
PORT = dummy[8], PIN = P42;
PORT = dummy[7], PIN = P43;
PORT = dummy[6], PIN = P44;
PORT = dummy[5], PIN = P45;
PORT = dummy[4], PIN = P46;
PORT = dummy[3], PIN = P47;
PORT = dummy[2], PIN = P48;
PORT = dummy[1], PIN = P49;
PORT = dummy[0], PIN = P50;
PORT = out1, PIN = P6; 

我正在尝试使用 lex 提取 PORTPIN,如下所示。

lex 语法。

%%
"="                    { return EQUALS; }
","                    { return COMMA; }
";"                    { return SEMICOLON; }
PORT                   { return PORT; }
PIN                    { return PIN; }
[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.str =strdup(yytext);return ALPHANUMERIC;}
"//".* | [\t]          {; }
"/*"[.\n]*"*/"         {; }
\n                     {; }
.                      {; }
%%

以及相应的 yacc 文件。

%token EQUALS
%token COMMA
%token SEMICOLON
%token PIN
%token PORT
%token <str> ALPHANUMERIC
%token <str> BUS_PORT
%type <str> port_name
%type <str> pin_name

%%

physical_command : sub_command
                 | physical_command sub_command
                 ;
sub_command      : port_command
                 ;

port_command     : PORT EQUALS port_name COMMA PIN EQUALS pin_name SEMICOLON
                 {
                   pm->addPortAndPin(std::string(),std::string());
                 }
                 ;

port_name        : ALPHANUMERIC
                 | ALPHANUMERIC BUS_PORT
                 {
                   $$ = ;
                 }
                 ;
pin_name         : ALPHANUMERIC
                 {
                   $$ = ;
                 }
                 ;
%%

如果您看到 port 名称可以是数组类型 {dummy[10]dummy[9]..etc} 或普通类型。 为了解析它,我编写了如下规则。

port_name        : ALPHANUMERIC //for normal type
                 | ALPHANUMERIC BUS_PORT  //for array type
                 {
                   $$ = ;
                 }

语法是

[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}
[a-zA-Z_][a-zA-Z0-9_]* {yylval.str =strdup(yytext);return ALPHANUMERIC;}

我的问题:

我无法使用上述规则解析数组类型,我的输出如下所示。请帮助我制定规则,以便我可以解析普通类型和数组类型。

Port = en       Pin = P3
Port = dummy    Pin = P41 //should have been dummy[9]
Port = dummy    Pin = P42 //should have been dummy[8]
Port = dummy    Pin = P43
Port = dummy    Pin = P44
Port = dummy    Pin = P45
Port = dummy    Pin = P46
Port = dummy    Pin = P47
Port = dummy    Pin = P48
Port = dummy    Pin = P49
Port = dummy    Pin = P50
Port = out1     Pin = P6

数组元素访问操作,其中每个开始的“[”应该有相应的结束“]”在解析级别处理。

在 lex 规范中,替换

[\[0-9\]]* {yylval.str =strdup(yytext);return BUS_PORT;}

来自

  [0-9]+   {}
   "["     { return *yytext;} 
   "]"     { return *yytext;}

在解析器规范中,替换

  port_name  : ALPHANUMERIC
             | ALPHANUMERIC BUS_PORT { $$ = ;}

来自

  port_name  : ALPHANUMERIC
             | ALPHANUMERIC '[' BUS_PORT ']' { $$ = ;}

另一种解决方案是,将 BUS_PORT 令牌规则更改为

  \[[0-9]+\] {yylval.str =strdup(yytext);return BUS_PORT;}

它识别[后跟一个或数字后跟]],仅适用于固定数量的维度