在 flex 中检测格式错误的字符串和注释

Detecting ill formed strings and comments in flex

我刚刚开始学习 flex,我已经编写了一个 flex 程序来检测给定的单词是否是动词。我将从文本中获取输入file.I 想要改进代码。我想检测 code.Unfinished 中是否有任何格式错误或未完成的字符串意味着它开始使用开始符号(“”或 /* )但没有任何结束符号和格式错误的方式,例如( "I am" a boy") 或 (/* this is a */ comment */) 像这些。我想在我的代码中检测它们。我将如何做?我的示例代码如下:

%%

[\t]+

is   |

am   |

are  |

was  |

were {printf("%s: is a verb",yytext);}

[a-zA-Z]+ {printf("%s: is a verb",yytext);}

["][^"]*["] {printf("'%s': is a string\n", yytext); }

. |\n

%%

int main(int argc, char *argv[]){    
    yyin = fopen(argv[1], "r");    
    yylex();         
    fclose(yyin);
}

这与 的解决方案类似。我引用:

The flex manual section on using <<EOF>> is quite helpful as it has exactly your case as an example, and their code can also be copied verbatim into your flex program.

As it explains, when using <<EOF>> you cannot place it in a normal regular expression pattern. It can only be proceeded by a the name of a state. In your code you are using a state to indicate you are inside a string. This state is called STRING_MULTI. All you have to do is put that in front of the <<EOF>> marker and give it an action to do.

The special action function yyterminate() tells flex that you have recognised the <<EOF>> and that it marks the end-of-input for your program.

将 stings 和注释合并到一个 flex 程序中可以得到:

%option noyywrap
%x COMMENT_MULTI STRING_MULTI


%%

[\n\t\r ]+ { 
  /* ignore whitespace */ }


<INITIAL>"/*" { 
  /* begin of multi-line comment */
  yymore();
  BEGIN(COMMENT_MULTI); 
}

<INITIAL>["] { yymore(); BEGIN(STRING_MULTI);}

<STRING_MULTI>[^"]+ {yymore(); }

<STRING_MULTI>["]    {printf("String was : %s\n",yytext); BEGIN(INITIAL); }

<STRING_MULTI><<EOF>> {printf("Unterminated String: %s\n",yytext); yyterminate();}

<COMMENT_MULTI>"*/" { 
  /* end of multi-line comment */
  printf("'%s': was a multi-line comment\n", yytext);
  BEGIN(INITIAL); 
}

<COMMENT_MULTI>. { 
  yymore();
} 

<COMMENT_MULTI>\n { 
  yymore();
} 

<COMMENT_MULTI><<EOF>> {printf("Unterminated Comment: %s\n", yytext); yyterminate();}

%%

int main(int argc, char *argv[]){    
  yylex();         
}