flex-lexer 是用于简单正则表达式识别和替换的高性能（快速）转换器吗？

Question

我正在为我的大学做一个项目，需要处理相当大的文件（>50MB），我使用 flex 基本上重写了一个由以下行组成的文件

1 1:41 3:54 7:40 13:8
2 4:7 7:8 23:85

进入

1 1 3 7 13
2 4 7 23

（基本上，将 number_a:number_b 转换为 number_a，并在执行时将输出打印到文件中）

我的问题是，因为我正在用 C++ 编写程序的其余部分：这是一个很好的反应，因为 flex 应该很快（fin flex 代表快速），或者我我只是错了，在 C++ 中有更简单但仍然有效的方法吗？

我是 C++ 的新手，所以我有很多 C 编码反应，但对所有可用工具及其性能知之甚少。

这是我在 flex 中写的一段代码：

%{
#include<stdio.h> 
unsigned int nb_doc = 1;//prend en compte le premier doc
unsigned int i;
%}
couple_entiers      [0-9]+:[0-9]+
retour_chariot      \n[0-9]+
autre               .
%%
{couple_entiers}      {i=0;
                       while(yytext[i] != ':'){
                            printf("%c",yytext[i]);
                            i++;
                         }
                      }
{retour_chariot}      {nb_doc ++; printf("%s",yytext);}
{autre}               {printf("%s",yytext);}

%%
int main (void){
    yylex();
    printf("\n\n%d",nb_doc);
    return 0;
}

Answer 1

考虑用通用解决方案替换您的自定义代码：

system("sed -E 's/:\d+ / /g'");

:)

flex-lexer 是用于简单正则表达式识别和替换的高性能（快速）转换器吗？

Is flex-lexer a performant(fast) converter for simple regex recognition and replacement?

c++

flex-lexer