如何为编译器编写缓冲区?

how to write buffer for compiler?

我第一次想用 C# 编写编译器,但我不知道如何处理它的缓冲!我的参考资料是 Compilers, Principles, Techniques and Tools, 它说:

Because of the amount of time taken to process characters and the large number of characters that must be processed during the compilation of a large source program, specialized buffering techniques have been developed to reduce the amount of overhead required to process a single input character.An important scheme involves two buffers that are alternately reloaded,Each buffer is of the same size N, and N is usually the size of a disk block,e.g., 4096 bytes. Using one system read command we can read N characters into a buffer, rather than using one system call per character. If fewer than N characters remain in the input file, then a special character, represented by eof,marks the end of the source file and is different from any possible character of the source program.

而且这本书里也说我们把eof放在每个buffer的末尾来实现我们到达buffer.and的末尾它有两个指针forwardlexemBegine 指向缓冲区中的词素! 我的问题是我不知道如何创建这个缓冲区?我应该在 sourceBuffer class 中创建大小为 N 的数组或缓冲区,然后如何从 StreamReader 读取文件并将源文件的 N 个字符放入数组? 如果我改为从源文件中读取字符,会出现什么问题?

你似乎引用了 1986 年的最后一版 "Compilers: Principles, Techniques, and Tools"。(但即使在那个时候引用的部分已经过时了)。

在像 C# 这样的现代编程语言中(或者更准确地说在它的 I/O 库中)这种缓冲已经实现了(以一种健壮的、经过测试的、高性能的方式)。

只需使用 StreamReader 即可完成所有这些工作。然后一个字符一个字符地阅读,直到找到一个完整的标记,然后按照这本优秀书籍中的描述处理您的标记。