运行在同一个源上多次使用 C 预处理器是否安全？

Question

根据我的经验，当运行对先前预处理的源代码进行处理时，C 预处理器的行为就像空操作一样。但标准是否保证了这种行为？或者一个实现可能有一个预处理器来修改以前预处理的代码，例如 removes/modifies 行指令，或者执行其他可能混淆编译器的修改？

Answer 1

一般来说，通过 cpp 进行的预处理不能保证是幂等的（第一个运行之后的一个 noop）。一个简单的反例：

#define X #define Y z
X
Y

第一次调用将产生：

 #define Y z
Y

第二个：

话虽如此，有效的 C 代码不应该做那样的事情（因为输出对于编译器的下一阶段来说不是有效输入）。

此外，根据您要执行的操作，cpp 提供 -fpreprocessed 之类的选项可能会有所帮助。

Answer 2

该标准没有将“预处理器”定义为单独的组件。最接近的是在 §5.1.1.2 中对翻译过程第 4 阶段的描述：

Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. If a character sequence that matches the syntax of a universal character name is produced by token concatenation (6.10.3.3), the behavior is undefined. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted.

但是，该部分中定义的翻译阶段不可分离，也不保证它们彼此独立：

Implementations shall behave as if these separate phases occur, even though many are typically folded together in practice. Source files, translation units, and translated translation units need not necessarily be stored as files, nor need there be any one-to-one correspondence between these entities and any external representation. The description is conceptual only, and does not specify any particular implementation. (Footnote 6 from the same section.)

因此没有预期的机制以任何形式提取翻译阶段 1-4 的结果，更不用说作为文本文件了——事实上，如果翻译阶段按照描述的那样精确实施，阶段的输出4 将是一个标记序列——而且也没有一种机制可以将该输出反馈给翻译器。

换句话说，您可能拥有一些自称为预处理器的工具，它甚至可能是编译器套件的一部分。但是该工具的行为超出了 C 标准的范围。所以标准没有任何保证。

顺便说一句，如果从第 4 阶段出来的令牌流被天真地转换为文本，它可能无法正确保留令牌边界。大多数预处理器工具会在可能发生这种情况的地方注入额外的空白。这允许将工具的输出输入编译器，至少在大多数情况下是这样。（有关无法正常工作的示例，请参阅。）但是标准既不需要也没有规定此行为。

运行在同一个源上多次使用 C 预处理器是否安全？

Is it safe to run the C preprocessor several times on the same source?

c

c-preprocessor

运行 在同一个源上多次使用 C 预处理器是否安全？

Is it safe to run the C preprocessor several times on the same source?

c

c-preprocessor

运行在同一个源上多次使用 C 预处理器是否安全？