为什么 GCC 会删除预处理标记之间的空格?

Why does GCC remove the whitespace between the preprocessing tokens?

示例代码:

#define X(x,y)  x y
#define STR_(x) #x
#define STR(x)  STR_(x)
STR(X(Y,Y))

调用:

$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P
"Y Y"

$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"YY"

为什么 GCC 会删除预处理标记之间的空格?

例如,clang 不会:

$ clang t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"Y Y"

UPD1。 gcc 以某种方式考虑了 ,Y:

之间的空格
$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()" -D"Z=STR(X(Y,Y))"
"YY"

$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()" -D"Z=STR(X(Y, Y))"
"Y Y"

UPD2。这个:

STR(X(Y,
Y))

导致:

$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"Y Y"

然而,这:

STR(X(Y
,Y))

导致:

$ gcc t222.c -std=c11 -pedantic -Wall -Wextra -E -P -D"Y()"
"YY"

UPD3。报告:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104147.

这是 GCC 中的错误。 C 2018 6.10.3.2 指定了 # 运算符的行为。第 1 段说“类函数宏的替换列表中的每个 # 预处理标记应后跟一个参数作为替换列表中的下一个预处理标记。”我们在 #define STR_(x) #x#x 中看到了这一点。

第 2 段说:

If, in the replacement list, a parameter is immediately preceded by a # preprocessing token, both are replaced by a single character string literal preprocessing token that contains the spelling of the preprocessing token sequence for the corresponding argument. Each occurrence of white space between the argument’s preprocessing tokens becomes a single space character in the character string literal. White space before the first preprocessing token and after the last preprocessing token composing the argument is deleted…

X(Y,Y) 宏调用必须产生标记 YY,我们在 #define X(x,y) x y 中看到它们会有白色 space他们之间。

White-space 在宏替换列表中很重要,根据 6.10.3 1,它说:

Two replacement lists are identical if and only if the preprocessing tokens in both have the same number, ordering, spelling, and white-space separation, where all white-space separations are considered identical.

因此,在#define X(x,y) x y中,替换列表不应该被认为只是xy这两个标记,忽略白色space。替换列表为 x、白色 space 和 y.

此外,当宏被替换时,它被替换列表替换(因此包括白色 space),而不仅仅是替换列表中的标记,根据 6.10.3 10:

… Each subsequent instance of the function-like macro name followed by a ( as the next preprocessing token introduces the sequence of preprocessing tokens that is replaced by the replacement list in the definition (an invocation of the macro)… Within the sequence of preprocessing tokens making up an invocation of a function-like macro, new-line is considered a normal white-space character.