为什么 var$ 编译但 var@ 不编译?

Why does var$ compile but var@ doesn't?

取此代码:

int main(void)
{
    int var$ = 3;
    printf("%d\n",var$);
}

这会正确编译 (GCC, Clang, MSVC) 并按预期在执行时打印 3

然而,这段代码:

int main(void)
{
    int var@ = 8;
    printf("%d\n",var@);
}

这无法编译 (GCC, Clang, MSVC),错误为 stray '@' in program

查看 C/C++ Operator ListCtrl+F 用于 @$),它们都不是运算符。

为什么 var$ 有效而 var@ 无效?

嗯,严格来说,var$ 也是无效的。

有些编译器允许您使用 $ 作为标识符中的字符,以及 a-zA-Z0-9,以及_。这是一个非标准扩展。它很有用,因为某些其他语言(我认为包括 FORTRAN)的编译器允许在标识符中使用 $,并且您可能会尝试编写与其他语言互操作的 C 代码。显然 gcc 是允许此扩展的编译器之一。

允许 @ 的扩展会更加不寻常。我可能曾经见过一个奇怪的编译器让你使用它,但由于我知道它没有用处,而且对它的需求较少,我想没有人提供它。

Blame VMS:

As an extension, GCC treats ‘$’ as a letter. This is for compatibility with some systems, such as VMS, where ‘$’ is commonly used in system-defined function and object names. ‘$’ is not a letter in strictly conforming mode, or if you specify the -$ option. See Invocation.

我稍微修改了您的 var$ 片段(以消除我们不关心的警告)并添加了严格符合的标志 here:

$ gcc -Wall -std=gnu99  -ansi -pedantic -Werror  -O2 -o a.out source_file.c

Error(s):
source_file.c: In function ‘main’:
source_file.c:4:13: error: '$' in identifier or number [-Werror]
         int var$ = 3;
             ^
cc1: all warnings being treated as errors

因此,如果 gcc 被赋予正确的标志,$ 也不会编译。

查看 C11 Specification,关于标识符的第 6.4.2 节:

Semantics

2 An identifier is a sequence of nondigit characters (including the underscore _, the lowercase and uppercase Latin letters, and other characters) and digits, which designates one or more entities as described in 6.2.1. Lowercase and uppercase letters are distinct. There is no specific limit on the maximum length of an identifier.

3 Each universal character name in an identifier shall designate a character whose encoding in ISO/IEC 10646 falls into one of the ranges specified in D.1.71) The initial character shall not be a universal character name designating a character whose encoding falls into one of the ranges specified in D.2. An implementation may allow multibyte characters that are not part of the basic source character set to appear in identifiers; which characters and their correspondence to universal character names is implementation-defined.

(强调我的)

并且根据 GCC Manual 关于实现定义的行为:

  • Identifier characters.

  The C and C++ standards allow identifiers to be composed of ‘_’ and the alphanumeric characters. C++ also allows universal character names. C99 and later C standards permit both universal character names and implementation-defined characters.

  GCC allows the ‘$’ character in identifiers as an extension for most targets. This is true regardless of the std= switch, since this extension cannot conflict with standards-conforming programs. When preprocessing assembler, however, dollars are not identifier characters by default.

(强调我的)

然后在 Tokenization, and mentioned in

As an extension, GCC treats ‘$’ as a letter. This is for compatibility with some systems, such as VMS, where ‘$’ is commonly used in system-defined function and object names. ‘$’ is not a letter in strictly conforming mode, or if you specify the -$ option.

因为 VMS 使用了许多系统定义的函数和对象,这些函数和对象被命名为 $,GCC 允许 $ 作为特定于实现的在某些系统上兼容的字符。

C 规范未明确允许在标识符中使用 $@ 等特殊字符,但可能允许使用某些字符(例如此处的 $)通过实施。例如,GCC 允许在大多数目标的标识符中使用 $。 Clang 也是如此(因为它的大多数实现定义的行为与 GCC 相同)和 MSVC。