为什么 sizeof 一个 char 字面值与 sizeof(char) 不同?

Why is sizeof a char literal not the same as sizeof(char)?

节目

#include <stdio.h>
int main(void) {
    printf("sizeof( char ) = %zu, sizeof 'a' = %zu.\n", sizeof( char ), sizeof 'a' );
    return 0;
}

输出如下:

sizeof( char ) = 1, sizeof 'a' = 4.

我正在用 gcc 编译(clang 给出相同的结果)和这些标志:

gcc -Wall -Wextra -Wswitch -pedantic -ansi -std=c11 -DDEBUG -ggdb3 -o

规范的第 6.5.3.4 节第 4 段 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

4 When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1.

所以我希望操作数 'a' 的大小为 1,因为 'a' 的类型是 char,或者它是否被自动“提升”为 int 或类似的东西? (我注意到,如果我将 'a' 转换为 char,则 sizeof( (char)'a' ) 为 1)。

还是我看错了标准?

'a' 具有整数类型,sizeof('a') 给出 int.

的大小

在 C 中,与 C++ 整数字符常量(文字)相反,类型为 int

所以表达式 sizeof( 'a' ) 的值等于表达式 sizeof( int ) 的值。虽然 sizeof( char ) 始终等于 1

来自 C 标准(6.5.3.4 sizeof 和 alignof 运算符)

4 When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1...

和(6.4.4.4 字符常量)

10 An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

请注意,由于整数提升,通常用作操作数或表达式的 char 类型的对象会转换为 int 类型。

在C语言中,character literal不是char类型。 C 将字符文字视为 integer。所以,sizeof('a')sizeof(1) 没有区别。

字符文字的大小等于整数的大小

字符常量(或更准确地说,整数字符常量)的类型为 int.

C standard 的第 6.4.4.4p2 节描述了字符常量:

An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'. A wide character constant is the same, except prefixed by the letter L, u, or U. With a few exceptions detailed later, the elements of the sequence are any members of the source character set; they are mapped in an implementation-defined manner to members of the execution character set

第 10 段首先描述了包括类型描述的语义:

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int