C 标准的哪些部分证明了整数类型大小之间的关系?
Which sections of the C standard prove the relation between the integer type sizes?
在 C11 的后期草稿中 [C11_N1570] and C17 [C17_N2176] 我找不到以下证明(我相信这是众所周知的):
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
任何人都可以向我推荐特定的部分吗?
我知道 适用于 C++11。回复的第二部分谈到了C,但只涉及值的范围。它不能证明 类型大小 .
之间的比例
来自 ISO/IEC 9899:2017 的初步检查(您的 C17 C17_N2176 link):
- “5.2.4.2.1 整数类型的大小
<limits.h>
”部分的范围为 (+ or -)(2 raised to n) - 1 信息(指示类型的位数) .
- “6.2.5 类型”部分第 5 点说'...“普通”int object 具有执行架构建议的自然大小
环境(足够大 以包含在 INT_MIN 到 INT_MAX 范围内的任何值,如
header
<limits.h>
).'
这让我觉得范围指定了类型可以达到的最小位大小。也许某些架构分配的大小大于这个最小大小。
6.2.6.2 Integer Types
首先为 unsigned
子类型定义 value 和 padding 位(除了 unsigned char
).
除了根本不需要任何填充位之外,没有说太多。但是可以有多个,不像符号类型的符号位。
没有 common-sense 规则反对 over-padding a short
直到它比 a long
长,无论 long 是否有更多的值位。
(值)位数与最大值之间的直接隐含关系也显示在标题中5.2.4.2.1 Sizes of integer types <limits.h>
。这定义了最小最大值 values,而不是 object sizes(CHAR_BIT 除外)。
剩下的就在于名称本身和实现:short
和 long
,而不是 small
和 large
。说“我是一个 space 节省整数”比说“我是一个最大值减少的整数”更好。
相关部分是:
环境限制和 limits.h
,来自 C17 5.2.4.2.1“整数类型的大小 ”。如果我们只看无符号类型,那么实现至少需要支持的最小值是:
UCHAR_MAX 255
USHRT_MAX 65535
UINT_MAX 65535
ULONG_MAX 4294967295
ULLONG_MAX 18446744073709551615
然后检查C17 6.3.1.1部分关于整数转换等级(另见):
- The rank of
long long int
shall be greater than the rank of long int
, which
shall be greater than the rank of int
, which shall be greater than the rank of short int
, which shall be greater than the rank of signed char
.
- The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
最后,C17 6.2.5/8 是规范性文本,声明每个具有较低转换等级的类型都是较高等级类型的子集:
For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.
要满足这个要求,我们必须有:
sizeof(unsigned char) <=
sizeof(unsigned short) <=
sizeof(unsigned int) <=
sizeof(unsigned long) <=
sizeof(unsigned long long)
Thank you very much everybody who participated in the search of the answer.
Most of the replies have shared what I have already learned, but some of the comments provided very interesting insight.
Below I will summarize what I learned so far (for my own future reference).
Conclusion
Looks like C (as of late draft of C17 [C17_N2176]) does not guarantee that
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
(as opposed to ).
What is Guaranteed
Below is my own interpretation/summary of what C does guarantee regarding the integer types (sorry if my terminology is not strict enough).
Multiple Aliases For the Same Type
moves out of my way the multiple aliases for the same type ([C17_N2176], 6.2.5/4 parenthesized sentence referring to 6.7.2/2, thanks @M.M for the reference).
The Number of Bits in a Byte
The number of bits in a byte is implementation-specific and is >= 8
. It is determined by CHAR_BIT
identifier.
5.2.4.2.1/1 Sizes of integer types <limits.h>
Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
The text below assumes that the byte is 8 bits (keep that in mind on the implementations where byte has a different number of bits).
The sizeof([[un]signed] char)
sizeof(char)
, sizeof(unsigned char)
, sizeof(signed char)
, are 1 byte.
6.5.3.4/2 The sizeof
and _Alignof
operators
The sizeof
operator yields the size (in bytes) of its operand
6.5.3.4/4:
When sizeof
is applied to an operand that has type char
, unsigned char
, or signed char
,
(or a qualified version thereof) the result is 1.
The Range of the Values and the Size of the Type
Objects may use not all the bits to store a value
The object representation has value bits, may have padding bits, and for the signed types has exactly one sign bit (6.2.6.2/1/2 Integer types).
例如。 the variable can have size of 4 bytes, but only the 2 least significant bytes may be used to store a value
(the object representation has only 16 value bits), similar to how the bool
type has at least 1 value bit, and all other bits are padding bits.
The correspondence between the range of the values and the size of the type (or the number of value bits) is arguable.
On the one hand @eric-postpischil refers to 3.19/1:
value
precise meaning of the contents of an object when interpreted as having a specific type
This makes an impression that every value has a unique bit representation (bit pattern).
On the other hand @language-lawyer states
different values don't have to be represented by different bit patterns. Thus, there can be more values than possible bit patterns.
when there is contradiction between the standard and a committee response (CR), committee response is chosen by implementors.
from DR260 Committee Response follows that a bit pattern in object representation doesn't uniquely determine the value.
Different values may be represented by the same bit pattern. So I think an implementation with CHAR_BIT
== 8 and sizeof(int)
== 1 is possible.
I didn't claim that an object has multiple values at the same time
@language-lawyer's statements make an impression that multiple values (e.g. 5
, 23
, -1
), probably at different moments of time,
can correspond to the same bit pattern (e.g. 0xFFFF
)
of the value bits of a variable. If that's true, then the integer types other than [[un]signed] char
(see "The sizeof([[un]signed] char)
" section above) can have any byte size >= 1
(they must have at least one value bit, which prevents them from having byte size 0
(paranoidly strictly speaking),
which results in a size of at least one byte),
and the whole range of values (mandated by <limits.h>
, see below) can correspond to that "at least one value bit".
To summarize, the relation between sizeof(short)
, sizeof(int)
, sizeof(long)
, sizeof(long long)
can be any
(any of these, in byte size, can be greater than or equal to any of the others. Again, somewhat paranoidly strictly speaking).
What Does Not Seem Arguable
What has not been mentioned is 6.2.6.2/1/2 Integer types:
For unsigned integer types .. If there are
N value bits, each bit shall represent a different power of 2 between 1 and 2^(N-1), so that objects of
that type shall be capable of representing values from 0 to 2^N - 1 using a pure binary representation ..
For signed integer types .. Each bit that is a value bit shall have
the same value as the same bit in the object representation of the corresponding unsigned type ..
This makes me believe that each value bit adds a unique value to the overall value of the object.例如。 the least significant value bit (I'll call it a value bit number 0) (regardless of where in the byte(s) it is located) adds a value of 2^0 == 1
, and no any other value bit adds that value, i.e. the value is added uniquely.
The value bit number 1 (again, regardless of its position in the byte(s), but position different from the position of any other value bit) uniquely adds a value of 2^1 == 2
.
These two value bits together sum up to the overall absolute value of 1 + 2 == 3.
Here I won't dig into whether they add a value when set to 1 or when cleared to 0 or combination of those.
In the text below I assume that they add value if set to 1.
Just in case I'll also quote 6.2.6.2/2 Integer types:
If the sign bit is one, the value shall be modified in one of the following ways:
...
— the sign bit has the value -(2^M) (two’s complement);
Earlier in 6.2.6.2/2 it has been mentioned that M is the number of value bits in the signed type.
Thus, if we are talking about 8-bit signed value with 7 value bits and 1 sign bit, then the sign bit, if set to 1, adds the value of -(2^M) == -(2^7) == -128.
Earlier I considered an example where the two least significant value bits sum up to the overall absolute value of 3.
Together with the sign bit set to 1 for the 8-bit signed value with 7 value bits, the overall signed value will be -128 + 3 == -125.
As an example, that value can have a bit pattern of 0x83 (the sign bit is set to 1 (0x80), the two least significant value bits are set to 1 (0x03), and both value bits add to the overall value if are set to 1, rather than cleared to 0, in the two's complement representation).
This observation makes me think that, very likely, there is a one-to-one correspondence
between the range of values and the number of value bits in an object - every value has a unique pattern of value bits and every pattern of value bits uniquely maps to a single value.
(I realize that this intermediate conclusion can still be not strict enough or wrong or not cover certain cases)
Minimum Number of Value Bits and Bytes
5.2.4.2.1/1 Sizes of integer types <limits.h>
:
Important sentence:
Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
Then:
SHRT_MIN -32767 // -(2^15 - 1)
SHRT_MAX +32767 // 2^15 - 1
USHRT_MAX 65535 // 2^16 - 1
This tells me that
short int
has at least 15 value bits (see SHRT_MIN
, SHRT_MAX
above), i.e. at least 2 bytes (if byte is 8 bits, see "The Number of Bits in a Byte" above).
unsigned short int
has at least 16 value bits (USHRT_MAX
above), i.e. at least 2 bytes.
Continuing that logic (see 5.2.4.2.1/1):
int
has at least 15 value bits (see INT_MIN
, INT_MAX
), i.e. at least 2 bytes.
unsigned int
has at least 16 value bits (see UINT_MAX
), i.e. at least 2 bytes.
long int
has at least 31 value bits (see LONG_MIN
, LONG_MAX
), i.e. at least 4 bytes.
unsigned long int
has at least 32 value bits (see ULONG_MAX
), i.e. at least 4 bytes.
long long int
has at least 63 value bits (see LLONG_MIN
, LLONG_MAX
), i.e. at least 8 bytes.
unsigned long long int
has at least 64 value bits (see ULLONG_MAX
), i.e. at least 8 bytes.
This proves to me that:
1 == sizeof(char)
< any of { sizeof(short)
, sizeof(int)
, sizeof(long)
, sizeof(long long)
}.
The sizeof(int)
6.2.5/5 Types
A "plain" int
object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range INT_MIN
to INT_MAX
as defined in the header <limits.h>
).
This proves to me that:
sizeof(int) == 4
on 32-bit architecture (if byte is 8 bits),
sizeof(int) == 8
on 64-bit architecture (if byte is 8 bits).
The sizeof(unsigned T)
6.2.5/6 Types
For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned
) that uses the same amount of storage (including sign information) and has the same alignment requirements.
This proves to me that:
sizeof(unsigned T) == sizoef(signed T)
.
The Ranges of Values
6.2.5/8 Types
For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.
(See the discussion of 6.3.1.1 below)
I assume that a subrange of values can contain the same or smaller number of values than the range.
IE。 the type with the smaller conversion rank can have the same or smaller number of values than the type with the greater conversion rank.
6.3.1.1/1 Boolean, characters, and integers
— The rank of long long int
shall be greater than the rank of long int
, which shall be greater than the rank of int
, which shall be greater than the rank of short int
, which shall be greater than the rank of signed char
.
— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
— The rank of _Bool
shall be less than the rank of all other standard integer types.
— The rank of any enumerated type shall equal the rank of the compatible integer type (see 6.7.2.2).
This tells me that:
range_of_values(bool) <= range_of_values(signed char) <= range_of_values(short int) <= range_of_values(int) <= range_of_values(long int) <= range_of_values(long long int)
.
For the unsigned types the relation between the ranges of values is the same.
This establishes the same relation for the number of value bits in the types.
But still does not prove the same relation between the sizes in bytes of objects of those types.
IE。 C (as of [C17_N2176]) does not guarantee the following statement (as opposed to C++):
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
在 C11 的后期草稿中 [C11_N1570] and C17 [C17_N2176] 我找不到以下证明(我相信这是众所周知的):
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
任何人都可以向我推荐特定的部分吗?
我知道
来自 ISO/IEC 9899:2017 的初步检查(您的 C17 C17_N2176 link):
- “5.2.4.2.1 整数类型的大小
<limits.h>
”部分的范围为 (+ or -)(2 raised to n) - 1 信息(指示类型的位数) . - “6.2.5 类型”部分第 5 点说'...“普通”int object 具有执行架构建议的自然大小
环境(足够大 以包含在 INT_MIN 到 INT_MAX 范围内的任何值,如
header
<limits.h>
).'
这让我觉得范围指定了类型可以达到的最小位大小。也许某些架构分配的大小大于这个最小大小。
6.2.6.2 Integer Types
首先为 unsigned
子类型定义 value 和 padding 位(除了 unsigned char
).
除了根本不需要任何填充位之外,没有说太多。但是可以有多个,不像符号类型的符号位。
没有 common-sense 规则反对 over-padding a short
直到它比 a long
长,无论 long 是否有更多的值位。
(值)位数与最大值之间的直接隐含关系也显示在标题中5.2.4.2.1 Sizes of integer types <limits.h>
。这定义了最小最大值 values,而不是 object sizes(CHAR_BIT 除外)。
剩下的就在于名称本身和实现:short
和 long
,而不是 small
和 large
。说“我是一个 space 节省整数”比说“我是一个最大值减少的整数”更好。
相关部分是:
环境限制和 limits.h
,来自 C17 5.2.4.2.1“整数类型的大小
UCHAR_MAX 255
USHRT_MAX 65535
UINT_MAX 65535
ULONG_MAX 4294967295
ULLONG_MAX 18446744073709551615
然后检查C17 6.3.1.1部分关于整数转换等级(另见
- The rank of
long long int
shall be greater than the rank oflong int
, which shall be greater than the rank ofint
, which shall be greater than the rank ofshort int
, which shall be greater than the rank ofsigned char
.- The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
最后,C17 6.2.5/8 是规范性文本,声明每个具有较低转换等级的类型都是较高等级类型的子集:
For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.
要满足这个要求,我们必须有:
sizeof(unsigned char) <=
sizeof(unsigned short) <=
sizeof(unsigned int) <=
sizeof(unsigned long) <=
sizeof(unsigned long long)
Thank you very much everybody who participated in the search of the answer.
Most of the replies have shared what I have already learned, but some of the comments provided very interesting insight.
Below I will summarize what I learned so far (for my own future reference).
Conclusion
Looks like C (as of late draft of C17 [C17_N2176]) does not guarantee that
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
(as opposed to
What is Guaranteed
Below is my own interpretation/summary of what C does guarantee regarding the integer types (sorry if my terminology is not strict enough).
Multiple Aliases For the Same Type
The Number of Bits in a Byte
The number of bits in a byte is implementation-specific and is >= 8
. It is determined by CHAR_BIT
identifier.
5.2.4.2.1/1 Sizes of integer types <limits.h>
Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
The text below assumes that the byte is 8 bits (keep that in mind on the implementations where byte has a different number of bits).
The sizeof([[un]signed] char)
sizeof(char)
, sizeof(unsigned char)
, sizeof(signed char)
, are 1 byte.
6.5.3.4/2 The sizeof
and _Alignof
operators
The
sizeof
operator yields the size (in bytes) of its operand
6.5.3.4/4:
When
sizeof
is applied to an operand that has typechar
,unsigned char
, orsigned char
, (or a qualified version thereof) the result is 1.
The Range of the Values and the Size of the Type
Objects may use not all the bits to store a value
The object representation has value bits, may have padding bits, and for the signed types has exactly one sign bit (6.2.6.2/1/2 Integer types).
例如。 the variable can have size of 4 bytes, but only the 2 least significant bytes may be used to store a value
(the object representation has only 16 value bits), similar to how the bool
type has at least 1 value bit, and all other bits are padding bits.
The correspondence between the range of the values and the size of the type (or the number of value bits) is arguable.
On the one hand @eric-postpischil refers to 3.19/1:
value
precise meaning of the contents of an object when interpreted as having a specific type
This makes an impression that every value has a unique bit representation (bit pattern).
On the other hand @language-lawyer states
different values don't have to be represented by different bit patterns. Thus, there can be more values than possible bit patterns.
when there is contradiction between the standard and a committee response (CR), committee response is chosen by implementors.
from DR260 Committee Response follows that a bit pattern in object representation doesn't uniquely determine the value. Different values may be represented by the same bit pattern. So I think an implementation with
CHAR_BIT
== 8 andsizeof(int)
== 1 is possible.
I didn't claim that an object has multiple values at the same time
@language-lawyer's statements make an impression that multiple values (e.g. 5
, 23
, -1
), probably at different moments of time,
can correspond to the same bit pattern (e.g. 0xFFFF
)
of the value bits of a variable. If that's true, then the integer types other than [[un]signed] char
(see "The sizeof([[un]signed] char)
" section above) can have any byte size >= 1
(they must have at least one value bit, which prevents them from having byte size 0
(paranoidly strictly speaking),
which results in a size of at least one byte),
and the whole range of values (mandated by <limits.h>
, see below) can correspond to that "at least one value bit".
To summarize, the relation between sizeof(short)
, sizeof(int)
, sizeof(long)
, sizeof(long long)
can be any
(any of these, in byte size, can be greater than or equal to any of the others. Again, somewhat paranoidly strictly speaking).
What Does Not Seem Arguable
What has not been mentioned is 6.2.6.2/1/2 Integer types:
For unsigned integer types .. If there are N value bits, each bit shall represent a different power of 2 between 1 and 2^(N-1), so that objects of that type shall be capable of representing values from 0 to 2^N - 1 using a pure binary representation ..
For signed integer types .. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type ..
This makes me believe that each value bit adds a unique value to the overall value of the object.例如。 the least significant value bit (I'll call it a value bit number 0) (regardless of where in the byte(s) it is located) adds a value of 2^0 == 1
, and no any other value bit adds that value, i.e. the value is added uniquely.
The value bit number 1 (again, regardless of its position in the byte(s), but position different from the position of any other value bit) uniquely adds a value of 2^1 == 2
.
These two value bits together sum up to the overall absolute value of 1 + 2 == 3.
Here I won't dig into whether they add a value when set to 1 or when cleared to 0 or combination of those. In the text below I assume that they add value if set to 1.
Just in case I'll also quote 6.2.6.2/2 Integer types:
If the sign bit is one, the value shall be modified in one of the following ways:
...
— the sign bit has the value -(2^M) (two’s complement);
Earlier in 6.2.6.2/2 it has been mentioned that M is the number of value bits in the signed type.
Thus, if we are talking about 8-bit signed value with 7 value bits and 1 sign bit, then the sign bit, if set to 1, adds the value of -(2^M) == -(2^7) == -128.
Earlier I considered an example where the two least significant value bits sum up to the overall absolute value of 3.
Together with the sign bit set to 1 for the 8-bit signed value with 7 value bits, the overall signed value will be -128 + 3 == -125.
As an example, that value can have a bit pattern of 0x83 (the sign bit is set to 1 (0x80), the two least significant value bits are set to 1 (0x03), and both value bits add to the overall value if are set to 1, rather than cleared to 0, in the two's complement representation).
This observation makes me think that, very likely, there is a one-to-one correspondence
between the range of values and the number of value bits in an object - every value has a unique pattern of value bits and every pattern of value bits uniquely maps to a single value.
(I realize that this intermediate conclusion can still be not strict enough or wrong or not cover certain cases)
Minimum Number of Value Bits and Bytes
5.2.4.2.1/1 Sizes of integer types <limits.h>
:
Important sentence:
Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
Then:
SHRT_MIN -32767 // -(2^15 - 1)
SHRT_MAX +32767 // 2^15 - 1
USHRT_MAX 65535 // 2^16 - 1
This tells me that
short int
has at least 15 value bits (see SHRT_MIN
, SHRT_MAX
above), i.e. at least 2 bytes (if byte is 8 bits, see "The Number of Bits in a Byte" above).
unsigned short int
has at least 16 value bits (USHRT_MAX
above), i.e. at least 2 bytes.
Continuing that logic (see 5.2.4.2.1/1):
int
has at least 15 value bits (see INT_MIN
, INT_MAX
), i.e. at least 2 bytes.
unsigned int
has at least 16 value bits (see UINT_MAX
), i.e. at least 2 bytes.
long int
has at least 31 value bits (see LONG_MIN
, LONG_MAX
), i.e. at least 4 bytes.
unsigned long int
has at least 32 value bits (see ULONG_MAX
), i.e. at least 4 bytes.
long long int
has at least 63 value bits (see LLONG_MIN
, LLONG_MAX
), i.e. at least 8 bytes.
unsigned long long int
has at least 64 value bits (see ULLONG_MAX
), i.e. at least 8 bytes.
This proves to me that:
1 == sizeof(char)
< any of { sizeof(short)
, sizeof(int)
, sizeof(long)
, sizeof(long long)
}.
The sizeof(int)
6.2.5/5 Types
A "plain"
int
object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the rangeINT_MIN
toINT_MAX
as defined in the header<limits.h>
).
This proves to me that:
sizeof(int) == 4
on 32-bit architecture (if byte is 8 bits),
sizeof(int) == 8
on 64-bit architecture (if byte is 8 bits).
The sizeof(unsigned T)
6.2.5/6 Types
For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword
unsigned
) that uses the same amount of storage (including sign information) and has the same alignment requirements.
This proves to me that:
sizeof(unsigned T) == sizoef(signed T)
.
The Ranges of Values
6.2.5/8 Types
For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.
(See the discussion of 6.3.1.1 below)
I assume that a subrange of values can contain the same or smaller number of values than the range.
IE。 the type with the smaller conversion rank can have the same or smaller number of values than the type with the greater conversion rank.
6.3.1.1/1 Boolean, characters, and integers
— The rank of
long long int
shall be greater than the rank oflong int
, which shall be greater than the rank ofint
, which shall be greater than the rank ofshort int
, which shall be greater than the rank ofsigned char
.
— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
— The rank of_Bool
shall be less than the rank of all other standard integer types.
— The rank of any enumerated type shall equal the rank of the compatible integer type (see 6.7.2.2).
This tells me that:
range_of_values(bool) <= range_of_values(signed char) <= range_of_values(short int) <= range_of_values(int) <= range_of_values(long int) <= range_of_values(long long int)
.
For the unsigned types the relation between the ranges of values is the same.
This establishes the same relation for the number of value bits in the types.
But still does not prove the same relation between the sizes in bytes of objects of those types.
IE。 C (as of [C17_N2176]) does not guarantee the following statement (as opposed to C++):
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)