C 中 == 的确切含义
Exact meaning of == in C
我对 C 中运算符 ==
的确切含义有点困惑。
它比较变量代表的数学值(取决于它们的类型)还是变量背后的位模式?具体来说:
int x = 0x80000000;
unsigned y = x;
x==y // true
所以尽管 x 是一个大的负值而 y 是一个大的正值,但它们是相等的(我猜是因为它们具有相同的位模式)。
int64_t x = 0x8000000000000000;
int y = x;
x==y // false
在这里,x 和 y 中的前(最低有效)32 位相同并不重要。所以在这种情况下,C 看起来像是在查看变量表示的值。
官方规则是什么,是否有权威的参考资料(在 K&R 中没有发现任何有用的东西)?
我在上面的例子中使用了 gcc
编译器。
首先,让我们看一下x
的初始化。假设一个 32 位 int
,常量 0x80000000
的类型 unsigned int
的值为 231。所以这个常量必须转换为int
类型。 C standard 的第 6.3.1.3p3 节规定了这是如何发生的:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
因此发生了实现定义的转换。在二进制补码系统中,这通常是通过将所讨论的值的低 32 位直接分配给要分配给的对象来实现的。这导致 x
的值为 -231.
现在 x
分配给 y
。这意味着该值从 int
转换为 unsigned int
,并且所讨论的值为负数。所以转换是由第 6.3.1.3p2 节规定的:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type
unsigned int
(假设32位)的最大值是232-1,所以比这个多一个就是232。将 232 添加到 -231 得到 231 这就是存储在 [=17 中的内容=].
现在进行比较。
当通过 ==
运算符比较两种不同的算术类型时,它们会进行 常规算术转换。
C 标准的第 6.3.1.8p1 节规定了以下关于如何转换两种整数类型的内容:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser
integer conversion rank is converted to the type of the operand
with greater rank.
Otherwise, if the operand that has unsigned integer type has
rank greater or equal to the rank of the type of the other
operand, then the operand with signed integer type is
converted to the type of the operand with unsigned integer
type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is
converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned
integer type corresponding to the type of the operand with signed
integer type.
粗体段落适用于此处,因为我们正在比较 int
和 unsigned int
。所以 x
的值被转换为类型 unsigned int
.
回到6.3.1.3p2小节的转换规则,unsigned int
(假设32位)的最大值为232-1,所以一比这多了 232。将 232 添加到 x
的值,即 -231,得到 231.这与y
的值相同,所以比较成立。
在第二个示例中,x
的类型为 int64_t
,y
的类型为 int
,实现定义的从 int64_t
到 int
当 x
被分配给 y
时可能导致 y
为 0,因为 x
的低 32 位都是 0.
你在问比较是基于值还是位模式时就草率下了结论,因为首先有一个重要的步骤。比较前,==
的操作数转换为普通类型
例如,当您将 32 位二进制补码 int x
与位模式 1000…00002(表示 −2,147,483,648)和 unsigned int y
与x == y
具有相同的位模式(表示+2,147,483,648),x
首先转换为unsigned int
,产生+2,147,483,648。然后将 +2,147,483,648 与 +2,147,483,648 进行比较,因此 ==
报告它们相等。
C 2018 6.5.9(“相等运算符”)4 说:
If both of the operands have arithmetic type, the usual arithmetic conversions are performed…
常用算术转换在 6.3.1.8 中指定。第 1 段开始:
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted… to a type whose corresponding real type is the common real type.
规则涉及一些技术细节,但是,在很大程度上,当您比较两个整数类型时,首先每个整数类型至少会提升到 int
,然后将较窄的类型转换为较宽的类型类型。如果它们的宽度相同但有一个是无符号的,则有符号类型将被转换为无符号类型。这可能会更改值。
一旦确定要比较的实际值,==
的结果将根据值而不是位模式来定义。
(最常见的情况是浮点数 +0 和 −0,它们表示相同的实数并且比较相等但具有不同的表示形式。在大多数现代环境中,整数类型中的所有位模式表示不同的值,二进制浮点类型中的所有位模式表示不同的值或 NaN,除了 +0 和 -0。有一些不太常用的浮点类型对某些值有多种表示,类似于方式3.5•107和35•106表示相同的数。)
任何时候你将有符号整数类型中的负值与宽度相同或更宽(提升后)的无符号类型进行比较时,有符号类型的值将在比较之前发生变化。所以你有得到“数学错误”结果的风险。
根据语言定义的确切含义:
6.2.6 Representations of types
6.2.6.1 General
4 Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT
bits, where n is the size of an object of that type, in bytes. The value may be copied into
an object of type unsigned char [n]
(e.g., by memcpy
); the resulting set of bytes is
called the object representation of the value. Values stored in bit-fields consist of m bits,
where m is the size specified for the bit-field. The object representation is the set of m
bits the bit-field comprises in the addressable storage unit holding it. Two values (other
than NaNs) with the same object representation compare equal, but values that compare
equal may have different object representations.
...
6.5.9 Equality operators
...
4 If both of the operands have arithmetic type, the usual arithmetic conversions are
performed. Values of complex types are equal if and only if both their real parts are equal
and also their imaginary parts are equal. Any two values of arithmetic types from
different type domains are equal if and only if the results of their conversions to the
(complex) result type determined by the usual arithmetic conversions are equal.
5 Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a
null pointer constant, the null pointer constant is converted to the type of the pointer. If
one operand is a pointer to an object type and the other is a pointer to a qualified or
unqualified version of void, the former is converted to the type of the latter.
6 Two pointers compare equal if and only if both are null pointers, both are pointers to the
same object (including a pointer to an object and a subobject at its beginning) or function,
both are pointers to one past the last element of the same array object, or one is a pointer
to one past the end of one array object and the other is a pointer to the start of a different
array object that happens to immediately follow the first array object in the address
space.109)
109) Two objects may be adjacent in memory because they are adjacent elements of a larger array or
adjacent members of a structure with no padding between them, or because the implementation chose
to place them so, even though they are unrelated. If prior invalid pointer operations (such as accesses
outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined
behavior.
C 2011 Online Draft
从语义上讲,==
运算符比较的是值,而不是位 - 1.0 == 1
将计算为 true
,即使两个操作数都有完全不同的按位表示。
但是,作为比较的一部分,整数 1
将首先转换为浮点数 1.0
,以便按位比较 可以 制作。
我对 C 中运算符 ==
的确切含义有点困惑。
它比较变量代表的数学值(取决于它们的类型)还是变量背后的位模式?具体来说:
int x = 0x80000000;
unsigned y = x;
x==y // true
所以尽管 x 是一个大的负值而 y 是一个大的正值,但它们是相等的(我猜是因为它们具有相同的位模式)。
int64_t x = 0x8000000000000000;
int y = x;
x==y // false
在这里,x 和 y 中的前(最低有效)32 位相同并不重要。所以在这种情况下,C 看起来像是在查看变量表示的值。
官方规则是什么,是否有权威的参考资料(在 K&R 中没有发现任何有用的东西)?
我在上面的例子中使用了 gcc
编译器。
首先,让我们看一下x
的初始化。假设一个 32 位 int
,常量 0x80000000
的类型 unsigned int
的值为 231。所以这个常量必须转换为int
类型。 C standard 的第 6.3.1.3p3 节规定了这是如何发生的:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
因此发生了实现定义的转换。在二进制补码系统中,这通常是通过将所讨论的值的低 32 位直接分配给要分配给的对象来实现的。这导致 x
的值为 -231.
现在 x
分配给 y
。这意味着该值从 int
转换为 unsigned int
,并且所讨论的值为负数。所以转换是由第 6.3.1.3p2 节规定的:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type
unsigned int
(假设32位)的最大值是232-1,所以比这个多一个就是232。将 232 添加到 -231 得到 231 这就是存储在 [=17 中的内容=].
现在进行比较。
当通过 ==
运算符比较两种不同的算术类型时,它们会进行 常规算术转换。
C 标准的第 6.3.1.8p1 节规定了以下关于如何转换两种整数类型的内容:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned
integer type corresponding to the type of the operand with signed integer type.
粗体段落适用于此处,因为我们正在比较 int
和 unsigned int
。所以 x
的值被转换为类型 unsigned int
.
回到6.3.1.3p2小节的转换规则,unsigned int
(假设32位)的最大值为232-1,所以一比这多了 232。将 232 添加到 x
的值,即 -231,得到 231.这与y
的值相同,所以比较成立。
在第二个示例中,x
的类型为 int64_t
,y
的类型为 int
,实现定义的从 int64_t
到 int
当 x
被分配给 y
时可能导致 y
为 0,因为 x
的低 32 位都是 0.
你在问比较是基于值还是位模式时就草率下了结论,因为首先有一个重要的步骤。比较前,==
的操作数转换为普通类型
例如,当您将 32 位二进制补码 int x
与位模式 1000…00002(表示 −2,147,483,648)和 unsigned int y
与x == y
具有相同的位模式(表示+2,147,483,648),x
首先转换为unsigned int
,产生+2,147,483,648。然后将 +2,147,483,648 与 +2,147,483,648 进行比较,因此 ==
报告它们相等。
C 2018 6.5.9(“相等运算符”)4 说:
If both of the operands have arithmetic type, the usual arithmetic conversions are performed…
常用算术转换在 6.3.1.8 中指定。第 1 段开始:
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted… to a type whose corresponding real type is the common real type.
规则涉及一些技术细节,但是,在很大程度上,当您比较两个整数类型时,首先每个整数类型至少会提升到 int
,然后将较窄的类型转换为较宽的类型类型。如果它们的宽度相同但有一个是无符号的,则有符号类型将被转换为无符号类型。这可能会更改值。
一旦确定要比较的实际值,==
的结果将根据值而不是位模式来定义。
(最常见的情况是浮点数 +0 和 −0,它们表示相同的实数并且比较相等但具有不同的表示形式。在大多数现代环境中,整数类型中的所有位模式表示不同的值,二进制浮点类型中的所有位模式表示不同的值或 NaN,除了 +0 和 -0。有一些不太常用的浮点类型对某些值有多种表示,类似于方式3.5•107和35•106表示相同的数。)
任何时候你将有符号整数类型中的负值与宽度相同或更宽(提升后)的无符号类型进行比较时,有符号类型的值将在比较之前发生变化。所以你有得到“数学错误”结果的风险。
根据语言定义的确切含义:
6.2.6 Representations of typesC 2011 Online Draft
6.2.6.1 General
4 Values stored in non-bit-field objects of any other object type consist of n ×CHAR_BIT
bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of typeunsigned char [n]
(e.g., bymemcpy
); the resulting set of bytes is called the object representation of the value. Values stored in bit-fields consist of m bits, where m is the size specified for the bit-field. The object representation is the set of m bits the bit-field comprises in the addressable storage unit holding it. Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations.
...
6.5.9 Equality operators
...
4 If both of the operands have arithmetic type, the usual arithmetic conversions are performed. Values of complex types are equal if and only if both their real parts are equal and also their imaginary parts are equal. Any two values of arithmetic types from different type domains are equal if and only if the results of their conversions to the (complex) result type determined by the usual arithmetic conversions are equal.
5 Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a null pointer constant, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.
6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.109)
109) Two objects may be adjacent in memory because they are adjacent elements of a larger array or adjacent members of a structure with no padding between them, or because the implementation chose to place them so, even though they are unrelated. If prior invalid pointer operations (such as accesses outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined behavior.
从语义上讲,==
运算符比较的是值,而不是位 - 1.0 == 1
将计算为 true
,即使两个操作数都有完全不同的按位表示。
但是,作为比较的一部分,整数 1
将首先转换为浮点数 1.0
,以便按位比较 可以 制作。