C 中的文字和变量（有符号与无符号短整型）有什么区别？

Question

我在Computer Systems: A Programmer's Perspective, 2/E一书中看到了以下代码。这很好用并创建了所需的输出。输出可以用有符号和无符号表示的差异来解释。

#include<stdio.h>
int main() {
    if (-1 < 0u) {
        printf("-1 < 0u\n");
    }
    else {
        printf("-1 >= 0u\n");
    }
    return 0;
}

上面的代码产生-1 >= 0u，但是，下面的代码应该与上面的相同，但不会！也就是说，

#include <stdio.h>

int main() {

    unsigned short u = 0u;
    short x = -1;
    if (x < u)
        printf("-1 < 0u\n");
    else
        printf("-1 >= 0u\n");
    return 0;
}

产量 -1 < 0u。为什么会这样？我无法解释这个。

请注意，我曾看到类似 this 的问题，但它们无济于事。

PS。正如@Abhineet 所说，可以通过将 short 更改为 int 来解决困境。然而，如何解释这一现象呢？换句话说，-1 在 4 个字节中是 0xff ff ff ff，在 2 个字节中是 0xff ff。给定它们作为解释为 unsigned 的 2s 补码，它们具有相应的值 4294967295 和 65535。它们都不少于 0，我认为在这两种情况下，输出都需要是 -1 >= 0u，即 x >= u.

它在 little endian Intel 系统上的示例输出：

简称：

-1 < 0u
u =
 00 00
x =
 ff ff

对于整数：

-1 >= 0u
u =
 00 00 00 00
x =
 ff ff ff ff

Answer 1

0u is not unsigned short, it's unsigned int.

Edit:: 对行为的解释， How comparison is performed ?

正如 Jens Gustedt 的回答，

This is called "usual arithmetic conversions" by the standard and applies whenever two different integer types occur as operands of the same operator.

In essence what is does

if the types have different width (more precisely what the standard calls conversion rank) then it converts to the wider type if both types are of same width, besides really weird architectures, the unsigned of them wins Signed to unsigned conversion of the value -1 with whatever type always results in the highest representable value of the unsigned type.

可以找到他写的更说明性的博客here。

Answer 2

您运行符合 C 的整数提升规则。

小于 int 类型的运算符会自动将其操作数提升为 int 或 unsigned int。请参阅评论以获取更详细的解释。如果之后类型仍然不匹配（例如 unsigned int 与 int），则二元（双操作数）运算符还有进一步的步骤。我不会尝试更详细地总结规则。 查看 Lundin 的回答。

This blog post 更详细地介绍了这一点，并提供了与您类似的示例：signed 和 unsigned char。它引用了 C99 规范：

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

你可以更轻松地在 Godbolt 之类的东西上玩这个，with a function that returns one or zero。只需查看编译器输出，看看最终会发生什么。

#define mytype short

int main() {
    unsigned mytype u = 0u;
    mytype x = -1;
    return (x < u);
}

Answer 3

与您假设的不同，这不是类型特定宽度的属性，这里是 2 字节对 4 字节，而是要应用的规则的问题。整数提升规则规定，short 和 unsigned short 在相应值范围适合 int 的所有平台上转换为 int。由于这里是这种情况，所以值都被保留下来，得到类型int。 -1 在 int 中可以完美表示，就像 0 一样。所以-1中的测试结果小于0。

在针对 0u 测试 -1 的情况下，公共转换选择 unsigned 类型作为两者都转换为的公共类型。 -1转换为unsigned的值是UINT_MAX，比0u.

大

这是一个很好的例子，为什么你永远不应该使用 "narrow" 类型来进行算术或比较。仅当您有服务器大小限制时才使用它们。简单变量很少会出现这种情况，但主要是大型数组，在这种情况下，您可以真正从存储在窄类型中获益。

Answer 4

The code above yields -1 >= 0u

所有整数文字（数字常量）都有类型，因此也有符号。默认情况下，它们是已签名的 int 类型。当您附加 u 后缀时，您将文字变成 unsigned int.

对于任何有一个有符号操作数和一个无符号操作数的 C 表达式，平衡规则（正式名称：the usual arithmetic conversions）会将有符号类型隐式转换为无符号类型。

从有符号到无符号的转换是明确定义的 (6.3.1.3)：

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

例如，对于标准二进制补码系统中的 32 位整数，无符号整数的最大值为 2^32 - 1（4294967295，UINT_MAX in limits.h）。比最大值多一是2^32。而-1 + 2^32 = 4294967295，所以字面量-1被转换为一个unsigned int，值为4294967295。大于0.

然而，当您将类型切换为短整型时，您最终会得到一个 小整数类型。这是两个例子的区别。每当一个小整数类型是表达式的一部分时，整数提升规则 隐式地将其转换为更大的 int (6.3.1.1):

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

如果给定平台上的 short 小于 int（32 位和 64 位系统就是这种情况），则任何 short 或 unsigned short 都将因此总是会转换为 int，因为它们可以放在一个里面。

所以对于表达式 if (x < u)，您实际上得到的是 if((int)x < (int)u)，它的行为符合预期（-1 小于 0）。

C 中的文字和变量（有符号与无符号短整型）有什么区别？

What is the difference between literals and variables in C (signed vs unsigned short ints)?

c

bit-manipulation

twos-complement

unsigned-integer

integer-promotion