左移位和丢弃位

Question

让我们考虑将无符号短值（或任何其他无符号整数类型）的右 N 位清零的函数（它的可能实现之一）。可能的实现如下所示：

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
  using type = unsigned short;

  constexpr type mask = ~(type(0));
  constexpr type right_zeros = mask << shift; // <-- error here
  return arg & right_zeros;
}

int check() {
  return zero_right<4>(16);
}

对于这段代码，我有权访问的所有编译器都会以某种方式抱怨可能的溢出。 CLang 是最明确的，具有以下明确信息：

error: implicit conversion from 'int' to 'const type' (aka 'const unsigned short') changes value from 1048560 to 65520 [-Werror,-Wconstant-conversion]

这段代码在我看来定义明确，清晰明了，但是当 3 个编译器抱怨时，我变得非常紧张。我在这里错过了什么吗？真的有可能发生可疑的事情吗？

P.S。虽然 zeriong out left X bits 的替代实现可能会受到欢迎并且很有趣，但这个问题的主要焦点是发布的代码的有效性。

Answer 1

来自 C++11 标准：

5.8 Shift operators [expr.shift]

1 ...

The operands shall be of integral or unscoped enumeration type and integral promotions are performed. The type of the result is that of the promoted left operand.

表达式

mask << shift;

在 对 mask 应用积分提升后 进行评估。因此，如果 sizeof(unsigned short) 为 2，它的计算结果为 1048560，这解释了来自 clang 的消息。

避免溢出问题的一种方法是在执行左移之前先右移，然后将其移动到它自己的函数中。

template <typename T, unsigned int shift>
constexpr T right_zero_bits()
{
   // ~(T(0)) performs integral promotion, if needed
   // T(~(T(0))) truncates the number to T, if needed.
   return (T(~(T(0))) >> shift ) << shift;
}

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
   return arg & right_zero_bits<unsigned short, shift>();
}

Answer 2

我不知道这是否正是你想要的，但它编译：

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
  using type = unsigned short;

  //constexpr type mask = ~(type(0));
  type right_zeros = ~(type(0));
  right_zeros <<= shift;
  return arg & right_zeros;
}

int check() {
  return zero_right<4>(16);
}

更新：

Seems like you simply hushed the compiler by making sure it has no idea what is going on with the types.

没有

首先你得到 right_zeros，值为 FFFF（来自 ~0）。通常，~0 是 FFFFFFFFFFFFFF... 但因为您使用的是 u16，所以您会得到 FFFF.

然后移位4得到FFFF0[计算扩展到32位]，但是回存时只剩下最右边的16位，所以值为FFF0

这是完全合法且已定义的行为，您正在利用截断。编译器是 而不是 "being fooled"。实际上，无论有没有截断，它都可以正常工作。

如果您愿意，您可以将 right_zeros 变成 u32 或 u64，但是您需要添加 right_zeros &= 0xFFFF

If there is an undefined behavior (the very essence of my question!) you simply made it undetectable.

无论编译器怎么说，根据您的代码的整体，没有 UB。

实际上，塔维安明白了。使用显式转换：

constexpr type right_zeros = (type) (mask << shift); // now clean

除其他事项外，这告诉编译器您想要截断为 16 位。

如果有 UB，那么编译器应该仍然会报错。

Answer 3

是的，正如您所怀疑的，即使在抑制编译器诊断之后，您的代码严格来说也不是完全可移植的，因为从 unsigned short 提升为 signed int，位运算在 signed int 中完成，然后 signed int 被转换回 unsigned short。你已经设法避免了未定义的行为（我认为，在快速浏览之后），但结果不能保证是你所希望的。 (type)~(type)0不需要对应类型type中的"all bits one"；在轮班之前就已经很不确定了。

要获得完全可移植的东西，只需确保至少在 unsigned int 中进行所有算术运算（必要时使用更宽的类型，但绝不能使用更窄的类型）。这样就不用担心签名类型的任何升级了。

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
  using type = unsigned short;

  constexpr auto mask = ~(type(0) + 0U);
  constexpr auto right_zeros = mask << shift;
  return arg & right_zeros;
}

int check() {
  return zero_right<4>(16);
}

Answer 4

消息看起来很简单：

error: implicit conversion from 'int' to 'const type' (aka 'const unsigned short') changes value from 1048560 to 65520 [-Werror,-Wconstant-conversion]

mask << shift 的值为 1048560（源自 65535 << 4），您将其赋值给 unsigned short，它被定义为调整值 mod 65536 , 给出 65520.

最后一次转换定义明确。错误消息是因为您传递了编译器标志 -Werror,-Wconstant-conversion 请求无论如何在这种情况下都会收到错误消息。如果您不想出现此错误，请不要传递这些标志。

尽管这种特殊用法定义明确，但某些输入可能存在未定义的行为（即，如果您使用的是 32 位 int 系统，则 shift 为 16 或更大） .所以你应该修复这个功能。

要修复函数，您需要在 unsigned short 情况下更加小心，因为关于将 unsigned short 整数提升为 signed int 的规则极其烦人。

这是一个与其他产品略有不同的解决方案。完全避免班次问题，适用于任何班次规模：

template<unsigned int shift, typename T>
constexpr T zero_right(T arg)
{
    T mask = -1;
    for (int s = shift; s--; ) mask *= 2u;
    return mask & arg;
}

// Demo
auto f() { return zero_right<15>((unsigned short)65535); }  //  mov eax, 32768

左移位和丢弃位

Bit-shifting left and discarding bits

c++

integer-overflow

bit-shift

language-lawyer