为什么 clang 在每次使用时都取消引用参数？

Question

我一直在对工作中的一些代码进行性能优化，偶然发现了一些奇怪的行为，我将其归结为下面的简单 C++ 代码片段：

#include <stdint.h>

void Foo(uint8_t*& out)
{
    out[0] = 1;
    out[1] = 2;
    out[2] = 3;
    out[3] = 4;
}

然后我用 clang（在 Windows 上）编译它，其中包含以下内容：clang -S -O3 -masm=intel test.cpp。这导致以下程序集：

        mov     rax, qword ptr [rcx]
        mov     byte ptr [rax], 1
        mov     rax, qword ptr [rcx]
        mov     byte ptr [rax + 1], 2
        mov     rax, qword ptr [rcx]
        mov     byte ptr [rax + 2], 3
        mov     rax, qword ptr [rcx]
        mov     byte ptr [rax + 3], 4
        ret

为什么 clang 生成的代码会重复将 out 参数取消引用到 rax 寄存器中？这似乎是一个非常明显的优化，它故意不进行，所以问题是为什么？

有趣的是，我尝试将 uint8_t 更改为 uint16_t，结果生成了更好的机器代码：

        mov     rax, qword ptr [rcx]
        movabs  rcx, 1125912791875585
        mov     qword ptr [rax], rcx
        ret

Answer 1

由于 uint8_t 总是*定义为 unsigned char 的严格别名，编译器无法进行此类优化。因此它可以指向任何内存位置，这意味着它也可以指向它自己，并且因为您将它作为引用传递，所以函数内部的写入可以有 side-effects。

这里是晦涩但正确的用法，取决于 non-cached 读取：

#include <cassert>
#include <stdint.h>
void Foo(uint8_t*& out)
{
    uint8_t local;
    // CANNOT be used as a cached value further down in the code.
    uint8_t* tmp = out;
    // Recover the stored pointer.
    uint8_t **orig =reinterpret_cast<uint8_t**>(out);
    // CHANGES `out` itself;
    *orig=&local;

    **orig=5;
    assert(local==5);
    // IS NOT EQUAL even though we did not touch `out` at all;
    assert(tmp!=out);
    assert(out==&local);
    assert(*out==5);
}

int main(){
   // True type of the stored ptr is uint8_t**
   uint8_t* ptr = reinterpret_cast<uint8_t*>(&ptr);

   Foo(ptr);
}

这也解释了为什么 uint16_t 生成“优化”代码，因为 uin16_t 永远不会*成为 (unsigned) char，因此编译器可以自由假设它不会为其他指针类型设置别名，例如就像它自己一样。

*可能是一些不相关的晦涩平台 differently-sized 字节。那是题外话。

为什么 clang 在每次使用时都取消引用参数？

Why is clang dereferencing a parameter on every use?

c++

windows

clang++