转换回我的指针时是否调用了未定义的行为？

Question

我一直在试图弄清楚这是否是一个优化错误，因为它似乎只影响堆栈变量，我想知道是否做出了一些不正确的假设。我有这种类型可以转换为相对偏移量，并且在使用 reinterpret_cast 时它一直运行良好，但现在我要转向 static_cast，它开始在优化构建中引起问题。出于安全认证原因，我需要远离 reinterpret_cast，因此我无法选择保持原样。

#include <iostream>

template <typename T>
class Ptr
{
public:
    Ptr(const T* ptr = nullptr) : m_offset(GetOffset(ptr)) {}
    T& operator*() const noexcept { return *GetPtr(); } 
    T* get() const noexcept { return GetPtr(); }

    bool operator==(const T *ptr) const {
        // comment this back in and it stops failing
        //std::cout << "{op==" << get() << " == " << ptr << "}";
        return get() == ptr;
    }

  private:
    std::ptrdiff_t m_offset = 0;

    inline T* GetPtr() const
    {
        auto offset = m_offset;
        auto const_void_address = static_cast<const void*>(&m_offset);
        auto const_char_address = static_cast<const char*>(const_void_address);
        auto offset_address = const_cast<char*>(const_char_address);
        auto final_address  = static_cast<void*>(offset_address - offset);
        return static_cast<T*>(final_address);
    }

    std::ptrdiff_t GetOffset(const void* ptr) const
    {
        auto void_address = static_cast<const void*>(&m_offset);
        auto offset_address = static_cast<const char*>(void_address);
        auto ptr_address = static_cast<const char*>(ptr);
        return offset_address - ptr_address;
    }
};

std::ostream& operator<<(std::ostream &stream, const Ptr<int> &rp) {
    stream << rp.get();
    return stream;
}

int main() {
    int data = 123;
    Ptr<int> rp(&data);
    std::cout << "data " << data << " @ " << &data << std::endl;
    std::cout << "rp " << *rp << " get " << rp.get() << std::endl;
    std::cout << (rp == &data) << std::endl;
    std::cout << "(rp.get() == &data) = " << (rp.get() == &data) << std::endl;
    std::cout << "(rp == &data) = " << (rp == &data) << std::endl;
    return 0;
}

打开优化后，我得到这样的输出：

data 123 @ 0x7ffe79544a34
rp 123 get 0x7ffe79544a34
0
(rp.get() == &data) = 0
(rp == &data) = 0

其中包含一些明显与自身不一致的输出。

我已经在 GCC 8,9 和 11.2 上测试过了。

一回-O0就好了
如果我取消注释 operator== 中的 std::cout 就可以了。
如果我回到 reinterpret_cast (return reinterpret_cast<T*>(reinterpret_cast<std::ptrdiff_t>(&m_offset) - offset);) 就好了。
如果我将数据分配为指针并从中初始化 rp，也可以。
它似乎在 clang 下表现得像我预期的那样（8 到 13 看起来不错）

这感觉就像我不明白 UB 的来源，或者这里有一个编译器优化错误。

编辑/更新：

在查看了更多细节之后，我认为唯一的解决方案是以半安全的方式进行类型双关，所以我尝试了这个解决方案并且它似乎有效。（看来，我现在做的是 C++20 的一部分，叫做 bit_cast，所以也许这是有效的？）

    inline T* GetPtr() const
    {
        auto offset = m_offset;
        intptr_t realAddress;
        auto address_of_m_offset = &m_offset;
        std::memcpy(&realAddress, &address_of_m_offset, sizeof( realAddress));
        realAddress -= m_offset;
        T *outValue;
        std::memcpy(&outValue, &realAddress, sizeof( outValue));
        return outValue;
    }

    std::ptrdiff_t GetOffset(const void* ptr) const
    {
        auto address_of_m_offset = &m_offset;
        intptr_t myAddress;
        std::memcpy(&myAddress, &address_of_m_offset, sizeof(myAddress));
        intptr_t realAddress;
        std::memcpy(&realAddress, &ptr, sizeof(realAddress));
        return static_cast<ptrdiff_t>(myAddress - realAddress);
    }

这似乎不再导致 GCC 出现问题。我听说 std::memcpy 用于相同类型的对象，否则我们会使用 reinterpret_cast，所以这对我来说很有意义。

Answer 1

您未定义的行为在 GetOffset。

标准是这样定义指针减法的：

When two pointer expressions P and Q are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff in the <cstddef> header ([support.types.layout])

If P and Q both evaluate to null pointer values, the result is 0.

Otherwise, if P and Q both point to, respectively, array elements i and j of the same array object x, the expression P - Q has the value i - j.

Otherwise, the behavior is undefined.

这里，P（地址为m_object）和Q（地址为data）不是同一个元素数组，所以这是未定义的行为。

指针和整数的加减法也是按照数组元素定义的：

When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.

If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.

Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expression P + J and J + P (Where J has value j point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i-j of x if 0≤i-j≤n.

Otherwise, the behavior is undefined.

指针减法发生在offset_address - offset，其中P是m_offset的地址，offset可能是一些正数。 m_offset 是数组的第一个元素，所以 i-j < 0

因此，编译器可以看到 GetPtr returns 是一个相对于 m_offset 的指针（在 char[sizeof(Ptr<int>)] 数组中为对象设置别名），因此它不能等于 data 的地址（没有 UB），因此优化器可以用 false.

替换 (rp.get() == &data)

当你使用ptrdiff_t时，没有这样的加减限制。尽管该标准不保证 reinterpret_cast<char*>(reinterpret_cast<intptr_t>(char_pointer) + n) == char_pointer + n（指针线性映射，正如您所期望的那样），但在常见架构上使用 gcc 进行编译时会发生这种情况。

转换回我的指针时是否调用了未定义的行为？

Am I invoking undefined behavior when casting back to my pointer?

c++

optimization

undefined-behavior

language-lawyer