转换回我的指针时是否调用了未定义的行为?

Am I invoking undefined behavior when casting back to my pointer?

我一直在试图弄清楚这是否是一个优化错误,因为它似乎只影响堆栈变量,我想知道是否做出了一些不正确的假设。我有这种类型可以转换为相对偏移量,并且在使用 reinterpret_cast 时它一直运行良好,但现在我要转向 static_cast,它开始在优化构建中引起问题。出于安全认证原因,我需要远离 reinterpret_cast,因此我无法选择保持原样。

#include <iostream>

template <typename T>
class Ptr
{
public:
    Ptr(const T* ptr = nullptr) : m_offset(GetOffset(ptr)) {}
    T& operator*() const noexcept { return *GetPtr(); } 
    T* get() const noexcept { return GetPtr(); }

    bool operator==(const T *ptr) const {
        // comment this back in and it stops failing
        //std::cout << "{op==" << get() << " == " << ptr << "}";
        return get() == ptr;
    }

  private:
    std::ptrdiff_t m_offset = 0;

    inline T* GetPtr() const
    {
        auto offset = m_offset;
        auto const_void_address = static_cast<const void*>(&m_offset);
        auto const_char_address = static_cast<const char*>(const_void_address);
        auto offset_address = const_cast<char*>(const_char_address);
        auto final_address  = static_cast<void*>(offset_address - offset);
        return static_cast<T*>(final_address);
    }

    std::ptrdiff_t GetOffset(const void* ptr) const
    {
        auto void_address = static_cast<const void*>(&m_offset);
        auto offset_address = static_cast<const char*>(void_address);
        auto ptr_address = static_cast<const char*>(ptr);
        return offset_address - ptr_address;
    }
};

std::ostream& operator<<(std::ostream &stream, const Ptr<int> &rp) {
    stream << rp.get();
    return stream;
}

int main() {
    int data = 123;
    Ptr<int> rp(&data);
    std::cout << "data " << data << " @ " << &data << std::endl;
    std::cout << "rp " << *rp << " get " << rp.get() << std::endl;
    std::cout << (rp == &data) << std::endl;
    std::cout << "(rp.get() == &data) = " << (rp.get() == &data) << std::endl;
    std::cout << "(rp == &data) = " << (rp == &data) << std::endl;
    return 0;
}

打开优化后,我得到这样的输出:

data 123 @ 0x7ffe79544a34
rp 123 get 0x7ffe79544a34
0
(rp.get() == &data) = 0
(rp == &data) = 0

其中包含一些明显与自身不一致的输出。

我已经在 GCC 8,9 和 11.2 上测试过了。

这感觉就像我不明白 UB 的来源,或者这里有一个编译器优化错误。


编辑/更新:

在查看了更多细节之后,我认为唯一的解决方案是以半安全的方式进行类型双关,所以我尝试了这个解决方案并且它似乎有效。 (看来,我现在做的是 C++20 的一部分,叫做 bit_cast,所以也许这是有效的?)

    inline T* GetPtr() const
    {
        auto offset = m_offset;
        intptr_t realAddress;
        auto address_of_m_offset = &m_offset;
        std::memcpy(&realAddress, &address_of_m_offset, sizeof( realAddress));
        realAddress -= m_offset;
        T *outValue;
        std::memcpy(&outValue, &realAddress, sizeof( outValue));
        return outValue;
    }

    std::ptrdiff_t GetOffset(const void* ptr) const
    {
        auto address_of_m_offset = &m_offset;
        intptr_t myAddress;
        std::memcpy(&myAddress, &address_of_m_offset, sizeof(myAddress));
        intptr_t realAddress;
        std::memcpy(&realAddress, &ptr, sizeof(realAddress));
        return static_cast<ptrdiff_t>(myAddress - realAddress);
    }

这似乎不再导致 GCC 出现问题。我听说 std::memcpy 用于相同类型的对象,否则我们会使用 reinterpret_cast,所以这对我来说很有意义。

您未定义的行为在 GetOffset

标准是这样定义指针减法的:

When two pointer expressions P and Q are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff in the <cstddef> header ([support.types.layout])

  • If P and Q both evaluate to null pointer values, the result is 0.
  • Otherwise, if P and Q both point to, respectively, array elements i and j of the same array object x, the expression P - Q has the value i - j.
  • Otherwise, the behavior is undefined.

这里,P(地址为m_object)和Q(地址为data)不是同一个元素数组,所以这是未定义的行为。

指针和整数的加减法也是按照数组元素定义的:

When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.

  • If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
  • Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expression P + J and J + P (Where J has value j point to the (possibly-hypothetical) array element i+j of x if 0≤i+jn and the expression P - J points to the (possibly-hypothetical) array element i-j of x if 0≤i-j≤n.
  • Otherwise, the behavior is undefined.

指针减法发生在offset_address - offset,其中Pm_offset的地址,offset可能是一些正数。 m_offset 是数组的第一个元素,所以 i-j < 0

因此,编译器可以看到 GetPtr returns 是一个相对于 m_offset 的指针(在 char[sizeof(Ptr<int>)] 数组中为对象设置别名),因此它不能等于 data 的地址(没有 UB),因此优化器可以用 false.

替换 (rp.get() == &data)

当你使用ptrdiff_t时,没有这样的加减限制。尽管该标准不保证 reinterpret_cast<char*>(reinterpret_cast<intptr_t>(char_pointer) + n) == char_pointer + n(指针线性映射,正如您所期望的那样),但在常见架构上使用 gcc 进行编译时会发生这种情况。