有效类型的 memmove 就地更改(类型双关)

memmove in-place change of effective type (type-punning)

在下面的问题中: What's a proper way of type-punning a float to an int and vice-versa?,结论是从整数位构造双精度数的方法是通过 memcpy.

可以,找到的pseudo_cast转换方法是:

template <typename T, typename U>
inline T pseudo_cast(const U &x)
{
    static_assert(sizeof(T) == sizeof(U));    
    T to;
    std::memcpy(&to, &x, sizeof(T));
    return to;
}

我会这样使用它:

int main(){
  static_assert(std::numeric_limits<double>::is_iec559);
  static_assert(sizeof(double)==sizeof(std::uint64_t));
  std::uint64_t someMem = 4614253070214989087ULL;
  std::cout << pseudo_cast<double>(someMem) << std::endl; // 3.14
}

我对标准和 cppreference is/was 的解释也应该可以使用 memmove 就地更改 effective type,如下所示:

template <typename T, typename U>
inline T& pseudo_cast_inplace(U& x)
{
    static_assert(sizeof(T) == sizeof(U));
    T* toP = reinterpret_cast<T*>(&x);
    std::memmove(toP, &x, sizeof(T));
    return *toP;
}

template <typename T, typename U>
inline T pseudo_cast2(U& x)
{
    return pseudo_cast_inplace<T>(x); // return by value
}

重新解释转换 本身 对任何指针都是合法的(只要不违反 cv,cppreference/reinterpret_cast 中的第 5 项)。然而,取消引用需要 memcpy memmove(§6.9.2),并且 T 和 U 必须可以简单地复制。

这合法吗?它使用 gcc 和 clang 编译并做正确的事情。 memmove 明确允许源和目标重叠,根据 到 cppreference std::memmove and memmove,

The objects may overlap: copying takes place as if the characters were copied to a temporary character array and then the characters were copied from the array to dest.


编辑:最初这个问题有一个由@hvd 发现的小错误(导致段错误)。谢谢!问题还是一样,这样合法吗?

C++ 不允许仅通过复制字节来构造 double。首先需要构造一个对象(可能会保留其未初始化的值),然后才能填充其字节以生成值。这在 C++14 之前未指定,但 C++17 的当前草案包含在 [intro.object]:

An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2).

尽管使用默认初始化构造 double 不会执行任何初始化,但构造仍然需要进行。您的第一个版本通过声明局部变量 T to; 来包含此构造。你的第二个版本没有。

您可以修改您的第二个版本以使用放置 new 在先前持有 U 对象的同一位置构造一个 T,但在这种情况下,当您通过&xmemmove,不再需要读取构成 x 值的字节,因为对象 x 已经被之前的放置破坏了new.

在实际类型为 uint64_t 时访问 double 未定义行为 因为编译器永远不会认为类型为 double 的对象可以共享 uint64_t intro.object:

类型对象的地址

Unless an object is a bit-field or a base class subobject of zero size, the address of that object is the address of the first byte it occupies. Two objects a and b with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they have distinct addresses.

我对标准的阅读表明这两个函数都会导致 UB。

考虑:

int main()
{
    long x = 10;
    something_with_x(x*10);
    double& y = pseudo_cast_inplace<double>(x);
    y = 20;
    something_with_y(y*10);
}

由于严格的别名规则,在我看来,没有什么可以阻止编译器重新排序指令以生成代码:

int main()
{
    long x = 10;
    double& y = pseudo_cast_inplace<double>(x);
    y = 20;
    something_with_x(x*10);   // uh-oh!
    something_with_y(y*10);
}

我认为唯一合法的写法是:

template <typename T, typename U>
inline T pseudo_cast(U&& x)
{
    static_assert(sizeof(T) == sizeof(U));
    T result;
    std::memcpy(std::addressof(result), std::addressof(x), sizeof(T));
    return result;
}

这实际上会导致完全相同的汇编程序输出(即 none 无论如何 - 整个函数被省略,变量本身也是如此) - 至少在带有 -O2

的 gcc 上

这在 C++20 中应该是合法的。 Example in godbolt.

template <typename T, typename U>
requires (
    sizeof(U) >= sizeof(T) and 
    std::alignment_of_v<T> <= std::alignment_of_v<U> and 
    std::is_trivially_copyable_v<T> and
    std::is_trivially_destructible_v<U>
)
[[nodiscard]] T& reinterpret_object(U& obj)
{
    // Get access to object representation
    std::byte* bytes = reinterpret_cast<std::byte*>(&obj); 
    
    // Copy object representation to temporary buffer.
    // Implicitly create a T object in the destination storage. The lifetime of U object ends.
    // Copy temporary buffer back.
    void* storage = std::memmove(bytes, bytes, sizeof(T));
    
    // Storage pointer value is 'pointer to T object', so we are allowed to cast it to the proper pointer type.
    return *static_cast<T*>(storage);
}
    允许
  • reinterpret_cast 指向不同的指针类型 (7.6.1.10)

    An object pointer can be explicitly converted to an object pointer of a different type.

  • 允许通过 std::byte* 指针访问对象表示 (7.2.1)

    If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined

    • a char, unsigned char, or std​::​byte type.
  • std::memmove 的行为就像复制到临时缓冲区一样,并且可以隐式创建对象 (21.5.3)

    The functions memcpy and memmove are signal-safe. Both functions implicitly create objects ([intro.object]) in the destination region of storage immediately prior to copying the sequence of characters to the destination.

    在 (6.7.2)

    中描述了隐式对象创建

    Some operations are described as implicitly creating objects within a specified region of storage. For each operation that is specified as implicitly creating objects, that operation implicitly creates and starts the lifetime of zero or more objects of implicit-lifetime types ([basic.types]) in its specified region of storage if doing so would result in the program having defined behavior. If no such set of objects would give the program defined behavior, the behavior of the program is undefined. If multiple such sets of objects would give the program defined behavior, it is unspecified which such set of objects is created. [Note 4: Such operations do not start the lifetimes of subobjects of such objects that are not themselves of implicit-lifetime types. — end note]

    Further, after implicitly creating objects within a specified region of storage, some operations are described as producing a pointer to a suitable created object. These operations select one of the implicitly-created objects whose address is the address of the start of the region of storage, and produce a pointer value that points to that object, if that value would result in the program having defined behavior. If no such pointer value would give the program defined behavior, the behavior of the program is undefined. If multiple such pointer values would give the program defined behavior, it is unspecified which such pointer value is produced.

    未指定 std::memmove 是这样一个函数,它的 returned 指针值将是指向隐式创建的对象的指针。 不过有道理就是这样。

  • (7.6.1.9)

    允许返回指向新对象的指针

    A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.

    如果 std::memmove 没有 return 可用的指针值,std::launder<T>(reinterpret_cast<T*>(bytes)) (17.6.5) 应该能够产生这样的指针值。

补充说明:

  • 我不能 100% 确定所有 requires 是否正确或缺少某些条件。

  • 要获得零开销,编译器必须优化 std::memmove(gcc 和 clang 似乎可以做到)。

  • 原始对象的生命周期结束(6.7.3

    A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling a destructor or pseudo-destructor ([expr.prim.id.dtor]) for the object.

    这意味着使用原始名称或指针或对其的引用将导致未定义的行为。

    可以通过重新解释对象来“复活”对象 reinterpret_object<U>(reinterpret_object<T>(obj)) 并且应该允许使用旧引用 (6.7.3)

    If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o1 is transparently replaceable by an object o2 if:

    • the storage that o2 occupies exactly overlays the storage that o1 occupied, and
    • o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
    • o1 is not a complete const object, and
    • neither o1 nor o2 is a potentially-overlapping subobject ([intro.object]), and
    • either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2, respectively, and p1 is transparently replaceable by p2.
  • 对象表示应该是“兼容的”,将原始对象的字节解释为新对象的字节会产生“垃圾”甚至陷阱表示。