从较小到较大整数指针使用 memcpy() 的与字节序无关的方式

Question

假设我有两个数组。

uint8_t[SIZE] src = { 0 };
uint32_t[SIZE] dst = { 0 };

uint8_t* srcPtr;  // Points to current src value
uint32_t* dstPtr; // Points to current dst value

src 包含有时需要放入 dst 的值。重要的是，来自 src 的值可能是 8 位、16 位或 32 位，并且不一定正确对齐。所以，假设我想像下面那样使用 memcpy() 来复制一个 16 位值

memcpy(dstPtr, srcPtr, 2);

我会运行进入字节顺序问题吗？这在小端系统上工作正常，因为如果我想复制 8，那么 srcPtr 有 08 然后 00 dstPtr 的字节将是 08 00 00 00 并且值将为 8，因为预期。

但如果我在大端系统上，srcPtr 将是 00，然后是 08，而 dstPtr 的字节将是 00 08 00 00（我假设），这将取值 524288.

编写此副本的独立于字节序的方式是什么？

Answer 1

Will I run into an endianness issue here?

是的。您不是在复制，而是从一种格式转换为另一种格式（将几个无符号整数打包成一个更大的无符号整数）。

What would be an endian-independent way to write this copy?

简单的方法是明确转换，例如：

    for(int i = 0; i < something; i++) {
        dest[i] = (uint32_t)src[i*4] | ((uint32_t)src[i*4+1] <<  8) |
                    ((uint32_t)src[i*4+2] <<  16) | ((uint32_t)src[i*4+3] <<  24);
    }

然而，对于使用 memcpy() 的情况，它可能会更快，并且这在编译后不会改变；所以你可以这样做：

#ifdef BIG_ENDIAN
    for(int i = 0; i < something; i++) {
        dest[i] = (uint32_t)src[i*4] | ((uint32_t)src[i*4+1] <<  8) |
                    ((uint32_t)src[i*4+2] <<  16) | ((uint32_t)src[i*4+3] <<  24);
    }
#else
    memcpy(dest, src, something*4);
#endif

注意：您还必须在适当的时候定义 BIG_ENDIAN 宏 - 例如当您知道目标体系结构需要它时，启动编译器时可能是 -D BIG_ENDIAN 命令行参数。

I'm storing 16-bit values in src which aren't 16-bit-aligned which then need to be put into a 64-bit integer

这增加了另一个问题——一些架构不允许未对齐的访问。您需要使用显式转换（读取 2 个单独的 uint8_t，而不是未对齐的 uint16_t）来避免此问题。

Answer 2

Will I run into an endianness issue here?

不一定是字节顺序问题本身，但是，是的，您描述的具体方法运行会导致整数表示问题。

This works fine on little-endian systems, since if I want to copy 8, then srcPtr has 08 then 00 the bytes at dstPtr will be 08 00 00 00 and the value will be 8, as expected.

你似乎在做一个假设，要么

将修改的目标字节数多于您实际复制的字节数，或者可能
目标的相关部分预先设置为全零字节。

但您需要了解 memcpy() 将准确复制请求的字节数。不会从指定的源中读取更多内容，也不会在目标中修改更多内容。特别是，源指针和目标指针指向的对象的数据类型对 memcpy().

的操作没有影响

What would be an endian-independent way to write this copy?

最自然的方法是通过简单的赋值，依靠编译器执行必要的转换：

*dstPtr = *srcPtr;

但是，我认为您强调数组可能未对齐的前景，因为担心取消引用源和/或目标指针可能不安全。事实上，指向 char 的指针并非如此，但指向其他类型的指针可能就是这种情况。对于将 memcpy 作为从数组读取的唯一安全方法的情况，转换值表示的最可移植方法仍然依赖于实现。例如：

uint8_t* srcPtr = /* ... */;
uint32_t* dstPtr = /* ... */;

uint16_t srcVal;
uint32_t dstVal;

memcpy(&srcVal, srcPtr, sizeof(srcVal));
dstVal = srcVal;  // conversion is automatically performed
memcpy(dstPtr, &dstVal, sizeof(dstVal));

从较小到较大整数指针使用 memcpy() 的与字节序无关的方式

Endian-independent way of using memcpy() from smaller to larger integer pointer

c

memory

endianness

memcpy