ARM 上未对齐内存访问的函数

Question

我正在从事一个从内存中读取数据的项目。其中一些数据是整数，在未对齐的地址访问它们时出现问题。我的想法是为此使用 memcpy，即

uint32_t readU32(const void* ptr)
{
    uint32_t n;
    memcpy(&n, ptr, sizeof(n));
    return n;
}

我从项目源中找到的解决方案与此代码相似：

uint32_t readU32(const uint32_t* ptr)
{
    union {
        uint32_t n;
        char data[4];
    } tmp;
    const char* cp=(const char*)ptr;
    tmp.data[0] = *cp++;
    tmp.data[1] = *cp++;
    tmp.data[2] = *cp++;
    tmp.data[3] = *cp;
    return tmp.n;
}

所以我的问题：

第二个版本不是未定义行为吗？ C 标准在 6.2.3.2 指针中说，在 7:

A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned 57) for the pointed-to type, the behavior is undefined.

由于调用代码在某些时候使用了 char* 来处理内存，因此必须进行一些从 char* 到 uint32_t* 的转换。那么，如果 uint32_t* 没有正确对齐，那不是未定义行为的结果吗？如果它是，那么该函数就没有意义，因为您可以编写 *(uint32_t*) 来获取内存。此外，我想我在某处读到编译器可能期望 int* 正确对齐，任何未对齐的 int* 也将意味着未定义的行为，因此为该函数生成的代码可能会创建一些快捷方式，因为它可能期望函数参数正确对齐。

原始代码在参数和所有变量上有 volatile，因为内存内容可能会改变（它是驱动程序内部的数据缓冲区（无寄存器））。也许这就是它不使用 memcpy 的原因，因为它不适用于易失性数据。但是，这在哪个世界有意义？如果基础数据可以随时更改，那么所有的赌注都会落空。数据甚至可以在这些字节复制操作之间发生变化。所以你必须有某种互斥锁来同步对这些数据的访问。但是如果你有这样的同步，为什么还需要volatile呢？
这个内存访问问题有canonical/accepted/better解决方案吗？经过一番搜索后，我得出的结论是你需要一个互斥锁而不需要 volatile 并且可以使用 memcpy.

P.S.:

# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 10 (v7l)
BogoMIPS        : 1581.05
Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0xc09
CPU revision    : 10

Answer 1

这个代码

uint32_t readU32(const uint32_t* ptr)
{
    union {
        uint32_t n;
        char data[4];
    } tmp;
    const char* cp=(const char*)ptr;
    tmp.data[0] = *cp++;
    tmp.data[1] = *cp++;
    tmp.data[2] = *cp++;
    tmp.data[3] = *cp;
    return tmp.n;
}

将指针作为 uint32_t * 传递。如果它实际上不是 uint32_t，那就是 UB。该参数可能应该是 const void *.

在转换本身中使用 const char * 不是未定义的行为。根据 6.3.2.3 指针 ，the C Standard 的第 7 段（强调我的）：

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

关于在您的特定硬件上直接访问 memory/registers 的正确方法，使用 volatile 没有 canonical/accepted/best 解决方案。任何解决方案都将特定于您的系统并且超出标准 C 的范围。

Answer 2

在标准没有定义的情况下允许实现定义行为，并且一些实现可以指定所有指针类型具有相同的表示并且可以在彼此之间自由转换而不考虑对齐方式，前提是指针实际上是used to access things 适当对齐。

不幸的是，因为一些迟钝的编译器强制使用 "memcpy" 作为即使已知指针已对齐，也会出现混叠问题的逃生阀，编译器可以有效处理需要编译的代码的唯一方法对对齐存储的类型不可知访问是假设任何需要对齐的类型的指针将始终适合这种类型对齐。因此，您认为使用 uint32_t* 的方法很危险的直觉是正确的。可能需要进行编译时检查以确保函数传递的是 void* 或 uint32_t*，而不是 uint16_t* 或 double* 之类的东西，但没有办法以这种方式声明一个函数，而不允许编译器 "optimize" 通过将字节访问合并到一个 32 位加载中的函数，如果指针未对齐，该加载将失败。

ARM 上未对齐内存访问的函数

Function for unaligned memory access on ARM

linux

gcc

arm

memory-alignment

undefined-behavior