使用类型双关将对象分解为单词

using type punning to decompose object into words

看看下面的代码,它把一个对象分解成单词,以便使用只接受一个单词的 API 将(单词对齐的)对象写入内存:

void func(some_type obj /*obj is word aligned*/, unsigned int size_of_obj_in_words)
{
    union punning{
        unsigned char bytes[4]; /* assume 4 bytes in word in my system */
        uint32_t      word;
    };
    union punning pun;
    unsigned char *legal_aliasing_by_char_pointer;
    for (int i=0; i < size_of_obj_in_words; i++)
    {
        for (int j=0; j<4; j++)
        {
            legal_aliasing_by_char_pointer = (unsigned char *)&obj + j + i*4;
            pun.byte[j] = *legal_aliasing_by_char_pointer;
        }
        /* finally, using word aliasing to decompose object to words */
        /* endianity is not important */
        write_word_to_hw_by_word(pun.word)
    }
}   

我正在尝试以符合 c 标准的方式执行它,以便不会违反严格的别名规则。该代码是否实现了该目标?

看起来不错,但你可以简化很多:

void func(some_type obj)
{
    uint32_t word;

    for (int i=0; i < sizeof obj / sizeof word; i++)
    {
        memcpy(&word, (char *)&obj + i * sizeof word, sizeof word);
        write_word(word);
    }
}

obj 的对齐方式无关紧要。此外,您不需要传递大小,因为 sizeof 可以完成工作。

我怀疑如果您将函数更改为接受对象的地址,它的性能会更好,在这种情况下您可能还想传递一个数组长度。

首先,如果类型很大,也许你应该改为通过指针传入。

如果(且仅当)结构已经像您所说的那样正确对齐,您可以将指针转换为 uint32_t.

的结构和数组的并集
#include <stdlib.h>
#include <stdint.h>

typedef struct some_type {
    uint32_t a;
    uint32_t b;
} some_type;

void write_word_to_hw_by_word(uint32_t word);

void func(some_type *obj)
{
    union punning {
        uint32_t words[sizeof (some_type) / sizeof (uint32_t)];
        some_type obj;
    } *pun_ptr;

    pun_ptr = (union punning *)obj;
    for (size_t i = 0; i < sizeof (some_type) / sizeof (uint32_t); i++)
    {
        write_word_to_hw_by_word(pun_ptr->words[i]);
    }
}

这里我们使用左值表达式 *pun_ptr 通过依赖 C11 6.5p7 来访问 obj 中的单词:

  1. An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

    • a type compatible with the effective type of the object,
    • a qualified version of a type compatible with the effective type of the object,
    • a type that is the signed or unsigned type corresponding to the effective type of the object,
    • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
    • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
    • a character type.

你可能想多了。以下代码没有违反严格的别名规则。

void func(void *obj, size_t size_of_obj) // Passing address and size of object
{
    for (size_t i = 0; i < size_of_obj; i += sizeof(word_type))
    {
        word_type word;
        memcpy(&word, obj, sizeof(word_type));
        write_word_to_hw_by_word(word);
        obj = (char *)obj + sizeof(word_type);
    }
}

它也独立于对象的类型,不进行任何大的堆栈分配。

唯一重要的是 "some_type" 及其成员的实际类型。由于它不太可能是 uint32_t 的数组,因此您必须采取一些技巧来避免严格的别名。 (如前所述,您可能应该将结构作为指针传递,但它不会改变任何关于别名的内容。)

最简单的方法:

void func(some_type obj /*obj is word aligned*/, size_t size_of_obj_in_words)
{
  _Static_assert( (_Alignof(some_type) % _Alignof(uint32_t) )==0 , "Incorrect alignment");

  typedef union 
  {
    some_type  st;
    uint32_t   word[sizeof(some_type) / sizeof(uint32_t)];
  } pun_intended_t;

  pun_intended_t* pi = (pun_intended_t*) &obj;
  for(size_t i=0; i<size_of_obj_in_words; i++)
  {
    write_word_to_hw_by_word(pi->word[i]);
  }
}  

这不涉及中间复制缓冲区。这是一个优势,因为 write_word_to_hw_by_word 可能涉及 volatile 限定访问并且它可能(?)位于另一个翻译单元中,因此编译器不太可能优化掉中间 memcpy 到临时缓冲区。

为什么上面的方法有效:

C17 6.7.2.1/16

A pointer to a union object, suitably converted, points to each of its members and vice versa.

严格别名 C17 6.5/7 的例外情况:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
/--/
- an aggregate or union type that includes one of the aforementioned types among its members

是否支持各种构造的问题是实施质量问题,不在标准的管辖范围内。为各种任务设计和配置的质量实现将以适合这些任务的方式维护 C 原则的精神 "Don't prevent the programmer from doing what needs to be done"。

除非有人不得不使用不是特别合适但最好的实现或配置来执行某些任务,否则应该以允许编写简单代码的实现或配置为目标,然后编写简单的代码代码。

当为 -fstrict-aliasing 配置时,clang 和 gcc 都不能正确处理标准定义的所有极端情况,并试图找到利用标准的极端情况来完成无法有效完成的事情的方法否则就是徒劳。该标准没有努力要求所有实现都适合需要低级语义的目的,它支持看起来像低级语义的东西只是为了适应只需要做高级事情的代码。如果您需要做低级的事情,请以这样一种方式编写代码,即做出任何合理努力以支持低级语义的编译器在这样做时不会有任何问题,而不必担心标准是否会强制支持即使是编译器不适用于需要低级语义的任务。