使用 union 内部的结构来编码 base64

Question

我看过很多关于如何实现 Base64 编码器的例子。但是他们中的 none 在联合内部使用 struct 来完成从三个 8 位块到四个 6 位块的转换。而且我想知道为什么没有人使用这种方法，因为对我来说它看起来是一种简单快捷的方法。

我在联合结构中写了一个例子。

namespace Base64
{
    typedef union
    {
        struct
        {
            uint32_t b2     : 0x08;
            uint32_t b1     : 0x08;
            uint32_t b0     : 0x08;
            uint32_t pad    : 0x08;
        } decoded;
        struct
        {
            uint32_t b3     : 0x06;
            uint32_t b2     : 0x06;
            uint32_t b1     : 0x06;
            uint32_t b0     : 0x06;
            uint32_t pad    : 0x08;
        } encoded;
        uint32_t raw;
    } base64c_t;
}

我已经测试过用这种方法将 0xFC0FC0 或二进制 111111000000111111000000 翻译成四个 6 位块，它似乎有效。

Base64::base64c_t b64;

b64.decoded.b0  = 0xFC;
b64.decoded.b1  = 0x0F;
b64.decoded.b2  = 0xC0;

std::cout.fill ( '0' );

std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b0 << std::endl;
std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b1 << std::endl;
std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b2 << std::endl;
std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b3 << std::endl;

输出：

0x3f
0x00
0x3f
0x00

这种将 8 位块转换为 6 位块的方法有缺点吗？还是没有人更早想过这种方式？

Answer 1

位域在结构中的打包顺序是实现定义的。因此，尽管您在您的机器上获得了正确的 base64 结果，但是当您将此代码移植到不同的体系结构或编译器（例如大端电源电脑）。从this answer借用：

Unspecified behavior

The alignment of the addressable storage unit allocated to hold a bit-field (6.7.2.1).

Implementation-defined behavior

Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).

The order of allocation of bit-fields within a unit (6.7.2.1).

因此，您最好使用移位代码（这基本上是每个 base64 实现都使用的代码），因为这将保证跨平台是相同的。

使用 union 内部的结构来编码 base64

Use structs inside union to encode base64

c++

base64

struct

unions