有没有办法得到一个非 8 位的多数据类型？

Question

我知道这是一个奇怪的问题，但我对此有一些想法，这让我很感兴趣。

示例：我怎样才能得到 31 位数据类型或其他东西？）

现在我认为答案是否定的，你不能。也许我错了？

Answer 1

您始终可以手动实现任何宽度的换行，例如 a++; a&=0x7fffffff; 将结果屏蔽为 31 位并实现无符号 31 位类型。将符号扩展重做为更宽的类型代价更高，通常是左移然后算术右移，除非语言 and/or 硬件特别支持源宽度。（例如，ARM 有一个带符号的位域扩展指令，可以提取任意位域并将其符号扩展到完整的整数寄存器中）。

有些 CPU 的字 and/or 字节不是 8 位的倍数，例如PDP-10 有 36 位字。 https://en.wikipedia.org/wiki/36-bit。在该系统上，自然大小为 36 位，而 32 位将是需要额外指令的非标准类型。

could I have some kind of data structures that will be store in memory like 31 bit -> 31 bit -> 31 bit and can I made CPU work with them as 31 bit.

不，你不能那样做。 据我所知，没有可位寻址内存的 CPU。 任何 load/store 都必须至少对齐到字节边界。（如今字节可寻址内存几乎是通用的，但一些 DSP 和一些较旧的 CPU（如 DEC Alpha）仅 have/had 字可寻址内存）。

带有位域的 C 将模拟较窄的类型，但带有填充；你无法避免让编译器生成的 asm 接触填充。

例如

struct i31 {
    int i:31;   // note *signed* int
    // 1 bit of padding is implicit on targets with 32-bit int
};

struct i31 inc(struct i31 x) {
    x.i++;
    return x;
}

int extend_to_int(struct i31 x) {
    return x.i;
}

compiles for x86-64 to this (on the Godbolt compiler explorer).

我可能应该使用 gcc -fwrapv 将带符号溢出的行为定义为 2 的补码环绕。我不确定位域的 C 规则是什么，将带符号的结果分配给带符号的位域是否仍会触发 ISO C 和 C++ 中的带符号溢出未定义行为。

# gcc8.2 -O3
inc(i31):
    lea     eax, [rdi+1]
    and     edi, -2147483648   # keep the top bit of the input
    and     eax, 2147483647    # keep the low 31 bits of i++
    or      eax, edi           # merge.
          #   IDK why it can't / doesn't just leave the carry-out in the padding
    ret
extend_to_int(i31):
    lea     eax, [rdi+rdi]     # left shift by 1 (and copy)
    sar     eax                # shift arithmetic right (by 1)
    ret

但是 ARM 很整洁，并且具有比 x86 更好的位域指令。（几乎所有东西都有比 x86 更好的位域指令）。

# ARM gcc7.2 -march=armv8-a -O3
inc(i31):
    add     r3, r0, #1
    bfi     r0, r3, #0, #31    # bitfield insert to preserve the high bit of the struct
    bx      lr
extend_to_int(i31):
    sbfx    r0, r0, #0, #31    # signed bitfield extract
    bx      lr

有没有办法得到一个非 8 位的多数据类型？

Is there any way to get a non 8-bit multiple data type?

memory

assembly

types

cpu-architecture

low-level