C：位域和位运算符

Question

我的教授给我们布置了一些使用位域的作业，并给了我们三个宏

# define SETBIT(A, k) { A[k >> 3] |= (01 << (k & 07)); }
# define CLRBIT(A, k) { A[k >> 3] &= ~(01 << (k & 07)); }
# define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)

我们还没有required/expected去完全理解，只是用来完成赋值。这些宏中的每一个都在该索引处采用 unsigned int 和索引 (0-8) 和 sets/gets/clears 位。我明白了，我明白了如何使用它。

我想知道的是这些宏的作用。有人能像我五岁那样给我解释一下吗？

Answer 1

宏的作用

忽略下一节中概述的问题，宏将整数类型的数组视为 8 位值的数组，并且当要求处理位 k 时，处理 k%8^th 位数组的第 k/8^th 元素。

但是，它没有使用 k % 8 或 k / 8，而是使用移位和掩码。

# define SETBIT(A, k) { A[k >> 3] |= (01 << (k & 07)); }
# define CLRBIT(A, k) { A[k >> 3] &= ~(01 << (k & 07)); }
# define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)

k >> 3 将值右移 3 位位置，有效地除以 8。
k & 07 提取 3 个最低有效位（因为 07 八进制或 7 十进制在二进制中是 111），忽略其余位。
01 << (k & 07) 根据 k & 07 的值将值 1 左移 0..7 位，产生一个二进制值：
```
0000 0001
0000 0010
0000 0100
0000 1000
0001 0000
0010 0000
0100 0000
1000 0000
```
形式上，它实际上是一个 int 值，因此可能有 32 位，但高位全为零。
~ 运算符将每个 0 位转换为 1，将每个 1 位转换为 0。
& 运算符组合两个值，产生一个 1 位（两个位均为 1）和一个 0（其中一个或两个位为 0）。
| 运算符组合两个值，产生一个 0 位（两个位都为 0）和一个 1（其中一个或两个位都为 1）。
赋值运算符 |= 和 &= 将 RHS 上的操作数应用于 LHS 上的变量。表示法 a |= b; 等同于 a = a | b; 除了 a 只计算一次。这个细节在这里无关紧要；如果表达式 a.

综合起来：

SETBIT 在由 [=46= 表示的 8 位值数组中设置第 k^th 位（意味着将其设置为 1） ].
CLRBIT 重置由 [=46= 表示的 8 位值数组中的第 k^th 位（意思是将其设置为 0） ].
GETBIT 查找 k^th 位中的值 A 表示的 8 位值数组，并且 returns 它作为 0 或 1 — 这就是最后的 >> (k & 07) 所做的。

名义上，数组元素应该是 unsigned char 以避免值的问题和浪费 space，但是任何整数类型都可以使用，或多或少会造成浪费。如果类型是 signed char 并且在值上设置了高位，或者如果类型是纯 char 并且纯 char 是有符号类型，您会得到有趣的结果。如果 A 的类型是大于 char 的整数类型并且数组中的值的位设置在最后（最低有效）8 位之外，您还可以从 GETBIT 获得有趣的结果的数量。

宏不做什么

教授提供的宏是关于如何不编写 C 预处理器宏的实物课。他们不会教你如何写出好的 C；他们教如何写出可怕的 C.

这些宏中的每一个都有被破坏的危险，因为参数 k 在使用时没有包含在括号中。不难争辩说 A 也是如此。使用 01 和 07 并没有完全错误，但是八进制 01 和 07 与十进制 1 和 7 相同。

GETBIT 宏的整个主体也需要额外的括号。鉴于

int y = 2;
unsigned char array[32] = "abcdefghijklmnopqrstuvwxyz01234";

那么这不会编译：

int x = GETBIT(array + 3, y + 2) + 13;

如果您的编译器选项足够宽松，这确实可以编译（带有警告），但会产生异常结果：

int x = GETBIT(3 + array, y + 2) + 13;

那是在我们尝试讨论之前：

int x = GETBIT(3 + array, y++) + 13;

CLRBIT 和 SETBIT 宏使用大括号，这意味着您不能写：

if (GETBIT(array, 13))
    SETBIT(array, 27);
else
    CLRBIT(array, 19);

因为SETBIT后面的分号是SETBIT引入的语句块中右大括号后面的空语句，所以else子句在语法上根本不正确。

宏可以这样写（保留SETBIT和CLRBIT宏的语句块结构）：

#define SETBIT(A, k) do { (A)[(k) >> 3] |= (1 << ((k) & 7)); } while (0)
#define CLRBIT(A, k) do { (A)[(k) >> 3] &= ~(1 << ((k) & 7)); } while (0)
#define GETBIT(A, k) (((A)[(k) >> 3] & (1 << ((k) & 7))) >> ((k) & 7))

do { … } while (0) 表示法是宏中的一种标准技术，可以解决破坏 if / else 语句的问题。

宏也可以这样重写，因为赋值是表达式：

#define SETBIT(A, k) ( (A)[(k) >> 3] |=  (1 << ((k) & 7)))
#define CLRBIT(A, k) ( (A)[(k) >> 3] &= ~(1 << ((k) & 7)))
#define GETBIT(A, k) (((A)[(k) >> 3] &   (1 << ((k) & 7))) >> ((k) & 7))

或者，甚至更好，因为 static inline 函数是这样的：

static inline void SETBIT(unsigned char *A, int k) { A[k >> 3] |=  (1 << (k & 7)); }
static inline void CLRBIT(unsigned char *A, int k) { A[k >> 3] &= ~(1 << (k & 7)); }
static inline int  GETBIT(unsigned char *A, int k) { return (A[k >> 3] & (1 << (k & 7))) >> (k & 7); }

整体可以组装成一个简单的测试程序：

#if MODE == 1

/* As provided */
#define SETBIT(A, k) { A[k >> 3] |= (01 << (k & 07)); }
#define CLRBIT(A, k) { A[k >> 3] &= ~(01 << (k & 07)); }
#define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)

#elif MODE == 2

/* As rewritten */
#define SETBIT(A, k) do { (A)[(k) >> 3] |= (1 << ((k) & 7)); } while (0)
#define CLRBIT(A, k) do { (A)[(k) >> 3] &= ~(1 << ((k) & 7)); } while (0)
#define GETBIT(A, k) (((A)[(k) >> 3] & (1 << ((k) & 7))) >> ((k) & 7))

#else

/* As rewritten */
static inline void SETBIT(unsigned char *A, int k) { A[k >> 3] |=  (1 << (k & 7)); }
static inline void CLRBIT(unsigned char *A, int k) { A[k >> 3] &= ~(1 << (k & 7)); }
static inline int  GETBIT(unsigned char *A, int k) { return (A[k >> 3] & (1 << (k & 7))) >> (k & 7); }

#endif

int main(void)
{
    int y = 2;
    unsigned char array[32] = "abcdefghijklmnopqrstuvwxyz01234";
    int x = GETBIT(array + 3, y + 2) + 13;
    int z = GETBIT(3 + array, y + 2) + 13;

    if (GETBIT(array, 3))
        SETBIT(array, 22);
    else
        CLRBIT(array, 27);

    return x + z;
}

当使用 -DMODE=2 或 -DMODE=0 或没有任何 -DMODE 设置编译时，它是干净的。当使用 -DMODE=1 编译时，会有大量警告（对我来说是错误，因为我使用 GCC 并使用 -Werror 编译，这会使任何警告变成错误）。

$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DMODE=0 bits23.c -o bits23 
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DMODE=2 bits23.c -o bits23
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DMODE=1 bits23.c -o bits23
bits23.c: In function ‘main’:
bits23.c:28:33: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses]
     int x = GETBIT(array + 3, y + 2) + 13;
                                 ^
bits23.c:6:25: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                         ^
bits23.c:6:24: error: subscripted value is neither array nor pointer nor vector
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                        ^
bits23.c:28:13: note: in expansion of macro ‘GETBIT’
     int x = GETBIT(array + 3, y + 2) + 13;
             ^
bits23.c:28:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int x = GETBIT(array + 3, y + 2) + 13;
                                 ^
bits23.c:6:43: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                           ^
bits23.c:28:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int x = GETBIT(array + 3, y + 2) + 13;
                                 ^
bits23.c:6:57: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                                         ^
bits23.c:29:33: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                 ^
bits23.c:6:25: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                         ^
bits23.c:29:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                 ^
bits23.c:6:43: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                           ^
bits23.c:29:22: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                      ^
bits23.c:6:23: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                       ^
bits23.c:29:33: error: suggest parentheses around ‘+’ in operand of ‘&’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                 ^
bits23.c:6:57: note: in definition of macro ‘GETBIT’
 #define GETBIT(A, k) (A[k >> 3] & (01 << (k & 07))) >> (k & 07)
                                                         ^
bits23.c:29:38: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses]
     int z = GETBIT(3 + array, y + 2) + 13;
                                      ^
bits23.c:33:5: error: ‘else’ without a previous ‘if’
     else
     ^
cc1: all warnings being treated as errors
$

C：位域和位运算符

C: Bit fields and bitwise operators

c

bit-manipulation

bit

宏的作用

宏不做什么