C - 二进制读取，fread 是反转顺序

Question

fread(cur, 2, 1, fin)

我确定当我得到这个答案时我会觉得自己很愚蠢，但这是怎么回事？

cur 是一个指向 code_cur 的指针，一个 short（2 字节），fin 是一个为二进制读取而打开的流。

如果我的文件是00101000 01000000

我最后得到的是

code_cur = 01000000 00101000

这是为什么？我还没有参加任何比赛，因为问题真的归结为这种（至少对我而言）意想不到的行为。

而且，如果这是常态，我怎样才能获得想要的效果？

P.S.

我应该补充一点，为了 'view' 个字节，我正在打印它们的整数值。

printf("%d\n",code_cur)

我试了几次，感觉靠谱。

Answer 1

这就是 htonl 和 htons（和朋友）存在的原因。它们不是 C 标准库的一部分，但它们在几乎所有进行联网的平台上都可用。

"htonl"表示"host to network, long"； "htons" 表示 "host to network, short"。在此上下文中，"long" 表示 32 位，"short" 表示 16 位（即使平台声明 "long" 为 64 位）。基本上，每当您从 "network"（或者在您的情况下，您正在阅读的流）中读取内容时，您都会通过 "ntoh*" 传递它。当你写出来的时候，你通过 "hton*"

你可以用任何你想要的方式排列这些函数名，愚蠢的除外（不，没有 ntons，也没有 stonl）

Answer 2

正如其他人所指出的，您需要了解更多有关 endianness 的信息。

您不知道，但您的文件（幸运的是）采用网络字节顺序（即 Big Endian）。您的机器是小端，因此需要更正。无论是否需要，始终建议进行此更正，因为这将保证您的程序无处不在。

做类似这样的事情：

{
    uint16_t tmp;

    if (1 == fread(&tmp, 2, 1, fin)) { /* Check fread finished well */
        code_cur = ntohs(tmp);
    } else {
        /* Treat error however you see fit */
        perror("Error reading file");
        exit(EXIT_FAILURE); // requires #include <stdlib.h>
    }
}

ntohs() 会将您的值从文件顺序转换为您机器的顺序，无论它是大端还是小端。

Answer 3

正如其他人所指出的，这是一个字节顺序问题。

最高有效字节在您的文件和您的机器中有所不同。您的文件采用大端（MSB 在前），而您的机器是小端（MSB 在后或 LSB 在前）。

为了了解发生了什么，让我们创建一个包含一些二进制数据的文件：

    uint8_t buffer[2] = {0x28, 0x40}; // hexadecimal for 00101000 01000000
    FILE * fp = fopen("file.bin", "wb"); // opens or creates file as binary
    fwrite(buffer, 1, 2, fp); // write two bytes to file
    fclose(fp);

创建了file.bin，保存二进制值00101000 01000000，让我们读一下：

    uint8_t buffer[2] = {0, 0};
    FILE * fp = fopen("file.bin", "rb");
    fread(buffer, 1, 2, fp); // read two bytes from file
    fclose(fp);
    printf("0x%02x, 0x%02x\n", buffer[0], buffer[1]);
    // The above prints 0x28, 0x40, as expected and in the order we wrote previously

所以一切正常，因为我们正在逐字节读取，字节没有字节顺序（从技术上讲，它们有，它们总是 最重要位先不管你的机器，但你可能会认为好像他们没有简化理解）。

无论如何，如您所见，当您尝试直接阅读短片时会发生以下情况：

    FILE * fp_alt = fopen("file.bin", "rb");
    short incorrect_short = 0;
    fread(&incorrect_short, 1, 2, fp_alt);
    fclose(fp_alt);
    printf("Read short as machine endianess: %hu\n", incorrect_short);
    printf("In hex, that is 0x%04x\n", incorrect_short);
    // We get the incorrect decimal of 16424 and hex of 0x4028!
    // The machine inverted our short because of the way the endianess works internally

最糟糕的是，如果您使用的是 big-endian 机器，上述结果不会 return 不正确的数字让您不知道您的代码是特定于字节序的，并且不能在处理器之间移植！

使用 arpa/inet.h 中的 ntohs 来转换字节顺序很好，但我觉得很奇怪，因为它是一个为网络通信而创建的完整（非标准）库，用于解决随之而来的问题从读取文件，它通过从文件中错误地读取它然后 'translating' 不正确的值而不是正确地读取它来解决它。

在高级语言中，我们经常看到处理从文件中读取字节顺序而不是转换值的函数，因为我们（通常）知道文件结构及其字节顺序，只需查看 Javascript 缓冲区的 readInt16BE method，开门见山，简单易用。

受这种简单性的启发，我创建了一个函数来读取下面的 16 位整数（但如果需要，可以很容易地更改为 8、32 或 64 位）：

#include <stdint.h> // necessary for specific int types

// Advances and reads a single signed 16-bit integer from the file descriptor as Big Endian
// Writes the value to 'result' pointer
// Returns 1 if succeeds or 0 if it fails
int8_t freadInt16BE(int16_t * result, FILE * f) {
    uint8_t buffer[sizeof(int16_t)];
    if (!result || !f || sizeof(int16_t) != fread((void *) buffer, 1, sizeof(int16_t), f))
        return 0;
    *result = buffer[0] << 8 + buffer[1];
    return 1;
}

用法很简单（为简洁起见省略了错误处理）：

    FILE * fp = fopen("file.bin", "rb"); // Open file as binary
    short code_cur = 0;
    freadInt16BE(&code_cur, fp);
    fclose(fp);
    printf("Read Big-Endian (MSB first) short: %hu\n", code_cur);
    printf("In hex, that is 0x%04x\n", code_cur);
    // The above code prints 0x2840 correctly (decimal: 10304)

如果文件不存在、无法打开或不包含要在当前位置读取的 2 个字节，则函数将失败 (return0)。

作为奖励，如果你碰巧找到一个小端的文件，你可以使用这个函数：

// Advances and reads a single signed 16-bit integer from the file descriptor as Little Endian
// Writes the value to 'result' pointer
// Returns 1 if succeeds or 0 if it fails
int8_t freadInt16LE(int16_t * result, FILE * f) {
    uint8_t buffer[sizeof(int16_t)];
    if (!result || !f || sizeof(int16_t) != fread((void *) buffer, 1, sizeof(int16_t), f))
        return 0;
    *result = buffer[1] << 8 + buffer[0];
    return 1;
}

C - 二进制读取，fread 是反转顺序

C - binary reading, fread is inverting the order

c

binaryfiles