为什么我们在 C 编程中总是必须使用 fgetc 命令而不是 fscanf 命令，它做同样的事情但打印出奇怪的结果？

Question

在 C 编程中，每当我使用 fgetc(file) 读取所有字符直到文件末尾时，它就可以工作。但是当我使用类似的 fscanf(file, "%c") 函数时，它会打印出奇怪的字符。代码：

#include <stdio.h>
#include <stdlib.h>

int main() {
    char c;
    FILE * file = fopen("D\filename.txt", "r");
    while (c != EOF) {
        fscanf(file, "%c", &c);
        printf("%c", c);
    }
    return 0;
}

但是当我使用 fgetc 而不是 fscanf 时，它起作用了。它打印文件中存在的每个字符。

谁能回答为什么会这样？

Answer 1

注意

c=fscanf(file,"%c");

is undefined behavior (here I am explaining why you should be afraid of it, even when a program seems to apparently "work"), and every good C compiler (e.g. GCC to be invoked as gcc -Wall -Wextra -g) 应该警告你（如果你启用所有警告）。在用 C 编写代码时，您还应该学习如何使用调试器（例如 gdb）。

您应该阅读 fscanf(3) 的文档。你可能想编码

char c= '[=11=]';
if (fscanf(file, "%c", &c) <= 0) break;

你最好养成初始化每个变量的习惯；一个好的优化编译器会删除无用的初始化，否则会经常警告您有关单元化变量的信息。

请注意，在您的情况下使用 fgetc(3) 可能更可取。然后你需要声明 c 为整数，而不是字符，代码：

do {
  int c=fgetc(file);
  if (c==EOF) break;
} while (!feof(file));

注意在上面的循环中 feof(file) 永远不会为真（因为 fgetc 之前会给出 EOF），所以你最好替换 while(!feof(file)) while(true)

使用相同的代码更容易阅读（其他开发人员，甚至你自己在几个月后），而且很可能更快。 fscanf 的大多数实现都以某种方式基于 fgetc 或非常相关的东西。

此外，养成测试输入的好习惯。输入文件可能与您预期的不同。

在大多数最新的系统上，编码是今天 UTF-8. Be aware that some (human language) characters could be encoded in several bytes (e.g. French accentuated e letter é, or Russian yery letterЫ, or even the Euro sign €, or the mathematical for all sign ∀, letters or glyphs in other languages, etc....). You probably should consider using some UTF-8 library (e.g. libunistring）如果你关心它（你应该关心严肃软件中的 UTF-8！）。

Nota Bene：如果您还年轻并且正在学习编程，最好（恕我直言）在学习 C 或 Java 之前先学习 Scheme with SICP, using e.g. Racket。 C真的不适合初学者恕我直言。

PS 字符类型（通常是一个字节）是 char 小写。

为什么我们在 C 编程中总是必须使用 fgetc 命令而不是 fscanf 命令，它做同样的事情但打印出奇怪的结果？

Why we always have to use fgetc command in C programming instead of fscanf which do the same thing but prints strange results?

c

file-io

scanf

fgetc