使用 fscanf,将文件扫描到 C 中的结构中,但第一个参数已经失败
Using fscanf, scanning a file into a struct in C, but the first argument is failing already
我有一个文件,我试图在其中将每一行读入 C 中的结构以进一步使用它。
文件如下所示:
Bread,212,2.7,36,6,9.8,0.01,0.01,10,500
Pasta,347,2.5,64,13,7,0.01,0.01,6,500
Honey,340,0.01,83,0.01,0.01,0.01,0.01,22.7,425
Olive-oil,824,92,0.01,0.01,0.01,0.01,13.8,35,500
White-beans,320,2.7,44,21,18,0.01,0.01,11,400
Flaxseed-oil,828,92,0.01,0.01,0.01,52,14,100,100
Cereal,363,6.5,58,13,9.9,0.01,0.01,11,1000
Hazelnuts,644,61.6,10.5,12,0.01,0.09,7.83,16.74,252
所以我写了一个 for
循环来遍历文件中的行,试图将每个值存储到 struct
的字段中。我尝试打印结构的字段,但第一个参数字符串已经出错了。
正在打印:
scanresult: 1, name: ■B, kcal: 0.00, omega 3: 0.00, omega 6: 0.00, carb: 0.00, protein: 0.00, fib: 0.00, price: 0.00, weight: 0.00g
Scanres
应该是 10
,而不是 1
,并且值应该与文件第一行的值匹配。
我试过在格式化字符串的参数前面有或没有空格。我还尝试了编译器警告 -Wall
和 -pedantic
。没有发现问题。
还有什么可能导致此问题?
代码如下所示:
#include <stdio.h>
#define MAX_CHAR 100
#define SIZE_OF_SHELF 8
typedef struct {
char name[MAX_CHAR];
double kcal, fat, omega_3, omega_6, carb, protein, fib, price, weight;
} Food;
int main(void) {
int i = 0, scanresult;
Food Shelf[SIZE_OF_SHELF];
FILE *fp;
fp = fopen("foods.txt", "r");
if (! fp) {
printf("error loading file. bye.\n");
return 0;
}
for (i = 0; !feof(fp); i++) {
scanres = fscanf(fp, " %[^,],%lf,%lf,%lf,%lf,%lf,%lf,%lf,%lf,%lf ",
Shelf[i].name,
&Shelf[i].kcal, &Shelf[i].fat,
&Shelf[i].carb, &Shelf[i].protein,
&Shelf[i].fib, &Shelf[i].omega_3,
&Shelf[i].omega_6, &Shelf[i].price,
&Shelf[i].weight);
printf("scanres: %d, name: %s, kcal: %.2f, omega 3: %.2f, omega 6: %.2f, carb: %.2f, protein: %.2f, fib: %.2f, price: %.2f, weight: %.2fg\n",
scanres, Shelf[i].name, Shelf[i].kcal,
Shelf[i].omega_3, Shelf[i].omega_6, Shelf[i].carb,
Shelf[i].protein, Shelf[i].fib, Shelf[i].price,
Shelf[i].weight);
}
return 0;
}
如果有人发现我做错了什么,请告诉我。
检查文件的前三个字符是否有字节顺序标记 (BOM)。您可以使用 hexdump
(或任何二进制编辑器)来检查它。
文件 BOM:
00000000 ef bb bf 42 72 65 61 64 2c 32 31 32 2c 32 2e 37 |...Bread,212,2.7|
00000010 2c 33 36 2c 36 2c 39 2e 38 2c 30 2e 30 31 2c 30 |,36,6,9.8,0.01,0|
00000020 2e 30 31 2c 31 30 2c 35 30 30 20 0a 50 61 73 74 |.01,10,500 .Past|
00000030 61 2c 33 34 37 2c 32 2e 35 2c 36 34 2c 31 33 2c |a,347,2.5,64,13,|
...
文件无物料清单:
00000000 42 72 65 61 64 2c 32 31 32 2c 32 2e 37 2c 33 36 |Bread,212,2.7,36|
00000010 2c 36 2c 39 2e 38 2c 30 2e 30 31 2c 30 2e 30 31 |,6,9.8,0.01,0.01|
00000020 2c 31 30 2c 35 30 30 20 0a 50 61 73 74 61 2c 33 |,10,500 .Pasta,3|
00000030 34 37 2c 32 2e 35 2c 36 34 2c 31 33 2c 37 2c 30 |47,2.5,64,13,7,0|
...
很可能,除了具有该编码的 Byte Order Mark (BOM), the original copy of the foods.txt
file was encoded using UTF-16, instead of ASCII or the more popular and compatible UTF-8. Taking a cue from wildplasser's answer, here is a hex dump of the first portion of the file in the little-endian 变体之外:
00000000 ff fe 42 00 72 00 65 00 61 00 64 00 2c 00 32 00 |..B.r.e.a.d.,.2.|
00000010 31 00 32 00 2c 00 32 00 2e 00 37 00 2c 00 33 00 |1.2.,.2...7.,.3.|
00000020 36 00 2c 00 36 00 2c 00 39 00 2e 00 38 00 2c 00 |6.,.6.,.9...8.,.|
00000030 30 00 2e 00 30 00 31 00 2c 00 30 00 2e 00 30 00 |0...0.1.,.0...0.|
00000040 31 00 2c 00 31 00 30 00 2c 00 35 00 30 00 30 00 |1.,.1.0.,.5.0.0.|
00000050 20 00 0a 00 50 00 61 00 73 00 74 00 61 00 2c 00 | ...P.a.s.t.a.,.|
00000060 33 00 34 00 37 00 2c 00 32 00 2e 00 35 00 2c 00 |3.4.7.,.2...5.,.|
前导 ff fe
代表字节顺序标记,并且会解释输出 name: ■B
中出现的神秘 ■
。此后,每隔一个字节为 0,这就是“Bread”被截断为“B”的原因。然后 fscanf 的第一个 %lf
看到“r[=16=]e[=16=]a[=16=]d
”,并且不能将其解析为双精度数,这就是为什么 fscanf
returns 1 而不是 10.
将 .txt 文件的内容复制到新的 .txt 文件中解决了问题。它起源于一个 .xls 文件,我的猜测是,你们中的一些人提到的奇怪的 BOM 东西来自。
我有一个文件,我试图在其中将每一行读入 C 中的结构以进一步使用它。
文件如下所示:
Bread,212,2.7,36,6,9.8,0.01,0.01,10,500
Pasta,347,2.5,64,13,7,0.01,0.01,6,500
Honey,340,0.01,83,0.01,0.01,0.01,0.01,22.7,425
Olive-oil,824,92,0.01,0.01,0.01,0.01,13.8,35,500
White-beans,320,2.7,44,21,18,0.01,0.01,11,400
Flaxseed-oil,828,92,0.01,0.01,0.01,52,14,100,100
Cereal,363,6.5,58,13,9.9,0.01,0.01,11,1000
Hazelnuts,644,61.6,10.5,12,0.01,0.09,7.83,16.74,252
所以我写了一个 for
循环来遍历文件中的行,试图将每个值存储到 struct
的字段中。我尝试打印结构的字段,但第一个参数字符串已经出错了。
正在打印:
scanresult: 1, name: ■B, kcal: 0.00, omega 3: 0.00, omega 6: 0.00, carb: 0.00, protein: 0.00, fib: 0.00, price: 0.00, weight: 0.00g
Scanres
应该是 10
,而不是 1
,并且值应该与文件第一行的值匹配。
我试过在格式化字符串的参数前面有或没有空格。我还尝试了编译器警告 -Wall
和 -pedantic
。没有发现问题。
还有什么可能导致此问题?
代码如下所示:
#include <stdio.h>
#define MAX_CHAR 100
#define SIZE_OF_SHELF 8
typedef struct {
char name[MAX_CHAR];
double kcal, fat, omega_3, omega_6, carb, protein, fib, price, weight;
} Food;
int main(void) {
int i = 0, scanresult;
Food Shelf[SIZE_OF_SHELF];
FILE *fp;
fp = fopen("foods.txt", "r");
if (! fp) {
printf("error loading file. bye.\n");
return 0;
}
for (i = 0; !feof(fp); i++) {
scanres = fscanf(fp, " %[^,],%lf,%lf,%lf,%lf,%lf,%lf,%lf,%lf,%lf ",
Shelf[i].name,
&Shelf[i].kcal, &Shelf[i].fat,
&Shelf[i].carb, &Shelf[i].protein,
&Shelf[i].fib, &Shelf[i].omega_3,
&Shelf[i].omega_6, &Shelf[i].price,
&Shelf[i].weight);
printf("scanres: %d, name: %s, kcal: %.2f, omega 3: %.2f, omega 6: %.2f, carb: %.2f, protein: %.2f, fib: %.2f, price: %.2f, weight: %.2fg\n",
scanres, Shelf[i].name, Shelf[i].kcal,
Shelf[i].omega_3, Shelf[i].omega_6, Shelf[i].carb,
Shelf[i].protein, Shelf[i].fib, Shelf[i].price,
Shelf[i].weight);
}
return 0;
}
如果有人发现我做错了什么,请告诉我。
检查文件的前三个字符是否有字节顺序标记 (BOM)。您可以使用 hexdump
(或任何二进制编辑器)来检查它。
文件 BOM:
00000000 ef bb bf 42 72 65 61 64 2c 32 31 32 2c 32 2e 37 |...Bread,212,2.7|
00000010 2c 33 36 2c 36 2c 39 2e 38 2c 30 2e 30 31 2c 30 |,36,6,9.8,0.01,0|
00000020 2e 30 31 2c 31 30 2c 35 30 30 20 0a 50 61 73 74 |.01,10,500 .Past|
00000030 61 2c 33 34 37 2c 32 2e 35 2c 36 34 2c 31 33 2c |a,347,2.5,64,13,|
...
文件无物料清单:
00000000 42 72 65 61 64 2c 32 31 32 2c 32 2e 37 2c 33 36 |Bread,212,2.7,36|
00000010 2c 36 2c 39 2e 38 2c 30 2e 30 31 2c 30 2e 30 31 |,6,9.8,0.01,0.01|
00000020 2c 31 30 2c 35 30 30 20 0a 50 61 73 74 61 2c 33 |,10,500 .Pasta,3|
00000030 34 37 2c 32 2e 35 2c 36 34 2c 31 33 2c 37 2c 30 |47,2.5,64,13,7,0|
...
很可能,除了具有该编码的 Byte Order Mark (BOM), the original copy of the foods.txt
file was encoded using UTF-16, instead of ASCII or the more popular and compatible UTF-8. Taking a cue from wildplasser's answer, here is a hex dump of the first portion of the file in the little-endian 变体之外:
00000000 ff fe 42 00 72 00 65 00 61 00 64 00 2c 00 32 00 |..B.r.e.a.d.,.2.|
00000010 31 00 32 00 2c 00 32 00 2e 00 37 00 2c 00 33 00 |1.2.,.2...7.,.3.|
00000020 36 00 2c 00 36 00 2c 00 39 00 2e 00 38 00 2c 00 |6.,.6.,.9...8.,.|
00000030 30 00 2e 00 30 00 31 00 2c 00 30 00 2e 00 30 00 |0...0.1.,.0...0.|
00000040 31 00 2c 00 31 00 30 00 2c 00 35 00 30 00 30 00 |1.,.1.0.,.5.0.0.|
00000050 20 00 0a 00 50 00 61 00 73 00 74 00 61 00 2c 00 | ...P.a.s.t.a.,.|
00000060 33 00 34 00 37 00 2c 00 32 00 2e 00 35 00 2c 00 |3.4.7.,.2...5.,.|
前导 ff fe
代表字节顺序标记,并且会解释输出 name: ■B
中出现的神秘 ■
。此后,每隔一个字节为 0,这就是“Bread”被截断为“B”的原因。然后 fscanf 的第一个 %lf
看到“r[=16=]e[=16=]a[=16=]d
”,并且不能将其解析为双精度数,这就是为什么 fscanf
returns 1 而不是 10.
将 .txt 文件的内容复制到新的 .txt 文件中解决了问题。它起源于一个 .xls 文件,我的猜测是,你们中的一些人提到的奇怪的 BOM 东西来自。