C 中奇怪的 malloc 行为
weird malloc behavior in C
我有以下 ANSI C 代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *buffer = 0;
int length = 0;
FILE *f = fopen("text.txt", "r");
if(f) {
fseek(f, 0, SEEK_END);
length = ftell(f);
fseek(f, 0, SEEK_SET);
buffer = malloc(length);
fread(buffer, 1, length, f);
fclose (f);
}
printf("File size: %d\nBuffer size: %d\nContent: %s\n=END=", length, strlen(buffer), buffer);
return 0;
}
由于某种原因,在 malloc 分配了比需要更多的内存并从内存中输出额外的垃圾,例如:
首先运行:
File size: 12
Buffer size: 22
Content: 123456789012les=$#▬rW|
=END=
第二个运行:
File size: 12
Buffer size: 22
Content: 123456789012les↔1↕.'
=END=
第三个运行:
File size: 12
Buffer size: 22
Content: 123456789012les=▬kπà
=END=
有人可以帮我解决这个问题并解释为什么我的版本表现得很奇怪吗?
我使用 MingW TDM-GCC 4.9.2 32bit 编译 (gcc)
您有 undefined behavior (this explains why you should be afraid of UB) -because of buffer overflow。您忘记添加终止空字节。
替换 错误的 行:
// WRONG CODE:
buffer = malloc(length);
fread(buffer, 1, length, f);
和
buffer = malloc(length+1);
if (!buffer)
{ perror("malloc"); exit(EXIT_FAILURE); };
memset (buffer, 0, length+1);
if (fread(buffer, 1, length, f) < length)
{ perror("fread"); exit(EXIT_FAILURE); };
(您可以只将结束字节归零;我更喜欢用 memset
清除整个缓冲区)
顺便说一句,ANSI C 已经过时了。你应该使用 C11 compliant compiler (e.g. a recent GCC used as gcc -std=c11 -Wall -Wextra -g
) and target C11 compliance (or at least C99). Learn to use the debugger (e.g. gdb
)
fseek(f, 0, SEEK_END);
的使用会调用未定义的行为。首先,您不是以二进制模式读取,因此文件中的字节数不一定是要读取的字节数。
但是如果你切换到二进制流,根据 C Standard 的 7.19.9.2:
A binary stream need not meaningfully support fseek
calls with a
whence
value of SEEK_END
.
和
Setting the file position indicator to end-of-file, as with
fseek(file, 0, SEEK_END)
, has undefined behavior for a binary
stream (because of possible trailing null characters) or for any
stream with state-dependent encoding that does not assuredly end in
the initial shift state.
我有以下 ANSI C 代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *buffer = 0;
int length = 0;
FILE *f = fopen("text.txt", "r");
if(f) {
fseek(f, 0, SEEK_END);
length = ftell(f);
fseek(f, 0, SEEK_SET);
buffer = malloc(length);
fread(buffer, 1, length, f);
fclose (f);
}
printf("File size: %d\nBuffer size: %d\nContent: %s\n=END=", length, strlen(buffer), buffer);
return 0;
}
由于某种原因,在 malloc 分配了比需要更多的内存并从内存中输出额外的垃圾,例如: 首先运行:
File size: 12 Buffer size: 22 Content: 123456789012les=$#▬rW| =END=
第二个运行:
File size: 12 Buffer size: 22 Content: 123456789012les↔1↕.' =END=
第三个运行:
File size: 12 Buffer size: 22 Content: 123456789012les=▬kπà =END=
有人可以帮我解决这个问题并解释为什么我的版本表现得很奇怪吗? 我使用 MingW TDM-GCC 4.9.2 32bit 编译 (gcc)
您有 undefined behavior (this explains why you should be afraid of UB) -because of buffer overflow。您忘记添加终止空字节。
替换 错误的 行:
// WRONG CODE:
buffer = malloc(length);
fread(buffer, 1, length, f);
和
buffer = malloc(length+1);
if (!buffer)
{ perror("malloc"); exit(EXIT_FAILURE); };
memset (buffer, 0, length+1);
if (fread(buffer, 1, length, f) < length)
{ perror("fread"); exit(EXIT_FAILURE); };
(您可以只将结束字节归零;我更喜欢用 memset
清除整个缓冲区)
顺便说一句,ANSI C 已经过时了。你应该使用 C11 compliant compiler (e.g. a recent GCC used as gcc -std=c11 -Wall -Wextra -g
) and target C11 compliance (or at least C99). Learn to use the debugger (e.g. gdb
)
fseek(f, 0, SEEK_END);
的使用会调用未定义的行为。首先,您不是以二进制模式读取,因此文件中的字节数不一定是要读取的字节数。
但是如果你切换到二进制流,根据 C Standard 的 7.19.9.2:
A binary stream need not meaningfully support
fseek
calls with awhence
value ofSEEK_END
.
和
Setting the file position indicator to end-of-file, as with
fseek(file, 0, SEEK_END)
, has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.