缓冲区上的基名进入分段错误
basename on buffer goes into segmentation fault
我现在正在调整 basename
,我遇到了一个非常奇怪的案例(至少对我而言)。这是代码:
char buffer[300];
char* p;
strcpy(buffer, "../src/test/resources/constraints_0020_000");
printf("%d\n", strcmp(basename("../src/test/resources/constraints_0020_000"), "constraints_0020_000")); //works as expected
printf("assert testBasename02");
printf("%d\n", strcmp(basename(buffer), "constraints_0020_000") == 0);
printf("done 1\n"); //goes in segmentation fault
printf("%d\n", strcmp(basename(&buffer), "constraints_0020_000") == 0);
printf("done 2\n"); //goes in segmentation fault
printf("%d\n", strcmp(basename(&buffer[0]), "constraints_0020_000") == 0);
printf("done 3\n"); //goes in segmentation fault
p = malloc(strlen("../src/test/resources/constraints_0020_000") +1);
strcpy(p, "../src/test/resources/constraints_0020_000");
printf("%d\n", strcmp(basename(p), "constraints_0020_000") == 0); //works as expected
free(p);
printf("all done\n");
第一个 strcmp
完全正常;令我困惑的是第二个:为什么缓冲区会出现分段错误?我尝试以不同的方式对缓冲区进行编码,但结果是一样的。
我当然可以忍受这种行为,但是...我真的不明白如果我给他喂 const char*
或缓冲区(在end 也是一个 char*
).
是否有解释此行为的文档?只有我吗?我试图寻找解释,但找不到任何解释。
这里是我电脑的规格(如果你需要的话):
- OS 系统:Ubuntu 16.4(64 位在 Windows 10 64 位上虚拟化);
- CPU(我认为没用):Intel® Core™ i5-3230M CPU @ 2.60GHz × 2;
根据 man page,
Bugs
In the glibc implementation of the POSIX versions of these functions they modify their argument, and segfault when called with a static string like "/usr/"
. [...]
基本上,
basename("../src/test/resources/constraints_0020_000")
invokes 调用 undefined behavior 因为这是修改字符串文字的尝试。
注意:如手册页中所述,需要更改文字。喜欢阅读,
In the glibc implementation of the POSIX versions of these functions they modify their argument, and invokes undefined behavior when called with a static string like "/usr/"
. [...]
分段错误是 UB 的副作用之一,但不是唯一的副作用。
FWIW,尝试修改字符串文字本身会调用 UB。引用 C11
,章节 §6.4.5,字符串文字
[...] If the program attempts to modify such an array, the behavior is
undefined.
编辑:
正如后续评论中所讨论的,另一个问题是缺少头文件。你需要
#include <libgen.h>
添加以获得函数的前向声明basename()
可用。
The basename()
function may modify the string pointed to by path
,
and may return a pointer to internal storage. The returned pointer
might be invalidated or the storage might be overwritten by a
subsequent call to basename()
. The returned pointer might also be
invalidated if the calling thread is terminated.
Both dirname() and basename() may modify the contents of
path
, so it may be desirable to pass a copy when calling one of
these functions.
您正在使用静态字符串调用 basename()
,该字符串可能是只读的,因此当 basename()
尝试修改字符串时会导致 SEGV。
我现在正在调整 basename
,我遇到了一个非常奇怪的案例(至少对我而言)。这是代码:
char buffer[300];
char* p;
strcpy(buffer, "../src/test/resources/constraints_0020_000");
printf("%d\n", strcmp(basename("../src/test/resources/constraints_0020_000"), "constraints_0020_000")); //works as expected
printf("assert testBasename02");
printf("%d\n", strcmp(basename(buffer), "constraints_0020_000") == 0);
printf("done 1\n"); //goes in segmentation fault
printf("%d\n", strcmp(basename(&buffer), "constraints_0020_000") == 0);
printf("done 2\n"); //goes in segmentation fault
printf("%d\n", strcmp(basename(&buffer[0]), "constraints_0020_000") == 0);
printf("done 3\n"); //goes in segmentation fault
p = malloc(strlen("../src/test/resources/constraints_0020_000") +1);
strcpy(p, "../src/test/resources/constraints_0020_000");
printf("%d\n", strcmp(basename(p), "constraints_0020_000") == 0); //works as expected
free(p);
printf("all done\n");
第一个 strcmp
完全正常;令我困惑的是第二个:为什么缓冲区会出现分段错误?我尝试以不同的方式对缓冲区进行编码,但结果是一样的。
我当然可以忍受这种行为,但是...我真的不明白如果我给他喂 const char*
或缓冲区(在end 也是一个 char*
).
是否有解释此行为的文档?只有我吗?我试图寻找解释,但找不到任何解释。
这里是我电脑的规格(如果你需要的话):
- OS 系统:Ubuntu 16.4(64 位在 Windows 10 64 位上虚拟化);
- CPU(我认为没用):Intel® Core™ i5-3230M CPU @ 2.60GHz × 2;
根据 man page,
Bugs
In the glibc implementation of the POSIX versions of these functions they modify their argument, and segfault when called with a static string like
"/usr/"
. [...]
基本上,
basename("../src/test/resources/constraints_0020_000")
invokes 调用 undefined behavior 因为这是修改字符串文字的尝试。
注意:如手册页中所述,需要更改文字。喜欢阅读,
In the glibc implementation of the POSIX versions of these functions they modify their argument, and invokes undefined behavior when called with a static string like
"/usr/"
. [...]
分段错误是 UB 的副作用之一,但不是唯一的副作用。
FWIW,尝试修改字符串文字本身会调用 UB。引用 C11
,章节 §6.4.5,字符串文字
[...] If the program attempts to modify such an array, the behavior is undefined.
编辑:
正如后续评论中所讨论的,另一个问题是缺少头文件。你需要
#include <libgen.h>
添加以获得函数的前向声明basename()
可用。
The
basename()
function may modify the string pointed to bypath
, and may return a pointer to internal storage. The returned pointer might be invalidated or the storage might be overwritten by a subsequent call tobasename()
. The returned pointer might also be invalidated if the calling thread is terminated.
Both dirname() and basename() may modify the contents of
path
, so it may be desirable to pass a copy when calling one of these functions.
您正在使用静态字符串调用 basename()
,该字符串可能是只读的,因此当 basename()
尝试修改字符串时会导致 SEGV。