Clang 13 -O2 产生奇怪的输出而 gcc 没有
Clang 13 -O2 produces weird output while gcc does not
有人可以向我解释为什么以下代码使用带有 -O2 标志的 clang 13 进行了奇怪的优化吗?使用带有 clang 的较低优化设置和 gcc 的所有优化设置,我得到预期的打印输出“John:5”,但是,使用 clang -O2 或更高的优化标志,我得到“:5”的输出。我的代码是否有我不知道的未定义行为?奇怪的是,如果我使用 -fsanitize=undefined 编译代码,代码将按预期工作。我什至应该如何着手尝试诊断这样的问题?非常感谢任何帮助。
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef size_t usize;
typedef struct String {
char *s;
usize len;
} String;
String string_new(void) {
String string;
char *temp = malloc(1);
if (temp == NULL) {
printf("Failed to allocate memory in \"string_new()\".\n");
exit(-1);
}
string.s = temp;
string.s[0] = 0;
string.len = 1;
return string;
}
String string_from(char *s) {
String string = string_new();
string.s = s;
string.len = strlen(s);
return string;
}
void string_push_char(String *self, char c) {
self->len = self->len + 1;
char *temp = realloc(self->s, self->len);
if (temp == NULL) {
printf("Failed to allocate memory in \"string_push_char()\".\n");
exit(-1);
}
self->s[self->len - 2] = c;
self->s[self->len - 1] = 0;
}
void string_free(String *self) {
free(self->s);
}
int main(void) {
String name = string_new();
string_push_char(&name, 'J');
string_push_char(&name, 'o');
string_push_char(&name, 'h');
string_push_char(&name, 'n');
printf("%s: %lu\n", name.s, name.len);
string_free(&name);
return 0;
}
您的 string_push_char
调用 realloc
但随后继续使用旧指针。如果就地重新分配,这 通常 会很顺利,但是如果内存块被移动,这当然是未定义的行为。
然而,Clang 有一个 (controversial) 优化,它假设传递给 realloc
always 的指针变得无效,因为你应该改为使用返回的指针。
解决方案是在空检查后将 temp
赋值回 self->s
。
附带说明一下,您的 string_from
已完全损坏,您应该将其删除并从头开始重新考虑。
我会以不同的方式来做。
typedef size_t usize;
typedef struct String
{
usize len;
char str[];
} String;
String *string_from(char *s)
{
usize size = strlen(s);
String *string = malloc(sizeof(*string) + size + 1);
if(string)
{
string -> len = size + 1; //including null character
strcpy(string -> str, s);
}
return string;
}
String *string_push_char(String *self, char c) {
usize len = self ? self->len : 1;
self = realloc(self, len + 1);
if(self)
{
self -> len = len + 1;
self -> str[self -> len - 2] = c;
self -> str[self -> len - 1] = 0;
}
return self;
}
void string_free(String *self) {
free(self);
}
int main(void) {
String *str = NULL;
/* add some allocation checks same as with realloc function (temp pointer etc) */
str = string_push_char(str, 'J');
str = string_push_char(str, 'o');
str = string_push_char(str, 'h');
str = string_push_char(str, 'n');
printf("%s: %zu\n", str -> str, str -> len);
string_free(str);
return 0;
}
https://godbolt.org/z/4ardvGcxa
在你的代码中你有很多问题:
String string_from(char *s) {
String string = string_new();
string.s = s;
string.len = strlen(s);
return string;
}
此函数会立即造成内存泄漏,并将(很可能)不可重新分配(并且可能不可修改)的内存块分配给稍后您可能会尝试重新分配的结构。
除了@Sebastian Redl 的回答之外,我还可以补充说,根据 C17 7.22.3.5,代码具有未定义的行为:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size.
这是在 C90 中指定不当并在 C99 中默默澄清的事情之一。从C99原理V5.10 7.20.3.4:
A new feature of C99: the realloc
function was changed to make it clear that the pointed-to object is deallocated, a new object is allocated, and the content of the new object is the same as that of the old object up to the lesser of the two sizes. C89 attempted to specify that the new object was the same object as the old object but might have a different address. This conflicts with other parts of the Standard that assume that the address of an object is constant during its lifetime. Also, implementations that support an actual allocation when the size is zero do not necessarily return a null pointer for this case. C89 appeared to require a null return value, and the Committee felt that this was too restrictive.
值得注意的是 clang -O3 -std=c90 -pedantic-errors
仍然会崩溃,因此此代码在任何 C 版本的 clang 中都无法正常工作。
有人可以向我解释为什么以下代码使用带有 -O2 标志的 clang 13 进行了奇怪的优化吗?使用带有 clang 的较低优化设置和 gcc 的所有优化设置,我得到预期的打印输出“John:5”,但是,使用 clang -O2 或更高的优化标志,我得到“:5”的输出。我的代码是否有我不知道的未定义行为?奇怪的是,如果我使用 -fsanitize=undefined 编译代码,代码将按预期工作。我什至应该如何着手尝试诊断这样的问题?非常感谢任何帮助。
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef size_t usize;
typedef struct String {
char *s;
usize len;
} String;
String string_new(void) {
String string;
char *temp = malloc(1);
if (temp == NULL) {
printf("Failed to allocate memory in \"string_new()\".\n");
exit(-1);
}
string.s = temp;
string.s[0] = 0;
string.len = 1;
return string;
}
String string_from(char *s) {
String string = string_new();
string.s = s;
string.len = strlen(s);
return string;
}
void string_push_char(String *self, char c) {
self->len = self->len + 1;
char *temp = realloc(self->s, self->len);
if (temp == NULL) {
printf("Failed to allocate memory in \"string_push_char()\".\n");
exit(-1);
}
self->s[self->len - 2] = c;
self->s[self->len - 1] = 0;
}
void string_free(String *self) {
free(self->s);
}
int main(void) {
String name = string_new();
string_push_char(&name, 'J');
string_push_char(&name, 'o');
string_push_char(&name, 'h');
string_push_char(&name, 'n');
printf("%s: %lu\n", name.s, name.len);
string_free(&name);
return 0;
}
您的 string_push_char
调用 realloc
但随后继续使用旧指针。如果就地重新分配,这 通常 会很顺利,但是如果内存块被移动,这当然是未定义的行为。
然而,Clang 有一个 (controversial) 优化,它假设传递给 realloc
always 的指针变得无效,因为你应该改为使用返回的指针。
解决方案是在空检查后将 temp
赋值回 self->s
。
附带说明一下,您的 string_from
已完全损坏,您应该将其删除并从头开始重新考虑。
我会以不同的方式来做。
typedef size_t usize;
typedef struct String
{
usize len;
char str[];
} String;
String *string_from(char *s)
{
usize size = strlen(s);
String *string = malloc(sizeof(*string) + size + 1);
if(string)
{
string -> len = size + 1; //including null character
strcpy(string -> str, s);
}
return string;
}
String *string_push_char(String *self, char c) {
usize len = self ? self->len : 1;
self = realloc(self, len + 1);
if(self)
{
self -> len = len + 1;
self -> str[self -> len - 2] = c;
self -> str[self -> len - 1] = 0;
}
return self;
}
void string_free(String *self) {
free(self);
}
int main(void) {
String *str = NULL;
/* add some allocation checks same as with realloc function (temp pointer etc) */
str = string_push_char(str, 'J');
str = string_push_char(str, 'o');
str = string_push_char(str, 'h');
str = string_push_char(str, 'n');
printf("%s: %zu\n", str -> str, str -> len);
string_free(str);
return 0;
}
https://godbolt.org/z/4ardvGcxa
在你的代码中你有很多问题:
String string_from(char *s) {
String string = string_new();
string.s = s;
string.len = strlen(s);
return string;
}
此函数会立即造成内存泄漏,并将(很可能)不可重新分配(并且可能不可修改)的内存块分配给稍后您可能会尝试重新分配的结构。
除了@Sebastian Redl 的回答之外,我还可以补充说,根据 C17 7.22.3.5,代码具有未定义的行为:
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size.
这是在 C90 中指定不当并在 C99 中默默澄清的事情之一。从C99原理V5.10 7.20.3.4:
A new feature of C99: the
realloc
function was changed to make it clear that the pointed-to object is deallocated, a new object is allocated, and the content of the new object is the same as that of the old object up to the lesser of the two sizes. C89 attempted to specify that the new object was the same object as the old object but might have a different address. This conflicts with other parts of the Standard that assume that the address of an object is constant during its lifetime. Also, implementations that support an actual allocation when the size is zero do not necessarily return a null pointer for this case. C89 appeared to require a null return value, and the Committee felt that this was too restrictive.
值得注意的是 clang -O3 -std=c90 -pedantic-errors
仍然会崩溃,因此此代码在任何 C 版本的 clang 中都无法正常工作。