为什么 C 中的复合文字是可修改的
Why are compound literals in C modifiable
人们通常会将 'unmodifiable' 与字面量
联系起来
char* str = "Hello World!";
*str = 'B'; // Bus Error!
然而,当使用复合文字时,我很快发现它们是完全可以修改的(查看生成的机器代码,您会看到它们被压入堆栈):
char* str = (char[]){"Hello World"};
*str = 'B'; // A-Okay!
我正在使用 clang-703.0.29
进行编译。这两个示例不应该生成完全相同的机器代码吗?如果复合文字是可修改的,那么它真的是文字吗?
编辑:一个更短的例子是:
"Hello World"[0] = 'B'; // Bus Error!
(char[]){"Hello World"}[0] = 'B'; // Okay!
复合文字是左值,其元素的值是可修改的。如果
char* str = (char[]){"Hello World"};
*str = 'B'; // A-Okay!
您正在修改一个合法的复合文字。
C11-§6.5.2.5/4:
If the type name specifies an array of unknown size, the size is determined by the initializer list as specified in 6.7.9, and the type of the compound literal is that of the completed array type. Otherwise (when the type name specifies an object type), the type
of the compound literal is that specified by the type name. In either case, the result is an lvalue.
可以看出compound literal的类型是一个完整的数组类型,并且是左值,因此不像string literals[=34]是可修改的=]
标准也提到
§6.5.2.5/7:
String literals, and compound literals with const-qualified types, need not designate distinct objects.101
进一步说:
11 EXAMPLE 4 A read-only compound literal can be specified through constructions like:
(const float []){1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6}
12 EXAMPLE 5 The following three expressions have different meanings:
"/tmp/fileXXXXXX"
(char []){"/tmp/fileXXXXXX"}
(const char []){"/tmp/fileXXXXXX"}
The first always has static storage duration and has type array of char
, but need not be modifiable; the last two have automatic storage duration when they occur within the body of a function, and the first of these
two is modifiable.
13 EXAMPLE 6 Like string literals, const-qualified compound literals can be placed into read-only memory and can even be shared. For example,
(const char []){"abc"} == "abc"
might yield 1 if the literals’ storage is shared.
复合字面量语法是一种简写表达式,等效于带有初始值设定项的局部声明,后跟对如此声明的未命名对象的引用:
char *str = (char[]){ "Hello World" };
相当于:
char __unnamed__[] = { "Hello world" };
char *str = __unnamed__;
__unnamed__
是自动存储的,定义为可修改的,可以通过初始化指向它的指针str
进行修改。
在 char *str = "Hello World!";
的情况下,不应修改 str
指向的对象。事实上试图修改它有未定义的行为。
C 标准可以将此类字符串文字定义为类型 const char[]
而不是 char[]
,但这会在遗留代码中产生许多警告和错误。
然而,建议将标志传递给编译器以隐式生成此类字符串文字 const
并使整个项目 const
正确,即:定义所有不用于的指针参数将他们的对象修改为 const
。对于 gcc
和 clang
,命令行选项是 -Wwrite-strings
。我还强烈建议启用更多警告并使用 -Wall -W -Werror
.
使它们致命
人们通常会将 'unmodifiable' 与字面量
联系起来char* str = "Hello World!";
*str = 'B'; // Bus Error!
然而,当使用复合文字时,我很快发现它们是完全可以修改的(查看生成的机器代码,您会看到它们被压入堆栈):
char* str = (char[]){"Hello World"};
*str = 'B'; // A-Okay!
我正在使用 clang-703.0.29
进行编译。这两个示例不应该生成完全相同的机器代码吗?如果复合文字是可修改的,那么它真的是文字吗?
编辑:一个更短的例子是:
"Hello World"[0] = 'B'; // Bus Error!
(char[]){"Hello World"}[0] = 'B'; // Okay!
复合文字是左值,其元素的值是可修改的。如果
char* str = (char[]){"Hello World"};
*str = 'B'; // A-Okay!
您正在修改一个合法的复合文字。
C11-§6.5.2.5/4:
If the type name specifies an array of unknown size, the size is determined by the initializer list as specified in 6.7.9, and the type of the compound literal is that of the completed array type. Otherwise (when the type name specifies an object type), the type of the compound literal is that specified by the type name. In either case, the result is an lvalue.
可以看出compound literal的类型是一个完整的数组类型,并且是左值,因此不像string literals[=34]是可修改的=]
标准也提到
§6.5.2.5/7:
String literals, and compound literals with const-qualified types, need not designate distinct objects.101
进一步说:
11 EXAMPLE 4 A read-only compound literal can be specified through constructions like:
(const float []){1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6}
12 EXAMPLE 5 The following three expressions have different meanings:
"/tmp/fileXXXXXX" (char []){"/tmp/fileXXXXXX"} (const char []){"/tmp/fileXXXXXX"}
The first always has static storage duration and has type array of
char
, but need not be modifiable; the last two have automatic storage duration when they occur within the body of a function, and the first of these two is modifiable.13 EXAMPLE 6 Like string literals, const-qualified compound literals can be placed into read-only memory and can even be shared. For example,
(const char []){"abc"} == "abc"
might yield 1 if the literals’ storage is shared.
复合字面量语法是一种简写表达式,等效于带有初始值设定项的局部声明,后跟对如此声明的未命名对象的引用:
char *str = (char[]){ "Hello World" };
相当于:
char __unnamed__[] = { "Hello world" };
char *str = __unnamed__;
__unnamed__
是自动存储的,定义为可修改的,可以通过初始化指向它的指针str
进行修改。
在 char *str = "Hello World!";
的情况下,不应修改 str
指向的对象。事实上试图修改它有未定义的行为。
C 标准可以将此类字符串文字定义为类型 const char[]
而不是 char[]
,但这会在遗留代码中产生许多警告和错误。
然而,建议将标志传递给编译器以隐式生成此类字符串文字 const
并使整个项目 const
正确,即:定义所有不用于的指针参数将他们的对象修改为 const
。对于 gcc
和 clang
,命令行选项是 -Wwrite-strings
。我还强烈建议启用更多警告并使用 -Wall -W -Werror
.