定长空字符串中的空字符在哪里?
Where is the null-character in a fixed-length empty string?
所以我好奇地阅读了一些 C 代码;假设我们有以下代码:
char text[10] = "";
然后 C 编译器会将空字符放在哪里?
我能想到3种可能的情况
- 开头,然后是内存中的9个字符
- 最后,所以9个字符的垃圾,然后是尾随的
'[=11=]'
- 完全填满10
'[=11=]'
问题是,根据这两种情况,在执行 strncpy
时是否有必要添加尾随 '[=11=]'
。如果是情况 2 和 3,那么它不是绝对必要的,而是一个好主意;如果是情况1,那绝对有必要。
这是哪个?
字符串文字 ""
在 C 中具有字符数组类型 char[1]
,在 C++ 中具有 const char [1]
。
你可以这样想象
C
chat no_name[] = { '[=10=]' };
或
在 C++ 中
const chat no_name[] = { '[=11=]' };
当字符串文字用于初始化字符数组时,它的所有字符都被用作初始值设定项。所以对于这个声明
char text[10] = "";
你其实有
char text[10] = { '[=13=]' };
数组中所有其他没有相应初始值设定项的字符(除了第一个字符是 text[0])然后它们被初始化为 0。
来自C标准(6.7.9初始化)
14 An array of character type may be initialized by a character string
literal or UTF−8 string literal, optionally enclosed in braces.
Successive bytes of the string literal (including the terminating null
character if there is room or if the array is of unknown size)
initialize the elements of the array.
和
21 If there are fewer initializers in a brace-enclosed list than there
are elements or members of an aggregate, or fewer characters in a
string literal used to initialize an array of known size than there
are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage
duration
最后
10 If an object that has automatic storage duration is not initialized
explicitly, its value is indeterminate. If an object that has static
or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively)
according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized
(recursively) according to these rules, and any padding is initialized
to zero bits;
类似的是用C++标准写的。
考虑到在 C 中你可以这样写,例如下面的方式
char text[5] = "Hello";
^^^
在这种情况下,字符数组将没有终止零,因为没有空间容纳它。 :) 和你定义的一样
char text[5] = { 'H', 'e', 'l', 'l', 'o' };
在您的初始化中,text
数组填充了空字节(即选项 #3)。
char text[10] = "";
相当于:
char text[10] = { '[=11=]' };
因为 text
的第一个元素显式初始化为零,其余元素根据 C11, Initialization 6.7.9, 21 的要求隐式初始化为零:
If there are fewer initializers in a brace-enclosed list than there
are elements or members of an aggregate, or fewer characters in a
string literal used to initialize an array of known size than there
are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage
duration.
引用N1256(大致为C99),因为前后语言没有相关变化:
6.7.8 Initialization
14 An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
""
是一个由一个字符(它的终止空字符)组成的字符串字面量,这一段说的是用一个字符来初始化数组的元素,也就是初始化第一个字符归零。这里没有任何内容说明数组的其余部分发生了什么,但是有:
21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
本段说明其余字符的初始化与静态存储持续时间相同,这意味着数组的其余部分也被初始化为零。
这里还值得一提的是p14中的“如果有空间”:
在 C 中,char a[5] = "hello";
也完全有效,对于这种情况,您可能也想问编译器将空字符放在哪里。这里的答案是:没有。
所以我好奇地阅读了一些 C 代码;假设我们有以下代码:
char text[10] = "";
然后 C 编译器会将空字符放在哪里?
我能想到3种可能的情况
- 开头,然后是内存中的9个字符
- 最后,所以9个字符的垃圾,然后是尾随的
'[=11=]'
- 完全填满10
'[=11=]'
问题是,根据这两种情况,在执行 strncpy
时是否有必要添加尾随 '[=11=]'
。如果是情况 2 和 3,那么它不是绝对必要的,而是一个好主意;如果是情况1,那绝对有必要。
这是哪个?
字符串文字 ""
在 C 中具有字符数组类型 char[1]
,在 C++ 中具有 const char [1]
。
你可以这样想象
C
chat no_name[] = { '[=10=]' };
或 在 C++ 中
const chat no_name[] = { '[=11=]' };
当字符串文字用于初始化字符数组时,它的所有字符都被用作初始值设定项。所以对于这个声明
char text[10] = "";
你其实有
char text[10] = { '[=13=]' };
数组中所有其他没有相应初始值设定项的字符(除了第一个字符是 text[0])然后它们被初始化为 0。
来自C标准(6.7.9初始化)
14 An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
和
21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration
最后
10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
类似的是用C++标准写的。
考虑到在 C 中你可以这样写,例如下面的方式
char text[5] = "Hello";
^^^
在这种情况下,字符数组将没有终止零,因为没有空间容纳它。 :) 和你定义的一样
char text[5] = { 'H', 'e', 'l', 'l', 'o' };
在您的初始化中,text
数组填充了空字节(即选项 #3)。
char text[10] = "";
相当于:
char text[10] = { '[=11=]' };
因为 text
的第一个元素显式初始化为零,其余元素根据 C11, Initialization 6.7.9, 21 的要求隐式初始化为零:
If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
引用N1256(大致为C99),因为前后语言没有相关变化:
6.7.8 Initialization
14 An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
""
是一个由一个字符(它的终止空字符)组成的字符串字面量,这一段说的是用一个字符来初始化数组的元素,也就是初始化第一个字符归零。这里没有任何内容说明数组的其余部分发生了什么,但是有:
21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
本段说明其余字符的初始化与静态存储持续时间相同,这意味着数组的其余部分也被初始化为零。
这里还值得一提的是p14中的“如果有空间”:
在 C 中,char a[5] = "hello";
也完全有效,对于这种情况,您可能也想问编译器将空字符放在哪里。这里的答案是:没有。