ISO C95 数组初始化保证

ISO C95 array initialization guarantee

我试图找到证实或反驳

声明的文件
char test[5]="";

导致缓冲区初始化为与

相同的所有空字符
memset(test,'[=11=]',sizeof(test));

但未能找到(或理解/破译)任何东西。我专门在旧规范中寻找细节,C99 参考也可以。 谢谢

来自C标准(6.7.9初始化)

10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:

— if it has pointer type, it is initialized to a null pointer;

— if it has arithmetic type, it is initialized to (positive or unsigned) zero;

...

21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

这意味着在这个声明中

char test[5] = "";

数组的所有五个元素都初始化为零。第一个元素由字符串文字的终止零显式初始化,所有其他元素都以与具有静态存储持续时间的对象相同的方式隐式初始化。

至少从C99标准开始是有效的。

下面是一个演示程序,展示了用零初始化字符数组的不同方法。

#include <stdio.h>

int main(void) 
{
    enum { N = 5 };

    char s1[N] = "";
    char s2[N] = { "" };
    char s3[N] = { 0 };
    char s4[N] = { [0] = 0 };
    char s5[N] = { [N-1] = 0 };

    char * s[] = { s1, s2, s3, s4, s5 };

    for ( size_t i = 0; i < sizeof( s ) / sizeof( *s ); i++ )
    {
        for ( size_t j = 0; j < N; j++ )
        {
            printf( "%d", s[i][j] );
        }
        putchar( '\n' );
    }
    return 0;
}

程序输出为

00000
00000
00000
00000
00000

从代码清晰的角度来看,如果数组的目的是保存字符串,那么

char test[5] = "";

用零长度字符串初始化数组,其余字节无关紧要。如果它们确实重要,那么数组就不是真正的字符串,您应该使用

char test[5] = {0};

澄清一下。

这不是您问题的完整答案,但可用于清除其他答案中的错误信息

在 ANSI C89 中,相关的标准文本是(第 3.5.7 节):

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the members of the array.

只指定字符串字面量对应的数组元素的初始化。因此尾随数组元素未明确初始化,因此具有不确定的值。

还有一段:

If there are fewer initializers in a list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

但这不适用,因为我们不是从列表中初始化的(“列表”是指用大括号括起来的列表,而不是字符串文字)。


在 C90 中(我不确定我是否可以合法 link),这些部分被重新编号,因此包含这些段落的部分变成了 6.5.7。后一段的写法也改了:

If there are fewer initializers in a brace-enclosed list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.


在C90 TC1(HTML, PDF)中,以上不变

然而,在Defect Report 60中,提出了关键问题:

When an array of char (or wchar_t) is initialized with a string literal that contains fewer characters than the array, are the remaining elements of the array initialized?

Subclause 6.5.7 Initialization, page 72, only says (emphasis mine):

If there are fewer initializers in a brace-enclosed list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

Correction

In subclause 6.5.7, page 72, the penultimate paragraph of Semantics (before Examples), add after the comma:

or fewer characters in a string literal or wide string literal used to initialize an array of known size, and elements of character or wchar_t type

看来建议修复中的零初始化确实是标准编写者的意图,因为在 C90 TC2 中,我们看到了相同的关键更改:

Page 72

In subclause 6.5.7, page 72, the penultimate paragraph of Semantics (before Examples), add after the comma:

or fewer characters in a string literal or wide string literal used to initialize an array of known size, and elements of character or wchar_t type

给我们:

If there are fewer initializers in a brace-enclosed list than there are members of an aggregate, or fewer characters in a string literal or wide string literal used to initialize an array of known size, and elements of character or wchar_t type the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

请注意,TC1 的日期是 1994 年,尽管它是在 1995 年出版的。TC2 的日期是 1996 年。令人费解的是,DR60 的日期是 1993 年 7 月 16 日,因此早于 TC1。也许 TC1 的工作在那个时候已经处于太高级的阶段,无法处理新的缺陷报告并且积压了?无论如何,TC2 主要只是针对缺陷报告的一组更正,这表明更改首先出现在那里而不是在 C95 中,并且空终止符之后的字符零初始化是 C89 标准编写者所拥有的有意的。


在ISO C99(原始版本,无技术勘误)中,该段现在重新编号为6.7.8/21,并再次发生变化。 “宽字符串文字”“字符元素或 wchar_t 类型” 的提及已删除:

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

意味着尾随数组元素被初始化为空字节。

(注意:原版C99可能有版权material,所以我不能在上面post一个link。就是这样它说,虽然。Here 是对最后一个免费提供的工作草案的 link。之后还有两个草案,但 WG14 网站已将它们删除。尽管如此,该措辞在 N843 中工作草案,并且仍然出现在后来包含 TC3 的 C99 中。)


我找不到任何 C95 的免费副本 (ISO/IEC 9899:1990/AMD1:1995)。因此,我无法准确回答在 C89 和 C99 之间的哪一点进行了“宽字符串文字”和“wchar_t”更改。此外,C99 基本原理文档中未提及该主题。

当然,C99 的行为可能是 C89 作者的意图,而缺失的文本是一种疏忽,但由于没有任何类型的文档说明这一点,我们无法得出任何结论,并且那时可能有编译器不初始化尾随元素。

希望其他拥有这些文件(或倾向于从 ISO 商店购买!)的人可以提供准确的答案。

快速总结

C99及以后保证剩余字符初始化为零。 C89/C90/C95不做此保证,不指定剩余字符的值。这可能是无意的疏忽,我推测大多数或所有 C99 之前的编译器无论如何都会对剩余字符进行零初始化。如果您使用的是符合标准的 C99 或更高版本的编译器,则零初始化是有保证的。

血淋淋的细节

char test[5]="";

由于 C89/C90 标准的缺陷,这只能保证将 test[0] 初始化为 '[=14=]'test 的其他元素未指定。

C95 修正案没有解决这个问题。

C99 标准修正了这个缺陷,要求 test 初始化为全零。

另一个例子:

char foo[5] = "foo";

在C89/C90,C95中,语言保证了foo[0]=='f', foo[1]=='o', foo[2]=='o', foo[3]=='[=17=]',但对foo[4]的值只字未提。在 C99 及更高版本中,它保证被初始化,就好像你写了:

char foo[5] = { 'f', 'o', 'o', '[=12=]' };

在 C 标准的所有版本中,保证 foo[4]=='[=19=]'

引用次数

1989 年的 ANSI C 标准和 1990 年的 ISO C 标准是等效的,区别仅在于非规范性介绍 material 和部分重新编号。 1995年修正案更新了标准但不影响数组初始化。

1990 ISO C 标准第 6.5.7 节说:

An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

以及同一部分的后续内容:

If there are fewer initializers in a brace-enclosed list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

它指定将大括号括起来的列表的尾随成员初始化为零,但不会对字符串文字初始化程序进行相同的声明。 (我推测这是一个无意的疏忽,并且大多数编译器无论如何都会用零填充剩余的元素,因为在某些情况下他们已经不得不这样做了。)

C 标准的每个版本都有一组与之相关的缺陷报告:

C90 Defect Report #060,由 P.J 于 1993 年提交。 Plauger and/or Larry Jones,提出了这个问题:

When an array of char (or wchar_t) is initialized with a string literal that contains fewer characters than the array, are the remaining elements of the array initialized?
Subclause 6.5.7 Initialization, page 72, only says (emphasis mine):

If there are fewer initializers in a brace-enclosed list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

对该缺陷报告的响应导致 C99 标准第 6.7.8 节第 21 段中的修订措辞(添加了强调):

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.