将结构别名作为其第一个成员是否严格违反别名?
Is it a strict aliasing violation to alias a struct as its first member?
示例代码:
struct S { int x; };
int func()
{
S s{2};
return (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
}
我认为这很常见并且可以接受。该标准确实保证结构中没有初始填充。但是,这种情况未在严格的别名规则 (C++17 [basic.lval]/11) 中列出:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
- (11.1) the dynamic type of the object,
- (11.2) a cv-qualified version of the dynamic type of the object,
- (11.3) a type similar (as defined in 7.5) to the dynamic type of the object,
- (11.4) a type that is the signed or unsigned type corresponding to the dynamic type of the object,
- (11.5) a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
- (11.6) an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
- (11.7) a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
- (11.8) a char, unsigned char, or std::byte type.
很明显对象 s
正在访问它的存储值。
要点中列出的类型是进行访问的左值类型,而不是被访问对象的类型。在此代码中,glvalue 类型是 int
,它不是聚合或联合类型,排除了 11.6.
我的问题是:此代码是否正确?如果正确,在上述哪一个要点下是允许的?
我认为它在 expr.reinterpret.cast#11
A glvalue expression of type T1, designating an object x
, can be cast
to the type “reference to T2” if an expression of type “pointer to T1”
can be explicitly converted to the type “pointer to T2” using a
reinterpret_cast. The result is that of *reinterpret_cast<T2 *>(p)
where p
is a pointer to x
of type “pointer to T1”. No temporary is
created, no copy is made, and no constructors or
conversion functions are called [1].
[1] 当结果引用与源 glvalue[ 相同的对象时,这有时被称为 类型的双关语 =21=]
支持@M.M关于pointer-incovertible的回答:
来自 cppreference:
Assuming that alignment requirements are met, a reinterpret_cast
does
not change the value of a pointer outside of a few limited cases
dealing with pointer-interconvertible objects:
struct S { int a; } s;
int* p = reinterpret_cast<int*>(&s); // value of p is "pointer to s.a" because s.a
// and s are pointer-interconvertible
*p = 2; // s.a is also 2
对比
struct S { int a; };
S s{2};
int i = (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
// i doesn't change S.a;
演员的行为归结为 [expr.static.cast]/13;
A prvalue of type “pointer to cv1 void
” can be converted to a prvalue of type “pointer to cv2 T
”, where T
is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original
pointer value represents the address A
of a byte in memory and A
does not satisfy the alignment requirement of T
, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a
, and there is an object b
of type T
(ignoring cv-qualification) that is pointer-interconvertible with a
, the result is a pointer to b
. Otherwise, the pointer value is unchanged by the conversion.
pointer-interconvertible的定义是:
Two objects a and b are pointer-interconvertible if:
- they are the same object, or
- one is a union object and the other is a non-static data member of that object, or
- one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object, or
- there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
所以在原来的代码中,s
和s.x
是pointer-interconvertible,因此(int &)s
实际上指定了s.x
.
因此,在严格的别名规则中,其存储值被访问的对象是 s.x
而不是 s
所以没有问题,代码是正确的。
引用的规则源自 C89 中的一条类似规则,除非有人延伸 "by" 一词的含义,或者认识到 "Undefined Behavior" 在编写 C89 时的含义,否则该规则将毫无意义。给定类似 struct S {unsigned dat[10];}s;
的东西,语句 s.dat[1]++;
显然会修改 s
的存储值,但该表达式中唯一类型 struct S
的左值仅用于生成 unsigned*
类型的值。用于修改任何对象的唯一左值是 int
.
类型
在我看来,有两种相关的方法可以解决此问题:(1) 认识到标准的作者希望允许一种类型的左值明显派生自另一种类型的情况,但是不想纠结于必须考虑哪些形式的可见推导的细节,特别是因为编译器需要识别的案例范围会根据他们执行的优化风格和他们所执行的任务而有很大差异正在使用; (2) 认识到标准的作者没有理由认为标准是否实际要求对特定构造进行有用处理应该很重要,如果每个人都清楚有理由不这样做的话。
我认为委员会成员对于编译器是否给出了类似的东西没有达成共识:
struct foo {int ct; int *dat;} it;
void test(void)
{
for (int i=0; i < it.ct; i++)
it.dat[i] = 0;
}
应该要求确保例如在 it.ct = 1234; it.dat = &it.ct;
之后,对 test();
的调用将使 it.ct
归零并且没有其他效果。基本原理的部分内容表明至少一些委员会成员会如此预期,但省略任何允许使用成员类型的任意左值访问结构类型对象的规则表明情况并非如此。 C 标准从未真正解决过这个问题,而 C++ 标准稍微清理了一下,但也没有真正解决它。
示例代码:
struct S { int x; };
int func()
{
S s{2};
return (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
}
我认为这很常见并且可以接受。该标准确实保证结构中没有初始填充。但是,这种情况未在严格的别名规则 (C++17 [basic.lval]/11) 中列出:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
- (11.1) the dynamic type of the object,
- (11.2) a cv-qualified version of the dynamic type of the object,
- (11.3) a type similar (as defined in 7.5) to the dynamic type of the object,
- (11.4) a type that is the signed or unsigned type corresponding to the dynamic type of the object,
- (11.5) a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
- (11.6) an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
- (11.7) a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
- (11.8) a char, unsigned char, or std::byte type.
很明显对象 s
正在访问它的存储值。
要点中列出的类型是进行访问的左值类型,而不是被访问对象的类型。在此代码中,glvalue 类型是 int
,它不是聚合或联合类型,排除了 11.6.
我的问题是:此代码是否正确?如果正确,在上述哪一个要点下是允许的?
我认为它在 expr.reinterpret.cast#11
A glvalue expression of type T1, designating an object
x
, can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast. The result is that of*reinterpret_cast<T2 *>(p)
wherep
is a pointer tox
of type “pointer to T1”. No temporary is created, no copy is made, and no constructors or conversion functions are called [1].
[1] 当结果引用与源 glvalue[ 相同的对象时,这有时被称为 类型的双关语 =21=]
支持@M.M关于pointer-incovertible的回答:
来自 cppreference:
Assuming that alignment requirements are met, a
reinterpret_cast
does not change the value of a pointer outside of a few limited cases dealing with pointer-interconvertible objects:
struct S { int a; } s;
int* p = reinterpret_cast<int*>(&s); // value of p is "pointer to s.a" because s.a
// and s are pointer-interconvertible
*p = 2; // s.a is also 2
对比
struct S { int a; };
S s{2};
int i = (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
// i doesn't change S.a;
演员的行为归结为 [expr.static.cast]/13;
A prvalue of type “pointer to cv1
void
” can be converted to a prvalue of type “pointer to cv2T
”, whereT
is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the addressA
of a byte in memory andA
does not satisfy the alignment requirement ofT
, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an objecta
, and there is an objectb
of typeT
(ignoring cv-qualification) that is pointer-interconvertible witha
, the result is a pointer tob
. Otherwise, the pointer value is unchanged by the conversion.
pointer-interconvertible的定义是:
Two objects a and b are pointer-interconvertible if:
- they are the same object, or
- one is a union object and the other is a non-static data member of that object, or
- one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object, or
- there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
所以在原来的代码中,s
和s.x
是pointer-interconvertible,因此(int &)s
实际上指定了s.x
.
因此,在严格的别名规则中,其存储值被访问的对象是 s.x
而不是 s
所以没有问题,代码是正确的。
引用的规则源自 C89 中的一条类似规则,除非有人延伸 "by" 一词的含义,或者认识到 "Undefined Behavior" 在编写 C89 时的含义,否则该规则将毫无意义。给定类似 struct S {unsigned dat[10];}s;
的东西,语句 s.dat[1]++;
显然会修改 s
的存储值,但该表达式中唯一类型 struct S
的左值仅用于生成 unsigned*
类型的值。用于修改任何对象的唯一左值是 int
.
在我看来,有两种相关的方法可以解决此问题:(1) 认识到标准的作者希望允许一种类型的左值明显派生自另一种类型的情况,但是不想纠结于必须考虑哪些形式的可见推导的细节,特别是因为编译器需要识别的案例范围会根据他们执行的优化风格和他们所执行的任务而有很大差异正在使用; (2) 认识到标准的作者没有理由认为标准是否实际要求对特定构造进行有用处理应该很重要,如果每个人都清楚有理由不这样做的话。
我认为委员会成员对于编译器是否给出了类似的东西没有达成共识:
struct foo {int ct; int *dat;} it;
void test(void)
{
for (int i=0; i < it.ct; i++)
it.dat[i] = 0;
}
应该要求确保例如在 it.ct = 1234; it.dat = &it.ct;
之后,对 test();
的调用将使 it.ct
归零并且没有其他效果。基本原理的部分内容表明至少一些委员会成员会如此预期,但省略任何允许使用成员类型的任意左值访问结构类型对象的规则表明情况并非如此。 C 标准从未真正解决过这个问题,而 C++ 标准稍微清理了一下,但也没有真正解决它。