是否可以通过成员地址访问超过结构的大小,并分配足够的 space?
Is it OK to access past the size of a structure via member address, with enough space allocated?
具体来说,下面的代码,标记下面的那一行,好吗?
struct S{
int a;
};
#include <stdlib.h>
int main(){
struct S *p;
p = malloc(sizeof(struct S) + 1000);
// This line:
*(&(p->a) + 1) = 0;
}
人们争论here,但没有人给出有说服力的解释或参考。
他们的论点基础略有不同,但本质上是相同的
typedef struct _pack{
int64_t c;
} pack;
int main(){
pack *p;
char str[9] = "aaaaaaaa"; // Input
size_t len = offsetof(pack, c) + (strlen(str) + 1);
p = malloc(len);
// This line, with similar intention:
strcpy((char*)&(p->c), str);
// ^^^^^^^
这是未定义的行为,因为您正在访问不是数组的东西(int a
在 struct S
内)作为数组访问,并且超出范围。
实现所需目标的正确方法是使用没有大小的数组作为最后一个 struct
成员:
#include <stdlib.h>
typedef struct S {
int foo; //avoid flexible array being the only member
int a[];
} S;
int main(){
S *p = malloc(sizeof(*p) + 2*sizeof(int));
p->a[0] = 0;
p->a[1] = 42; //Perfectly legal.
}
至少自 1989 年 C 语言标准化以来,其意图是允许实现检查数组边界 以进行数组访问。
成员 p->a
是类型 int
的对象。 C11 6.5.6p7 表示
7 For the purposes of [additive operators] a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
因此
&(p->a)
是指向 int
的指针;但它也好像是一个指向长度为 1 的数组的第一个元素的指针,对象类型为 int
。
现在 6.5.6p8 allows one to calculate &(p->a) + 1
which is a pointer to just past the end of the array, so there is no undefined behaviour. However, the dereference of such a pointer is invalid. From Appendix J.2 拼写的地方,在以下情况下行为未定义:
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary *
operator that is evaluated (6.5.6).
在上面的表达式中,只有一个数组,一个(好像)恰好有 1 个元素的数组。如果&(p->a) + 1
被解引用,长度为1的数组被越界访问,出现undefined behaviour,即
behavior [...], for which [The C11] Standard imposes no requirements
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
最常见的行为是完全忽略这种情况,即表现得好像指针刚刚引用了内存位置,并不意味着其他类型的行为不会'从标准的角度来看是不可接受的 - 该标准允许每一个可以想象和无法想象的结果。
有人说C11标准文本写得很模糊,委员会的意图应该是确实允许这样,如果是以前就可以了。这不是真的。阅读委员会对 [缺陷报告 #017,1992 年 12 月 10 日至 C89] 的回应部分。
Question 16
[...]
Response
For an array of arrays, the permitted pointer arithmetic in
subclause 6.3.6, page 47, lines 12-40 is to be understood by
interpreting the use of the word object as denoting the specific
object determined directly by the pointer's type and value, not other
objects related to that one by contiguity. Therefore, if an expression
exceeds these permissions, the behavior is undefined. For example, the
following code has undefined behavior:
int a[4][5];
a[1][7] = 0; /* undefined */
Some conforming implementations may
choose to diagnose an array bounds violation, while others may
choose to interpret such attempted accesses successfully with the
obvious extended semantics.
(bolded emphasis mine)
没有理由不将相同的 转移到结构的标量成员,特别是当 6.5.6p7 说指向它们的指针应该被视为表现与 指向长度为 1 的数组的第一个元素的指针相同,对象的类型作为其元素类型。
如果你想寻址连续的 struct
s,你总是可以将指针指向 第一个成员 并将其转换为指向 struct
并改为:
*(int *)((S *)&(p->a) + 1) = 0;
C 标准保证
§6.7.2.1/15:
[...] A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
&(p->a)
等价于 (int *)p
。 &(p->a) + 1
将是第二个结构的元素的地址。在这种情况下,那里只有一个元素,结构中不会有任何填充,所以这会起作用,但是在有填充的地方,这段代码会中断并导致未定义的行为。
具体来说,下面的代码,标记下面的那一行,好吗?
struct S{
int a;
};
#include <stdlib.h>
int main(){
struct S *p;
p = malloc(sizeof(struct S) + 1000);
// This line:
*(&(p->a) + 1) = 0;
}
人们争论here,但没有人给出有说服力的解释或参考。
他们的论点基础略有不同,但本质上是相同的
typedef struct _pack{
int64_t c;
} pack;
int main(){
pack *p;
char str[9] = "aaaaaaaa"; // Input
size_t len = offsetof(pack, c) + (strlen(str) + 1);
p = malloc(len);
// This line, with similar intention:
strcpy((char*)&(p->c), str);
// ^^^^^^^
这是未定义的行为,因为您正在访问不是数组的东西(int a
在 struct S
内)作为数组访问,并且超出范围。
实现所需目标的正确方法是使用没有大小的数组作为最后一个 struct
成员:
#include <stdlib.h>
typedef struct S {
int foo; //avoid flexible array being the only member
int a[];
} S;
int main(){
S *p = malloc(sizeof(*p) + 2*sizeof(int));
p->a[0] = 0;
p->a[1] = 42; //Perfectly legal.
}
至少自 1989 年 C 语言标准化以来,其意图是允许实现检查数组边界 以进行数组访问。
成员 p->a
是类型 int
的对象。 C11 6.5.6p7 表示
7 For the purposes of [additive operators] a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
因此
&(p->a)
是指向 int
的指针;但它也好像是一个指向长度为 1 的数组的第一个元素的指针,对象类型为 int
。
现在 6.5.6p8 allows one to calculate &(p->a) + 1
which is a pointer to just past the end of the array, so there is no undefined behaviour. However, the dereference of such a pointer is invalid. From Appendix J.2 拼写的地方,在以下情况下行为未定义:
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary
*
operator that is evaluated (6.5.6).
在上面的表达式中,只有一个数组,一个(好像)恰好有 1 个元素的数组。如果&(p->a) + 1
被解引用,长度为1的数组被越界访问,出现undefined behaviour,即
behavior [...], for which [The C11] Standard imposes no requirements
Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
最常见的行为是完全忽略这种情况,即表现得好像指针刚刚引用了内存位置,并不意味着其他类型的行为不会'从标准的角度来看是不可接受的 - 该标准允许每一个可以想象和无法想象的结果。
有人说C11标准文本写得很模糊,委员会的意图应该是确实允许这样,如果是以前就可以了。这不是真的。阅读委员会对 [缺陷报告 #017,1992 年 12 月 10 日至 C89] 的回应部分。
Question 16
[...]
Response
For an array of arrays, the permitted pointer arithmetic in subclause 6.3.6, page 47, lines 12-40 is to be understood by interpreting the use of the word object as denoting the specific object determined directly by the pointer's type and value, not other objects related to that one by contiguity. Therefore, if an expression exceeds these permissions, the behavior is undefined. For example, the following code has undefined behavior:
int a[4][5]; a[1][7] = 0; /* undefined */
Some conforming implementations may choose to diagnose an array bounds violation, while others may choose to interpret such attempted accesses successfully with the obvious extended semantics.
(bolded emphasis mine)
没有理由不将相同的 转移到结构的标量成员,特别是当 6.5.6p7 说指向它们的指针应该被视为表现与 指向长度为 1 的数组的第一个元素的指针相同,对象的类型作为其元素类型。
如果你想寻址连续的 struct
s,你总是可以将指针指向 第一个成员 并将其转换为指向 struct
并改为:
*(int *)((S *)&(p->a) + 1) = 0;
C 标准保证
§6.7.2.1/15:
[...] A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
&(p->a)
等价于 (int *)p
。 &(p->a) + 1
将是第二个结构的元素的地址。在这种情况下,那里只有一个元素,结构中不会有任何填充,所以这会起作用,但是在有填充的地方,这段代码会中断并导致未定义的行为。