澄清比较无效指针的行为

Clarification on behaviour of comparing invalid pointers

我对标准的语言有点困惑(具体来说是 N4868)。据我了解,指针被归类为 6.8.3 [basic.compound]:

(3.1) — a pointer to an object or function (the pointer is said to point to the object or function), or

(3.2) — a pointer past the end of an object (7.6.6), or

(3.3) — the null pointer value for that type, or

(3.4) — an invalid pointer value

无效指针进一步 详见注释 2:

[Note 2 : A pointer past the end of an object (7.6.6) is not considered to point to an unrelated object of the object’s type that might be located at that address. A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration; see 6.7.5. — end note]

我的第一个问题是,在下一个子句中,指针 'past the end of the array' 被认为是有效的。我是否理解这恰好是数组末尾之后的指针,或者数组末尾之后的所有指针都是有效的。上面的注释让我相信不指向实例化对象的指针自动无效,因为它指向可能未分配的内存。所以也许只有紧接数组末尾的指针才有效?

For purposes of pointer arithmetic (7.6.6) and comparison (7.6.9, 7.6.10), a pointer past the end of the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical array element n of x and an object of type T that is not an array element is considered to belong to an array with one element of type T. The value representation of pointer types is implementation-defined. Pointers to layout-compatible types shall have the same value representation and alignment requirements (6.7.6).

所有这一切的原因是因为标准非常清楚地指定了它认为定义的指针关系比较,确实 7.6.9 [expr.rel] 说:

(4.1) — If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.

(4.2) — If two pointers point to different non-static data members of the same object, or to subobjects of such members, recursively, the pointer to the later declared member is required to compare greater provided the two members have the same access control (11.9), neither member is a subobject of zero size, and their class is not a union.

(4.3) — Otherwise, neither pointer is required to compare greater than the other.

这对我来说意味着如果指针不指向数组元素或未达到其生命周期结束的对象,则它是无效的。所以,如果我想创建一个函数来检查指针是否在数组的范围内,即:

template<class Type>
bool is_bounded(Type* arr_first, Type* arr_last, Type* elem) {
    return (arr_first <= elem) && (elem < arr_last);
}

如果 elem 不在 [arr_firstarr_last] 区间内,这是否是未定义的行为,因为不能保证 elem 指向任何东西?由于我不能保证定义了它的(预期的)错误结果,这反过来又使该函数的存在无效?

P.S。对于问题的措辞可能令人困惑,我提前表示歉意,如果有人问,我会尽力详细说明或澄清。


编辑:我想澄清一下为什么这里的细节对我很重要(谢谢大家的帮助)。我目前正在学习如何编写质量良好定义的容器,并且在我的容器迭代器中我想做一些仅调试检查以确保用户不会意外地使迭代器无效。在考虑连续迭代器的 operator+= 重载时出现问题。

Container& operator+=(difference_type n) {
    assert(_check_valid(n));
    _ptr += n;
    return *this;
}

这里我有两个选项 _check_valid(n) 使用 is_bounded(_first, _last, _curr + n) 其中 _first_last_curr 是指向数组第一个元素的指针, 一个数组的最后一个元素和指向迭代器存储的值的指针;或按照以下方式做某事:

bool _check_valid(difference_type n) {
    difference_type size  = _last - _first;
    difference_type index = _curr - _first;

    return (index + n) >= 0 && (index + n) <= size;
}

据我所知,后者没有未定义的行为,假设 _curr 在闭区间 [_first, _last] 中(更容易执行),而如果 n 太大,前者可能会变得不确定。但是,我不想不必要地过度设计,并且更喜欢像 is_bounded 这样更简单的函数。现在我看到前者确实不是正确的方法。谢谢。

Am I to understand that this is exactly one past the end of the array or that all pointers past the end of the array are valid.

只有一个指针one-past-the-array或one-past-the-object是有效的(虽然你不能解引用这样的指针)。之后的指针无法构造,因为指针算术在此点之后具有未定义的行为。

The note above would have me believe that a pointer that does not point to an instantiated object is automatically invalid, since it is pointing to potentially unallocated memory.

如果指针是 one-past-end 指针,则指针不需要指向实际对象。但是,不能取消引用这样的指针。指向array/object的指针,包括one-past-the-end指针,一旦object/array的存储期结束就失效。

Which to me would suggest that if a pointer does not point to an array element or an object that has not reached the end of its lifetime, it is invalid.

one-past-the-end 指针被认为是引用子句的(假设的)数组的假设元素,请参阅引用 [basic.compound] 部分下的注释。

Would this be undefined behaviour if elem is not in the interval [arr_first, arr_last] since there is no guarantee elem points to anything?

假设 arr_first 是数组的第一个元素,arr_last 是数组的最后一个元素,如果 elem 没有指向范围 arr_firstarr_last+1(含)。

这并不意味着它有未定义的行为,只是函数的 return 值可能是完全任意的。

然而,试图形成例如传递给函数的指针 arr_last+2 本身已经具有未定义的行为,因为指针运算仅在数组(或 one-past-the 数组)的范围内定义。

Which in turn invalidates the existence of this function since I can't guarantee its (expected) false results are defined?

所写的功能在技术上是没有用的,尽管我想它在大多数时候在实践中或多或少会按预期工作。这是一种更好的方法来验证数组中的索引,而不是指针。