std::string 是否需要将其字符存储在连续的内存中?

Does std::string need to store its character in a contiguous piece of memory?

我知道在 C++98 中,std::basic_string<>std::vector<> 都不需要使用连续存储。一经指出,这就被视为对 std::vector<> 的疏忽,而且,如果我没记错的话,已在 C++03 中得到修复。

似乎 记得在 C++11 仍被称为 C++0x 时读过关于要求 std::basic_string<> 使用连续存储的讨论,但我没有' 密切关注当时的讨论,并且在工作中仍然仅限于 C++03,所以我不确定它变成了什么。

那么 std::basic_string<> 是否需要使用连续存储? (如果是这样,那么哪个版本的标准首先需要它?)

如果您想知道:如果您有代码将 &str[0] 的结果传递给需要写入连续内存块的函数,这很重要。 (我知道 str.data(),但由于明显的原因旧代码不使用它。)

C++11 standard,basic_string21.4.1.5,

The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

根据标准草案N452721.4/3Class模板basic_string[basic.string]:

A basic_string is a contiguous container (23.2.1).

在 c++03 中,不能保证字符串的元素连续存储。 [basic.string] 是

  1. For a char-like type charT, the class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects (clause 21). The first element of the sequence is at position zero. Such a sequence is also called a “string” if the given char-like type is clear from context. In the rest of this clause, charT denotes such a given char-like type. Storage for the string is allocated and freed as necessary by the member functions of class basic_string, via the Allocator class passed as template parameter. Allocator::value_type shall be the same as charT.
  2. The class template basic_string conforms to the requirements of a Sequence, as specified in (23.1.1). Additionally, because the iterators supported by basic_string are random access iterators (24.1.5), basic_string conforms to the the requirements of a Reversible Container, as specified in (23.1). 389 ISO/IEC 14882:2003(E)  ISO/IEC 21.3 Class template basic_string 21 Strings library
  3. In all cases, size() <= capacity().

然后在 C++17 中他们也改变了它

  1. The class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects with the first element of the sequence at position zero. Such a sequence is also called a “string” if the type of the char-like objects that it holds is clear from context. In the rest of this Clause, the type of the char-like objects held in a basic_string object is designated by charT.
  2. The member functions of basic_string use an object of the Allocator class passed as a template parameter to allocate and free storage for the contained char-like objects.233
  3. A basic_string is a contiguous container (23.2.1).
  4. In all cases, size() <= capacity().

强调我的

所以在 C++17 之前它是不能保证的,但现在是。

由于 std::string::data 施加的约束,这种不保证几乎没有实际意义,因为调用 std::string::data 会为您提供字符串中字符的连续数组。因此,除非实现是按需执行此操作并且在恒定时间内字符串将是连续的。


In case you wonder: This is important if you have code passing the result of &str[0] to a function expecting a contiguous piece of memory to write to. (I know about str.data(), but for obvious reasons old code doesn't use it.)

operator[] 的行为也发生了变化。在 C++03 中我们有

Returns: If pos < size(), returns data()[pos]. Otherwise, if pos == size(), the const version returns charT(). Otherwise, the behavior is undefined.

因此,如果您在 s 为空时尝试 &s[0],则只有 const 版本可以保证具有定义的行为。在 C++11 中,他们将其更改为:

Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior.

所以现在 const 和非 const 版本都定义了当 s 为空时尝试 &s[0] 的行为。