c_str() 与 data() 在涉及 return 类型时

Question

C++11之后，我想到了c_str()和data()equivalently。

C++17 为后者引入了重载，即 return 是一个非常量指针（reference，我不确定它是否已完全更新 w.r.t。C ++17):

const CharT* data() const;    (1)   
CharT* data();                (2)   (since C++17)

c_str() 只做 return 常量指针：

const CharT* c_str() const;

为什么这两种方法在 C++17 中有所不同，尤其是当 C++11 是使它们同构的方法时？换句话说，为什么只有一个方法重载，而另一个没有？

Answer 1

data() 成员超载的原因在 open-std.org 的 this 论文中有解释。

TL;论文的 DR：添加了 std::string 的非 const .data() 成员函数以提高标准库中的一致性并帮助 C++ 开发人员编写正确的代码。调用其 C 字符串参数没有 const 限定的 C 库函数时也很方便。

论文中的一些相关段落：

Abstract
Is std::string's lack of a non-const .data() member function an oversight or an intentional design based on pre-C++11 std::string semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const .data() member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.

Use Cases
C libraries occasionally include routines that have char * parameters. One example is the lpCommandLine parameter of the CreateProcess function in the Windows API. Because the data() member of std::string is const, it cannot be used to make std::string objects work with the lpCommandLine parameter. Developers are tempted to use .front() instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
  // etc.
} else {
  // handle error
}
Note that when programName is empty, the programName.front() expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...

if( !programName.empty() ) { 
  char emptyString[] = {'[=11=]'};    
  if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
    // etc.
  } else {
    // handle error
  }
}
If there were a non-const .data() member, as there is with std::vector, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
  char emptyString[] = {'[=12=]'};
  if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
    // etc.
  } else {
    // handle error
  }
}
A non-const .data() std::string member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.

Answer 2

这仅取决于 "what you want to do with it" 的语义。一般来说，std::string 有时用作缓冲向量，即作为 std::vector<char> 的替代。这可以在 boost::asio 中经常看到。换句话说，它是一个字符数组。

c_str()：严格来说，您正在寻找以 null 结尾的字符串。从这个意义上讲，您永远不应该修改数据，也不应该将字符串作为非常量。

data()：你可能需要把字符串里面的信息作为缓冲数据，甚至作为非常量。您可能需要也可能不需要修改数据，只要不涉及更改字符串的长度即可。

Answer 3

P0272R1 为 C++17 添加了新的重载。论文本身和其中的链接都没有讨论为什么只有 data 被赋予了新的重载而 c_str 却没有。目前只能推测（除非参与讨论的人插话），但我想提出以下几点供参考：

即使只是将重载添加到 data 也会破坏一些代码；保持此更改保守是将负面影响降至最低的一种方式。
到目前为止，c_str 函数与 data 完全相同，并且实际上是一个 "legacy" 接口代码工具，需要 "C string"，即 不可变 ，以 null 结尾的 char 数组。由于您始终可以将 c_str 替换为 data，因此没有特别理由添加到此旧版界面。

我意识到 P0292R1 的真正动机是确实存在遗留的 APIs 错误地或出于 C 原因仅采用可变指针，即使它们不发生变化。尽管如此，我想我们不想在已经非常庞大的字符串中添加更多绝对必要的 API。

还有一点：从 C++17 开始，您现在 allowed to write 到空终止符，只要您写入值零。（以前，它曾经是 UB 向 null 终止符写入任何内容。）可变 c_str 将创建另一个进入此特定微妙之处的入口点，我们拥有的微妙之处越少越好。

Answer 4

std::string的两个成员函数c_str和data由于std::string class.

在 C++11 之前，std::string 可以实现为写时复制。内部表示不需要存储字符串的任何空终止。成员函数 c_str 确保 returned 字符串以 null 终止。成员函数 data simlpy returned 指向存储的字符串的指针，它不一定以 null 终止。 - 为确保注意到对字符串的更改以启用写时复制，这两个函数都需要 return 指向常量数据的指针。

当 std::string 不再允许写时复制时，C++11 改变了这一切。由于仍然需要 c_str 来传递以空字符结尾的字符串，因此始终将空字符附加到实际存储的字符串中。否则调用 c_str 可能需要更改存储的数据以使字符串空终止，这将使 c_str非常量函数。由于 data 传递一个指向存储字符串的指针，它通常与 c_str 具有相同的实现。由于向后兼容，这两个功能仍然存在。

c_str() 与 data() 在涉及 return 类型时

c_str() vs. data() when it comes to return type

c++

string

c-str

c++17