为什么在 returns 字符串不起作用的函数上调用 std::string.c_str()?
Why does calling std::string.c_str() on a function that returns a string not work?
我有以下代码:
std::string getString() {
std::string str("hello");
return str;
}
int main() {
const char* cStr = getString().c_str();
std::cout << cStr << std::endl; // this prints garbage
}
我认为 getString()
会 return copy of str
(getString()
returns 按值);因此,str
的副本将在 main()
中保留 "alive",直到 main()
return 秒。这将使 cStr
指向一个有效的内存位置:char[]
或 char*
(或其他)str
return 的副本的底层14=] 保留在 main()
.
但是,显然不是这样的,因为程序输出的是垃圾。那么,问题是,str
何时销毁,为什么?
getString()
would return a copy of str
(getString()
returns by value);
没错。
thus, the copy of str
would stay "alive" in main()
until main()
returns.
不,返回的副本是临时的 std::string
,它将在创建它的语句结束时销毁,即在 std::cout << cStr << std::endl;
之前。然后 cStr
变得悬空,对它的取消引用导致 UB,一切皆有可能。
您可以将返回的临时变量复制到命名变量,或将其绑定到 const
左值引用或右值引用(临时变量的生命周期将延长,直到引用超出范围) .如:
std::string s1 = getString(); // s1 will be copy initialized from the temporary
const char* cStr1 = s1.c_str();
std::cout << cStr1 << std::endl; // safe
const std::string& s2 = getString(); // lifetime of temporary will be extended when bound to a const lvalue-reference
const char* cStr2 = s2.c_str();
std::cout << cStr2 << std::endl; // safe
std::string&& s3 = getString(); // similar with above
const char* cStr3 = s3.c_str();
std::cout << cStr3 << std::endl; // safe
或者在临时对象被销毁之前使用指针。例如
std::cout << getString().c_str() << std::endl; // temporary gets destroyed after the full expression
这是来自 [The.C++.Programming.Language.Special.Edition] 10.4.10 临时对象 [class.temp]] 的解释:
Unless bound to a reference or used to initialize a named object, a
temporary object is destroyed at the end of the full expression in
which it was created. A full expression is an expression that is
not a subexpression of some other expression.
The standard string class has a member function c_str() that
returns a C-style, zero-terminated array of characters (§3.5.1, §20.4.1). Also, the operator + is defined to mean string concatenation.
These are very useful facilities for strings . However, in combination they can cause obscure problems.
For example:
void f(string& s1, string& s2, string& s3)
{
const char* cs = (s1 + s2).c_str();
cout << cs ;
if (strlen(cs=(s2+s3).c_str())<8 && cs[0]==´a´) {
// cs used here
}
}
Probably, your first reaction is "but don’t do that," and I agree.
However, such code does get written, so it is worth knowing how it is
interpreted.
A temporary object of class string is created to hold s1 + s2 .
Next, a pointer to a C-style string is extracted from that object. Then
– at the end of the expression – the temporary object is deleted. Now,
where was the C-style string allocated? Probably as part of the
temporary object holding s1 + s2 , and that storage is not guaranteed
to exist after that temporary is destroyed. Consequently, cs points
to deallocated storage. The output operation cout << cs might work
as expected, but that would be sheer luck. A compiler can detect and
warn against many variants of this problem.
这里的问题是您要返回一个临时变量并超过
您正在执行的那个临时变量 c_str 函数。
"c_str() function Returns a pointer to an array that contains a null-terminated
sequence of characters (i.e., a C-string) representing the current
value of the string object(
[http://www.cplusplus.com/reference/string/string/c_str/][1]).
在这种情况下,您的指针指向现在不存在的内存位置。
std::string getString() {
std::string str("hello");
return str; // Will create Temporary object as it's return by value}
int main() {
const char* cStr = getString().c_str(); // Temporary object is destroyed
std::cout << cStr << std::endl; // this prints garbage }
解决方案是将您的临时对象正确复制到内存位置(通过创建本地副本),然后对该对象使用 c_str。
正如其他人所提到的,您在临时指针已被删除后使用它 - 这是 免费使用后堆的经典示例。
我可以添加到其他人的答案中的是,您可以使用 gcc's or clang's 地址清理器轻松检测到此类用法。
示例:
#include <string>
#include <iostream>
std::string get()
{
return "hello";
}
int main()
{
const char* c = get().c_str();
std::cout << c << std::endl;
}
消毒剂输出:
=================================================================
==2951==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300000eff8 at pc 0x7f78e27869bb bp 0x7fffc483e670 sp 0x7fffc483de20
READ of size 6 at 0x60300000eff8 thread T0
#0 0x7f78e27869ba in strlen (/usr/lib64/libasan.so.2+0x6d9ba)
#1 0x39b4892ba0 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) (/usr/lib64/libstdc++.so.6+0x39b4892ba0)
#2 0x400dd8 in main /tmp/tmep_string/main.cpp:12
#3 0x39aa41ed5c in __libc_start_main (/lib64/libc.so.6+0x39aa41ed5c)
#4 0x400c48 (/tmp/tmep_string/a.out+0x400c48)
0x60300000eff8 is located 24 bytes inside of 30-byte region [0x60300000efe0,0x60300000effe)
freed by thread T0 here:
#0 0x7f78e27ae6ea in operator delete(void*) (/usr/lib64/libasan.so.2+0x956ea)
#1 0x39b489d4c8 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (/usr/lib64/libstdc++.so.6+0x39b489d4c8)
#2 0x39aa41ed5c in __libc_start_main (/lib64/libc.so.6+0x39aa41ed5c)
previously allocated by thread T0 here:
#0 0x7f78e27ae1aa in operator new(unsigned long) (/usr/lib64/libasan.so.2+0x951aa)
#1 0x39b489c3c8 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (/usr/lib64/libstdc++.so.6+0x39b489c3c8)
#2 0x400c1f (/tmp/tmep_string/a.out+0x400c1f)
SUMMARY: AddressSanitizer: heap-use-after-free ??:0 strlen
Shadow bytes around the buggy address:
0x0c067fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c067fff9df0: fa fa fa fa fa fa fa fa fa fa fa fa fd fd fd[fd]
0x0c067fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==2951==ABORTING
我有以下代码:
std::string getString() {
std::string str("hello");
return str;
}
int main() {
const char* cStr = getString().c_str();
std::cout << cStr << std::endl; // this prints garbage
}
我认为 getString()
会 return copy of str
(getString()
returns 按值);因此,str
的副本将在 main()
中保留 "alive",直到 main()
return 秒。这将使 cStr
指向一个有效的内存位置:char[]
或 char*
(或其他)str
return 的副本的底层14=] 保留在 main()
.
但是,显然不是这样的,因为程序输出的是垃圾。那么,问题是,str
何时销毁,为什么?
getString()
would return a copy ofstr
(getString()
returns by value);
没错。
thus, the copy of
str
would stay "alive" inmain()
untilmain()
returns.
不,返回的副本是临时的 std::string
,它将在创建它的语句结束时销毁,即在 std::cout << cStr << std::endl;
之前。然后 cStr
变得悬空,对它的取消引用导致 UB,一切皆有可能。
您可以将返回的临时变量复制到命名变量,或将其绑定到 const
左值引用或右值引用(临时变量的生命周期将延长,直到引用超出范围) .如:
std::string s1 = getString(); // s1 will be copy initialized from the temporary
const char* cStr1 = s1.c_str();
std::cout << cStr1 << std::endl; // safe
const std::string& s2 = getString(); // lifetime of temporary will be extended when bound to a const lvalue-reference
const char* cStr2 = s2.c_str();
std::cout << cStr2 << std::endl; // safe
std::string&& s3 = getString(); // similar with above
const char* cStr3 = s3.c_str();
std::cout << cStr3 << std::endl; // safe
或者在临时对象被销毁之前使用指针。例如
std::cout << getString().c_str() << std::endl; // temporary gets destroyed after the full expression
这是来自 [The.C++.Programming.Language.Special.Edition] 10.4.10 临时对象 [class.temp]] 的解释:
Unless bound to a reference or used to initialize a named object, a temporary object is destroyed at the end of the full expression in which it was created. A full expression is an expression that is not a subexpression of some other expression.
The standard string class has a member function c_str() that returns a C-style, zero-terminated array of characters (§3.5.1, §20.4.1). Also, the operator + is defined to mean string concatenation. These are very useful facilities for strings . However, in combination they can cause obscure problems. For example:
void f(string& s1, string& s2, string& s3) { const char* cs = (s1 + s2).c_str(); cout << cs ; if (strlen(cs=(s2+s3).c_str())<8 && cs[0]==´a´) { // cs used here } }
Probably, your first reaction is "but don’t do that," and I agree. However, such code does get written, so it is worth knowing how it is interpreted.
A temporary object of class string is created to hold s1 + s2 . Next, a pointer to a C-style string is extracted from that object. Then – at the end of the expression – the temporary object is deleted. Now, where was the C-style string allocated? Probably as part of the temporary object holding s1 + s2 , and that storage is not guaranteed to exist after that temporary is destroyed. Consequently, cs points to deallocated storage. The output operation cout << cs might work as expected, but that would be sheer luck. A compiler can detect and warn against many variants of this problem.
这里的问题是您要返回一个临时变量并超过 您正在执行的那个临时变量 c_str 函数。
"c_str() function Returns a pointer to an array that contains a null-terminated sequence of characters (i.e., a C-string) representing the current value of the string object( [http://www.cplusplus.com/reference/string/string/c_str/][1]).
在这种情况下,您的指针指向现在不存在的内存位置。
std::string getString() {
std::string str("hello");
return str; // Will create Temporary object as it's return by value}
int main() {
const char* cStr = getString().c_str(); // Temporary object is destroyed
std::cout << cStr << std::endl; // this prints garbage }
解决方案是将您的临时对象正确复制到内存位置(通过创建本地副本),然后对该对象使用 c_str。
正如其他人所提到的,您在临时指针已被删除后使用它 - 这是 免费使用后堆的经典示例。
我可以添加到其他人的答案中的是,您可以使用 gcc's or clang's 地址清理器轻松检测到此类用法。
示例:
#include <string>
#include <iostream>
std::string get()
{
return "hello";
}
int main()
{
const char* c = get().c_str();
std::cout << c << std::endl;
}
消毒剂输出:
=================================================================
==2951==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300000eff8 at pc 0x7f78e27869bb bp 0x7fffc483e670 sp 0x7fffc483de20
READ of size 6 at 0x60300000eff8 thread T0
#0 0x7f78e27869ba in strlen (/usr/lib64/libasan.so.2+0x6d9ba)
#1 0x39b4892ba0 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) (/usr/lib64/libstdc++.so.6+0x39b4892ba0)
#2 0x400dd8 in main /tmp/tmep_string/main.cpp:12
#3 0x39aa41ed5c in __libc_start_main (/lib64/libc.so.6+0x39aa41ed5c)
#4 0x400c48 (/tmp/tmep_string/a.out+0x400c48)
0x60300000eff8 is located 24 bytes inside of 30-byte region [0x60300000efe0,0x60300000effe)
freed by thread T0 here:
#0 0x7f78e27ae6ea in operator delete(void*) (/usr/lib64/libasan.so.2+0x956ea)
#1 0x39b489d4c8 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (/usr/lib64/libstdc++.so.6+0x39b489d4c8)
#2 0x39aa41ed5c in __libc_start_main (/lib64/libc.so.6+0x39aa41ed5c)
previously allocated by thread T0 here:
#0 0x7f78e27ae1aa in operator new(unsigned long) (/usr/lib64/libasan.so.2+0x951aa)
#1 0x39b489c3c8 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (/usr/lib64/libstdc++.so.6+0x39b489c3c8)
#2 0x400c1f (/tmp/tmep_string/a.out+0x400c1f)
SUMMARY: AddressSanitizer: heap-use-after-free ??:0 strlen
Shadow bytes around the buggy address:
0x0c067fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c067fff9df0: fa fa fa fa fa fa fa fa fa fa fa fa fd fd fd[fd]
0x0c067fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==2951==ABORTING