C++ getline() 的未记录行为
C++ getline()'s undocumented behavior
在 C++ 中,当您在 stringstream 上使用带定界符的 getline()
时,有一些我没有发现的记录,但它们在以下情况下有一些非错误的方便行为:
- 未找到定界符 => 然后返回整个 string/rest
- 有分隔符,但之前没有分隔符 => 返回空字符串
- 得到实际上并不存在的东西 => returns 最后可以用它阅读的东西
部分测试代码(简体):
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
string test(const string &s, char delim, int parseIndex ){
stringstream ss(s);
string parsedStr = "";
for( int i = 0; i < (parseIndex+1); i++ ) getline(ss, parsedStr, delim);
return parsedStr;
}
int main() {
stringstream ss("something without delimiter");
string s1;
getline(ss,s1,';');
cout << "'" << s1 << "'" << endl; //no delim
cout << endl;
string s2 = "321;;123";
cout << "'" << test(s2,';',0) << "'" << endl; //classic
cout << "'" << test(s2,';',1) << "'" << endl; //nothing before
cout << "'" << test(s2,';',2) << "'" << endl; //no delim at the end
cout << "'" << test(s2,';',3) << "'" << endl; //this shouldn't be there
cout << endl;
return 0;
}
测试代码输出:
'something without delimiter'
'321'
''
'123'
'123'
测试代码fiddle:http://ideone.com/ZAuydR
问题
问题是——这个靠谱吗?如果是这样,它在哪里记录 - 是吗?
感谢您的回答和澄清:)
getline
的行为明确记录在标准 (C++11 §21.4.8.9 ¶7-10) 中,这是关于 C++ 的唯一规范性文档。
您在前两个问题中询问的行为是有保证的,而第三个问题是您的测试装置的制作方式的结果。
template<class charT, class traits, class Allocator>
basic_istream<charT,traits>&
getline(basic_istream<charT,traits>& is,
basic_string<charT,traits,Allocator>& str,
charT delim);
template<class charT, class traits, class Allocator>
basic_istream<charT,traits>&
getline(basic_istream<charT,traits>&& is,
basic_string<charT,traits,Allocator>& str,
charT delim);
Effects: Behaves as an unformatted input function (27.7.2.3), except that it does not affect the value
returned by subsequent calls to basic_istream<>::gcount()
. After constructing a sentry
object,
if the sentry
converts to true
, calls str.erase()
and then extracts characters from is
and appends
them to str
as if by calling str.append(1, c)
until any of the following occurs:
- end-of-file occurs on the input sequence (in which case, the
getline
function calls is.setstate(ios_base::eofbit)
).
traits::eq(c, delim)
for the next available input character c
(in which case, c
is extracted but
not appended) (27.5.5.4)
str.max_size()
characters are stored (in which case, the function calls is.setstate(ios_base::failbit
)) (27.5.5.4)
The conditions are tested in the order shown. In any case, after the last character is extracted, the
sentry object k
is destroyed.
If the function extracts no characters, it calls is.setstate(ios_base::failbit)
which may throw
ios_base::failure
(27.5.5.4).
Returns: is
.
回答您的问题:
delimiter is not found => then simply whole string/rest of it is returned
这是第一个退出条件的结果 - 当输入字符串终止时,字符串流进入文件末尾,因此提取终止(在将所有前面的字符添加到输出字符串之后)。
there is delimiter but nothing before it => empty string is returned
这只是第二点的特例 - 当找到定界符时提取终止(traits::eq(c, delim)
通常归结为 c==delim
),即使之前没有提取其他字符。
getting something that isn't really there => returns the last thing that could be read with it
事情并不完全像这样。如果流处于错误状态(sentry
对象未转换为 true
,在上面的描述中)- 在您的情况下,您有一个 EOF -,getline
不理会您的字符串和 returns。在您的测试代码中,您看到最后读取的数据只是因为您正在回收相同的字符串而没有在各种测试之间清除它。
C++ 工具的行为由 ISO C++ 标准描述。但是,它不是最具可读性的资源。在这种情况下,cppreference.com 具有良好的覆盖率。
这是他们要说的。引用块是复制粘贴的;我已经穿插解释你的问题。
Behaves as UnformattedInputFunction
, except that input.gcount()
is not affected. After constructing and checking the sentry object, performs the following:
"Constructing and checking the sentry" 表示如果在流上检测到错误条件,函数将 return 不做任何事情。这就是为什么在 #3 中,当 "nothing should be there."
时您会观察到最后一个有效输入
1) Calls str.erase()
因此,如果随后在分隔符之前找不到任何内容,您将得到一个空字符串。
2) Extracts characters from input and appends them to str until one of the following occurs (checked in the order listed)
a) end-of-file condition on input, in which case, getline sets eofbit
.
这是一个错误条件,会导致 string
局部变量在随后的 getline
中保持不变。
它还允许您观察结束前的最后一段输入,因此如果您愿意,您可以将文件结束作为分隔符。
b) the next available input character is delim, as tested by Traits::eq(c, delim)
, in which case the delimiter character is extracted from input, but is not appended to str.
c) str.max_size() characters have been stored, in which case getline sets failbit and returns.
3) If no characters were extracted for whatever reason (not even the discarded delimiter), getline sets failbit
and returns.
在 C++ 中,当您在 stringstream 上使用带定界符的 getline()
时,有一些我没有发现的记录,但它们在以下情况下有一些非错误的方便行为:
- 未找到定界符 => 然后返回整个 string/rest
- 有分隔符,但之前没有分隔符 => 返回空字符串
- 得到实际上并不存在的东西 => returns 最后可以用它阅读的东西
部分测试代码(简体):
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
string test(const string &s, char delim, int parseIndex ){
stringstream ss(s);
string parsedStr = "";
for( int i = 0; i < (parseIndex+1); i++ ) getline(ss, parsedStr, delim);
return parsedStr;
}
int main() {
stringstream ss("something without delimiter");
string s1;
getline(ss,s1,';');
cout << "'" << s1 << "'" << endl; //no delim
cout << endl;
string s2 = "321;;123";
cout << "'" << test(s2,';',0) << "'" << endl; //classic
cout << "'" << test(s2,';',1) << "'" << endl; //nothing before
cout << "'" << test(s2,';',2) << "'" << endl; //no delim at the end
cout << "'" << test(s2,';',3) << "'" << endl; //this shouldn't be there
cout << endl;
return 0;
}
测试代码输出:
'something without delimiter'
'321'
''
'123'
'123'
测试代码fiddle:http://ideone.com/ZAuydR
问题
问题是——这个靠谱吗?如果是这样,它在哪里记录 - 是吗?
感谢您的回答和澄清:)
getline
的行为明确记录在标准 (C++11 §21.4.8.9 ¶7-10) 中,这是关于 C++ 的唯一规范性文档。
您在前两个问题中询问的行为是有保证的,而第三个问题是您的测试装置的制作方式的结果。
template<class charT, class traits, class Allocator> basic_istream<charT,traits>& getline(basic_istream<charT,traits>& is, basic_string<charT,traits,Allocator>& str, charT delim); template<class charT, class traits, class Allocator> basic_istream<charT,traits>& getline(basic_istream<charT,traits>&& is, basic_string<charT,traits,Allocator>& str, charT delim);
Effects: Behaves as an unformatted input function (27.7.2.3), except that it does not affect the value returned by subsequent calls to
basic_istream<>::gcount()
. After constructing asentry
object, if thesentry
converts totrue
, callsstr.erase()
and then extracts characters fromis
and appends them tostr
as if by callingstr.append(1, c)
until any of the following occurs:
- end-of-file occurs on the input sequence (in which case, the
getline
function callsis.setstate(ios_base::eofbit)
).traits::eq(c, delim)
for the next available input characterc
(in which case,c
is extracted but not appended) (27.5.5.4)str.max_size()
characters are stored (in which case, the function callsis.setstate(ios_base::failbit
)) (27.5.5.4)The conditions are tested in the order shown. In any case, after the last character is extracted, the sentry object
k
is destroyed.If the function extracts no characters, it calls
is.setstate(ios_base::failbit)
which may throwios_base::failure
(27.5.5.4).Returns:
is
.
回答您的问题:
delimiter is not found => then simply whole string/rest of it is returned
这是第一个退出条件的结果 - 当输入字符串终止时,字符串流进入文件末尾,因此提取终止(在将所有前面的字符添加到输出字符串之后)。
there is delimiter but nothing before it => empty string is returned
这只是第二点的特例 - 当找到定界符时提取终止(traits::eq(c, delim)
通常归结为 c==delim
),即使之前没有提取其他字符。
getting something that isn't really there => returns the last thing that could be read with it
事情并不完全像这样。如果流处于错误状态(sentry
对象未转换为 true
,在上面的描述中)- 在您的情况下,您有一个 EOF -,getline
不理会您的字符串和 returns。在您的测试代码中,您看到最后读取的数据只是因为您正在回收相同的字符串而没有在各种测试之间清除它。
C++ 工具的行为由 ISO C++ 标准描述。但是,它不是最具可读性的资源。在这种情况下,cppreference.com 具有良好的覆盖率。
这是他们要说的。引用块是复制粘贴的;我已经穿插解释你的问题。
Behaves as
UnformattedInputFunction
, except thatinput.gcount()
is not affected. After constructing and checking the sentry object, performs the following:
"Constructing and checking the sentry" 表示如果在流上检测到错误条件,函数将 return 不做任何事情。这就是为什么在 #3 中,当 "nothing should be there."
时您会观察到最后一个有效输入1) Calls str.erase()
因此,如果随后在分隔符之前找不到任何内容,您将得到一个空字符串。
2) Extracts characters from input and appends them to str until one of the following occurs (checked in the order listed)
a) end-of-file condition on input, in which case, getline sets
eofbit
.
这是一个错误条件,会导致 string
局部变量在随后的 getline
中保持不变。
它还允许您观察结束前的最后一段输入,因此如果您愿意,您可以将文件结束作为分隔符。
b) the next available input character is delim, as tested by
Traits::eq(c, delim)
, in which case the delimiter character is extracted from input, but is not appended to str.c) str.max_size() characters have been stored, in which case getline sets failbit and returns.
3) If no characters were extracted for whatever reason (not even the discarded delimiter), getline sets
failbit
and returns.