为什么 string.find("</a>") 找不到 "</a>"?
why string.find("</a>") cant find "</a>"?
我的程序运行良好,只是找不到 </a>
。
它可以找到一切,例如它可以找到 </b>
、</i>
、</head>
等但出于某种原因找不到 </a>
?
#include <iostream>
#include <string>
using namespace std;
int main()
{
string HTML_text;
getline(cin, HTML_text, '\t');
cout << endl << "Printing test!";
// replacing hyperlink
string sub_string;
int index_begin = HTML_text.find("<a href=") + 8;
string helper = HTML_text.substr(index_begin, HTML_text.size());
int index_end = helper.find(">");
helper.clear();
sub_string = HTML_text.substr(index_begin, index_end);
//substring is made
index_begin = HTML_text.find(">", index_begin) + 1;
index_end = HTML_text.find("</a>"); //HERE IS THE PROBLEM
helper = HTML_text.substr(index_begin, index_end);
cout << "\n\nPrinting helper!\n";
cout << helper << endl << endl;
HTML_text.erase(index_begin, index_end);
HTML_text.insert(index_begin, sub_string);
cout << endl << "Printing results!";
cout << endl << endl << HTML_text << endl << endl;
}
例如我使用的HTML.text是这样的:
<html>
<head>
text to be deleted
</head>
<body>
Hi there!
<b>some bold text</b>
<i>italic</i>
<a href=www.abc.com>link text</a>
</body>
</html> //tab and then enter
问题不在您假设的位置:index_end = HTML_text.find("</a>");
正常工作并找到包含 </a>
的字符串中的位置:如果您观察该值,您可以在调试器中轻松看到index_end
。如果找不到 </a>
,index_end
将等于 std::string::npos),但它是 123,而 index_begin
是 114。
我们来看看 std::string.erase()
的文档
string& erase (size_t pos = 0, size_t len = npos);
erase 方法的签名有两个参数,位置和 length 而您的代码假定第二个参数将是结束位置(同样如此, 对于 std::string.substr()).
这不是什么大问题,很容易解决,因为我们可以简单地计算长度
length = end_position - start_position;
所以你的固定代码是:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string HTML_text;
getline(cin, HTML_text, '\t');
cout << endl << "Printing test!";
// replacing hyperlink
string sub_string;
int index_begin = HTML_text.find("<a href=") + 8;
string helper = HTML_text.substr(index_begin);
int index_end = helper.find(">");
helper.clear();
sub_string = HTML_text.substr(index_begin, index_end);
//substring is made
index_begin = HTML_text.find(">", index_begin) + 1;
index_end = HTML_text.find("</a>");
helper = HTML_text.substr(index_begin, index_end - index_begin);
cout << "\n\nPrinting helper!\n";
cout << helper << endl << endl;
HTML_text.erase(index_begin, index_end - index_begin);
HTML_text.insert(index_begin, sub_string);
cout << endl << "Printing results!";
cout << endl << endl << HTML_text << endl << endl;
}
如您所料,输出:
Printing test!
Printing helper!
link text
Printing results!
<html>
<head>
text to be deleted
</head>
<body>
Hi there!
<b>some bold text</b>
<i>italic</i>
<a href=www.abc.com>www.abc.com</a>
</body>
</html>
我的程序运行良好,只是找不到 </a>
。
它可以找到一切,例如它可以找到 </b>
、</i>
、</head>
等但出于某种原因找不到 </a>
?
#include <iostream>
#include <string>
using namespace std;
int main()
{
string HTML_text;
getline(cin, HTML_text, '\t');
cout << endl << "Printing test!";
// replacing hyperlink
string sub_string;
int index_begin = HTML_text.find("<a href=") + 8;
string helper = HTML_text.substr(index_begin, HTML_text.size());
int index_end = helper.find(">");
helper.clear();
sub_string = HTML_text.substr(index_begin, index_end);
//substring is made
index_begin = HTML_text.find(">", index_begin) + 1;
index_end = HTML_text.find("</a>"); //HERE IS THE PROBLEM
helper = HTML_text.substr(index_begin, index_end);
cout << "\n\nPrinting helper!\n";
cout << helper << endl << endl;
HTML_text.erase(index_begin, index_end);
HTML_text.insert(index_begin, sub_string);
cout << endl << "Printing results!";
cout << endl << endl << HTML_text << endl << endl;
}
例如我使用的HTML.text是这样的:
<html>
<head>
text to be deleted
</head>
<body>
Hi there!
<b>some bold text</b>
<i>italic</i>
<a href=www.abc.com>link text</a>
</body>
</html> //tab and then enter
问题不在您假设的位置:index_end = HTML_text.find("</a>");
正常工作并找到包含 </a>
的字符串中的位置:如果您观察该值,您可以在调试器中轻松看到index_end
。如果找不到 </a>
,index_end
将等于 std::string::npos),但它是 123,而 index_begin
是 114。
我们来看看 std::string.erase()
的文档string& erase (size_t pos = 0, size_t len = npos);
erase 方法的签名有两个参数,位置和 length 而您的代码假定第二个参数将是结束位置(同样如此, 对于 std::string.substr()).
这不是什么大问题,很容易解决,因为我们可以简单地计算长度
length = end_position - start_position;
所以你的固定代码是:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string HTML_text;
getline(cin, HTML_text, '\t');
cout << endl << "Printing test!";
// replacing hyperlink
string sub_string;
int index_begin = HTML_text.find("<a href=") + 8;
string helper = HTML_text.substr(index_begin);
int index_end = helper.find(">");
helper.clear();
sub_string = HTML_text.substr(index_begin, index_end);
//substring is made
index_begin = HTML_text.find(">", index_begin) + 1;
index_end = HTML_text.find("</a>");
helper = HTML_text.substr(index_begin, index_end - index_begin);
cout << "\n\nPrinting helper!\n";
cout << helper << endl << endl;
HTML_text.erase(index_begin, index_end - index_begin);
HTML_text.insert(index_begin, sub_string);
cout << endl << "Printing results!";
cout << endl << endl << HTML_text << endl << endl;
}
如您所料,输出:
Printing test!
Printing helper!
link text
Printing results!
<html>
<head>
text to be deleted
</head>
<body>
Hi there!
<b>some bold text</b>
<i>italic</i>
<a href=www.abc.com>www.abc.com</a>
</body>
</html>