正则表达式查找 mp3 html 文件
Regex to find mp3 html file
我正在尝试匹配 html
中的所有 mp3 链接
预期输出
http://mp3cofe.com/ariana-grande-weeknd-love-me-harder-andreevskiy-remix.mp3
http://mp3cofe.com/listen/52d-remix.mp3
获取输出
http://mp3cofe.com/ariana-grande-weeknd-love-me-harder-andreevskiy-remix.mp3" rel="nofollow" target="_blank" style="color:green;">Download</a</div><a href="http://mp3cofe.com/listen/52d-remix.mp3"
代码
#include <iostream>
#include <string>
#include <regex>
int main(){
std::string subject("<a href=\"http://mp3cofe.com/ariana-grande-weeknd-love-me-harder-andreevskiy-remix.mp3\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a></div><a href=\"http://mp3cofe.com/listen/52d-remix.mp3\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a> ");
std::regex re("(http:\/\/)(.*)(\.mp3\"\ )");
std::sregex_iterator next(subject.begin(), subject.end(), re);
std::sregex_iterator end;
while (next != end) {
std::smatch match = *next;
std::cout << match.str() << "\n";
next++;
}
return 0;
}
因为.*
默认是贪心的。它尽可能贪婪地匹配所有字符。
std::regex re("(http://)(.*?)([.]mp3\" )");
如果您不想在最后包含 "<space>
,请使用以下正则表达式。
std::regex re("(http://)(.*?)[.]mp3(?=\" )");
我正在尝试匹配 html
中的所有 mp3 链接预期输出
http://mp3cofe.com/ariana-grande-weeknd-love-me-harder-andreevskiy-remix.mp3
http://mp3cofe.com/listen/52d-remix.mp3
获取输出
http://mp3cofe.com/ariana-grande-weeknd-love-me-harder-andreevskiy-remix.mp3" rel="nofollow" target="_blank" style="color:green;">Download</a</div><a href="http://mp3cofe.com/listen/52d-remix.mp3"
代码
#include <iostream>
#include <string>
#include <regex>
int main(){
std::string subject("<a href=\"http://mp3cofe.com/ariana-grande-weeknd-love-me-harder-andreevskiy-remix.mp3\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a></div><a href=\"http://mp3cofe.com/listen/52d-remix.mp3\" rel=\"nofollow\" target=\"_blank\" style=\"color:green;\">Download</a> ");
std::regex re("(http:\/\/)(.*)(\.mp3\"\ )");
std::sregex_iterator next(subject.begin(), subject.end(), re);
std::sregex_iterator end;
while (next != end) {
std::smatch match = *next;
std::cout << match.str() << "\n";
next++;
}
return 0;
}
因为.*
默认是贪心的。它尽可能贪婪地匹配所有字符。
std::regex re("(http://)(.*?)([.]mp3\" )");
如果您不想在最后包含 "<space>
,请使用以下正则表达式。
std::regex re("(http://)(.*?)[.]mp3(?=\" )");