从 Perl 到 C++ 的正则表达式转换
Regexp conversion from perl to C++
我们有 following regexp 用于解析等式 5x+10x^3-10x^2
:
[+-]?[\d(x)]*[\^\d]*
c++
中的以下代码,取自示例并针对任务进行了修改,导致无限循环:
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
auto words_begin =
std::sregex_iterator(s.begin(), s.end(), words_regex);
auto words_end = std::sregex_iterator();
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
std::smatch match = *i;
std::string match_str = match.str();
std::cout << match_str << '\n';
}
它还在编译时抛出警告:
1.cpp:21:35: warning: unknown escape sequence '\d' [-Wunknown-escape-sequence]
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
^~
1.cpp:21:43: warning: unknown escape sequence '\^' [-Wunknown-escape-sequence]
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
^~
1.cpp:21:45: warning: unknown escape sequence '\d' [-Wunknown-escape-sequence]
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
如果我们天真地将正则表达式转换为 [+-]?[d(x)]*[^d]*
- 无限循环当然会离开。
如何正确转换 c++
的正则表达式?
UPD:
铿锵版本:
Mac:concurrent macbook$ clang++ -v
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.5.0
Thread model: posix
问题是试图将 \d
解释为转义序列,因此您必须转义反斜杠,如 \d
.
另一种方法是使用原始字符串文字,如:
std::regex words_regex(R"([+-]?[\d(x)]*[\^\d]*)");
查看实际效果 here。
我们有 following regexp 用于解析等式 5x+10x^3-10x^2
:
[+-]?[\d(x)]*[\^\d]*
c++
中的以下代码,取自示例并针对任务进行了修改,导致无限循环:
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
auto words_begin =
std::sregex_iterator(s.begin(), s.end(), words_regex);
auto words_end = std::sregex_iterator();
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
std::smatch match = *i;
std::string match_str = match.str();
std::cout << match_str << '\n';
}
它还在编译时抛出警告:
1.cpp:21:35: warning: unknown escape sequence '\d' [-Wunknown-escape-sequence]
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
^~
1.cpp:21:43: warning: unknown escape sequence '\^' [-Wunknown-escape-sequence]
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
^~
1.cpp:21:45: warning: unknown escape sequence '\d' [-Wunknown-escape-sequence]
std::regex words_regex("[+-]?[\d(x)]*[\^\d]*");
如果我们天真地将正则表达式转换为 [+-]?[d(x)]*[^d]*
- 无限循环当然会离开。
如何正确转换 c++
的正则表达式?
UPD: 铿锵版本:
Mac:concurrent macbook$ clang++ -v
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.5.0
Thread model: posix
问题是试图将 \d
解释为转义序列,因此您必须转义反斜杠,如 \d
.
另一种方法是使用原始字符串文字,如:
std::regex words_regex(R"([+-]?[\d(x)]*[\^\d]*)");
查看实际效果 here。