Python 正则表达式未找到特定模式
Python regex doesn't find certain pattern
我正在尝试从 html 代码中解析乳胶代码,如下所示:
string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
我想用一个函数的输出替换所有的乳胶代码,该函数将乳胶代码作为参数(由于找到正确的模式有问题,函数 extract
returns目前为空字符串)。
我试过了:
latex_end = "\)"
latex_start = "\("
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), extract, string)
结果:
your answer is wrong! Solution: based on \= 0 \) and \=0\) beeing ...
预计:
your answer is wrong! Solution: based on and beeing ...
知道为什么找不到模式吗?有实现的方法吗?
这是因为反斜杠在 Python 中用作转义字符。这使得处理这些情况非常棘手。以下是完成这项工作的两种快速方法:
import re
extract = lambda a: ""
# Using no raw components
string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
latex_bounds = ("\\(", "\\)\)")
print(re.sub('{}.*?{}'.format(*latex_bounds), extract, string))
# Using all raw components (backslashes mean nothing, but not really)
string = r"%s" % string
latex_bounds = (r"\\(", r"\\)")
print(re.sub(r'{}.*?{}'.format(*latex_bounds), extract, string))
您应该使用原始字符串来定义 string
,因为 \v
被解释为特殊字符。
import re
string = r" your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
string = re.sub(r'\\(.*?\\)', '', string))
print(string)
打印:
your answer is wrong! Solution: based on and beeing ...
如果您需要开始和结束的变量:
latex_end = r"\\)"
latex_start = r"\\("
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), '', string)
print(string)
我正在尝试从 html 代码中解析乳胶代码,如下所示:
string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
我想用一个函数的输出替换所有的乳胶代码,该函数将乳胶代码作为参数(由于找到正确的模式有问题,函数 extract
returns目前为空字符串)。
我试过了:
latex_end = "\)"
latex_start = "\("
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), extract, string)
结果:
your answer is wrong! Solution: based on \= 0 \) and \=0\) beeing ...
预计:
your answer is wrong! Solution: based on and beeing ...
知道为什么找不到模式吗?有实现的方法吗?
这是因为反斜杠在 Python 中用作转义字符。这使得处理这些情况非常棘手。以下是完成这项工作的两种快速方法:
import re
extract = lambda a: ""
# Using no raw components
string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
latex_bounds = ("\\(", "\\)\)")
print(re.sub('{}.*?{}'.format(*latex_bounds), extract, string))
# Using all raw components (backslashes mean nothing, but not really)
string = r"%s" % string
latex_bounds = (r"\\(", r"\\)")
print(re.sub(r'{}.*?{}'.format(*latex_bounds), extract, string))
您应该使用原始字符串来定义 string
,因为 \v
被解释为特殊字符。
import re
string = r" your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
string = re.sub(r'\\(.*?\\)', '', string))
print(string)
打印:
your answer is wrong! Solution: based on and beeing ...
如果您需要开始和结束的变量:
latex_end = r"\\)"
latex_start = r"\\("
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), '', string)
print(string)