当开始和结束已知时如何在模式中找到模式?
How to find a pattern inside a pattern when start and end is known?
我有一个模式,其开始和结束模式如下:
start = '\n\[\n'
end = '\n\]\n'
我的字符串是:
'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
我要查找:
\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)'
如果我使用:
re.findall(r'\s*\+\n\[\n(.*?)\+\n\]\n', mystring)
r'\s*\+\[(.*?)\+\]' # did not work either
然后它给了我一个空的结果。我在这里做错了什么?
start = '\n\'
end = '\n\]\n'
s = 'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
test_str = "\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)"
idx_start = s.find(start) + len(start) + 1
idx_end = s.rfind(end)
found = s[idx_start:idx_end]
found == test_str
OUTPUT:
True
这对我有用:
mystring = 'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
expected_result = '\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)'
import codecs
import re
matches = re.findall(r'\n\\\[(\n.*)\n\\\]\n', repr(mystring))
results = [codecs.decode(match, 'unicode_escape') for match in matches]
results
['\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)']
results[0] == expected_result
True
我有一个模式,其开始和结束模式如下:
start = '\n\[\n'
end = '\n\]\n'
我的字符串是:
'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
我要查找:
\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)'
如果我使用:
re.findall(r'\s*\+\n\[\n(.*?)\+\n\]\n', mystring)
r'\s*\+\[(.*?)\+\]' # did not work either
然后它给了我一个空的结果。我在这里做错了什么?
start = '\n\'
end = '\n\]\n'
s = 'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
test_str = "\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)"
idx_start = s.find(start) + len(start) + 1
idx_end = s.rfind(end)
found = s[idx_start:idx_end]
found == test_str
OUTPUT:
True
这对我有用:
mystring = 'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
expected_result = '\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)'
import codecs
import re
matches = re.findall(r'\n\\\[(\n.*)\n\\\]\n', repr(mystring))
results = [codecs.decode(match, 'unicode_escape') for match in matches]
results
['\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)']
results[0] == expected_result
True