当开始和结束已知时如何在模式中找到模式？

Question

我有一个模式，其开始和结束模式如下：

start = '\n\[\n'
end = '\n\]\n'

我的字符串是：

'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'

我要查找：

\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)'

如果我使用：

re.findall(r'\s*\+\n\[\n(.*?)\+\n\]\n', mystring)

r'\s*\+\[(.*?)\+\]' # did not work either

然后它给了我一个空的结果。我在这里做错了什么？

Answer 1

start = '\n\'
end = '\n\]\n'


s = 'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'
test_str = "\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)"


idx_start = s.find(start) + len(start) + 1
idx_end = s.rfind(end)


found = s[idx_start:idx_end]
found == test_str

OUTPUT:
True

Answer 2

这对我有用：

mystring = 'The above mentioned formal formula is\nthat of\n\[\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)\n\]\nA. Tobacco\nB. Tulip\nc. soybean\nD. Sunhemp'

expected_result = '\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)'

import codecs
import re

matches = re.findall(r'\n\\\[(\n.*)\n\\\]\n', repr(mystring))

results = [codecs.decode(match, 'unicode_escape') for match in matches]

results
['\n\oplus \bigoplus_{(5)} \widehat{C_{(5)}} A_{5} G(2)']

results[0] == expected_result
True

当开始和结束已知时如何在模式中找到模式？

How to find a pattern inside a pattern when start and end is known?

python

nlp

python-re