按文本中的特定行拆分 Python 中的字符串

Split a string in Python by a specific line in text

如果有一行只包含“----”,我想拆分正文。我正在使用 re.split(..) 方法,但它没有按预期运行。我错过了什么?

import re

s = """width:5
----
This is a test sentence to test the width thing"""

print re.split('^----

这只是打印

['width:5\n----\nThis is a test scentence to test the width thing']
, s)

这只是打印

['width:5\n----\nThis is a test scentence to test the width thing']

您缺少 MULTILINE flag:

print re.split(r'^----$', s, flags=re.MULTILINE)

如果没有它,^$ 将应用于整个 s 字符串,而不是字符串中的每一行:

re.MULTILINE

When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline).

演示:

>>> import re
>>> 
>>> s = """width:5
... ----
... This is a test sentence to test the width thing"""
>>> 
>>> print re.split(r'^----$', s, flags=re.MULTILINE)
['width:5\n', '\nThis is a test sentence to test the width thing']

您也不能使用 ^$,因为使用 ^$ 您指定正则表达式引擎从字符串的首尾匹配,并使用Positive look-around 保持 \n:

>>> print re.split('(?<=\n)----(?=\n)', s)
['width:5\n', '\nThis is a test sentence to test the width thing']

另一种不使用正则表达式的拆分方法。

s.split("\n----\n")

更少的代码使其如预期般完美:

输入:

re.split('[\n-]+', s, re.MULTILINE)

输出:

['width:5', 'This is a test sentence to test the width thing']

你试过了吗:

result = re.split("^----$", subject_text, 0, re.MULTILINE)