按文本中的特定行拆分 Python 中的字符串

Question

如果有一行只包含“----”，我想拆分正文。我正在使用 re.split(..) 方法，但它没有按预期运行。我错过了什么？

import re

s = """width:5
----
This is a test sentence to test the width thing"""

print re.split('^----



这只是打印

['width:5\n----\nThis is a test scentence to test the width thing']
, s)

这只是打印

['width:5\n----\nThis is a test scentence to test the width thing']

Answer 1

您缺少 MULTILINE flag:

print re.split(r'^----$', s, flags=re.MULTILINE)

如果没有它，^ 和 $ 将应用于整个 s 字符串，而不是字符串中的每一行：

re.MULTILINE

When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline).

演示：

>>> import re
>>> 
>>> s = """width:5
... ----
... This is a test sentence to test the width thing"""
>>> 
>>> print re.split(r'^----$', s, flags=re.MULTILINE)
['width:5\n', '\nThis is a test sentence to test the width thing']

Answer 2

您也不能使用 ^ 和 $，因为使用 ^ 和 $ 您指定正则表达式引擎从字符串的首尾匹配，并使用Positive look-around 保持 \n:

>>> print re.split('(?<=\n)----(?=\n)', s)
['width:5\n', '\nThis is a test sentence to test the width thing']

Answer 3

另一种不使用正则表达式的拆分方法。

s.split("\n----\n")

Answer 4

更少的代码使其如预期般完美：

输入：

re.split('[\n-]+', s, re.MULTILINE)

输出：

['width:5', 'This is a test sentence to test the width thing']

Answer 5

你试过了吗：

result = re.split("^----$", subject_text, 0, re.MULTILINE)

按文本中的特定行拆分 Python 中的字符串

Split a string in Python by a specific line in text

python

regex