Python 正则表达式匹配多个 headers 及其在多行字符串中的段落

Python Regex to match multiple headers and their paragraphs in multi line string

我正在尝试在 Python 中创建一个 Regular-Expression,它应该在 multi-line 字符串中捕获 header 和对应的文本。示例字符串:

.Main Header
This is the main paragraph in the text. Also this is another sentence.
This is secondary header and text.
.Last Header
And this is the last header in the text.

此处 .Main Header.Sub-Header.Last Header 是段落的 header 和接下来的几行 (text until next ".Header" string) 是文本的body。所以我的预期输出是:

Header1 - .Main Header, Text1 - This is the main paragraph in the text. Also this is another sentence.
Header2 - .Sub-Header, Text2 - This is secondary header and text.
Header3 - .Last Header, Text3 - And this is the last header in the text.

我试图组合一个 regex 来满足这个期望并且它几乎可以工作,我面临的唯一挑战是捕获句子之间 dot(.) 的文本(for ex. Text1), 我的 regex 的停止条件是 newlinedot(.) 因为下一个 header 从 dot(.),所以我正在寻求帮助,以区分常规点和换行点作为我的停止标准。



对于 Text1 这捕获:

This is the main paragraph in the text


This is the main paragraph in the text. Also this is another sentence.



