用于展开段落的正则表达式:删除 returns 和包含内容但不包含空行的行尾的新行

Regex to unwrap paragraphs: Remove returns and new lines at end of lines that have content, but not empty lines

我正在使用 Text Soap by Unmarked Software on Mac OS which is pretty much PCRE but uses ICU Regular Expression Syntax 作为其正则表达式查找和替换工具。我还是 Regex 的新手,所以我还在学习许多复杂的东西。请耐心等待我。

我正在努力捕获包含内容的行尾的新行或 returns,但无法捕获空行的新行或 returns,或者如果有紧随其后的是空行。

我试过在多行模式下使用正向后视和正向前视,但一直无法弄清楚。经过一些反复试验,我确实发现 $ 在 newline/carriage return.

之后

我实际上是在尝试展开段落,但将它们保留为段落。

我想要像这个例子这样的输入:

"I need to unblock," someone may have breathed out.\n
\n
"I know how to do it," I may have responded, picking up\n
the cue. My life has always included strong internal directives.\n
Marching orders) I call them.\n
\n
In any case, I suddenly knew that I did know how to un-\n
block people and that I was meant to do so, starting then and\n
there with the lessons I myself had learned.\n
\n
Where did the lessons come from?\n
\n
In 1978, in January, I stopped drinking. I had never\n
thought drinking made me a writer, but now I suddenly\n
thought not drinking might make me stop. In my mind,\n
drinking and writing went together like, well, scotch and\n
soda. For me, the trick was always getting past the fear and\n
onto the page. I was playing beat the clock-trying to write be-\n
fore the booze closed in like fog and my window of creativity\n
was blocked again.\n

要输出这个:

"I need to unblock," someone may have breathed out.\n
\n
"I know how to do it," I may have responded, picking up the cue. My life has always included strong internal directives. Marching orders) I call them.\n
\n
In any case, I suddenly knew that I did know how to un-block people and that I was meant to do so, starting then and there with the lessons I myself had learned.
\n
Where did the lessons come from?\n
\n
In 1978, in January, I stopped drinking. I had never thought drinking made me a writer, but now I suddenly thought not drinking might make me stop. In my mind, drinking and writing went together like, well, scotch and soda. For me, the trick was always getting past the fear and onto the page. I was playing beat the clock-trying to write be-fore the booze closed in like fog and my window of creativity was blocked again.\n

我拼凑了这个基本的正则表达式,但我假设它可能会遗漏某些类型的视觉空行。如果这个正则表达式有可能失败的方式或者它如何比我预期的更贪婪,或者如何改进它,我将非常感谢反馈。我欢迎其他人在 regex101.com 上尝试使用我当前的解决方案并分叉它等来玩或教我一些东西。

(?<=.$)([\r\n\f\v]?)(?!^$)

</code> 替换为 <code>\s

如果我理解正确,你可以使用这个正则表达式:

(?<!\n)\n(?!\n)

替换为空字符串。

如果要查找换行符以外的字符,可以将所有 \n 替换为要查找的 character/string。例如,如果您的换行符是 \r\n。使用:

(?<!\r\n)\r\n(?!\r\n)

本质上,正则表达式找到一个换行符,它既不跟在另一个换行符后面,也不跟另一个换行符。并用空字符串替换将其删除。