如何在 unix 中只删除 > [大于] 和 < [小于] 之间的一个换行符

Question

假设有一个文本格式如下：

"this\n is >\n<"

我只想截断 > 和 < 之间的换行符，这将导致：

"this\n is ><"

如何实现？

我尝试使用以下方法：

echo "this\n is >\n<" | sed -e 's/>\n<//g'

和

echo "this\n is >\n<" | sed -e 's/>\n</></g'

但是 none 他们成功了。天才头脑有什么建议吗？

Answer 1

sed 在逐行的基础上工作，但您可以对其进行修改：

printf 'this\n is >\n<\n' | sed ':a;N;$!ba;s/>\n</></g'

这是一个您可以在其他一些地方找到的已知模式。

本质上，:a 创建一个标签，N 将当前行和下一行合并到模式 space，$!ba 分支到一个 if not (这意味着它一直持续到所有输入都在一个模式中 - space)，然后下一个是替换 (s/>\n</></g)，它现在适用于所有行。

还有其他选项，但这可能是最便携的，因为 sed 比其他可以像 Perl 那样做的工具有更多的用处。您可能可以使用 awk 破解它，但我不知道如何在不比这个 sed 解决方案冗长的情况下做到这一点。

Answer 2

一起使用

你们非常亲密：

$ echo "this\n is >\n<" | sed -e 's/>\n</></g'
this\n is ><

在sed中，\n是一个换行符。但是，您的字符串没有换行符：它有斜杠后跟 n。所以，我们需要告诉 sed 寻找斜杠-n。这是通过将斜杠加倍来完成的。

使用echo（无选项），字符串没有换行符：

$ echo "this\n is >\n<"
this\n is >\n<

但是，如果我们使用 printf，\ n 序列将转换为换行符：

$ printf "this\n is >\n<"
this
 is >
<

我们可以使用 GNU sed 删除尖括号之间的换行符：

$ printf "this\n is >\n<" | sed -z 's/>\n</></g'
this
 is ><

（在 Mac OSX 上，GNU sed 被称为 gsed。）

Answer 3

这可能对你有用 (GNU sed)：

sed ':a;N;s/>\n</></;ta;P;D' file

将两行读入模式 space，如果换行符在 > 和 < 之间，则将其删除。然后打印并删除第一行并重复。

How to remove only one newline between > [greater than] and < [less than] in unix