删除重复多次的特定字符，只留下一个

Question

在过去的 20 分钟里，我查看了有关如何删除重复单词的答案，试图让它们适用于我的用例，但没有奏效。

我有一个小命令可以删除特殊字符并将空格转换为破折号，

echo "[[[h[el]lo - w{o}rld%^& -text" | tr -d '?$#@;:/\="<>%{}|^~[]&`' | tr ' ' '-'

并产生以下输出

hello---world--text

运行完美，但我还想向该命令添加其他内容，也许是另一个管道，用于删除重复的破折号

ex：我希望它从生成的输出转换为：

hello-world-text

我怎样才能以最 POSIX 合规的方式做到这一点？

PS：请告诉我是否有更有效的方法来完成我在那里所做的事情

Answer 1

您可以为此使用 -s 标志：

echo "[[[h[el]lo - w{o}rld%^& -text" | tr -d '?$#@;:/\="<>%{}|^~[]&`' | tr -s ' ' '-'

-s, --squeeze-repeats
              replace each sequence of a repeated character that is listed in the  last
              specified SET, with a single occurrence of that character

Answer 2

对于这个特殊情况，我们可以tr删除所有内容除了 - 和 letters/digits ([:alnum:])

$ echo "[[[h[el]lo - w{o}rld%^& -text" | tr -dc -- '-[:alnum:]'
hello-world-text

### or

$ tr -dc -- '-[:alnum:]' <<< "[[[h[el]lo - w{o}rld%^& -text"
hello-world-text

关键是 -c 标志，表示取给定模式的补码（即除之外的所有内容）。

Answer 3

你也可以使用sed:

$ echo "[[[h[el]lo - w{o}rld%^& -text" | sed -E 's/([^-[:alnum:]]*)//g'
hello-world-text

删除重复多次的特定字符，只留下一个

Remove specific character that repeats more than once, and leave only one

string

bash

posix