需要使用 mawk 删除重复行(特别是)

Need to remove duplicate lines using mawk (specifically)

我有一个运行良好的 gawk 命令。但是我有一台安装了 mawk 的机器,当我尝试安装 gawk 时,它会抱怨破坏了依赖关系。我想将此行更改为 mawk 语法。

awk -F '[|]{3}' 'BEGIN {OFS="|||"} !seen[]++ {print ,,,,,,,,}' 

输入文件:这是一个三竖线分隔的文件

A|||B|||C|||D|||E|||F|||G|||H|||I|||J|||K||||L|||M|||N|||O|||P|||Q|||R|||S||||T|||U
1|||2|||3|||4|||5|||6|||7|||8|||9|||10|||11|||12|||13|||14|||15|||16|||17|||18|||19

POSIX awk 使用扩展的正则表达式,可以通过 {m,n}

定义字符重复

When an ERE matching a single character or an ERE enclosed in parentheses is followed by an interval expression of the format {m}, {m,}, or {m,n}, together with that interval expression it shall match what repeated consecutive occurrences of the ERE would match. The values of m and n are decimal integers in the range 0 <= m<= n<= {RE_DUP_MAX}, where m specifies the exact or minimum number of occurrences and n specifies the maximum number of occurrences. The expression {m} matches exactly m occurrences of the preceding ERE, {m,} matches at least m occurrences, and {m,n} matches any number of occurrences between m and n, inclusive.

source: POSIX Regular Expressions

不幸的是,这种复制方法 mawk 支持,可以从 manual (Section 3 Regular Expressions).

中读取

因此,不是通过 -F '[|]{3}' 定义字段分隔符 FS,而是必须使用 -F '[|][|][|]'-F "\|\|\|"