在 awk 或 nawk 中，如何使用最后一次出现的管道字符作为字段分隔符，给我 2 个字段？

Question

我宁愿不使用 gawk-only 功能，因为我需要运行在各种 UNIX 版本上使用此功能，但并非所有版本都有 gawk。我有一个包含这样几行的文件：

^myfile\..*\.(pork|beef)$|send -d j
^myfile\..*\.(chicken|turkey|quail)$|send -d q
^myfile\..*\.cheese$|send -d u

有时，但并非总是如此，第一个字段包含一个或多个竖线字符。最后一个管道后面的字符可以可靠地称为字段 2.

Answer 1

我不确定这是否完全可移植，但我认为它是：

awk '{
    # Find the position of the last "|" in the line.
    p=match([=10=], /\|[^|]*$/)

    # "Split" the line into two fields around that position.
    a[1]=substr([=10=], 1, p-1)
    a[2]=substr([=10=], p+1)

    printf "[%s] [%s]\n", a[1], a[2]
}' file.in

正如 Ed Morton 在评论中指出的那样，这里不需要使用 p，因为 awk match 函数还将 RSTART 变量设置为字符串中的位置正则表达式匹配所以上面也可以这样写：

awk '{
    # Find the last "|" in the line.
    match([=11=], /\|[^|]*$/)

    # "Split" the line into two fields around that position (using the RSTART variable from the match() call).
    a[1]=substr([=11=], 1, RSTART-1)
    a[2]=substr([=11=], RSTART+1)

    printf "[%s] [%s]\n", a[1], a[2]
}' file.in'

实际上 exact 任务是 awk Grymoire.

中 match() 的示例

Answer 2

您可以将 FS 设置为 $|:

$ awk -F'[$][|]' '{printf "[%s$] [%s]\n", , }' file
[^myfile\..*\.(pork|beef)$] [send -d j]
[^myfile\..*\.(chicken|turkey|quail)$] [send -d q]
[^myfile\..*\.cheese$] [send -d u]

如果您愿意，可以将 $ 重新添加到 </code> 的末尾：</p> <pre><code>$ awk -F'[$][|]' '{="$"; printf "[%s] [%s]\n", , }' file [^myfile\..*\.(pork|beef)$] [send -d j] [^myfile\..*\.(chicken|turkey|quail)$] [send -d q] [^myfile\..*\.cheese$] [send -d u]

如果您愿意，另一种方法是：

$ awk '{f1=f2=[=12=]; sub(/\|[^|]*$/,"",f1); sub(/.*\|/,"",f2); printf "[%s] [%s]\n", f1, f2}' file
[^myfile\..*\.(pork|beef)$] [send -d j]
[^myfile\..*\.(chicken|turkey|quail)$] [send -d q]
[^myfile\..*\.cheese$] [send -d u]

Answer 3

你也可以这样做（例如我选择标签作为新的分隔符）：

awk -vRS='[|]' -vORS='' 'NR>1{printf /\n/?"\t":"|"}1' file

在 awk 或 nawk 中，如何使用最后一次出现的管道字符作为字段分隔符，给我 2 个字段？

In awk or nawk, how do I use the last occurrence of pipe character as field delimiter, giving me 2 fields?

regex

awk

nawk