使用 Awk 将漂亮的 table 转换为带有分隔符的单行

Question

正在尝试清理 Python 客户端的输出。这是一个例子：

+--------------------------+-----------+
| Text                     | Test      |
+--------------------------+-----------+
| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |
+--------------------------+-----------+

我首先通过管道输出删除顶部和底部：

Command_Output | tail -n +4 | head -n -1 |

所以现在我们有以下内容：

| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |

现在我试图删除 table 中的管道并将 table 转换为单个逗号分隔行。不过，保持两个数字之间的相关性很重要，所以也许我应该使用两个分隔符。也许最终输出应该如下所示：

111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789

所以现在我在这一点上：

Command_Output | tail -n +4 | head -n -1 | awk '{ = "~"; print [=15=];}'

有人可以帮我完成最后一部分吗？我需要将 table 变成一个逗号分隔的行。

Answer 1

有效，但是：

仅限于单组输入线，全部作为单输出线输出。
- 如果不需要分组逻辑，请考虑。
使用几个 GNU 特定的选项，这些选项通常不会在非 Linux 平台上工作。
使用 4 外部进程，当 1 就可以了。

一个通用的解决方案，输出每个行块共享相同的（概念上）第一列值作为单行，只使用一个单一的，POSIX 兼容的 awk 命令（仍然假设一个2 列 布局）：

 ... | awk '
  NR <= 3 || /^\+/ { next }                          # skip header and footer
  prev != "" && prev !=  { printf "\n"; fsep="" }  # see if new block is starting
  { printf "%s", fsep  "~" ; fsep=","; prev= } # print line at hand
  END { printf "\n" }                                # print final newline
'

处理可变列数:

... | awk -F ' *\| *' '
  NR <= 3 || /^\+/ { next }                          # skip header and footer
  {                                                  # process each data row
    fsep=""; first=1
    for (i=1; i<=NF; ++i) {                          # loop over all fields
      if ($i == "") continue                         # skip empty fields
      # See if a new block is starting and print the appropriate record
      # separator.      
      if (first) {  
        if (prev != "") printf (prev != $i ? "\n" : ",") 
        prev=$i                                      # save record's 1st nonempty field
        first=0                                      # done with 1st nonempty field
      }
      printf "%s", fsep $i                           # print field at hand.
      fsep="~"                                       # set separator for subsequent fields
    }
  }
  END { printf "\n" }                                # print trailing newline
'

Answer 2

Command_Output | tail -n +4 | head -n -1 | awk -vORS=, '{ print  "~"  }' | sed 's/,$/\n/'

感谢帮助

Answer 3

一个更简单的基于 awk 的解决方案：

Command | awk -vORS=, '(=="|" && NR>3 ) {print "~"}'

然而，这会在末尾留下尾随 ,。要解决这个问题：

Command | awk -vORS= '(=="|" && NR>3 ) {if (NR>4) {print ","}; print "~"}'

给出：

111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789

Answer 4

这将在所有 awks 中适用于任意数量的输入列：

$ awk -F ' *[|] *' -v OFS='~' 'NF>1 && ++c>1 {=; gsub(/^~|~$/,""); printf "%s%s", (c>2?",":""), [=10=]} END{print ""}' file
111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789

使用 Awk 将漂亮的 table 转换为带有分隔符的单行

Transform a pretty-printed table to a single line with separators, using Awk

bash

awk

text-parsing

separator