Bash：将多行数据集重塑为多列数据集

Question

假设我有以下数据：

# all the numbers are their own number.  I want to reshape exactly as below
0 a 
1 b
2 c
0 d
1 e
2 f
0 g
1 h
2 i
...

我想重塑数据，使其成为：

0 a d g ...
1 b e h ... 
2 c f i ...

不用写复杂的作文。这可以使用 unix/bash 工具包吗？

是的，我可以在一种语言中轻松做到这一点。这个想法是不要 "just" 那样做。因此，如果存在某种 cat X.csv | rs [magic options] 类型的解决方案（和 rs，或 bash reshape 命令，那就太好了，除非它在 debian stretch 上不起作用），这就是我我在找。

否则，涉及命令或脚本组合的等效答案超出范围：已经知道了，但宁愿没有。

Answer 1

使用 GNU datamash:

$ datamash -s -W -g 1 collapse 2 < file
0       a,d,g
1       b,e,h
2       c,f,i

选项：

-s排序
-W 使用白色space（space 或制表符）作为分隔符
-g 1 在第一个字段上分组
collapse 2 打印第二个字段的逗号分隔值列表

要将制表符和逗号转换为 space 个字符，请将输出通过管道传输到 tr:

$ datamash -s -W -g 1 collapse 2 < file | tr '\t,' ' '
0 a d g
1 b e h
2 c f i

Answer 2

bash版本：

function reshape {
    local index number key
    declare -A result
    while read index number; do
        result[$index]+=" $number"
    done
    for key in "${!result[@]}"; do
        echo "$key${result[$key]}"
    done
}
reshape < input

我们只需要确保输入是 unix 格式

Bash：将多行数据集重塑为多列数据集

Bash: reshape a dataset of many rows to dataset of many columns

bash

reshape