从多个文件中提取列 'x'，并用 'x' 转置文件名

Question

我正在尝试从多个 txt 文件（file1.txt、file2.txt、等）中提取列 "m"，并将每一列转置为新文件中的一行。

下面是file1.txt:

contig_1    contig_1    geneX       ctg1_886;ctg1_887;ctg1_888
contig_2    contig_2    geneY       ctg1_886;ctg1_887;ctg1_888
contig_3    contig_3    genesZ      ctg1_886;ctg1_887;ctg1_888

我想要一个 summary.txt 文件，如下所示：

file1 geneX geneY geneZ
file2 geneA geneY
.
.
.
etc.

总行数可能因文件而异。我尝试使用 awk 但没有成功。

Answer 1

根据评论中的 glenn jackmans 建议，GNU AWK 解决方案如下所示：

awk 'BEGIN {ORS=" "} BEGINFILE{print FILENAME} {print } ENDFILE{ printf("\n")}'  file*.txt

awk 解决方案可能如下所示（抱歉，只有 gnu awk 用于测试）：

awk 'BEGIN {ORS=" "} FNR==1 {printf("\n%s", FILENAME)} {print } END{printf("\n")} '

说明

有几种特殊模式：

BEGIN，它的动作一开始就执行一次。这里ORS（输出记录分隔符）设置为space，效果是你从每个原始行得到一个新列，这个是转置步骤
END动作最后执行一次
BEGINFILE和ENDFILE动作在每个文件处理的开始和结束时执行一次。这里 FILENAME 分别打印换行符。

Answer 2

假设字段分隔符是多个空格：

for f in file*.txt ; do 
    echo $f `tr -s ' ' < $f | cut -d ' ' -f 3`
done > summary.txt

如果数据是<tab>分隔的：

for f in file*.txt ; do 
    echo $f `cut -f 3 $f`
done > summary.txt

从多个文件中提取列 'x'，并用 'x' 转置文件名

Extract column 'x' from multiple files, and transpose file name with 'x'

unix

shell

awk

cut

find