在滑动 window 步中合并列

Question

我有一个标签删除文件

NC_044998.1     3778    0       CC      0       CC      0       CC      0       CC      1       CT      0       CC      0       CC      0       CC      0       CC      1       CT      0      CC       var     heterozygous    varvar  9       0.818182        2       0.181818        refref  refref  refref  refref  refdev  refref  refref  refref  refref  refdev  refref  homo    homo   homo     homo    het     homo    homo    homo    homo    het     homo    9       0       2       0       9       2
NC_044998.1     3787    0       CC      0       CC      1       CG      0       CC      0       CC      0       CC      0       CC      0       CC      0       CC      0       CC      0      CC       var     heterozygous    varvar  10      0.909091        1       0.0909091       refref  refref  refdev  refref  refref  refref  refref  refref  refref  refref  refref  homo    homo   het      homo    homo    homo    homo    homo    homo    homo    homo    10      0       1       0       10      1

其中第 - 列具有某种信息 (ref/dev)，第 - 列具有另一种信息 (homo/het) 我想合并它们，以便 col </code> 与 <code> + 11 = </code> 处的 col 合并并附加到文件末尾，然后移动 <code>i + 1，以便 </code> 被合并并追加，然后是 <code>，依此类推。输出看起来像

NC_044998.1     3778    0       CC      0       CC      0       CC      0       CC      1       CT      0       CC      0       CC      0       CC      0       CC      1       CT      0      CC       var     heterozygous    varvar  9       0.818182        2       0.181818        refref  refref  refref  refref  refdev  refref  refref  refref  refref  refdev  refref  homo    homo   homo     homo    het     homo    homo    homo    homo    het     homo    9       0       2       0       9       2   refrefhomo  refrefhomo  refrefhomo  refrefhomo  refdevhet  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refdevhet refrefhomo
NC_044998.1     3787    0       CC      0       CC      1       CG      0       CC      0       CC      0       CC      0       CC      0       CC      0       CC      0       CC      0      CC       var     heterozygous    varvar  10      0.909091        1       0.0909091       refref  refref  refdev  refref  refref  refref  refref  refref  refref  refref  refref  homo    homo   het      homo    homo    homo    homo    homo    homo    homo    homo    10      0       1       0       10      1   refrefhomo  refrefhomo  refdevhet  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo

我可以一一的

 cut -f32,43 file | sed 's/ //g' | paste file - > file.tmp

但它生成了 11 个 tmp 文件

Answer 1

你可以试试这个awk:

awk 'BEGIN {
   FS=OFS="\t"
}
{
   for (i=32; i<43; ++i)
      [=10=] = [=10=] OFS $i $(i+11)
} 1' file

NC_044998.1 3778    0   CC  0   CC  0   CC  0   CC  1   CT  0   CC  0   CC  0   CC  0   CC  1   CT  0   CC  var heterozygous    varvar  9   0.818182    2   0.181818    refref  refref  refref  refref  refdev  refref  refref  refref  refref  refdev  refref  homo    homo    homo    homo    het homo    homo    homo    homo    het homo    9   0   2   0   9   2   refrefhomo  refrefhomo  refrefhomo  refrefhomo  refdevhet   refrefhomo  refrefhomo  refrefhomo  refrefhomo  refdevhet   refrefhomo
NC_044998.1 3787    0   CC  0   CC  1   CG  0   CC  0   CC  0   CC  0   CC  0   CC  0   CC  0   CC  0   CC  var heterozygous    varvar  10  0.909091    1   0.0909091   refref  refref  refdev  refref  refref  refref  refref  refref  refref  refref  refref  homo    homo    het homo    homo    homo    homo    homo    homo    homo    homo    10  0   1   0   10  1   refrefhomo  refrefhomo  refdevhet   refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo  refrefhomo

Answer 2

像@anubhava 先生这样的答案在逻辑上是明智的。使用变量和一个小的附加逻辑，将数字添加到行尾，最大字段号与起始字段号的差异。

awk -v startFieldnum="32" -v tillFieldnum="43" '
BEGIN{
   FS=OFS="\t"
   diff=(tillFieldnum-startFieldnum)
}
{
   for (i=startFieldnum; i<tillFieldnum ; ++i)
      [=10=] = [=10=] OFS $i $(i+diff)
} 1' Input_file

在滑动 window 步中合并列

Merging columns in sliding window steps

bash

awk

gsub