根据第一个字段模式加入行

Join lines based on the first field patern

我有一个包含 200,000 行的文件。每行的开头以 "IMAGE"、"HISTO" 或 "FRAG" 开头。我需要将 HISTO 和 FRAG 行连接到 IMAGE 行。这是一个例子。

IMAGE Lots of Data on this line  
HISTO usually numbers 0 0 1 1 0 1 0  
FRAG Always at least 1 of these lines but can be more

结果需要如下所示:

>IMAGE Lots of Data on this line HISTO usually numbers 0 0 1 1 0 1 0 FRAG Always at least 1 of these lines but can be more

在以 IMAGE 行重新开始之前,可能有许多 FRAG 行。我使用的是 mac,所以我几乎可以使用任何工具,但我最熟悉的是 vi。

AWK:

awk '/^IMAGE/&&NR>1 {print a; a=""} {a=a""[=10=]" "} END{print a}' test.in

大声说:

/^IMAGE/ && NR>1 { # if it starts with IMAGE
    print a        # empty buffer variable to output
    a=""           # reset the buffer after emptying
} 
{                  # for all records
  a=a""[=11=]" "       # append to the buffer variable, prob. no need for ""
}
END {              # in the end
  print a          # empty the remaining buffer in the end
}