如何在 GNU Awk 4.2 中使用 FIELDWIDTHS 跳过字符?

How do you skip characters with FIELDWIDTHS in GNU Awk 4.2?

GNU awk 4.2 was released 它包含许多有趣的功能。其中之一是:

  1. The FIELDWIDTHS parsing syntax has been enhanced to allow specifying how many characters to skip before a field starts. It also allows specifying '*' as the last character to mean "the rest of the record". Field splitting with FIELDWIDTHS now sets NF correctly. The documentation for FIELDWIDTHS in the manual has been considerably reorganized and improved as well.

我测试了 * thingie 并且非常有效地将最后一个块捕获到 $NF:

# "*" catches in $NF from the 2+2+1=5th character and until the end
$ awk 'BEGIN {FIELDWIDTHS="2 2 *"} {print $NF}' <<< "1234567890"
567890

但是,我看不到如何使用该功能的第一部分,GNU Awk's Users Guide → A.6 History of gawk Features → Version 4.2 of gawk introduced the following changes:

中也有描述

FIELDWIDTHS was enhanced to allow skipping characters before assigning a value to a field (see Splitting By Content).

我在链接部分也找不到示例。那么,这个功能究竟是做什么的,它是如何工作的?

GNU Awk 文档中有一节定义了 FIELDWIDTHS id。
那个 section/paragraph 也有一些符号可以阐明新功能“在为字段赋值之前跳过字符”。

这是(突出显示),来自 7.5.1 Built-in Variables That Control awk:

FIELDWIDTHS #

A space-separated list of columns that tells gawk how to split input with fixed columnar boundaries. Starting in version 4.2, each field width may optionally be preceded by a colon-separated value specifying the number of characters to skip before the field starts. Assigning a value to FIELDWIDTHS overrides the use of FS and FPAT for field splitting.


实际情况如何:

假设我们要跳过第一个字段前的 3 个字符和第二个字段前的 1 个字符。

awk 'BEGIN {FIELDWIDTHS="3:2 1:2 *"} {print , }' <<< "1234567890"

输出:

45 78

因此 3:2 跳过 123 并设置 45 并且 1:2 跳过 6 并设置 78.