使用不同的 FS 语法时 awk 中的结果不同

Different results in awk when using different FS syntax

我有一个示例文件,其中包含以下内容。

logging.20160309.113.txt.log:  0 Rows successfully loaded.
logging.20160309.1180.txt.log:  0 Rows successfully loaded.
logging.20160309.1199.txt.log:  0 Rows successfully loaded.

我目前熟悉在 awk 中实现字段分隔符语法的两种方法。但是,我目前得到了不同的结果。

我用的时间最长

"FS=" 当我的 FS 超过一个字符时的语法。

当我的FS只​​有一个字符时"-f"标志。

我想了解为什么 FS= 语法给我一个意想不到的结果,如下所示。不知何故第一条记录被遗忘。

$ head -3 reload_list | awk -F"\.log\:" '{ print  }'
awk: warning: escape sequence `\.' treated as plain `.'
awk: warning: escape sequence `\:' treated as plain `:'
logging.20160309.113.txt
logging.20160309.1180.txt
logging.20160309.1199.txt
$ head -3 reload_list |  awk '{ FS="\.log\:" } { print  }'
awk: warning: escape sequence `\.' treated as plain `.'
awk: warning: escape sequence `\:' treated as plain `:'
logging.20160309.113.txt.log:
logging.20160309.1180.txt
logging.20160309.1199.txt

-f 用于 运行 来自文件的脚本。 -FFS 工作相同

$ awk -F'.log' '{print }' logs
logging.20160309.113.txt
logging.20160309.1180.txt
logging.20160309.1199.txt

$ awk 'BEGIN{FS=".log"} {print }' logs
logging.20160309.113.txt
logging.20160309.1180.txt
logging.20160309.1199.txt

您得到不同结果的原因是,如果您在 awk 程序中设置 FS,它不在 BEGIN 块中。因此,当您设置它时,第一条记录已经被解析为字段(使用默认分隔符)。

设置 -F

 $ awk -F"\.log:" '{ print  }' b.txt
 logging.20160309.113.txt
 logging.20160309.1180.txt
 logging.20160309.1199.txt

设置FS解析第一条记录后

$ awk '{ FS= "\.log:"} { print  }' b.txt
logging.20160309.113.txt.log:
logging.20160309.1180.txt
logging.20160309.1199.txt

在解析任何记录之前设置FS

$ awk 'BEGIN { FS= "\.log:"} { print  }' b.txt
logging.20160309.113.txt
logging.20160309.1180.txt
logging.20160309.1199.txt

我在 awk 手册中注意到了这一点。如果您以前看到过不同的行为或使用不同的实现,这可以解释原因:

According to the POSIX standard, awk is supposed to behave as if each record is split into fields at the time that it is read. In particular, this means that you can change the value of FS after a record is read, but before any of the fields are referenced. The value of the fields (i.e. how they were split) should reflect the old value of FS, not the new one.

However, many implementations of awk do not do this. Instead, they defer splitting the fields until a field reference actually happens, using the current value of FS! This behavior can be difficult to diagnose.