Bash : 按最后一个词过滤文件

Question

我有一个如下所示的日志文件：

Sun Oct 14 03:38:28 2018 [pid 5922] command: Client "0.0.0.0", "USER macly"
Sun Oct 14 03:38:58 2018 [pid 5940] command: Client "0.0.0.0", "USER tredred"
Sun Oct 14 03:40:41 2018 [pid 6870] command: Client "0.0.0.0", "USER sweet"
Sun Oct 14 03:40:47 2018 [pid 7037] command: Client "0.0.0.0", "USER sweet"

我正在尝试编辑文件，以便它保留第一次出现的 'User' 并删除其余的。所以基本上上面的块看起来像：

Sun Oct 14 03:38:28 2018 [pid 5922] command: Client "0.0.0.0", "USER macly"
    Sun Oct 14 03:38:58 2018 [pid 5940] command: Client "0.0.0.0", "USER tredred"
    Sun Oct 14 03:40:41 2018 [pid 6870] command: Client "0.0.0.0", "USER sweet"

这些行实际上 'unique' 因为时间戳不同。我想我可以使用 awk 然后做一个 uniq ： awk '{print $NF}' /home/user_logs | uniq

但这只是我每行的最后一个字，而不是整行。我需要在我的命令中添加什么才能保留整行？

Answer 1

你不需要uniq

$ awk -F, '!a[$NF]++' file

Sun Oct 14 03:38:28 2018 [pid 5922] command: Client "0.0.0.0", "USER macly"
Sun Oct 14 03:38:58 2018 [pid 5940] command: Client "0.0.0.0", "USER tredred"
Sun Oct 14 03:40:41 2018 [pid 6870] command: Client "0.0.0.0", "USER sweet"

说明

a[$NF]++ post 计算最后一个字段值的出现次数，显然第一个值为零，后续值为非零。此值的否定 (!)（视为逻辑，0~false；1~true）仅对值的第一个实例 true。默认操作是 {print [=15=]}，因此未明确写入。

这是标准的 awk 习惯用法，用于打印不需要对文件进行排序的唯一值。

Answer 2

̲I̲f̲数据是定宽的，可以用uniq

$ uniq -s 63 file
Sun Oct 14 03:38:28 2018 [pid 5922] command: Client "0.0.0.0", "USER macly"
Sun Oct 14 03:38:58 2018 [pid 5940] command: Client "0.0.0.0", "USER tredred"
Sun Oct 14 03:40:41 2018 [pid 6870] command: Client "0.0.0.0", "USER sweet"
└──────────────────────────────63─────────────────────────────┘

Bash : 按最后一个词过滤文件

Bash : Filter file by last word

bash

awk

uniq