删除重复行立即忽略第一个字段

Remove duplicate lines repeating immediately ignoring the first field

我尝试移动(星星只是在这里表示我要保留的行)

*2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:20:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:30:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:50:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm

2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm

我想在不考虑日期的情况下从一组相同的行中删除重复的行,但我不知道该怎么做。 我尝试从第二个参数排序,但它没有考虑行的“组”

cat <my file> | sort -t";" -k2 -u

给我那个

2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm

有人有想法吗?

给定 -m 标志,sort 假定输入已经排序并且不会再次排序;这正是您在这里寻找的。

$ sort -m -t';' -k2 -u <file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm

这不是真正的排序。如果它与前一行不同,则您正在尝试打印一行,忽略日期时间列。你可以试试这个 awk:

awk -F ';' '{s=[=10=]; sub(/^[^;]+;/, "", s)} p != s; {p=s}' file

2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm

只需告诉 uniq 跳过日期字段:

$ uniq -s 20 file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm