删除重复行立即忽略第一个字段
Remove duplicate lines repeating immediately ignoring the first field
我尝试移动(星星只是在这里表示我要保留的行)
*2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:20:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:30:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:50:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
到
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
我想在不考虑日期的情况下从一组相同的行中删除重复的行,但我不知道该怎么做。
我尝试从第二个参数排序,但它没有考虑行的“组”
cat <my file> | sort -t";" -k2 -u
给我那个
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
有人有想法吗?
给定 -m 标志,sort
假定输入已经排序并且不会再次排序;这正是您在这里寻找的。
$ sort -m -t';' -k2 -u <file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
这不是真正的排序。如果它与前一行不同,则您正在尝试打印一行,忽略日期时间列。你可以试试这个 awk
:
awk -F ';' '{s=[=10=]; sub(/^[^;]+;/, "", s)} p != s; {p=s}' file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
只需告诉 uniq
跳过日期字段:
$ uniq -s 20 file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
我尝试移动(星星只是在这里表示我要保留的行)
*2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:20:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:30:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:50:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
到
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
我想在不考虑日期的情况下从一组相同的行中删除重复的行,但我不知道该怎么做。 我尝试从第二个参数排序,但它没有考虑行的“组”
cat <my file> | sort -t";" -k2 -u
给我那个
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
有人有想法吗?
给定 -m 标志,sort
假定输入已经排序并且不会再次排序;这正是您在这里寻找的。
$ sort -m -t';' -k2 -u <file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
这不是真正的排序。如果它与前一行不同,则您正在尝试打印一行,忽略日期时间列。你可以试试这个 awk
:
awk -F ';' '{s=[=10=]; sub(/^[^;]+;/, "", s)} p != s; {p=s}' file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
只需告诉 uniq
跳过日期字段:
$ uniq -s 20 file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm