仅打印所有输入文件中存在的行
Print only the lines which are existing in all the input files
只打印存在于所有四个给定输入文件中的行。从下面显示的输入文件中,只有 /dev/dev_sg2 和 /dev/dev_sg3 存在于所有输入文件
中
$ cat file1
/dev/dev_sg1
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg4
$ cat file2
/dev/dev_sg8
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg6
$ cat file3
/dev/dev_sg5
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg6
$ cat file4
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg1
/dev/dev_sg4
尝试过的工具:-
cat file* | sort |uniq -c
1 /dev/dev_sg1
4 /dev/dev_sg2
4 /dev/dev_sg3
1 /dev/dev_sg4
1 /dev/dev_sg5
2 /dev/dev_sg6
1 /dev/dev_sg8
以下 awk
代码可能会对您有所帮助。
awk 'FNR==NR{a[[=10=]];next} ([=10=] in a){++c[[=10=]]} END{for(i in c){if(c[i]==3){print i,c[i]+1}}}' Input_file1 Input_file2 Input_file3 Input_file4
输出如下。
/dev/dev_sg2 4
/dev/dev_sg3 4
编辑: 如果您不想计算行数而只想打印出现的行在所有 4 Input_files 中,以下将起到作用:
awk 'FNR==NR{a[[=12=]];next} ([=12=] in a){++c[[=12=]]} END{for(i in c){if(c[i]==3){print i}}}' Input_file1 Input_file2 Input_file3 Input_file4
EDIT2: 现在也为代码添加说明。
awk '
FNR==NR{ ##FNR==NR condition will be TRUE when very first Input_file here Input_file1 is being read.
a[[=13=]]; ##creating an array named a whose index is current line [=13=].
next ##next is awk out of the box keyword which will avoid the cursor to go forward and will skip all next statements.
}
([=13=] in a){ ##These statements will be executed when awk complete reading the first Input_file named Input_file1 name here. Checking here is [=13=] is in array a.
++c[[=13=]] ##If above condition is TRUE then make an increment in array named c value whose index is current line.
}
END{ ##Starting END block of awk code here.
for(i in c){##Initiating a for loop here by which we will iterate in array c.
if(c[i]==3){ ##checking condition here if array c value is equal to 3, which means it appeared in all 4 Input_file(s).
print i ##if, yes then printing the value of i which is actually having the line which is appearing in all 4 Input_file(s).
}
}}
' Input_file1 Input_file2 Input_file3 Input_file4 ##Mentioning all the 4 Input_file(s) here.
使用comm
管道:
comm -12 <(sort file1) <(sort file2) | comm -12 - <(sort file3) | comm -12 - <(sort file4)
-12
- 抑制两个输入文件独有的行,只打印公共行
输出:
/dev/dev_sg2
/dev/dev_sg3
如果您事先知道不会超过 4 个输入文件,您可以简单地在现有解决方案的末尾添加 grep,如下所示:
cat file* | sort |uniq -c | egrep '^4'
这将仅显示在行首具有最大 (4) 计数的行。
如果您需要它来处理任意数量的文件,则需要更好的解决方案。
如果订单不需要维护
$ j() { join <(sort ) <(sort ); }; j <(j file1 file2) <(j file3 file4)
/dev/dev_sg2
/dev/dev_sg3
只打印存在于所有四个给定输入文件中的行。从下面显示的输入文件中,只有 /dev/dev_sg2 和 /dev/dev_sg3 存在于所有输入文件
中$ cat file1
/dev/dev_sg1
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg4
$ cat file2
/dev/dev_sg8
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg6
$ cat file3
/dev/dev_sg5
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg6
$ cat file4
/dev/dev_sg2
/dev/dev_sg3
/dev/dev_sg1
/dev/dev_sg4
尝试过的工具:-
cat file* | sort |uniq -c
1 /dev/dev_sg1
4 /dev/dev_sg2
4 /dev/dev_sg3
1 /dev/dev_sg4
1 /dev/dev_sg5
2 /dev/dev_sg6
1 /dev/dev_sg8
以下 awk
代码可能会对您有所帮助。
awk 'FNR==NR{a[[=10=]];next} ([=10=] in a){++c[[=10=]]} END{for(i in c){if(c[i]==3){print i,c[i]+1}}}' Input_file1 Input_file2 Input_file3 Input_file4
输出如下。
/dev/dev_sg2 4
/dev/dev_sg3 4
编辑: 如果您不想计算行数而只想打印出现的行在所有 4 Input_files 中,以下将起到作用:
awk 'FNR==NR{a[[=12=]];next} ([=12=] in a){++c[[=12=]]} END{for(i in c){if(c[i]==3){print i}}}' Input_file1 Input_file2 Input_file3 Input_file4
EDIT2: 现在也为代码添加说明。
awk '
FNR==NR{ ##FNR==NR condition will be TRUE when very first Input_file here Input_file1 is being read.
a[[=13=]]; ##creating an array named a whose index is current line [=13=].
next ##next is awk out of the box keyword which will avoid the cursor to go forward and will skip all next statements.
}
([=13=] in a){ ##These statements will be executed when awk complete reading the first Input_file named Input_file1 name here. Checking here is [=13=] is in array a.
++c[[=13=]] ##If above condition is TRUE then make an increment in array named c value whose index is current line.
}
END{ ##Starting END block of awk code here.
for(i in c){##Initiating a for loop here by which we will iterate in array c.
if(c[i]==3){ ##checking condition here if array c value is equal to 3, which means it appeared in all 4 Input_file(s).
print i ##if, yes then printing the value of i which is actually having the line which is appearing in all 4 Input_file(s).
}
}}
' Input_file1 Input_file2 Input_file3 Input_file4 ##Mentioning all the 4 Input_file(s) here.
使用comm
管道:
comm -12 <(sort file1) <(sort file2) | comm -12 - <(sort file3) | comm -12 - <(sort file4)
-12
- 抑制两个输入文件独有的行,只打印公共行
输出:
/dev/dev_sg2
/dev/dev_sg3
如果您事先知道不会超过 4 个输入文件,您可以简单地在现有解决方案的末尾添加 grep,如下所示:
cat file* | sort |uniq -c | egrep '^4'
这将仅显示在行首具有最大 (4) 计数的行。
如果您需要它来处理任意数量的文件,则需要更好的解决方案。
如果订单不需要维护
$ j() { join <(sort ) <(sort ); }; j <(j file1 file2) <(j file3 file4)
/dev/dev_sg2
/dev/dev_sg3