从 2 个不同文件匹配的 2 列中检索所有行

Retrieve all rows from 2 columns matching from 2 different files

我需要从一个文件中检索所有行,从与另一个文件匹配的某些列开始。

我的第一个文件是:

col1,col2,col3
1TF4,WP_110462952.1,AEV67733.1
1TF4,EGD45884.1,AEV67733.1
2BTO,NP_006073.2,XP_037953971.1
2BTO,XP_037953971.1,XP_037953971.1

第二个是:

col1,col2,col3,col4,col5
BAA13425.1,SDD02770.1,38.176,296,175
BAA13425.1,WP_002465021.1,32.056,287,185
BBE42932.1,AEG17356.1,40.909,110,64
BBE42932.1,WP_048124638.1,40.367,109,64

我想从第二个文件中检索所有行,其中 file2_col1=file1_col3 和 file2_col2=file1_col1

我这样试过,但它没有打印出所有内容

awk -F"," 'FILENAME=="file1"{A[]=}
FILENAME=="file2"{if(A[]){print [=12=]}}' file1 file2  > test

I want to retrieve all rows from the second file, where its file2_col1=file1_col3 and file2_col2=file1_col1

您可以使用这个 2 pass awk 解决方案:

awk -F, 'FNR == NR {seen[,]; next} FNR == 1 || (,) in seen' file1 file2

col1,col2,col3,col4,col5
BAA13425.1,2BTO,32.056,287,185
BAA13425.1,2BTO,12.410,641,123

输入文件所在位置:

cat file1

col1,col2,col3
1TF4,WP_110462952.1,AEV67733.1
1TF4,EGD45884.BAA13425.1
2BTO,NP_006073.2,BAA13425.1
2BTO,XP_037953971.1,BAA13425.1

cat file2

col1,col2,col3,col4,col5
BAA13425.1,SDD02770.1,38.176,296,175
BAA13425.1,2BTO,32.056,287,185
BBE42932.1,AEG17356.1,40.909,110,64
BBE42932.1,WP_048124638.1,40.367,109,64
BAA13425.1,2BTO,12.410,641,123