比较两列:如果它们匹配,则在新列中打印值,如果它们不匹配,则将第二列的值打印到新列

Comparing two columns: if they match, print the value in a new column and if they do not match print the value of the second column to the new column

我有一个包含多列的文件。我想比较 A1 ($4) 和 A2 ($14),如果值不匹配,打印 A2 ($14) 的值。如果值匹配,我想打印 A1 ($15) 的值。

文件:

chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T

期望的输出:

chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G

我先检查了第 4 列和第 15 列之间的区别。

awk '!={print ,}' file > diff

然后我尝试写if-else语句:

awk '{if(=) print = ; else print =}' file > new_file
awk '{$(++NF)=(==)?:}1' file

试试这个:

awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=(==)?:}1' so1186.txt

输出:

awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=(==)?:}1' so1186.txt | column -t
chr  SNP          BP     A1  TEST  N       OR      Z       P       chr  SNP          cm  BP     A2  A1  noneff
20   rs6078030    61098  T   ADD   421838  0.9945  -0.209  0.8344  20   rs6078030    0   61098  C   T   C
20   rs143291093  61270  G   ADD   422879  1.046   0.5966  0.5508  20   rs143291093  0   61270  G   A   A
20   rs4814683    61795  T   ADD   417687  1.015   0.6357  0.525   20   rs4814683    0   61795  G   T   G