比较两列:如果它们匹配,则在新列中打印值,如果它们不匹配,则将第二列的值打印到新列
Comparing two columns: if they match, print the value in a new column and if they do not match print the value of the second column to the new column
我有一个包含多列的文件。我想比较 A1 ($4) 和 A2 ($14),如果值不匹配,打印 A2 ($14) 的值。如果值匹配,我想打印 A1 ($15) 的值。
文件:
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T
期望的输出:
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G
我先检查了第 4 列和第 15 列之间的区别。
awk '!={print ,}' file > diff
然后我尝试写if-else语句:
awk '{if(=) print = ; else print =}' file > new_file
awk '{$(++NF)=(==)?:}1' file
试试这个:
awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=(==)?:}1' so1186.txt
输出:
awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=(==)?:}1' so1186.txt | column -t
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G
我有一个包含多列的文件。我想比较 A1 ($4) 和 A2 ($14),如果值不匹配,打印 A2 ($14) 的值。如果值匹配,我想打印 A1 ($15) 的值。
文件:
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T
期望的输出:
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G
我先检查了第 4 列和第 15 列之间的区别。
awk '!={print ,}' file > diff
然后我尝试写if-else语句:
awk '{if(=) print = ; else print =}' file > new_file
awk '{$(++NF)=(==)?:}1' file
试试这个:
awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=(==)?:}1' so1186.txt
输出:
awk 'NR==1{$(++NF)="noneff"}NR>1{$(++NF)=(==)?:}1' so1186.txt | column -t
chr SNP BP A1 TEST N OR Z P chr SNP cm BP A2 A1 noneff
20 rs6078030 61098 T ADD 421838 0.9945 -0.209 0.8344 20 rs6078030 0 61098 C T C
20 rs143291093 61270 G ADD 422879 1.046 0.5966 0.5508 20 rs143291093 0 61270 G A A
20 rs4814683 61795 T ADD 417687 1.015 0.6357 0.525 20 rs4814683 0 61795 G T G