如果在 bash 中使用 grep 和 awk 编写其他脚本

Question

我正在尝试制作一个脚本来检查值 snp（测试文件的 $4 列）是否存在于另一个文件（地图文件）中。如果是，则打印值 snp 和从地图文件中获取的值距离（距离是地图文件的 $4 列）。如果测试文件中的 snp 值不在地图文件中，则打印 snp 值，但在第二列中放置 0（零）作为距离值。

脚本是：

for chr in {1..22}; 
do
for snp in awk '{print }' test$chr.bim
i=$(grep $snp map$chr.txt | wc -l | awk '{print }')
if [[ $i == "0" ]]
then 
echo "$snp 0" >> position.$chr
else
distance=$(grep $snp map$chr.txt | awk '{print }')
echo "$snp $distance" >> position.$chr
fi
done
done

我的地图文件是这样制作的：

Chromosome  Position(bp)    Rate(cM/Mb) Map(cM)
chr22   16051347    8.096992    0.000000
chr22   16052618    8.131520    0.010291
chr22   16053624    8.131967    0.018472

等等..

我的测试文件是这样制作的：

22  16051347    0   16051347    C   A
22  16052618    0   16052618    G   T
22  17306184    0   17306184    T   G

等等..

我收到以下语法错误：

position.sh: line 6: syntax error near unexpected token `i=$(grep $snp map$chr.txt | wc -l | awk '{print }')'
position.sh: line 6: `i=$(grep $snp map$chr.txt | wc -l | awk '{print }')'

有什么建议吗？

Answer 1

尝试使用 awk 作为 for 的参数基本上是一个语法错误，这里有许多语法问题和效率低下。

试试这个：

for chr in {1..22}; do
    awk '{print }' "test$chr.bim" |
    while IFS="" read -r snp; do
        if ! grep -q "$snp" "map$chr.txt"; then
            echo "$snp 0"
        else
            awk -v snp="$snp" '
                [=10=] ~ snp { print snp,  }' "map$chr.txt"
        fi  >> "position.$chr"
    done
done

整个事情可能会进一步重构为单个 Awk 脚本。

for chr in {1..22}; do
    awk 'NR == FNR { ++a[]; next }
       in a { print a[], ; ++found[] }
      END { for(k in a) if (!found[k]) print a[k], 0 }' \
         "test$chr.bim"  "map$chr.txt" >> "position.$chr"
done

我猜您想要的正确 for 语法应该是这样的

for snp in $(awk '{print }' "test$chr.bim"); do

但这还有其他问题；见 don't read lines with for

如果在 bash 中使用 grep 和 awk 编写其他脚本

If else script in bash using grep and awk

bash

awk

grep