在 awk 中划分浮点数

Question

我编写了一个代码来计算 zscore，它计算一个文件的均值和标准差，并使用另一个文件中行的一些值，如下所示：

 mean=$(awk '{total += ; count++} END {print total/count}' ABC_avg.txt)
#calculating mean of the second column of the file
std=$(awk '{x[NR]=; s+=; n++} END{a=s/n; for (i in x){ss += (x[i]-a)^2} sd = sqrt(ss/n); print sd}' ABC_avg.txt)
#calculating standard deviation from the second column of the same file
awk '{if (std) print -$mean/$std}' ABC_splicedavg.txt" > ABC.tmp
#calculate the zscore for each row and store it in a temporary file
zscore=$(awk '{total += [=10=]; count++} END {if (count) print total/count}' ABC.tmp)
#calculate an average of all the zscores in the rows and store it in a variable 
echo $motif"  "$zscore
rm ABC.tmp

但是，当我执行此代码时，在创建临时文件的步骤中出现错误 fatal: division by zero attempted，正确的实现方式是什么这个代码？ TIA 我使用了 bc -l 选项，但它给出了一个非常长的浮动整数版本。

Answer 1

这是一个一次性计算均值和标准差的脚本，如果不可接受，您可能会失去一些分辨率，还有其他选择...

$ awk '{print rand()}' <(seq 100) 
  | awk '{sum+=; sqsum+=^2}
      END{print mean=sum/NR, std=sqrt(sqsum/NR-mean^2), z=mean/std}' 

0.486904 0.321789 1.51312

你的每个样本的 z-score 脚本是错误的！你需要做 ($2-mean)/std.

Answer 2

您可以使用 scale 变量控制 bc 输出的精度：

$ echo "4/7" | bc -l
.57142857142857142857
$ echo "scale=3; 4/7" | bc -l
.571

在 awk 中划分浮点数

Divide floats in awk

floating-point

awk

division