使用 AWK 对数据列执行数学运算

Using AWK to perform math on columns of data

我尝试使用 AWK 将一列信息作为输入,对输入执行数学运算,然后 return 将结果返回到该列中。 我的输入文件格式如下:

1,  1,  6,  3,  2,   4,  1.5,  1, 0.5,    1,  2.75
2,  3,  7,  6,  4,   6,  4.5,  2, 0.5,    2,  2.75
3,  11, 3,  5,  4,   5,  2.5,  6, 2.5,    3,  7.75
4,  9,  9,  10, 7.5, 9,    7,  7, 3.5,    4,  7.75
5,  7,  4,  2,  1,   2,    1,  3,   1,    5,  8.25
6,  8,  5,  9,  7,   7,  5.5,  8, 5.5,    6,  11
7,  6,  2,  1,  1,   1,    1,  4, 1.5,    7,  11.25
8,  10, 1,  4,  3,   3,  1.5,  5, 1.5,    8,  11.25
9,  5,  10, 8,  5.5, 8,    6,  9, 11.5,   9,  25.25
10, 2,  8,  7,  4.5, 10,  12, 10, 11.75, 10,  25.75

除了上述table格式外,还有几个bash变量需要用在方程式中。它们如下:

Feet=4290
FinalTime=76.79
FirstDistance=1320
FirstTime=21.67
SecondDistance=2640
SecondTime=44.65
ThirdDistance=3146
ThirdTime=70.33

我想出了一个很长的衬里,它没有输出准确的信息。我希望在实际工作的 AWK 脚本中有一些更容易阅读的东西。 一个班轮是:

 awk -F, -v Feet="$Feet" -v FinalTime="$FinalTime" -v ThirdDistance="$ThirdDistance" -v ThirdTime="$ThirdTime" -v FirstDistance="$FirstDistance" -v FirstTime="$FirstTime" -v SecondDistance="$SecondDistance" -v SecondTime="$SecondTime" '{ 

     $(NF-2)=(ThirdDistance-($NF-2)*8)/ThirdTime*.681818182;
     $(NF-4)=((SecondDistance-(($NF-4)*8))/SecondTime)*.681818182;
     $(NF-6)=((FirstDistance-(($NF-6)*8))/FirstTime)*.681818182;
     $NF=((Feet-($NF*8))/FinalTime)*.681818182;
     print [=12=]
  }' OFS=";" "$race".csv65

结果如下所示:

1;  1;  6;  3;  42.3501;  4;  40.4663;  1;  30.4409; 1;  37.8956;
2;  3;  7;  6;  42.3501;  6;  40.4663;  2;  30.4409; 2;  37.8956;
3;  11; 3;  5;  41.0916;  5;  39.8554;  6;  30.0531; 3;  37.5404;
4;  9;  9;  10; 41.0916;  9;  39.8554;  7;  30.0531; 4;  37.5404;
5;  7;  4;  2;  40.9657;  2;  39.7944;  3;  30.0143; 5;  37.5049;
6;  8;  5;  9;  40.2735;  7;  39.4584;  8;  29.8011; 6;  37.3095;
7;  6;  2;  1;  40.2106;  1;  39.4279;  4;  29.7817; 7;  37.2918;
8;  10; 1;  4;  40.2106;  3;  39.4279;  5;  29.7817; 8;  37.2918;
9;  5;  10; 8;  36.6867;  8;  37.7176;  9;  28.6959; 9;  36.2973;
10; 2;  8;  7;  36.5608;  10; 37.6565;  10; 28.6571; 10; 36.2618;

所需的结果应如下所示:

1;  1;  6;  3;  41.0287;    4;  40.1303;    1;  30.2664;    1;  37.8956;
2;  3;  7;  6;  40.5252;    6;  39.7638;    2;  30.2664;    2;  37.8956;
3;  11; 3;  5;  40.5252;    5;  40.0081;    6;  30.1113;    3;  37.5404;
4;  9;  9;  10; 39.6443;    9;  39.4584;    7;  30.0337;    4;  37.5404;
5;  7;  4;  2;  41.2804;    2;  40.1914;    3;  30.2276;    5;  37.5049;
6;  8;  5;  9;  39.7701;    7;  39.6417;    8;  29.8786;    6;  37.3095;
7;  6;  2;  1;  41.2804;    1;  40.1914;    4;  30.1888;    7;  37.2918;
8;  10; 1;  4;  40.7769;    3;  40.1303;    5   30.1888;    8;  37.2918;
9;  5;  10; 8;  40.1477;    8;  39.5806;    9;  29.4139;    9;  36.2973;
10; 2;  8;  7;  40.3994;    10; 38.8476;    10; 29.3939;    10; 36.2618;

我不知道我做错了什么。基本上我想完成的方程式是:

 $(NF-2)=((ThirdDistance-($NF-2)*8))/ThirdTime*.681818182;
 $(NF-4)=((SecondDistance-(($NF-4)*8))/SecondTime)*.681818182;
 $(NF-6)=((FirstDistance-(($NF-6)*8))/FirstTime)*.681818182;
 $NF=((Feet-($NF*8))/FinalTime)*.681818182;

但我显然在这里搞砸了。除了最后一列,我没有得到正确的结果,所有其他有结果的列都是按降序排列的。 欢迎任何建设性的批评。

谢谢!

我认为这里的问题是您放错了用于访问每个作业右侧字段的括号。 $(NF-2) 表示倒数第三个字段的值,而 ($NF-2) 表示最后一个字段的值减去二。看起来你真的打算在前三种情况下使用第一个选项:

$ awk -F, -v Feet="$Feet" -v FinalTime="$FinalTime" -v ThirdDistance="$ThirdDistance" -v ThirdTime="$ThirdTime" -v FirstDistance="$FirstDistance" -v FirstTim e="$FirstTime" -v SecondDistance="$SecondDistance" -v SecondTime="$SecondTime" '{
    $(NF-2)=(ThirdDistance-$(NF-2)*8)/ThirdTime*.681818182;
    $(NF-4)=((SecondDistance-$(NF-4)*8)/SecondTime)*.681818182;
    $(NF-6)=((FirstDistance-$(NF-6)*8)/FirstTime)*.681818182;
    $NF=((Feet-$NF*8)/FinalTime)*.681818182;
    print
}' OFS=";" file
1;  1;  6;  3;41.0287;   4;40.1303;  1;30.4603;    1;37.8956
2;  3;  7;  6;40.5252;   6;39.7638;  2;30.4603;    2;37.8956
3;  11; 3;  5;40.5252;   5;40.0081;  6;30.3052;    3;37.5404
4;  9;  9;  10;39.6443; 9;39.4584;  7;30.2276;    4;37.5404
5;  7;  4;  2;41.2804;   2;40.1914;  3;30.4215;    5;37.5049
6;  8;  5;  9;39.7701;   7;39.6417;  8;30.0725;    6;37.3095
7;  6;  2;  1;41.2804;   1;40.1914;  4;30.3827;    7;37.2918
8;  10; 1;  4;40.7769;   3;40.1303;  5;30.3827;    8;37.2918
9;  5;  10; 8;40.1477; 8;39.5806;  9;29.6072;   9;36.2973
10; 2;  8;  7;40.3994; 10;38.8476; 10;29.5878; 10;36.2618

在对 $NF 的赋值中,您遇到了相反的问题:$(NF*8) 试图访问不存在的第 NF*8 字段。由于运算符的优先级,您可以在这种情况下删除括号以获得您想要的结果。

就提高脚本的可读性而言,一些较短的变量名可能会有所帮助。此外,您可能需要考虑使用实际脚本并使用 awk -f 调用它,而不是使用很长的 "one-liner"。我还删除了一些不必要的括号;我个人认为它们会分散注意力,但其他人可能不同意。