文件中各行的总和值

Question

structureA 在一个文件中多次出现，我必须在 numUnitsA、numUnitsB、numUnitsC 下分别对 structureA 的所有出现的参数 1 的值求和。

structureA {
    numUnitsA {
        parameter1 = 2
    }    
    numUnitsB {
        parameter1 = 4
    }    
    numUnitsC {
        parameter1 = 3
    }    
}

我使用下面的方法获取值，但如何对它们求和，例如：

numUnitsA parameter1=6
numUnitsB parameter1=9
numUnitsC parameter1=9

代码：

while read -r line
do
if grep -q "parameter1" "$filename"; then
   echo $(awk 'BEGIN{FS="="}{print }' )
fi
done < "$filename"

Answer 1

试试这个：

awk -F'= *' '/parameter1/ {
    if (++numUnit % 3 == 1) {par1 += }
    else if (numUnit % 3 == 2) {par2 += }
    else {par3 += }
}
END {print "numUnitsA parameter1=" par1
     print "numUnitsA parameter1=" par2
     print "numUnitsA parameter1=" par3}' "$filename"

真的没有循环的原因。这会将文件作为参数并查找 "numUnitX" 的出现，获取下一行并将该值添加到与 X 对应的总值。最后它将打印总计。

备选答案：

$ cols=$(($(grep parameter1 $filename | wc -l)/3))
$ grep parameter1 "$filename" | sed 's/.*= //' | pr -ts"+" --columns "$cols" | bc

这将获取所有值，然后将单位 A、B 和 C 的值粘贴在单独的行中，以“+”分隔，并使用 bc 计算总和。输出为三行，分别包含单元 A、B 和 C 的总计。

更新如果参数没有紧跟在 numUnits 标签之后，答案现在有效。

说明

awk 是一个将文件分成记录（将这些视为行，即使它们可以是多行）和字段（将它们视为列，与之前的评论仍然有效）。这些记录和字段的分隔可以由用户定义，但默认分隔符是记录的换行符和字段的选项卡。所以文件结构定义如下：

record1: field1    field2 spaces allowed    field 3
record2: this record has only one field

record4: the previous line was an empty record
record5: in awk you can refer to fields using , , . like this:
 in your code means this field     in code this field        
record7: [=12=] is the variable for the entire record!

字段可以用</code>、<code>等方式寻址，特殊的[=19=]指的是整条记录。举两个简单的例子来说明。首先我们打印整个文件，渲染脚本等同于 cat：awk '{print} file' orawk '{print $0}' file. A second example changes every record (i.e. line as default) to the literal stringdon't mock awk:awk '{$0 = "don'\'' mock awk"}' 文件. Note the special care to output a'`.

Builtins 一些强大的 awk 内置变量可供我们使用，其中一些解释如下。

FS 字段分隔符，默认FS = "\t"
RS 记录分隔符，默认RS = "\n"
OFS 输出字段分隔符，默认OFS = " "
ORS 输出记录分隔符，默认ORS = "\n"
NR当前记录数，最后是文件中的记录数
NF 这条记录中的字段数。
FILENAME正在处理的文件的文件名。

这些都是非常有用的变量，打印输出时字段分隔符OFS会自动插入。以下示例代码打印第一行的前两个字段，由单个 space 分隔（使用 space 插入 OFS）。 awk 'NR == 1 {print , }' file。

结构一个基本的awk结构如下：

awk -F'= ' '
# this is a comment (starting with #)
# begin clause
BEGIN {
    # do stuff BEFORE parsing the file
    FS = "= +"    # this is also achieved using the -F flag above
    ... 
}
/some regex/ {
    # code here will be executed if record contains 'some regex'
    # example: count number of lines that match this regex
    count++   # increment count with one
}
NR == 1 {
    # code here will only be executed on the first record
}
{
    # code right here will always be executed (i.e. for every record)
    # note the regex is missing => match every record
    ...
}
# add more clauses to match certain records before the end clause:
END {
    # execute code AFTER all files (you can read multiple files) have been parsed
    print count   # print number of records containing our regex
}' path/to/some/file_to_parse /another/path/to/another/file

基本上，如果前面的布尔值 returns 为真，则执行大括号中的代码，无论它是在记录中找到的正则表达式 (/regex notation/) 还是逻辑比较。当缺少条件时，代码将始终执行。

解析代码

如您所见，我们没有 BEGIN 子句，只有一个记录子句。我们正在寻找记录，在我们的例子中是行，包含文字字符串 'parameter1'。这正是包含我们想要总结的值的行。

我们将字段分隔符设置为正则表达式 = +，意思是一个等号和一个或多个 space。请注意，对于我们感兴趣的记录，这意味着我们有两条记录：

        paramter1 = 4
      field1     |||field 2,

这意味着 </code> 现在指的是 <code>4。请注意，</code> 在以下记录中将为空：<code>paramter1=4 因为等号后没有 space。

现在我们有一个案例的转换：

numUnit 等于 1 模 3
numUnit 等于 2 模 3
numUnit 等于 3 模 3.

请注意，我们首先有 if (++numUnit ...，这将在计算表达式之前增加变量 numUnit（因此在 if 检查条件之前）。如您所见，awk 不是强类型的，因此无需先声明 numUnit。在第一次增加时，awk 会假设它是一个零，因为你试图向它添加一些东西，但他不知道它是什么。

因此，每当我们找到包含 paramter1 的记录时，numUnit 就会增加。由于第一次 numUnit 被评估为 1，然后遵循模式 1 2 0 1 2 0 ...' 并且“numUnit”模式是 numUnitA numUnitB numUnitC numUnitA numUnitB ...，您可以看到这些情况中的每一个都处理所有且只有一种类型的记录。每个案例现在都会将参数的值添加到它的总数中（正如您现在可以在代码中轻松看到的那样）。

最后我们通过打印信息来结束 awk 脚本，记住这只会在所有记录都被读取后执行一次。这个应该清楚了。

我强烈建议继续阅读 awk，它是一种非常强大的脚本语言，允许许多高级编程语言构造。乍一看似乎很难，但完全值得付出努力！

文件中各行的总和值

Sum values from various lines in a file

shell