在 CSV 文件中，基于第三列的小计 2 列，在 KSH 中使用 AWK

Question

免责声明：

    1) English is my second language, so please forgive any grammatical horrors you may find. I am pretty confident you will be able to understand what I need despite these.
    2) I have found several examples in this site that address questions/problems similar to mine, though I was unfortunately not able to figure out the modifications that would need to be introduced to fit my needs.

"Problem":

我有一个如下所示的 CSV 文件：

c1,c2,c3,c4,c5,134.6,,c8,c9,SERVER1,c11
c1,c2,c3,c4,c5,0,,c8,c9,SERVER1,c11
c1,c2,c3,c4,c5,0.18,,c8,c9,SERVER2,c11
c1,c2,c3,c4,c5,0,,c8,c9,SERVER2,c11
c1,c2,c3,c4,c5,416.09,,c8,c9,SERVER3,c11
c1,c2,c3,c4,c5,0,,c8,c9,SERVER3,c11
c1,c2,c3,c4,c5,12.1,,c8,c9,SERVER3,c11
c1,c2,c3,c4,c5,480.64,,c8,c9,SERVER4,c11
c1,c2,c3,c4,c5,,83.65,c8,c9,SERVER5,c11
c1,c2,c3,c4,c5,,253.15,c8,c9,SERVER6,c11
c1,c2,c3,c4,c5,,18.84,c8,c9,SERVER7,c11
c1,c2,c3,c4,c5,,8.12,c8,c9,SERVER7,c11
c1,c2,c3,c4,c5,,22.45,c8,c9,SERVER7,c11
c1,c2,c3,c4,c5,,117.81,c8,c9,SERVER8,c11
c1,c2,c3,c4,c5,,96.34,c8,c9,SERVER9,c11

补充事实：

    1) File has 11 columns.
    2) The data in columns 1, 2, 3, 4, 5, 8, 9 and 11 is irrelevant in this case. In other words, I will only work with columns 6, 7 and 10.
    3) Column 10 will be typically alphanumeric strings (server names), though it may contain also "-" and/or "_".
    4) Columns 6 and 7 will have exclusively numbers, with up to two decimal places (A possible value is 0). Only one of the two will have data per line, never both.

我需要的输出：

    - A single occurrence of every string in column 10 (as column 1), then the sum (subtotal) of it's values in column 6 (as column 2) and last, the sum (subtotal) of it's values in column 7 (as column 3).
    - If the total for a field is "0" the field must be left empty, but still must exist (it's respective comma has to be printed).
    - **Note** that the strings in column 10 will be already alphabetically sorted, so there is no need to do that part of the processing with AWK.

输出样本，使用上面的样本作为输入：

SERVER1,134.6,,
SERVER2,0.18,,
SERVER3,428.19,,
SERVER4,480.64,,
SERVER5,,83.65
SERVER6,,253.15
SERVER7,,26.96

我已经在这些页面中发现不是一个，而是两个 AWK oneliners，它们部分地完成了它所需要的：

awk -F "," 'NR==1{last=; sum=0;}{if (last != ) {print last "," sum; last=; sum=0;} sum += ;}END{print last "," sum;}' inputfile


awk -F, '{a[]+=;}END{for(i in a)print i","a[i];}' inputfile

我的"problems"两种情况都是一样的：

    - Subtotals of 0 are printed.
    - I can only handle the sum of one column at a time. Whenever I try to add the second one, I get either a syntax error or it does simply not print the third column at all.

在此先感谢您的支持！问候，马丁

Answer 1

是这样的吗？

$ awk 'BEGIN{FS=OFS=","} 
            {s6[]+=; s7[]+=} 
         END{for(k in s6) print k,(s6[k]?s6[k]:""),(s7[k]?s7[k]:"")}' file | sort

SERVER1,134.6,
SERVER2,0.18,
SERVER3,428.19,
SERVER4,480.64,
SERVER5,,83.65
SERVER6,,253.15
SERVER7,,49.41
SERVER8,,117.81
SERVER9,,96.34

请注意，您对逗号的处理不一致，您在最后一个字段为零时添加了一个额外的逗号（算上逗号）

Answer 2

您发布的预期输出似乎与您发布的示例输入不匹配，所以我们猜测，但这可能是您要查找的内容：

$ cat tst.awk
BEGIN { FS=OFS="," }
 != prev {
    if (NR > 1) {
        print prev, sum6, sum7
    }
    sum6 = sum7 = ""
    prev = 
}
  { sum6 +=  }
  { sum7 +=  }
END { print prev, sum6, sum7 }

$ awk -f tst.awk file
SERVER1,134.6,
SERVER2,0.18,
SERVER3,428.19,
SERVER4,480.64,
SERVER5,,83.65
SERVER6,,253.15
SERVER7,,49.41
SERVER8,,117.81
SERVER9,,96.34

在 CSV 文件中，基于第三列的小计 2 列，在 KSH 中使用 AWK

In a CSV file, subtotal 2 columns based on a third one, using AWK in KSH

awk

ksh