对 CSV 文件中的每一行求和并按特定值排序 bash

Question

我有一个问题，使用下面的设置 Coma 分隔的 CSV 我想运行 bash 中的一个脚本，它对来自特定城市的第 7、8、9 列的所有值求和并显示具有最大值的行所以原始数据集：

Row,name,city,age,height,weight,good rates,bad rates,medium rates
1,john,New York,25,186,98,10,5,11
2,mike,New York,21,175,87,19,6,21
3,Sandy,Boston,38,185,88,0,5,6
4,Sam,Chicago,34,167,76,7,0,2
5,Andy,Boston,31,177,85,19,0,1
6,Karl,New York,33,189,98,9,2,1
7,Steve,Chicago,45,176,88,10,3,0

the desire output will be

Row,name,city,age,height,weight,good rates,bad rates,medium rates,max rates by city
2,mike,New York,21,175,87,19,6,21,46
5,Andy,Boston,31,177,85,19,0,1,20
7,Steve,Chicago,45,176,88,10,3,0,13

我正在尝试这个；但它只给了我最高费率数字，所以 46 但我需要它按城市显示所有行，有什么想法如何继续吗？

awk 'BEGIN {FS=OFS=","}{sum = 0; for (i=7; i<=9;i++) sum += $i} NR ==1 || sum >max {max = sum}

Answer 1

你可以使用这个 awk:

awk '
BEGIN {FS=OFS=","}
NR==1 {
   print [=10=], "max rates by city"
   next
}
{
   s = ++
   if (s > max[]) {
      max[] = s
      rec[] = [=10=]
   }
}
END {
   for (i in max)
      print rec[i], max[i]
}' file

Row,name,city,age,height,weight,good rates,bad rates,medium rates,max rates by city
7,Steve,Chicago,45,176,88,10,3,0,13
2,mike,New York,21,175,87,19,6,21,46
5,Andy,Boston,31,177,85,19,0,1,20

或获取表格输出：

awk 'BEGIN {FS=OFS=","} NR==1{print [=11=], "max rates by city"; next} {s=++; if (s > max[]) {max[] = s; rec[] = [=11=]}} END {for (i in max) print rec[i], max[i]}' file | column -s, -t

Row  name   city      age  height  weight  good rates  bad rates  medium rates  max rates by city
7    Steve  Chicago   45   176     88      10          3          0             13
2    mike   New York  21   175     87      19          6          21            46
5    Andy   Boston    31   177     85      19          0          1             20

对 CSV 文件中的每一行求和并按特定值排序 bash

Sum each row in a CSV file and sort it by specific value bash

linux

bash

row