通过收集(子)实例的计数将 awk 用于报告
Using awk for a report by gathering counts of (sub)instances
对于这个数据集(data.csv;实际上有几百行)
输入数据
mig|Lecture|12.00
mig|Other|1.681
mige|Research|20.026
mige|Other|4.32
mige|Lecture|0.120
migc|Research|12.83
migc|Lecture|2.170
migc|Other|70.719
done|Research|24.794
done|Lecture|23.123
done|Other|9.96
done|NoMigration|6.9
mig|Research|5.4
md|Required|0.169
md|Required|0.02
mdc|NoMigration|0.122
mdc|Research|0.019
md|Required|2.12
mdc|Research|1.23
mdc|Other|18.53
mdc|Other|2.08
mdc|Lecture|2.5
我想获得包含“状态”、“类别”、“节点”、“配额”列的报告。
数据字典
- 状态:这可以是
done
、md
、mdc
、mig
、migc
、 mige
- 类别:
Other
、Lecture
、Research
、NoMigration
(NoMigration
仅适用于状态 done
和 mdc
)
- nodes: 应根据状态(期望为整数)统计该类别的所有实例。
- 配额:这是节点计数的总和,就类别和状态而言。
输出错误
目前我得到了这个
done|Lecture|4|64.777
mdc|Lecture|6|24.481
md|Lecture|3|2.309
migc|Lecture|3|85.719
mige|Lecture|3|24.466
mig|Lecture|3|19.081
awk代码
这是 awk 片段:
awk 'BEGIN {FS="|";OFS="|" }{
nodes[]++; # Increment count of lines.
quota[] += ; # Accumulate sum of second column.
}
END{for (x in nodes) {
printf("%s|%s|%.f|%.3f\n",x, , nodes[x], quota[x]) | "sort";}}' data.csv
问题是根据status得到categories....
期望的输出
所需的输出应如下所示
它应该看起来像这样(缩写):
done|Research|1|24.794
done|Lecture|1|23.123
done|Other|1|9.96
done|NoMigration|1|6.9
md|Required|3|2.309
mdc|NoMigration|1|0.122
mdc|Research|2|1.249
mdc|Other|2|20.61
mdc|Lecture|1|2.5
mig|Lecture|1|12
mig|Other|1|1.681
mig|Research|1|5.4
migc|Research|1|12.83
migc|Lecture|1|2.17
migc|Other|1|70.719
mige|Research|1|20.026
mige|Other|1|4.32
mige|Lecture|1|0.12
您可以使用 multidimensional array nodes[,]
并在 END 部分打印值。
awk 'BEGIN {FS="|";OFS="|"}
{
nodes[,] +=
quota[,]++
}
END {
for (i in quota) {
split(i, val, SUBSEP)
print val[1] OFS val[2] OFS quota[i] OFS nodes[i] | "sort"
}
}
' data.csv
输出
done|Lecture|1|23.123
done|NoMigration|1|6.9
done|Other|1|9.96
done|Research|1|24.794
mdc|Lecture|1|2.5
mdc|NoMigration|1|0.122
mdc|Other|2|20.61
mdc|Research|2|1.249
md|Required|3|2.309
migc|Lecture|1|2.17
migc|Other|1|70.719
migc|Research|1|12.83
mige|Lecture|1|0.12
mige|Other|1|4.32
mige|Research|1|20.026
mig|Lecture|1|12
mig|Other|1|1.681
mig|Research|1|5.4
对于这个数据集(data.csv;实际上有几百行)
输入数据
mig|Lecture|12.00
mig|Other|1.681
mige|Research|20.026
mige|Other|4.32
mige|Lecture|0.120
migc|Research|12.83
migc|Lecture|2.170
migc|Other|70.719
done|Research|24.794
done|Lecture|23.123
done|Other|9.96
done|NoMigration|6.9
mig|Research|5.4
md|Required|0.169
md|Required|0.02
mdc|NoMigration|0.122
mdc|Research|0.019
md|Required|2.12
mdc|Research|1.23
mdc|Other|18.53
mdc|Other|2.08
mdc|Lecture|2.5
我想获得包含“状态”、“类别”、“节点”、“配额”列的报告。
数据字典
- 状态:这可以是
done
、md
、mdc
、mig
、migc
、mige
- 类别:
Other
、Lecture
、Research
、NoMigration
(NoMigration
仅适用于状态done
和mdc
) - nodes: 应根据状态(期望为整数)统计该类别的所有实例。
- 配额:这是节点计数的总和,就类别和状态而言。
输出错误
目前我得到了这个
done|Lecture|4|64.777
mdc|Lecture|6|24.481
md|Lecture|3|2.309
migc|Lecture|3|85.719
mige|Lecture|3|24.466
mig|Lecture|3|19.081
awk代码
这是 awk 片段:
awk 'BEGIN {FS="|";OFS="|" }{
nodes[]++; # Increment count of lines.
quota[] += ; # Accumulate sum of second column.
}
END{for (x in nodes) {
printf("%s|%s|%.f|%.3f\n",x, , nodes[x], quota[x]) | "sort";}}' data.csv
问题是根据status得到categories....
期望的输出
所需的输出应如下所示 它应该看起来像这样(缩写):
done|Research|1|24.794
done|Lecture|1|23.123
done|Other|1|9.96
done|NoMigration|1|6.9
md|Required|3|2.309
mdc|NoMigration|1|0.122
mdc|Research|2|1.249
mdc|Other|2|20.61
mdc|Lecture|1|2.5
mig|Lecture|1|12
mig|Other|1|1.681
mig|Research|1|5.4
migc|Research|1|12.83
migc|Lecture|1|2.17
migc|Other|1|70.719
mige|Research|1|20.026
mige|Other|1|4.32
mige|Lecture|1|0.12
您可以使用 multidimensional array nodes[,]
并在 END 部分打印值。
awk 'BEGIN {FS="|";OFS="|"}
{
nodes[,] +=
quota[,]++
}
END {
for (i in quota) {
split(i, val, SUBSEP)
print val[1] OFS val[2] OFS quota[i] OFS nodes[i] | "sort"
}
}
' data.csv
输出
done|Lecture|1|23.123
done|NoMigration|1|6.9
done|Other|1|9.96
done|Research|1|24.794
mdc|Lecture|1|2.5
mdc|NoMigration|1|0.122
mdc|Other|2|20.61
mdc|Research|2|1.249
md|Required|3|2.309
migc|Lecture|1|2.17
migc|Other|1|70.719
migc|Research|1|12.83
mige|Lecture|1|0.12
mige|Other|1|4.32
mige|Research|1|20.026
mig|Lecture|1|12
mig|Other|1|1.681
mig|Research|1|5.4