简化一个 awk "nth column sum"
Simplify an awk "nth column sum"
你能帮我简化一下吗:
awk 'BEGIN{FS=OFS=","}{rank=1/((1/)+(1/)+(1/)+(1/)+(1/));print [=11=],rank}' test.csv
我知道 for 循环应该是:
for(i=6; i<=NF; i+=4)
但我不知道如何在 AWK 中制作重复模式。也不确定 awk 如何处理除以零。
示例数据:
04/12/10 01:15,1291425300,279,41,6,24,71,39,12,1,356,25,4,29,32,10,1,1,170,27,16,8
21/05/14 16:45,1400690700,147,28,80,13,99,7,121,11,107,19,132,12,119,24,40,10,154,25,161,20
09/10/07 09:45,1191923100,152,56,201,35,115,47,157,29,149,47,119,19,131,40,30,11,216,136,213,64
08/06/07 00:30,1181262600,133,47,268,41,93,26,282,40,151,30,249,39,160,46,191,45,164,64,216,42
13/11/09 06:15,1258092900,1043,1462,1163,1456,789,1111,930,1143,954,1460,1366,1469,831,891,728,954,1092,1316,1381,1492
10/03/98 19:30,889558200,789,1240,1176,1262,,,,,,,,,,,,,162,271,1006,283
示例输出:
04/12/10 01:15,1291425300,279,41,6,24,71,39,12,1,356,25,4,29,32,10,1,1,170,27,16,8,0.454308093994778
21/05/14 16:45,1400690700,147,28,80,13,99,7,121,11,107,19,132,12,119,24,40,10,154,25,161,20,2.49273678094131
09/10/07 09:45,1191923100,152,56,201,35,115,47,157,29,149,47,119,19,131,40,30,11,216,136,213,64,4.50004789527607
08/06/07 00:30,1181262600,133,47,268,41,93,26,282,40,151,30,249,39,160,46,191,45,164,64,216,42,8.2601610016789
13/11/09 06:15,1258092900,1043,1462,1163,1456,789,1111,930,1143,954,1460,1366,1469,831,891,728,954,1092,1316,1381,1492,252.467979545275
10/03/98 19:30,889558200,789,1240,1176,1262,,,,,,,,,,,,,162,271,1006,283,#DIV/0!
像这样:
BEGIN{FS=OFS=","}{rank=0;for(i=6;i<=22;i+=4)rank+=($i ? 1/$i : 0);print [=10=],rank}
$ awk '
BEGIN { FS=OFS="," }
{
for(i=6;i<=NF;i+=4) # every 4th column
if($i+0==0) { # if there is a 0 divisor
rank="#DIV/0!" # set rank to something static
break # break from for
}
else
rank+=1/$i # sum every 4th
print [=10=],rank # output
rank=0 # reset
}' file
输出(没有检查是否正确):
04/12/10 01:15,1291425300,279,41,6,24,71,39,12,1,356,25,4,29,32,10,1,1,170,27,16,8,2.20115
21/05/14 16:45,1400690700,147,28,80,13,99,7,121,11,107,19,132,12,119,24,40,10,154,25,161,20,0.401166
09/10/07 09:45,1191923100,152,56,201,35,115,47,157,29,149,47,119,19,131,40,30,11,216,136,213,64,0.22222
08/06/07 00:30,1181262600,133,47,268,41,93,26,282,40,151,30,249,39,160,46,191,45,164,64,216,42,0.121063
13/11/09 06:15,1258092900,1043,1462,1163,1456,789,1111,930,1143,954,1460,1366,1469,831,891,728,954,1092,1316,1381,1492,0.0039609
10/03/98 19:30,889558200,789,1240,1176,1262,,,,,,,,,,,,,162,271,1006,283,#DIV/0!
你能帮我简化一下吗:
awk 'BEGIN{FS=OFS=","}{rank=1/((1/)+(1/)+(1/)+(1/)+(1/));print [=11=],rank}' test.csv
我知道 for 循环应该是:
for(i=6; i<=NF; i+=4)
但我不知道如何在 AWK 中制作重复模式。也不确定 awk 如何处理除以零。
示例数据:
04/12/10 01:15,1291425300,279,41,6,24,71,39,12,1,356,25,4,29,32,10,1,1,170,27,16,8
21/05/14 16:45,1400690700,147,28,80,13,99,7,121,11,107,19,132,12,119,24,40,10,154,25,161,20
09/10/07 09:45,1191923100,152,56,201,35,115,47,157,29,149,47,119,19,131,40,30,11,216,136,213,64
08/06/07 00:30,1181262600,133,47,268,41,93,26,282,40,151,30,249,39,160,46,191,45,164,64,216,42
13/11/09 06:15,1258092900,1043,1462,1163,1456,789,1111,930,1143,954,1460,1366,1469,831,891,728,954,1092,1316,1381,1492
10/03/98 19:30,889558200,789,1240,1176,1262,,,,,,,,,,,,,162,271,1006,283
示例输出:
04/12/10 01:15,1291425300,279,41,6,24,71,39,12,1,356,25,4,29,32,10,1,1,170,27,16,8,0.454308093994778
21/05/14 16:45,1400690700,147,28,80,13,99,7,121,11,107,19,132,12,119,24,40,10,154,25,161,20,2.49273678094131
09/10/07 09:45,1191923100,152,56,201,35,115,47,157,29,149,47,119,19,131,40,30,11,216,136,213,64,4.50004789527607
08/06/07 00:30,1181262600,133,47,268,41,93,26,282,40,151,30,249,39,160,46,191,45,164,64,216,42,8.2601610016789
13/11/09 06:15,1258092900,1043,1462,1163,1456,789,1111,930,1143,954,1460,1366,1469,831,891,728,954,1092,1316,1381,1492,252.467979545275
10/03/98 19:30,889558200,789,1240,1176,1262,,,,,,,,,,,,,162,271,1006,283,#DIV/0!
像这样:
BEGIN{FS=OFS=","}{rank=0;for(i=6;i<=22;i+=4)rank+=($i ? 1/$i : 0);print [=10=],rank}
$ awk '
BEGIN { FS=OFS="," }
{
for(i=6;i<=NF;i+=4) # every 4th column
if($i+0==0) { # if there is a 0 divisor
rank="#DIV/0!" # set rank to something static
break # break from for
}
else
rank+=1/$i # sum every 4th
print [=10=],rank # output
rank=0 # reset
}' file
输出(没有检查是否正确):
04/12/10 01:15,1291425300,279,41,6,24,71,39,12,1,356,25,4,29,32,10,1,1,170,27,16,8,2.20115
21/05/14 16:45,1400690700,147,28,80,13,99,7,121,11,107,19,132,12,119,24,40,10,154,25,161,20,0.401166
09/10/07 09:45,1191923100,152,56,201,35,115,47,157,29,149,47,119,19,131,40,30,11,216,136,213,64,0.22222
08/06/07 00:30,1181262600,133,47,268,41,93,26,282,40,151,30,249,39,160,46,191,45,164,64,216,42,0.121063
13/11/09 06:15,1258092900,1043,1462,1163,1456,789,1111,930,1143,954,1460,1366,1469,831,891,728,954,1092,1316,1381,1492,0.0039609
10/03/98 19:30,889558200,789,1240,1176,1262,,,,,,,,,,,,,162,271,1006,283,#DIV/0!