如何对变量的输出进行排序?
How can I sort output from variable?
我希望能够对由在额外列中创建的值以逗号分隔的输入 csv 文件进行排序。以下是输入 csv 文件的示例
Timestamp,Email,Name,Year,Make,Model,Car_ID,Judge_ID,Judge_Name,Racer_Turbo,Racer_Supercharged,Racer_Performance,Racer_Horsepower,Car_Overall,Engine_Modifications,Engine_Performance,Engine_Chrome,Engine_Detailing,Engine_Cleanliness,Body_Frame_Undercarriage,Body_Frame_Suspension,Body_Frame_Chrome,Body_Frame_Detailing,Body_Frame_Cleanliness,Mods_Paint,Mods_Body,Mods_Wrap,Mods_Rims,Mods_Interior,Mods_Other,Mods_ICE,Mods_Aftermarket,Mods_WIP,Mods_Overall
8/5/2018 14:10,honoland13@japanpost.jp,Hernando,2015,Acura,TLX,48,J04,Bob,0,0,2,2,4,4,0,2,4,4,2,4,2,2,2,2,2,0,4,4,4,6,2,0,4
8/5/2018 15:11,nlighterness2q@umn.edu,Noel,2015,Jeep,Wrangler,124,J02,Carl,0,6,4,2,4,6,6,4,4,4,6,6,6,6,6,4,6,6,6,6,6,4,6,4,6
8/5/2018 17:10,eguest47@microsoft.com,Edan,2015,Lexus,Is250,222,J05,Adrian,0,0,0,0,0,0,0,0,6,6,6,0,0,6,6,6,0,0,0,0,0,0,0,0,4
8/5/2018 17:34,hchilley40@fema.gov,Hieronymus,1993,Honda,Civic eG,207,J06,Aaron,0,0,2,2,2,2,2,2,0,4,2,2,2,2,2,2,4,2,2,0,0,0,2,2,0
8/5/2018 14:30,nnowick3d@tuttocitta.it,Nickolas,2016,Ford,Mystang,167,J02,Carl,0,0,2,2,0,2,2,0,0,0,0,2,0,2,2,2,0,0,2,0,0,0,0,0,2
8/5/2018 16:12,mdearl39@amazon.co.uk,Martin,2013,Hyundai,Gen coupe,159,J04,Bob,0,0,2,0,0,0,2,0,0,0,0,2,0,2,2,0,2,0,2,0,0,0,0,0,0
8/5/2018 17:00,alynamg@blogtalkradio.com,Aldridge,2009,Infiniti,G37,20,J06,Aaron,2,0,2,2,0,0,2,0,0,2,2,2,2,2,2,2,2,2,4,2,2,0,2,0
我的代码目前所做的是筛选 csv 文件,然后选择 car_id 列、年份、品牌和型号列。然后它 运行 遍历从 racer_turbo 到最后一列的每一列,并且对于每一行,它将这些列中的值加起来成为一个总值,并将其与其他值(id、make、型号等)。还有一个排名列在打印时位于其他 5 个列之前。下面是我的代码。
BEGIN {
FS = ",";
OFS = "\t";
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total";
}
{
rank;
total = 0;
if(NR > 1) {
for(i = 8; i < NF; i++) {
total += $i;
}
print ++rank,, , , , total;
}
rows[][total][[=11=]]
}
END {
print "\n";
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total";
ranking;
PROCINFO["sorted_in"] = "@ind_str_asc"
for (m in rows) {
n = asorti(rows[m], t, "@ind_num_desc");
n = (n>3) ? 3 : n
for(i = 1; i <= n; i++) for(s in rows[m][t[i]]) {
[=11=] = s;
= ++r;
print ++ranking, , , , , total;
}
}
}
我想在 END 块中做的是再次打印输出,但是,使用在前面的代码块中创建的总列对每个品牌的前三名汽车进行排名。但是,我 运行 我的代码现在的输出如下所示
Ranking Car_ID Year Make Model Total
1 48 2015 Acura TLX 58
2 124 2015 Jeep Wrangler 118
3 222 2015 Lexus Is250 36
4 207 1993 Honda Civic eG 40
5 167 2016 Ford Mystang 18
6 159 2013 Hyundai Gen coupe 14
7 20 2009 Infiniti G37 36
...
Ranking Car_ID Year Make Model Total
1 113 2012 Acura Tsx sportwagon 10
2 112 2008 Acura TL 10
3 50 2015 Acura TLX 10
4 15 2014 Audi S4 10
5 18 2015 Audi S3 10
6 116 2008 Audi A4 10
7 2 2016 Bmw M2 10
8 172 2014 Bmw 4 10
9 28 1995 Bmw 318xi 10
...
看看在第二个印刷部分的总计列中,每辆印刷汽车的总数是 10,而不是与每个汽车的第一个印刷部分中的值相同,并且总计最高的 3 个对于显示的每个品牌。
下面是预期的输出
Ranking Car_ID Year Make Model Total
1 48 2015 Acura TLX 58
2 124 2015 Jeep Wrangler 118
3 222 2015 Lexus Is250 36
4 207 1993 Honda Civic eG 40
5 167 2016 Ford Mystang 18
6 159 2013 Hyundai Gen coupe 14
7 20 2009 Infiniti G37 36
8 178 2009 Honda Oddesy 66
...
Ranking Car_ID Year Make Model Total
1 112 2008 Acura TL 110
2 50 2015 Acura TLX 102
3 127 2013 Acura Tsx 86
4 15 2014 Audi S4 120
5 18 2015 Audi S3 38
6 116 2008 Audi A4 28
7 2 2016 Bmw M2 24
8 172 2014 Bmw 4 22
9 111 2007 Bmw 328i 10
10 218 2010 Chevy Camaro 64
11 170 2014 Chevy Cruze 50
12 0 2015 Chevy Camaro 0
...
这可以用我当前的代码修复吗?或者更好的方法是创建一个单独的 awk 文件,该文件将对生成的输出进行排序并生成另一个按前 3 位排序的文件?
我运行正在使用 GNU AWK v4.0.2。
假设Car_ID
(以下简称id
)在各行中是唯一的,请试一试:
BEGIN {
FS = ","
OFS = "\t"
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total"
}
{
rank
total = 0
if (NR > 1) {
for (i = 8; i < NF; i++) {
total += $i
}
print ++rank, , , , , total
ttl[][] = total
row[] = [=10=]
}
}
END {
print "\n"
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total"
ranking
id
PROCINFO["sorted_in"] = "@ind_str_asc"
for (m in ttl) {
n = asorti(ttl[m], t, "@val_num_desc")
n = (n>3) ? 3 : n
for (i = 1; i <= n; i++) {
id = t[i]
total = ttl[m][id]
[=10=] = row[id]
print ++ranking, , , , , total
}
}
}
我稍微修改了数据结构,将 id
指定为
主键。然后创建了一个二维数组ttl
,其中保存值total
由 make
和 id
键控。在 END
循环中,我们可以检索
使用 id
.
输入数据
作为旁注,您的原始数据结构使用 total
作为索引。
如果具有相同 make 的多行碰巧具有相同的值
total
个索引中的任何一个都将被覆盖。
我希望能够对由在额外列中创建的值以逗号分隔的输入 csv 文件进行排序。以下是输入 csv 文件的示例
Timestamp,Email,Name,Year,Make,Model,Car_ID,Judge_ID,Judge_Name,Racer_Turbo,Racer_Supercharged,Racer_Performance,Racer_Horsepower,Car_Overall,Engine_Modifications,Engine_Performance,Engine_Chrome,Engine_Detailing,Engine_Cleanliness,Body_Frame_Undercarriage,Body_Frame_Suspension,Body_Frame_Chrome,Body_Frame_Detailing,Body_Frame_Cleanliness,Mods_Paint,Mods_Body,Mods_Wrap,Mods_Rims,Mods_Interior,Mods_Other,Mods_ICE,Mods_Aftermarket,Mods_WIP,Mods_Overall
8/5/2018 14:10,honoland13@japanpost.jp,Hernando,2015,Acura,TLX,48,J04,Bob,0,0,2,2,4,4,0,2,4,4,2,4,2,2,2,2,2,0,4,4,4,6,2,0,4
8/5/2018 15:11,nlighterness2q@umn.edu,Noel,2015,Jeep,Wrangler,124,J02,Carl,0,6,4,2,4,6,6,4,4,4,6,6,6,6,6,4,6,6,6,6,6,4,6,4,6
8/5/2018 17:10,eguest47@microsoft.com,Edan,2015,Lexus,Is250,222,J05,Adrian,0,0,0,0,0,0,0,0,6,6,6,0,0,6,6,6,0,0,0,0,0,0,0,0,4
8/5/2018 17:34,hchilley40@fema.gov,Hieronymus,1993,Honda,Civic eG,207,J06,Aaron,0,0,2,2,2,2,2,2,0,4,2,2,2,2,2,2,4,2,2,0,0,0,2,2,0
8/5/2018 14:30,nnowick3d@tuttocitta.it,Nickolas,2016,Ford,Mystang,167,J02,Carl,0,0,2,2,0,2,2,0,0,0,0,2,0,2,2,2,0,0,2,0,0,0,0,0,2
8/5/2018 16:12,mdearl39@amazon.co.uk,Martin,2013,Hyundai,Gen coupe,159,J04,Bob,0,0,2,0,0,0,2,0,0,0,0,2,0,2,2,0,2,0,2,0,0,0,0,0,0
8/5/2018 17:00,alynamg@blogtalkradio.com,Aldridge,2009,Infiniti,G37,20,J06,Aaron,2,0,2,2,0,0,2,0,0,2,2,2,2,2,2,2,2,2,4,2,2,0,2,0
我的代码目前所做的是筛选 csv 文件,然后选择 car_id 列、年份、品牌和型号列。然后它 运行 遍历从 racer_turbo 到最后一列的每一列,并且对于每一行,它将这些列中的值加起来成为一个总值,并将其与其他值(id、make、型号等)。还有一个排名列在打印时位于其他 5 个列之前。下面是我的代码。
BEGIN {
FS = ",";
OFS = "\t";
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total";
}
{
rank;
total = 0;
if(NR > 1) {
for(i = 8; i < NF; i++) {
total += $i;
}
print ++rank,, , , , total;
}
rows[][total][[=11=]]
}
END {
print "\n";
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total";
ranking;
PROCINFO["sorted_in"] = "@ind_str_asc"
for (m in rows) {
n = asorti(rows[m], t, "@ind_num_desc");
n = (n>3) ? 3 : n
for(i = 1; i <= n; i++) for(s in rows[m][t[i]]) {
[=11=] = s;
= ++r;
print ++ranking, , , , , total;
}
}
}
我想在 END 块中做的是再次打印输出,但是,使用在前面的代码块中创建的总列对每个品牌的前三名汽车进行排名。但是,我 运行 我的代码现在的输出如下所示
Ranking Car_ID Year Make Model Total
1 48 2015 Acura TLX 58
2 124 2015 Jeep Wrangler 118
3 222 2015 Lexus Is250 36
4 207 1993 Honda Civic eG 40
5 167 2016 Ford Mystang 18
6 159 2013 Hyundai Gen coupe 14
7 20 2009 Infiniti G37 36
...
Ranking Car_ID Year Make Model Total
1 113 2012 Acura Tsx sportwagon 10
2 112 2008 Acura TL 10
3 50 2015 Acura TLX 10
4 15 2014 Audi S4 10
5 18 2015 Audi S3 10
6 116 2008 Audi A4 10
7 2 2016 Bmw M2 10
8 172 2014 Bmw 4 10
9 28 1995 Bmw 318xi 10
...
看看在第二个印刷部分的总计列中,每辆印刷汽车的总数是 10,而不是与每个汽车的第一个印刷部分中的值相同,并且总计最高的 3 个对于显示的每个品牌。
下面是预期的输出
Ranking Car_ID Year Make Model Total
1 48 2015 Acura TLX 58
2 124 2015 Jeep Wrangler 118
3 222 2015 Lexus Is250 36
4 207 1993 Honda Civic eG 40
5 167 2016 Ford Mystang 18
6 159 2013 Hyundai Gen coupe 14
7 20 2009 Infiniti G37 36
8 178 2009 Honda Oddesy 66
...
Ranking Car_ID Year Make Model Total
1 112 2008 Acura TL 110
2 50 2015 Acura TLX 102
3 127 2013 Acura Tsx 86
4 15 2014 Audi S4 120
5 18 2015 Audi S3 38
6 116 2008 Audi A4 28
7 2 2016 Bmw M2 24
8 172 2014 Bmw 4 22
9 111 2007 Bmw 328i 10
10 218 2010 Chevy Camaro 64
11 170 2014 Chevy Cruze 50
12 0 2015 Chevy Camaro 0
...
这可以用我当前的代码修复吗?或者更好的方法是创建一个单独的 awk 文件,该文件将对生成的输出进行排序并生成另一个按前 3 位排序的文件?
我运行正在使用 GNU AWK v4.0.2。
假设Car_ID
(以下简称id
)在各行中是唯一的,请试一试:
BEGIN {
FS = ","
OFS = "\t"
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total"
}
{
rank
total = 0
if (NR > 1) {
for (i = 8; i < NF; i++) {
total += $i
}
print ++rank, , , , , total
ttl[][] = total
row[] = [=10=]
}
}
END {
print "\n"
print "Ranking", "Car_ID", "Year", "Make", "Model", "Total"
ranking
id
PROCINFO["sorted_in"] = "@ind_str_asc"
for (m in ttl) {
n = asorti(ttl[m], t, "@val_num_desc")
n = (n>3) ? 3 : n
for (i = 1; i <= n; i++) {
id = t[i]
total = ttl[m][id]
[=10=] = row[id]
print ++ranking, , , , , total
}
}
}
我稍微修改了数据结构,将 id
指定为
主键。然后创建了一个二维数组ttl
,其中保存值total
由 make
和 id
键控。在 END
循环中,我们可以检索
使用 id
.
输入数据
作为旁注,您的原始数据结构使用 total
作为索引。
如果具有相同 make 的多行碰巧具有相同的值
total
个索引中的任何一个都将被覆盖。