根据命令行中列的值删除 CSV 文件行
delete CSV file row based on the value of a column in command line
这是我的数据集的样子,我试图过滤掉第 4 列 >= 1000 的国家/地区。
Marshall Islands,53127,77,41
Vanuatu,276244,25,70
Solomon Islands,611343,23,142
Sao Tome and Principe,204327,72,147
Belize,374681,46,171
Maldives,436330,39,172
Guyana,777859,27,206
Eswatini,1367254,24,323
Timor-Leste,1296311,30,392
Lesotho,2233339,28,619
Guinea-Bissau,1861283,43,799
Namibia,2533794,49,1242
Gambia,2100568,61,1273
.
.
.
Zimbabwe,16529904,32,5329
(共77行数据)
我已经尝试 运行 在我的终端上执行以下命令,但它只将数据集的 1 行输出到新文件。
awk -F, ' > 999' original.csv > new.csv
*更新,除津巴布韦外的所有行都以 ^M$ 结尾。
这是所需的输出
Namibia,2533794,49,1242
Gambia,2100568,61,1273
Burundi,10864245,13,1380
Armenia,2930450,63,1849
Rwanda,12208407,17,2091
Mongolia,3075647,68,2103
Kyrgyzstan,6045117,36,2184
Mauritania,4420184,53,2335
Lao People's Democratic Republic,6858160,34,2357
Liberia,4731906,51,2399
Tajikistan,8921343,27,2407
Sierra Leone,7557212,42,3147
Togo,7797694,41,3210
Chad,14899994,23,3406
Congo,5260750,66,3496
Cambodia,16005373,23,3678
Paraguay,6811297,61,4175
El Salvador,6377853,71,4546
Guinea,12717176,36,4552
Benin,11175692,47,5227
Zimbabwe,16529904,32,5329
Azerbaijan,9827589,55,5439
Burkina Faso,19193383,29,5517
Nepal,29304998,19,5666
Haiti,10981229,54,5968
Somalia,14742523,44,6544
Zambia,17094131,43,7346
Senegal,15850567,47,7409
Bolivia (Plurinational State of),11051600,69,7634
Mali,18541980,42,7708
Tunisia,11532127,69,7916
Guatemala,16913504,51,8572
Dominican Republic,10766998,80,8643
Cuba,11484636,77,8841
Afghanistan,35530082,25,8971
Syrian Arab Republic,18269867,54,9774
Uganda,42862957,23,9942
Yemen,28250420,36,10175
Kazakhstan,18204498,57,10438
Ecuador,16624857,64,10585
Côte d'Ivoire,24294750,50,12227
Kenya,49699863,27,13201
Cameroon,24053727,56,13416
Sudan,40533328,34,13931
Ghana,28833629,55,15976
Myanmar,53370609,30,16183
United Republic of Tanzania,57310020,33,18943
Angola,29784193,65,19312
Ethiopia,104957438,20,21317
Peru,32165484,78,24999
Iraq,38274617,70,26899
Algeria,41318141,72,29771
Viet Nam,95540797,35,33643
Thailand,69037516,49,33966
Democratic Republic of the Congo,81339984,44,35692
South Africa,56717156,66,37348
Colombia,49065613,80,39471
Egypt,97553148,43,41660
Philippines,104918094,47,48978
Bangladesh,164669750,36,59047
Pakistan,197015953,36,71797
Nigeria,190886313,50,94525
Mexico,129163273,80,103159
Indonesia,263991375,55,144295
India,1339180125,34,449965
有人对如何解决这个问题有建议吗?
假设您的 Input_file 的最后一个字段中可能有空格。您也可以通过执行 cat -e Input_file
来检查它,它会告诉您行尾在哪里,包括行尾的隐藏空格。如果是这种情况,请尝试执行以下命令。
awk 'BEGIN{FS=","} +0 > 999' Input_file
这是我的数据集的样子,我试图过滤掉第 4 列 >= 1000 的国家/地区。
Marshall Islands,53127,77,41
Vanuatu,276244,25,70
Solomon Islands,611343,23,142
Sao Tome and Principe,204327,72,147
Belize,374681,46,171
Maldives,436330,39,172
Guyana,777859,27,206
Eswatini,1367254,24,323
Timor-Leste,1296311,30,392
Lesotho,2233339,28,619
Guinea-Bissau,1861283,43,799
Namibia,2533794,49,1242
Gambia,2100568,61,1273
.
.
.
Zimbabwe,16529904,32,5329
(共77行数据)
我已经尝试 运行 在我的终端上执行以下命令,但它只将数据集的 1 行输出到新文件。
awk -F, ' > 999' original.csv > new.csv
*更新,除津巴布韦外的所有行都以 ^M$ 结尾。
这是所需的输出
Namibia,2533794,49,1242
Gambia,2100568,61,1273
Burundi,10864245,13,1380
Armenia,2930450,63,1849
Rwanda,12208407,17,2091
Mongolia,3075647,68,2103
Kyrgyzstan,6045117,36,2184
Mauritania,4420184,53,2335
Lao People's Democratic Republic,6858160,34,2357
Liberia,4731906,51,2399
Tajikistan,8921343,27,2407
Sierra Leone,7557212,42,3147
Togo,7797694,41,3210
Chad,14899994,23,3406
Congo,5260750,66,3496
Cambodia,16005373,23,3678
Paraguay,6811297,61,4175
El Salvador,6377853,71,4546
Guinea,12717176,36,4552
Benin,11175692,47,5227
Zimbabwe,16529904,32,5329
Azerbaijan,9827589,55,5439
Burkina Faso,19193383,29,5517
Nepal,29304998,19,5666
Haiti,10981229,54,5968
Somalia,14742523,44,6544
Zambia,17094131,43,7346
Senegal,15850567,47,7409
Bolivia (Plurinational State of),11051600,69,7634
Mali,18541980,42,7708
Tunisia,11532127,69,7916
Guatemala,16913504,51,8572
Dominican Republic,10766998,80,8643
Cuba,11484636,77,8841
Afghanistan,35530082,25,8971
Syrian Arab Republic,18269867,54,9774
Uganda,42862957,23,9942
Yemen,28250420,36,10175
Kazakhstan,18204498,57,10438
Ecuador,16624857,64,10585
Côte d'Ivoire,24294750,50,12227
Kenya,49699863,27,13201
Cameroon,24053727,56,13416
Sudan,40533328,34,13931
Ghana,28833629,55,15976
Myanmar,53370609,30,16183
United Republic of Tanzania,57310020,33,18943
Angola,29784193,65,19312
Ethiopia,104957438,20,21317
Peru,32165484,78,24999
Iraq,38274617,70,26899
Algeria,41318141,72,29771
Viet Nam,95540797,35,33643
Thailand,69037516,49,33966
Democratic Republic of the Congo,81339984,44,35692
South Africa,56717156,66,37348
Colombia,49065613,80,39471
Egypt,97553148,43,41660
Philippines,104918094,47,48978
Bangladesh,164669750,36,59047
Pakistan,197015953,36,71797
Nigeria,190886313,50,94525
Mexico,129163273,80,103159
Indonesia,263991375,55,144295
India,1339180125,34,449965
有人对如何解决这个问题有建议吗?
假设您的 Input_file 的最后一个字段中可能有空格。您也可以通过执行 cat -e Input_file
来检查它,它会告诉您行尾在哪里,包括行尾的隐藏空格。如果是这种情况,请尝试执行以下命令。
awk 'BEGIN{FS=","} +0 > 999' Input_file