删除 CSV 文件中列条件为 bash 的行
Delete lines in CSV file with a column condition in bash
我有一个很大的 CSV 文件 (5Go)。 header 是:
run number,export,downerQ,coefUpQuality,chooseMode,demandF,nbPLots,standarDevPop,nbCitys,whatWord,priceMaxWineF,marketColor,[step],giniIndexReserve,giniIndexPatch,meanQualityTotal,meanQualityMountain,meanQualityPlain,DiffExtCentral,nbcentralPlots,meanPatchByNetwork,sum_q_viti_moutain,sum_q_viti_plaine
"3","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.07083333333333335","0","0","0","0","0","0","48","0"
"4","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.04285714285714286","0","0","0","0","0","0","42","0"
"2","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.05348837209302328","0","0","0","0","0","0","43","0"
我想只保留字段 [step](第 13 个字段)中包含“500”的行。
- 我曾尝试在 sqlite 中导入此 CSV ...但删除崩溃...
- R 也会崩溃(即使有来自 data.table 的恐惧)
有人可以使用 sed
、awk
等工具或任何其他命令来解决问题吗?
awk 似乎是正确的选择:
awk -F, 'NR == 1 || == "\"500\""' filename
其中NR == 1
是保留第一行(header),之后只有第13个字段是"500"
.
的行
我有一个很大的 CSV 文件 (5Go)。 header 是:
run number,export,downerQ,coefUpQuality,chooseMode,demandF,nbPLots,standarDevPop,nbCitys,whatWord,priceMaxWineF,marketColor,[step],giniIndexReserve,giniIndexPatch,meanQualityTotal,meanQualityMountain,meanQualityPlain,DiffExtCentral,nbcentralPlots,meanPatchByNetwork,sum_q_viti_moutain,sum_q_viti_plaine
"3","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.07083333333333335","0","0","0","0","0","0","48","0"
"4","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.04285714285714286","0","0","0","0","0","0","42","0"
"2","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.05348837209302328","0","0","0","0","0","0","43","0"
我想只保留字段 [step](第 13 个字段)中包含“500”的行。
- 我曾尝试在 sqlite 中导入此 CSV ...但删除崩溃...
- R 也会崩溃(即使有来自 data.table 的恐惧)
有人可以使用 sed
、awk
等工具或任何其他命令来解决问题吗?
awk 似乎是正确的选择:
awk -F, 'NR == 1 || == "\"500\""' filename
其中NR == 1
是保留第一行(header),之后只有第13个字段是"500"
.