从另一个文件中删除包含字符串的文件中的所有行
remove all lines in a file containing a string from another file
我想根据匹配另一个文件中的字符串来删除文件中的所有行。这是我用过的,但它只删除了一些:
grep -vFf to_delete.csv inputfile.csv > output.csv
这是我的输入文件 (inputfile.csv) 中的示例行:
Ata,Aqu,Ama3,Abe,0.053475,0.025,0.1,0.11275,0.1,0.15,0.83377
Ata135,Aru2,Aba301,A29,0.055525,0.025,0.1,0.082825,0.075,0.125
Ata135,Atb,Aca,Am54,0.14695,0.1,0.2,0.05255,0.025,0.075,0.8005,
Adc,Aru7,Ama301,Agr84,0.002075,0,0.025,0.240075,0.2,0.
我的文件 "to_delete.csv" 例如如下所示:
Aqu
Aca
因此应删除包含这些字符串的任何行,在本例中,应删除第 1 行和第 3 行。示例所需输出:
Ata135,Aru2,Aba301,A29,0.055525,0.025,0.1,0.082825,0.075,0.125
Adc,Aru7,Ama301,Agr84,0.002075,0,0.025,0.240075,0.2,0.
编辑: 因为 OP 在他的文件中有回车符所以现在也添加解决方案。
cat -v Input_file ##To check if carriage returns are there or not.
tr -d '\r' < Input_file > temp_file && mv temp_file Input_file
由于您的 Input_file 示例和预期输出不明确,因此无法对其进行全面测试,请您尝试以下操作。(如果您对 awk
没问题),附加 > temp_file && mv temp_file Input_file
在代码中将输出保存到 Input_file 本身。
awk -F, 'FNR==NR{a[[=11=]];next} {for(i=1;i<=NF;i++){if($i in a){next}}} 1' to_delete.csv Input_file > temp_file && mv temp_file Input_file
说明:现在也为上面的代码添加说明。
awk -F, ' ##Setting field separator as comma here.
FNR==NR{ ##checking condition FNR==NR which will be TRUE when first Input_file is being read.
a[[=12=]] ##Creating an array named a whose index is [=12=].
next ##next will skip all further statements from here.
}
{
for(i=1;i<=NF;i++){ ##Starting a for loop from value i=1 to till value of NF.
if($i in a){ ##checking if $i is present in array a if yes then go into this condition block.
next ##next will skip all further statements(since we DO NOt want to print any matching contents)
} ##Closing if block now.
} ##Closing for block here.
} ##Closing block which should be executed for 2nd Input_file here.
1 ##awk works on pattern and action method so making condition TRUE here and not mentioning any action so by default print of current line will happen.
' to_delete.csv Input_file ##Mentioning Input_file names here now.
我想根据匹配另一个文件中的字符串来删除文件中的所有行。这是我用过的,但它只删除了一些:
grep -vFf to_delete.csv inputfile.csv > output.csv
这是我的输入文件 (inputfile.csv) 中的示例行:
Ata,Aqu,Ama3,Abe,0.053475,0.025,0.1,0.11275,0.1,0.15,0.83377
Ata135,Aru2,Aba301,A29,0.055525,0.025,0.1,0.082825,0.075,0.125
Ata135,Atb,Aca,Am54,0.14695,0.1,0.2,0.05255,0.025,0.075,0.8005,
Adc,Aru7,Ama301,Agr84,0.002075,0,0.025,0.240075,0.2,0.
我的文件 "to_delete.csv" 例如如下所示:
Aqu
Aca
因此应删除包含这些字符串的任何行,在本例中,应删除第 1 行和第 3 行。示例所需输出:
Ata135,Aru2,Aba301,A29,0.055525,0.025,0.1,0.082825,0.075,0.125
Adc,Aru7,Ama301,Agr84,0.002075,0,0.025,0.240075,0.2,0.
编辑: 因为 OP 在他的文件中有回车符所以现在也添加解决方案。
cat -v Input_file ##To check if carriage returns are there or not.
tr -d '\r' < Input_file > temp_file && mv temp_file Input_file
由于您的 Input_file 示例和预期输出不明确,因此无法对其进行全面测试,请您尝试以下操作。(如果您对 awk
没问题),附加 > temp_file && mv temp_file Input_file
在代码中将输出保存到 Input_file 本身。
awk -F, 'FNR==NR{a[[=11=]];next} {for(i=1;i<=NF;i++){if($i in a){next}}} 1' to_delete.csv Input_file > temp_file && mv temp_file Input_file
说明:现在也为上面的代码添加说明。
awk -F, ' ##Setting field separator as comma here.
FNR==NR{ ##checking condition FNR==NR which will be TRUE when first Input_file is being read.
a[[=12=]] ##Creating an array named a whose index is [=12=].
next ##next will skip all further statements from here.
}
{
for(i=1;i<=NF;i++){ ##Starting a for loop from value i=1 to till value of NF.
if($i in a){ ##checking if $i is present in array a if yes then go into this condition block.
next ##next will skip all further statements(since we DO NOt want to print any matching contents)
} ##Closing if block now.
} ##Closing for block here.
} ##Closing block which should be executed for 2nd Input_file here.
1 ##awk works on pattern and action method so making condition TRUE here and not mentioning any action so by default print of current line will happen.
' to_delete.csv Input_file ##Mentioning Input_file names here now.