如何使用 awk 过滤 .csv 文件
How to use awk to filter a .csv file
我想知道如何使用 awk 或其他一些 cli 工具从这个 .csv 文件中获取水果的名称。
我在 vim 中使用了一个宏来编辑文件,但我认为有一个简单的衬垫可以做同样的事情。
"1000","Apple","4","133"
"1028","Lemon","3","120"
"1029","Lime","3","165"
"1030","Lychee","6","120"
"1031","Mango","6","131"
"1032","Mangostine","1","181"
"1033","Melon","4","159"
"1034","Cantaloupe","4","138"
"1035","Honeydew melon","4","155"
"1036","Watermelon","5","176"
"1037","Rock melon","2","180"
"1038","Nectarine","1","128"
"1039","Orange","6","142"
"1040","Peach","6","179"
"1041","Pear","3","102"
"1042","Williams pear or Bartlett pear","1","164"
"1043","Pitaya","2","170"
"1044","Physalis","5","166"
"1045","Plum/prune (dried plum)","4","103"
"1046","Pineapple","3","120"
"1047","Pomegranate","5","112"
"1048","Raisin","4","111"
"1049","Raspberry","5","156"
"1050","Western raspberry (blackcap)","6","173"
我想要的最终结果如下所示:
Apple
Lemon
Lime
Lychee
Mango
Mangostine
Melon
Cantaloupe
Honeydew melon
Watermelon
Rock melon
Nectarine
Orange
Peach
Pear
Williams pear or Bartlett pear
Pitaya
Physalis
Plum/prune (dried plum)
Pineapple
Pomegranate
Raisin
Raspberry
Western raspberry (blackcap)
我意识到这是重复的:
How to parse a CSV in a Bash script?
我建议:
awk -F '","' '{print }' file
使用","
作为字段分隔符并输出第二列。
使用这个 Perl 单行代码:
perl -F',' -lane '$F[1] =~ tr/"//d; print $F[1];' in_file > out_file
Perl 单行代码使用这些命令行标志:
-e
: 告诉 Perl 查找内联代码,而不是在文件中。
-n
:一次循环输入一行,默认分配给 $_
。
-l
: 在执行内联代码之前去除输入行分隔符(默认情况下在 *NIX 上为 "\n"
),并在打印时附加它。
-a
: 在空格或 -F
选项中指定的正则表达式上将 $_
拆分为数组 @F
。
-F','
: 在逗号上拆分成 @F
,而不是在空格上。
另请参见:
perldoc perlrun
: how to execute the Perl interpreter: command line switches
GNU awk 和 gensub()
:
$ gawk '{print gensub(/^[^,]*,"|([^,])".*/,"\1","g")}' file
输出
Apple
...
Lemon
Lime
使用 awk
仅删除第二个字段中的所有 "
,并且仅删除第二个字段的开头和结尾。
awk -F',' '{gsub(/^"|"$/,"",);print }' file
Apple
Lemon
Lime
Lychee
Mango
Mangostine
Melon
Cantaloupe
Honeydew melon
Watermelon
Rock melon
Nectarine
Orange
Peach
Pear
Williams pear or Bartlett pear
Pitaya
Physalis
Plum/prune (dried plum)
Pineapple
Pomegranate
Raisin
Raspberry
Western raspberry (blackcap)
结合使用 sed 和 awk
sed -e 's/^"//;s/","/\t/g;s/"//g' Input.csv| awk -F'\t' '{print}'
或
awk -F, '{print}' Input.csv | sed 's/"//g'
两者都可以通过更改 awk 列号来打印每一列。
我想知道如何使用 awk 或其他一些 cli 工具从这个 .csv 文件中获取水果的名称。
我在 vim 中使用了一个宏来编辑文件,但我认为有一个简单的衬垫可以做同样的事情。
"1000","Apple","4","133"
"1028","Lemon","3","120"
"1029","Lime","3","165"
"1030","Lychee","6","120"
"1031","Mango","6","131"
"1032","Mangostine","1","181"
"1033","Melon","4","159"
"1034","Cantaloupe","4","138"
"1035","Honeydew melon","4","155"
"1036","Watermelon","5","176"
"1037","Rock melon","2","180"
"1038","Nectarine","1","128"
"1039","Orange","6","142"
"1040","Peach","6","179"
"1041","Pear","3","102"
"1042","Williams pear or Bartlett pear","1","164"
"1043","Pitaya","2","170"
"1044","Physalis","5","166"
"1045","Plum/prune (dried plum)","4","103"
"1046","Pineapple","3","120"
"1047","Pomegranate","5","112"
"1048","Raisin","4","111"
"1049","Raspberry","5","156"
"1050","Western raspberry (blackcap)","6","173"
我想要的最终结果如下所示:
Apple
Lemon
Lime
Lychee
Mango
Mangostine
Melon
Cantaloupe
Honeydew melon
Watermelon
Rock melon
Nectarine
Orange
Peach
Pear
Williams pear or Bartlett pear
Pitaya
Physalis
Plum/prune (dried plum)
Pineapple
Pomegranate
Raisin
Raspberry
Western raspberry (blackcap)
我意识到这是重复的:
How to parse a CSV in a Bash script?
我建议:
awk -F '","' '{print }' file
使用","
作为字段分隔符并输出第二列。
使用这个 Perl 单行代码:
perl -F',' -lane '$F[1] =~ tr/"//d; print $F[1];' in_file > out_file
Perl 单行代码使用这些命令行标志:
-e
: 告诉 Perl 查找内联代码,而不是在文件中。
-n
:一次循环输入一行,默认分配给 $_
。
-l
: 在执行内联代码之前去除输入行分隔符(默认情况下在 *NIX 上为 "\n"
),并在打印时附加它。
-a
: 在空格或 -F
选项中指定的正则表达式上将 $_
拆分为数组 @F
。
-F','
: 在逗号上拆分成 @F
,而不是在空格上。
另请参见:
perldoc perlrun
: how to execute the Perl interpreter: command line switches
GNU awk 和 gensub()
:
$ gawk '{print gensub(/^[^,]*,"|([^,])".*/,"\1","g")}' file
输出
Apple
...
Lemon
Lime
使用 awk
仅删除第二个字段中的所有 "
,并且仅删除第二个字段的开头和结尾。
awk -F',' '{gsub(/^"|"$/,"",);print }' file
Apple
Lemon
Lime
Lychee
Mango
Mangostine
Melon
Cantaloupe
Honeydew melon
Watermelon
Rock melon
Nectarine
Orange
Peach
Pear
Williams pear or Bartlett pear
Pitaya
Physalis
Plum/prune (dried plum)
Pineapple
Pomegranate
Raisin
Raspberry
Western raspberry (blackcap)
结合使用 sed 和 awk
sed -e 's/^"//;s/","/\t/g;s/"//g' Input.csv| awk -F'\t' '{print}'
或
awk -F, '{print}' Input.csv | sed 's/"//g'
两者都可以通过更改 awk 列号来打印每一列。