如何在 awk 中用 <RETURN> 替换模式“,”?
How can I replace a the pattern ",," with <RETURN> in awk?
我正在执行 ldapsearch 查询,returns 结果如下
John Joe jjoe@company.com +1 916 662-4727 Ann Tylor Atylor@company.com (987) 654-3210 Steve Harvey sharvey@company.com 4567893210 (321) 956-3344 ...
如您所见,每个个人记录输出之间有一个空白 space,phone 数字可能以 +1 开头,也可能不以 +1 开头,数字或括号之间可能有空白,最后在个人记录有两个空白spaces。例如:
John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344
我正在尝试 awk 并设法用“,”替换,这使得
<blank><blank> to double comma ",,".
But can't figure out how to turn ",," to <RETURN>
11/22/2017 ----****** 更新 ******-------- 11/22/2017
我把这条赛道弄得太拥挤了。我将 post 提出一个新的问题并提供更多详细信息。
如果您的 Input_file 与显示的示例相同,那么以下 awk
awk --re-interval '{gsub(/[0-9]{3}-[0-9]{4} +/,"&\n");print}' Input_file
我使用的是旧版本的 awk
,所以我在新版本中提到了 --re-interval
awk --re-interval '{ ##using --re-interval to use the extended regex as I have old version of awk.
gsub(/[0-9]{3}-[0-9]{4} +/,"&\n"); ##Using gsub utility(global substitute) of awk where I am checking 3 continuous dots then dash(-) then 4 continuous digits and till space with same regex match and NEW LINE.
print ##printing the line of Input_file
}' Input_file ##Mentioning the Input_file here.
根据您的要求,需要使用 sed
$ cat sed-script
s/\ \ ([A-Za-z])/\n/g; # replace alphabets which appended double spaced to '\n'
s/\ \ /,/g; # replace remaining double spaces to ','
s/([A-Za-z]) /,/g; # releace the space appended alphabets to ','
s/\+1//; # eliminate +1
s/[ ()-]//g; # eliminate space, parenthesis, or dash
s/([^0-9])([0-9]{3})/() /g; # modify first 3 numeric embraced by parenthesis
s/([0-9]{4}[^0-9])/-/g; # prepend a '-' to last 4 numerics
$ sed -r -f sed-script file
John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344,...
为了你的兴趣,你可以用 Perl 说:
perl -e '
while (<>) {
s/ /\n/g;
s/ /,/g;
s/(\+1,)?\(?(\d{3})\)?[-,]?(\d{3})[-,]?(\d{4})/() -/g;
}' file
我正在执行 ldapsearch 查询,returns 结果如下
John Joe jjoe@company.com +1 916 662-4727 Ann Tylor Atylor@company.com (987) 654-3210 Steve Harvey sharvey@company.com 4567893210 (321) 956-3344 ...
如您所见,每个个人记录输出之间有一个空白 space,phone 数字可能以 +1 开头,也可能不以 +1 开头,数字或括号之间可能有空白,最后在个人记录有两个空白spaces。例如:
John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344
我正在尝试 awk 并设法用“,”替换,这使得
<blank><blank> to double comma ",,".
But can't figure out how to turn ",," to <RETURN>
11/22/2017 ----****** 更新 ******-------- 11/22/2017
我把这条赛道弄得太拥挤了。我将 post 提出一个新的问题并提供更多详细信息。
如果您的 Input_file 与显示的示例相同,那么以下 awk
awk --re-interval '{gsub(/[0-9]{3}-[0-9]{4} +/,"&\n");print}' Input_file
我使用的是旧版本的 awk
,所以我在新版本中提到了 --re-interval
awk --re-interval '{ ##using --re-interval to use the extended regex as I have old version of awk.
gsub(/[0-9]{3}-[0-9]{4} +/,"&\n"); ##Using gsub utility(global substitute) of awk where I am checking 3 continuous dots then dash(-) then 4 continuous digits and till space with same regex match and NEW LINE.
print ##printing the line of Input_file
}' Input_file ##Mentioning the Input_file here.
根据您的要求,需要使用 sed
$ cat sed-script
s/\ \ ([A-Za-z])/\n/g; # replace alphabets which appended double spaced to '\n'
s/\ \ /,/g; # replace remaining double spaces to ','
s/([A-Za-z]) /,/g; # releace the space appended alphabets to ','
s/\+1//; # eliminate +1
s/[ ()-]//g; # eliminate space, parenthesis, or dash
s/([^0-9])([0-9]{3})/() /g; # modify first 3 numeric embraced by parenthesis
s/([0-9]{4}[^0-9])/-/g; # prepend a '-' to last 4 numerics
$ sed -r -f sed-script file
John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344,...
为了你的兴趣,你可以用 Perl 说:
perl -e '
while (<>) {
s/ /\n/g;
s/ /,/g;
s/(\+1,)?\(?(\d{3})\)?[-,]?(\d{3})[-,]?(\d{4})/() -/g;
}' file