unix中" "之间如何查找和替换

How to search and replace , between " " in unix

输入:

20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30

理想结果:

20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2     
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.224.213/30

如何去掉引号之间的逗号? 引号之间也有没有逗号的行。

我需要删除 ,"JUDICIARY, STATE COURTS (STATE COURTS)" 中的逗号(两次出现在同一行)。

有几个字段在double

之间用逗号

这是一个演示如何操作的脚本 — 欢迎来到 sedgoto 的世界。这是使用 BSD sed 编写的,它使用 -E 来启用扩展的正则表达式; GNU sed 使用 -r 完成相同的任务。

sed -E -e 's/^/A: /p; s/^A: /B: /' \
       -e ':again' \
       -e 's/^(([^"]*|"[^",]*")*)("[^"]*),([^"]*")//' \
       -e 't again' \
       data

假设数据在名为 data 的文件中。第一个 -e 简单地回显带有 A: 前缀的原始输入,然后将前缀更改为 B:。这是调试 material。第二个 -e 创建一个可以跳转到的标签 again 。如果上一步进行了替换,第四个 -e 会跳转到 again 标签。

所有的兴奋都在第三个-e。该模式查找行的开头,然后是零次或多次出现的序列 “不是双引号”或“双引号后跟零个或多个 'not double quote' 和一个双引号”,后跟一个双引号,一个 'not double quote' 序列,一个逗号,更多 'not double quotes'和双引号。这被前缀替换,双引号之间逗号之前的部分和双引号之间逗号之后的部分。

给定一个数据文件:

2000,"xxxx,xxxx",192.168.3.2
2000,"xx,xx,xx",192.16.3.2
2000,"xxxxxxxx",192.168.3.2
20000000,"xxxxxxxxxxxx,xxxxxxxxxxxx",192.168.3.2,"yyyyy,yyyyy"
20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
201,"x,x",192.168.3.2,"y,y","aaaa,cccc,dddd",192,"zzzz",234
201,"x,x",192.168.3.2,"yyy"
201,"xx",192.168.3.2,"yyy",2211
201,"xxx",192.168.3.2,"y,y"
201,"xxx",192.168.3.2,"yyy"
201,"x,x",192.168.3.2,"y,y"
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30 

脚本产生输出:

A: 2000,"xxxx,xxxx",192.168.3.2
B: 2000,"xxxxxxxx",192.168.3.2
A: 2000,"xx,xx,xx",192.16.3.2
B: 2000,"xxxxxx",192.16.3.2
A: 2000,"xxxxxxxx",192.168.3.2
B: 2000,"xxxxxxxx",192.168.3.2
A: 20000000,"xxxxxxxxxxxx,xxxxxxxxxxxx",192.168.3.2,"yyyyy,yyyyy"
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2,"yyyyyyyyyy"
A: 20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
A: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
A: 201,"x,x",192.168.3.2,"y,y","aaaa,cccc,dddd",192,"zzzz",234
B: 201,"xx",192.168.3.2,"yy","aaaaccccdddd",192,"zzzz",234
A: 201,"x,x",192.168.3.2,"yyy"
B: 201,"xx",192.168.3.2,"yyy"
A: 201,"xx",192.168.3.2,"yyy",2211
B: 201,"xx",192.168.3.2,"yyy",2211
A: 201,"xxx",192.168.3.2,"y,y"
B: 201,"xxx",192.168.3.2,"yy"
A: 201,"xxx",192.168.3.2,"yyy"
B: 201,"xxx",192.168.3.2,"yyy"
A: 201,"x,x",192.168.3.2,"y,y"
B: 201,"xx",192.168.3.2,"yy"
A: Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30 
B: Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.224.213/30 

请注意:这很难。如果您有选择,请使用可识别 CSV 格式的工具。比如Python自带CSV模块; Perl 有 Text::CSV(和附属模块 Text::CSV_PPText::CSV_XS)可以处理这个;有用于操作 CSV 文件的自定义工具。

另请注意,Microsoft 支持的符号与 RFC 4180 略有不同,这是 Internet World 试图合理化 Microsoft 使用的内容(初步近似)。