是否可以使用 grep、sed 或两者替换 csv 文件中单元格的值
Is it possible replace the value of a cell in a csv file using grep,sed or both
我写了下面的命令
#!/bin/bash
awk -v value=$newvalue -v row=$rownum -v col=1 'BEGIN{FS=OFS=","} NR==row {$col=value}1' "${file}".csv >> temp.csv && mv temp.csv "${file}".csv
file.csv
的示例输入
Header,1
Field1,Field2,Field3
1,ABC,4567
2,XYZ,7890
假设 $newvalue
=3 ,$rownum
=4 和 col
=1, 那么上面的代码将替换:
需要输出
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
所以如果我知道行和列,是否可以使用 grep、sed 替换所述值?
Edit1:Field3 各自的行始终具有唯一值。 (以防信息有帮助)
假设您的 CSV 文件与您显示的一样简单(引号字段中没有逗号),并且您的 newvalue
不包含 sed 会以特殊方式解释的字符(例如&符号、斜线或反斜线) ), 以下应该只适用于 sed (使用 GNU sed 测试):
sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv
演示:
$ cat file.csv
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
$ rownum=3
$ col=2
$ newvalue="NEW"
$ sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW,4567
3,XYZ,7890
说明:$rownum
用作应用以下命令的地址(此处为行号)。 s
是 sed 替换命令。 [^,]*
是要搜索和替换的正则表达式:不包含逗号的最长可能字符串。 $newvalue
是替换字符串。 $col
是要替换的事件。
如果 newvalue
可能包含符号、斜杠或反斜杠,我们必须先对其进行清理:
sanitizednewvalue=$(sed -E 's/([/\&])/\/g' <<< "$newvalue")
sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv
演示:
$ newvalue='NEW&\/&NEW'
$ sanitizednewvalue=$(sed -E 's/([/\&])/\/g' <<< "$newvalue")
$ echo "$sanitizednewvalue"
NEW\&\\/\&NEW
$ sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW&\/&NEW,4567
3,XYZ,7890
和sed
,怎么样:
#!/bin/bash
newvalue=3
rownum=4
col=1
sed -i -E "${rownum} s/(([^,]+,){$((col-1))})[^,]+/\1${newvalue}/" file.csv
file.csv
的结果
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
${rownum}
匹配行号。
(([^,]+,){n})
匹配n次重复的组
非逗号字符后跟一个逗号。那么它应该是子串
在目标(要替换的)列之前通过将 n
分配给
col - 1
.
让我们尝试执行 sed 命令
让我们考虑一个包含以下内容的示例 CSV 文件:
$ cat file
Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
- 删除第一个字段或列:
$ sed 's/[^,]*,//' file
25,11
31,2
21,3
45,4
12,5
This regular expression searches for a sequence of non-comma([^,]*) characters and deletes them which results in the 1st field getting removed.
- 仅打印最后一个字段,或删除除最后一个字段之外的所有字段:
$ sed 's/.*,//' file
11
2
3
4
5
This regex removes everything till the last comma(.*,) which results in deleting all the fields except the last field.
- 只打印第一个字段:
$ sed 's/,.*//' file
Solaris
Ubuntu
Fedora
LinuxMint
RedHat
This regex(,.*) removes the characters starting from the 1st comma till the end resulting in deleting all the fields except the last field.
- 要删除第二个字段:
$ sed 's/,[^,]*,/,/' file
Solaris,11
Ubuntu,2
Fedora,3
LinuxMint,4
RedHat,5
The regex (,[^,]*,) searches for a comma and sequence of characters followed by a comma which results in matching the 2nd column, and replaces this pattern matched with just a comma, ultimately ending in deleting the 2nd column.
注意:删除中间的字段在 sed 中变得更加困难,因为每个字段都必须逐字匹配。
- 只打印第二个字段:
$ sed 's/[^,]*,\([^,]*\).*//' file
25
31
21
45
12
The regex matches the first field, second field and the rest, however groups the 2nd field alone. The whole line is now replaced with the 2nd field(), hence only the 2nd field gets displayed.
- 只打印最后一列是单个数字的行:
$ sed -n '/.*,[0-9]$/p' file
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
The regex (,[0-9]$) checks for a single digit in the last field and the p command prints the line which matches this condition.
- 给文件中的所有行编号:
$ sed = file | sed 'N;s/\n/ /'
1 Solaris,25,11
2 Ubuntu,31,2
3 Fedora,21,3
4 LinuxMint,45,4
5 RedHat,12,5
This is simulation of cat -n command. awk does it easily using the special variable NR. The '=' command of sed gives the line number of every line followed by the line itself. The sed output is piped to another sed command to join every 2 lines.
- 如果第一个字段是 'Ubuntu':
,则将最后一个字段替换为 99
$ sed 's/\(Ubuntu\)\(,.*,\).*/9/' file
Solaris,25,11
Ubuntu,31,99
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
This regex matches 'Ubuntu' and till the end except the last column and groups each of them as well. In the replacement part, the 1st and 2nd group along with the new number 99 is substituted.
- 如果第一个字段是'RedHat',则删除第二个字段:
$ sed 's/\(RedHat,\)[^,]*\(.*\)//' file
Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,,5
The 1st field 'RedHat', the 2nd field and the remaining fields are grouped, and the replacement is done with only 1st and the last group , resuting in getting the 2nd field deleted.
- 要在末尾(最后一列)插入一个新列:
$ sed 's/.*/&,A/' file
Solaris,25,11,A
Ubuntu,31,2,A
Fedora,21,3,A
LinuxMint,45,4,A
RedHat,12,5,A
The regex (.*) matches the entire line and replacing it with the line itself (&) and the new field.
- 要在开头插入新列(第 1 列):
$ sed 's/.*/A,&/' file
A,Solaris,25,11
A,Ubuntu,31,2
A,Fedora,21,3
A,LinuxMint,45,4
A,RedHat,12,5
Same as last example, just the line matched is followed by the new column
希望对您有所帮助。如果您需要使用 Awk 或任何其他命令,请告诉我。
谢谢
我写了下面的命令
#!/bin/bash
awk -v value=$newvalue -v row=$rownum -v col=1 'BEGIN{FS=OFS=","} NR==row {$col=value}1' "${file}".csv >> temp.csv && mv temp.csv "${file}".csv
file.csv
的示例输入Header,1
Field1,Field2,Field3
1,ABC,4567
2,XYZ,7890
假设 $newvalue
=3 ,$rownum
=4 和 col
=1, 那么上面的代码将替换:
需要输出
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
所以如果我知道行和列,是否可以使用 grep、sed 替换所述值?
Edit1:Field3 各自的行始终具有唯一值。 (以防信息有帮助)
假设您的 CSV 文件与您显示的一样简单(引号字段中没有逗号),并且您的 newvalue
不包含 sed 会以特殊方式解释的字符(例如&符号、斜线或反斜线) ), 以下应该只适用于 sed (使用 GNU sed 测试):
sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv
演示:
$ cat file.csv
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
$ rownum=3
$ col=2
$ newvalue="NEW"
$ sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW,4567
3,XYZ,7890
说明:$rownum
用作应用以下命令的地址(此处为行号)。 s
是 sed 替换命令。 [^,]*
是要搜索和替换的正则表达式:不包含逗号的最长可能字符串。 $newvalue
是替换字符串。 $col
是要替换的事件。
如果 newvalue
可能包含符号、斜杠或反斜杠,我们必须先对其进行清理:
sanitizednewvalue=$(sed -E 's/([/\&])/\/g' <<< "$newvalue")
sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv
演示:
$ newvalue='NEW&\/&NEW'
$ sanitizednewvalue=$(sed -E 's/([/\&])/\/g' <<< "$newvalue")
$ echo "$sanitizednewvalue"
NEW\&\\/\&NEW
$ sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW&\/&NEW,4567
3,XYZ,7890
和sed
,怎么样:
#!/bin/bash
newvalue=3
rownum=4
col=1
sed -i -E "${rownum} s/(([^,]+,){$((col-1))})[^,]+/\1${newvalue}/" file.csv
file.csv
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
${rownum}
匹配行号。(([^,]+,){n})
匹配n次重复的组 非逗号字符后跟一个逗号。那么它应该是子串 在目标(要替换的)列之前通过将n
分配给col - 1
.
让我们尝试执行 sed 命令
让我们考虑一个包含以下内容的示例 CSV 文件:
$ cat file
Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
- 删除第一个字段或列:
$ sed 's/[^,]*,//' file
25,11
31,2
21,3
45,4
12,5
This regular expression searches for a sequence of non-comma([^,]*) characters and deletes them which results in the 1st field getting removed.
- 仅打印最后一个字段,或删除除最后一个字段之外的所有字段:
$ sed 's/.*,//' file
11
2
3
4
5
This regex removes everything till the last comma(.*,) which results in deleting all the fields except the last field.
- 只打印第一个字段:
$ sed 's/,.*//' file
Solaris
Ubuntu
Fedora
LinuxMint
RedHat
This regex(,.*) removes the characters starting from the 1st comma till the end resulting in deleting all the fields except the last field.
- 要删除第二个字段:
$ sed 's/,[^,]*,/,/' file
Solaris,11
Ubuntu,2
Fedora,3
LinuxMint,4
RedHat,5
The regex (,[^,]*,) searches for a comma and sequence of characters followed by a comma which results in matching the 2nd column, and replaces this pattern matched with just a comma, ultimately ending in deleting the 2nd column.
注意:删除中间的字段在 sed 中变得更加困难,因为每个字段都必须逐字匹配。
- 只打印第二个字段:
$ sed 's/[^,]*,\([^,]*\).*//' file
25
31
21
45
12
The regex matches the first field, second field and the rest, however groups the 2nd field alone. The whole line is now replaced with the 2nd field(), hence only the 2nd field gets displayed.
- 只打印最后一列是单个数字的行:
$ sed -n '/.*,[0-9]$/p' file
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
The regex (,[0-9]$) checks for a single digit in the last field and the p command prints the line which matches this condition.
- 给文件中的所有行编号:
$ sed = file | sed 'N;s/\n/ /'
1 Solaris,25,11
2 Ubuntu,31,2
3 Fedora,21,3
4 LinuxMint,45,4
5 RedHat,12,5
This is simulation of cat -n command. awk does it easily using the special variable NR. The '=' command of sed gives the line number of every line followed by the line itself. The sed output is piped to another sed command to join every 2 lines.
- 如果第一个字段是 'Ubuntu': ,则将最后一个字段替换为 99
$ sed 's/\(Ubuntu\)\(,.*,\).*/9/' file
Solaris,25,11
Ubuntu,31,99
Fedora,21,3
LinuxMint,45,4
RedHat,12,5
This regex matches 'Ubuntu' and till the end except the last column and groups each of them as well. In the replacement part, the 1st and 2nd group along with the new number 99 is substituted.
- 如果第一个字段是'RedHat',则删除第二个字段:
$ sed 's/\(RedHat,\)[^,]*\(.*\)//' file
Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,,5
The 1st field 'RedHat', the 2nd field and the remaining fields are grouped, and the replacement is done with only 1st and the last group , resuting in getting the 2nd field deleted.
- 要在末尾(最后一列)插入一个新列:
$ sed 's/.*/&,A/' file
Solaris,25,11,A
Ubuntu,31,2,A
Fedora,21,3,A
LinuxMint,45,4,A
RedHat,12,5,A
The regex (.*) matches the entire line and replacing it with the line itself (&) and the new field.
- 要在开头插入新列(第 1 列):
$ sed 's/.*/A,&/' file
A,Solaris,25,11
A,Ubuntu,31,2
A,Fedora,21,3
A,LinuxMint,45,4
A,RedHat,12,5
Same as last example, just the line matched is followed by the new column
希望对您有所帮助。如果您需要使用 Awk 或任何其他命令,请告诉我。 谢谢