是否可以使用 grep、sed 或两者替换 csv 文件中单元格的值

Question

我写了下面的命令

#!/bin/bash
awk -v value=$newvalue -v row=$rownum -v col=1 'BEGIN{FS=OFS=","} NR==row {$col=value}1' "${file}".csv >> temp.csv && mv temp.csv "${file}".csv

file.csv

的示例输入

Header,1
Field1,Field2,Field3
1,ABC,4567
2,XYZ,7890

假设 $newvalue=3 ,$rownum=4 和 col=1, 那么上面的代码将替换:

需要输出

Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890

所以如果我知道行和列，是否可以使用 grep、sed 替换所述值？

Edit1：Field3 各自的行始终具有唯一值。（以防信息有帮助）

Answer 1

假设您的 CSV 文件与您显示的一样简单（引号字段中没有逗号），并且您的 newvalue 不包含 sed 会以特殊方式解释的字符（例如＆符号、斜线或反斜线） ), 以下应该只适用于 sed (使用 GNU sed 测试):

sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv

演示：

$ cat file.csv
Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890
$ rownum=3
$ col=2
$ newvalue="NEW"
$ sed -Ei "$rownum s/[^,]*/$newvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW,4567
3,XYZ,7890

说明：$rownum用作应用以下命令的地址（此处为行号）。 s 是 sed 替换命令。 [^,]* 是要搜索和替换的正则表达式：不包含逗号的最长可能字符串。 $newvalue 是替换字符串。 $col 是要替换的事件。

如果 newvalue 可能包含符号、斜杠或反斜杠，我们必须先对其进行清理：

sanitizednewvalue=$(sed -E 's/([/\&])/\/g' <<< "$newvalue")
sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv

演示：

$ newvalue='NEW&\/&NEW'
$ sanitizednewvalue=$(sed -E 's/([/\&])/\/g' <<< "$newvalue")
$ echo "$sanitizednewvalue"
NEW\&\\/\&NEW
$ sed -Ei "$rownum s/[^,]*/$sanitizednewvalue/$col" file.csv
$ cat file.csv
Header,1
Field1,Field2,Field3
1,NEW&\/&NEW,4567
3,XYZ,7890

Answer 2

和sed，怎么样：

#!/bin/bash

newvalue=3
rownum=4
col=1

sed -i -E "${rownum} s/(([^,]+,){$((col-1))})[^,]+/\1${newvalue}/" file.csv

file.csv

的结果

Header,1
Field1,Field2,Field3
1,ABC,4567
3,XYZ,7890

${rownum} 匹配行号。
(([^,]+,){n})匹配n次重复的组非逗号字符后跟一个逗号。那么它应该是子串在目标（要替换的）列之前通过将 n 分配给 col - 1.

Answer 3

让我们尝试执行 sed 命令

让我们考虑一个包含以下内容的示例 CSV 文件：

$ cat file

Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

删除第一个字段或列：

$ sed 's/[^,]*,//' file

25,11
31,2
21,3
45,4
12,5

This regular expression searches for a sequence of non-comma([^,]*) characters and deletes them which results in the 1st field getting removed.

仅打印最后一个字段，或删除除最后一个字段之外的所有字段：

$ sed 's/.*,//' file

11
2
3
4
5

This regex removes everything till the last comma(.*,) which results in deleting all the fields except the last field.

只打印第一个字段：

$ sed 's/,.*//' file

Solaris
Ubuntu
Fedora
LinuxMint
RedHat

This regex(,.*) removes the characters starting from the 1st comma till the end resulting in deleting all the fields except the last field.

要删除第二个字段：

$ sed 's/,[^,]*,/,/' file

Solaris,11
Ubuntu,2
Fedora,3
LinuxMint,4
RedHat,5

The regex (,[^,]*,) searches for a comma and sequence of characters followed by a comma which results in matching the 2nd column, and replaces this pattern matched with just a comma, ultimately ending in deleting the 2nd column.

注意：删除中间的字段在 sed 中变得更加困难，因为每个字段都必须逐字匹配。

只打印第二个字段：

$ sed 's/[^,]*,\([^,]*\).*//' file

25
31
21
45
12

The regex matches the first field, second field and the rest, however groups the 2nd field alone. The whole line is now replaced with the 2nd field(), hence only the 2nd field gets displayed.

只打印最后一列是单个数字的行：

$ sed -n '/.*,[0-9]$/p' file

Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

The regex (,[0-9]$) checks for a single digit in the last field and the p command prints the line which matches this condition.

给文件中的所有行编号：

$ sed = file | sed 'N;s/\n/ /'

1 Solaris,25,11
2 Ubuntu,31,2
3 Fedora,21,3
4 LinuxMint,45,4
5 RedHat,12,5

This is simulation of cat -n command. awk does it easily using the special variable NR. The '=' command of sed gives the line number of every line followed by the line itself. The sed output is piped to another sed command to join every 2 lines.

如果第一个字段是 'Ubuntu':

$ sed 's/\(Ubuntu\)\(,.*,\).*/9/' file

Solaris,25,11
Ubuntu,31,99
Fedora,21,3
LinuxMint,45,4
RedHat,12,5

This regex matches 'Ubuntu' and till the end except the last column and groups each of them as well. In the replacement part, the 1st and 2nd group along with the new number 99 is substituted.

如果第一个字段是'RedHat'，则删除第二个字段：

$ sed 's/\(RedHat,\)[^,]*\(.*\)//' file

Solaris,25,11
Ubuntu,31,2
Fedora,21,3
LinuxMint,45,4
RedHat,,5

The 1st field 'RedHat', the 2nd field and the remaining fields are grouped, and the replacement is done with only 1st and the last group , resuting in getting the 2nd field deleted.

要在末尾（最后一列）插入一个新列：

$ sed 's/.*/&,A/' file

Solaris,25,11,A
Ubuntu,31,2,A
Fedora,21,3,A
LinuxMint,45,4,A
RedHat,12,5,A

The regex (.*) matches the entire line and replacing it with the line itself (&) and the new field.

要在开头插入新列（第 1 列）：

$ sed 's/.*/A,&/' file

A,Solaris,25,11
A,Ubuntu,31,2
A,Fedora,21,3
A,LinuxMint,45,4
A,RedHat,12,5

Same as last example, just the line matched is followed by the new column

希望对您有所帮助。如果您需要使用 Awk 或任何其他命令，请告诉我。谢谢

是否可以使用 grep、sed 或两者替换 csv 文件中单元格的值

Is it possible replace the value of a cell in a csv file using grep,sed or both

unix

linux

csv

bash

git-bash