删除字符，直到从数组项 Bash 中找到数字

Question

我在文本文件中有一行文本。该行看起来像这样：

xxxx,xxxxx,xxxxxx,xxxxx,xxxx,NL-1111 xx,xxxx,xxx

NL- 是国家/地区的标识符，因此可以是任何内容。我想从该行中删除 NL- 部分，使其看起来像这样：

xxxx,xxxxx,xxxxxx,xxxxx,xxxx,1111 xx,xxxx,xxx

然后写入文件。

提前致谢。

Answer 1

sed 可能会成功：从文件中的任何位置删除字符串“,NL-”、"BE-" 等：

sed -i 's/,[A-Z][A-Z]-/,/' file.txt

Answer 2

我认为这里最简单的解决方案是将它从文件中读取到一个 shell 变量中，然后立即将其写回并使用参数扩展的模式替换变体：

line="$(<file)"; echo "${line/[a-zA-Z][a-zA-Z]-}" >|file;

我会警告您不要使用 sed-in-place 功能的解决方案。我发现 sed 的行为在不同平台上与 -i 选项有关。在 Mac 上，您必须为 -i 选项提供一个空参数 ('')，而在 Cygwin 上，您必须 not 在 -i 之后有一个空参数.要获得平台兼容性，您必须测试您所在的平台。

Answer 3

像这样使用 sed

sed -i 's/,[A-Z][A-Z]-\([0-9]\+,\)/,/i' file.txt

,[A-Z][A-Z]-\([0-9]\+,\)搜索逗号字母、字母、-、数字、逗号

,只保留逗号和数字。

i忽略字母的大小写

感谢@chris 的校对。

Answer 4

另一个接近 sed 的解决方案，但使用 perl：

perl -i -pe "s/(?<=,)[a-zA-Z]{2}-//g" file.txt

使用look behind表达式，替换部分不需要重复逗号

Removing chars until a numeric is found from array item Bash