按指定顺序从每一行中提取字符

Question

我想使用 bash 命令以特定顺序提取第 N 个字符。

例如，如果 sample.txt 包含如下字符串..

ABCDEFG
ABCDEFG
ABCDEFG
ABCDEFG

我想要的输出如下。

BDC
BDC
BDC
BDC

但是，当我使用 cut -c 2,4,3 < sample.txt 时，我得到了，

BCD
BCD
BCD
BCD

如何保存我给出的命令？此操作是否有任何其他命令或脚本？

Answer 1

在一些流行的 AWKs^* 中，当字段分隔符是空字符串时，每个单独的字符都成为一个字段。使用此功能，您可以轻松地以任何顺序提取选定的字符。例如：

$ awk -v FS= '{print }' file
BDC
BDC
BDC
BDC

^* 比如GAWK、MAWK、busybox AWK、OpenBSD AWK等

Answer 2

使用 gawk 或 nawk:

awk -v FPAT='.' '{print }' file

输出：

BDC
BDC
BDC
BDC

来自man gawk：

FPAT: A regular expression describing the contents of the fields in a record. When set, gawk parses the input into fields, where the fields match the regular expression, instead of using the value of the FS variable as the field separator.

Answer 3

sed 使用 捕获组 和 后向引用 和标准 s/find/replace/ 操作可以很方便地做到这一点.例如：

sed 's/.\(.\)\(.\)\(.\).*$//' file

其中 sed 使用 $stuff$ 在 find 部分使用基本正则表达式语法捕获 "stuff" 然后使用编号的反向引用 </code> 来重新插入在表达式的 <code>replace 部分中捕获的内容（</code> 用于第二个捕获组的第二个反向引用，依此类推）。 <code>'.' 匹配任何单个字符，'*' 是零次或多次出现的重复匹配。 '$' 是行尾的锚点。

例子Use/Output

使用 file 中的示例数据，您将拥有：

$ sed 's/.\(.\)\(.\)\(.\).*$//' file
BDC
BDC
BDC
BDC

sed 和其他 awk 解决方案将比为 [=27= 生成单独的 process/subshell 快 数量级 ] 每次迭代。

按指定顺序从每一行中提取字符

extract chars from each line in specified order

bash

awk

sed

cut