当我使用 cut 命令时，我得到了一个意想不到的结果

Question

这是我的脚本：

table_nm=
hive_db=$(echo $table_nm | cut -d'.' -f1)
hive_tb=$(echo $table_nm | cut -d'.' -f2)

起初，我得到了正确的结果：

$echo "dev.dmf_bird_cost_detail" | cut -d'.' -f1
dev   #correct
$echo "dev.dmf_bird_cost_detail" | cut -d'.' -f2
dmf_bird_cost_detail   #correct

但是，我得到了一些错误，如果 $table_nm 中没有指定的字符，我得到这个结果：

$echo "dmf_bird_cost_detail" | cut -d'.' -f1
dmf_bird_cost_detail   
$echo "dmf_bird_cost_detail" | cut -d'.' -f2
dmf_bird_cost_detail  
$echo "dmf_bird_cost_detail" | cut -d'.' -f3
dmf_bird_cost_detail

不是我预期的结果，我希望它是空的，所以我进行了一些测试，发现如果字符串中没有指定字符，命令“剪切”将return原始值，是真的吗？

终于知道“awk”可以解决我的问题了，但我想知道为什么“cut”会出现上面的结果？非常感谢你们！

Answer 1

来自POSIX cut specification：

-f list

[...] Lines with no field delimiters shall be passed through intact, unless -s is specified. [...]

why "cut" has the above result?

我的猜测是 cut 的第一个实现有这样的行为（我的猜测是它是一个错误），并且它被保留并且 POSIX 标准化了现有行为并添加了 -s 选项。您可以浏览 https://minnie.tuhs.org/cgi-bin/utree.pl 以查找 cut.

的一些旧版本

Answer 2

正确的解决方案可能是无论如何都使用 parameter expansion。

hive_db=${table_nm%.*}
hive_tb=${table_nm#"$hive_db".}

如果您希望有多个点，则需要一些额外的处理来提取第二个字段。

因为这使用 shell 内置插件，所以它比为每个要提取的字段生成两个进程更有效（即使那样你也应该 use proper quoting）。

当我使用 cut 命令时，我得到了一个意想不到的结果

when I use the cut command, I get a result that is not expected

linux

shell

cut