Ubuntu 命令行从转换后的 excel 文件中提取部件号和数量

Question

我有一个包含 40 个组件的 excel 文件，我将其（在线）转换为 txt 文件以执行命令行功能。我想从中提取部件号（它是 6 或 7 位数字）。有些遵循特定的模式。我想提取并保存在txt文件中我的代码：

list.txt
        Product number 1  ac162049-2/slid||product|1971904|pgrid|119732683897|ptaid   1
        Product number 2, its accessories  1-82/pcrid|5194541117|pkw|product|3418376|-SHOPPING 10
        Product number 3  dip-40/dp/9761446       2

预期输出：

productnumber.txt
        1971904   
        3418376 
        9761446

我的代码：

grep -Po '/\K.[0-9]+[1-9]' hardware\ components_prashant.txt > serialnumber.txt

当前输出：

Answer 1

从您的示例数据来看，我认为列分隔符是竖线？

假设部件号是第 1 列，数量是第 8 列，你可以这样做来得到它

cat list.txt | awk -F| '{ print , }' > quantity.txt

Answer 2

前后都是非字母数字的任意六七位数字吗？

grep -Eo '\b[0-9]{6,7}\b' productnumber.txt
1971904
3418376
9761446

在-Extended模式匹配中，\b是一个“词边界”。 c.f。 this tutorial。您也可以使用 \< 和 \>，就像我在下面所做的那样。

[...] 是一个 字符 class 匹配给定集合中的任何内容。破折号 (-) 表示一个范围，因此 [0-9] 是任何从零到九，包括。 {...} 指定长度限制，因此 {6,7} 表示一系列数字 不少于六位，且不多于七位 .

如果你想要你之前提到的字段，(...)是存储分组，^是字符class中的否定，所以：

sed -E 's/^ *([^0-9]+[0-9]+).*\<([0-9]{6,7})\>.* ([0-9]+) *$/||/' productnumber.txt
Product number 1|1971904|1
Product number 2|3418376|10
Product number 3|9761446|2

Ubuntu 命令行从转换后的 excel 文件中提取部件号和数量

Ubuntu Command line Extract part number and quantity from converted excel file

bash

command-line

windows-subsystem-for-linux