来自拆分的 AWK 数组未排序

Question

我在变量 ArtTEXT 中有这个（演示）文本。

{1}: Reporting Problems and Bugs. 
{2}: Other freely available awk implementations. 
{3}: Summary of installation. 
{4}: How to disable certain gawk extensions. 
{5}: Making Additions To gawk. 
{6}: Accessing the Git repository. 
{7}: Adding code to the main body of gawk. 
{8}: Porting gawk to a new operating system.  
{9}: Why derived files are kept in the Git repository.

它是一个单变量，每行用缩进分隔。

indent = "\n\t\t\t";

我想遍历这些行并检查每行中的内容。

所以我使用缩进将它拆分成一个数组

split(ArtTEXT,lin, indent);

然后我遍历数组lin

l = 0;
for (l in lin) {
    print "l -- ", l, " lin[l] -- " ,lin[l] ;
}

我得到的是从第 4 行开始的 ArtTEXT 行

l --  4  lin[l] --  {3}: Summary of installation. 
l --  5  lin[l] --  {4}: How to disable certain gawk extensions. 
l --  6  lin[l] --  {5}: Making Additions To gawk. 
l --  7  lin[l] --  {6}: Accessing the Git repository. 
l --  8  lin[l] --  {7}: Adding code to the main body of gawk. 
l --  9  lin[l] --  {8}: Porting gawk to a new operating system.  
l --  10  lin[l] --  {9}: Why derived files are kept in the Git repository. 
l --  1  lin[l] --   
l --  2  lin[l] --  {1}: Reporting Problems and Bugs. 
l --  3  lin[l] --  {2}: Other freely available awk implementations.

(原文开头空行)

手册中关于拆分功能的描述：

The first piece is stored in array[1], the second piece in array[2], and so forth.

如何避免这个问题？

为什么会这样？

谢谢。

Answer 1

在 awk 中，数组是无序的。如果他们碰巧按顺序出来，那是偶然的。

在 GNU awk 中，可以控制顺序。例如，要按索引获取数字排序，请使用 PROCINFO["sorted_in"]="@ind_num_asc":

$ awk -v ArtTEXT="$(cat file)" 'BEGIN{PROCINFO["sorted_in"]="@ind_num_asc"; indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l in lin) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l --  1  lin[l] --  {1}: Reporting Problems and Bugs. 
l --  2  lin[l] --  {2}: Other freely available awk implementations. 
l --  3  lin[l] --  {3}: Summary of installation. 
l --  4  lin[l] --  {4}: How to disable certain gawk extensions. 
l --  5  lin[l] --  {5}: Making Additions To gawk. 
l --  6  lin[l] --  {6}: Accessing the Git repository. 
l --  7  lin[l] --  {7}: Adding code to the main body of gawk. 
l --  8  lin[l] --  {8}: Porting gawk to a new operating system.  
l --  9  lin[l] --  {9}: Why derived files are kept in the Git repository.

或者，由于数组索引是数字，我们可以使用 for (l=1;l<=length(lin);l++) print...:

进行数字循环

$ awk -v ArtTEXT="$(cat file)" 'BEGIN{indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l=1;l<=length(lin);l++) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l --  1  lin[l] --  {1}: Reporting Problems and Bugs. 
l --  2  lin[l] --  {2}: Other freely available awk implementations. 
l --  3  lin[l] --  {3}: Summary of installation. 
l --  4  lin[l] --  {4}: How to disable certain gawk extensions. 
l --  5  lin[l] --  {5}: Making Additions To gawk. 
l --  6  lin[l] --  {6}: Accessing the Git repository. 
l --  7  lin[l] --  {7}: Adding code to the main body of gawk. 
l --  8  lin[l] --  {8}: Porting gawk to a new operating system.  
l --  9  lin[l] --  {9}: Why derived files are kept in the Git repository.

多行版本

多行显示的 GNU 代码如下所示：

awk -v ArtTEXT="$(cat file)" '
BEGIN{
    PROCINFO["sorted_in"]="@ind_num_asc"
    indent="\n\t\t\t"
    split(ArtTEXT, lin, indent)
    for (l in lin)
        print "l -- ", l, " lin[l] -- " ,lin[l]
}'

而且，替代代码是：

awk -v ArtTEXT="$(cat file)" '
BEGIN{
    indent="\n\t\t\t"
    split(ArtTEXT, lin, indent)
    for (l=1;l<=length(lin);l++)
        print "l -- ", l, " lin[l] -- " ,lin[l]
}'

来自拆分的 AWK 数组未排序

AWK array from split not sorted

arrays

awk

split

gawk

多行版本