根据 AWK 的行索引文件中的单词

Question

假设我有一个类似如下的文件：

hello
hello
hi
hi
hello
hey

我想找到每一行的索引并使用逗号作为索引分隔符。所以理想情况下，输出应该是这样的：

hello 1,2,5
hi 3,4
hey 6

使用以下代码获取行的值做了什么，

{ arr[[=12=]]++ }
END { for (i in arr) {
        print i
    }
}

结果是，

hey
hi
hello

Answer 1

尝试使用这个脚本

{
  words[[=10=]] = words[[=10=]] == "" ? FNR : words[[=10=]] "," FNR        # appends the line, sorting for the word
}

END {                                # once we are done reading the file
  for (w in words)                     # for each word, the sorting order depends on awk internal variables.
  {
    print w, words[w]             # prints the desired output
  }
}

请参阅Controlling Array Traversal for more details on how the words are going to be printed out and how to control it. For more details on FNR see What are NR and FNR。

根据 AWK 的行索引文件中的单词

Indexing words in a file according to their line with AWK

awk

indices