使用 awk，在基于相同模式汇总行时忽略区分大小写的模式

Question

使用 awk，我想在基于相同模式汇总行时忽略区分大小写的模式。

我有以下行（非常感谢安德烈（https://whosebug.com/users/3476320/andrey）

awk '{n=;="";a[[=10=]]+=n}END{for(i in a){print a[i], i}}' testing.txt

文件内容：

1 Used cars
12 Drivers
1 used cars
1 used  cars
14 drivers
2 Used Cars

实际输出为

2  Used Cars
14  drivers
12  Drivers
2  used cars
1  Used cars

我需要的东西：

26 drivers/Drivers (doesn't matter)
5 used cars/Used Cars (doesn't matter)

谢谢！

Answer 1

来自AWK Manual

One way to perform a case-insensitive match at a particular point in the program is to convert the data to a single case, using the tolower() or toupper() built-in string functions (which we haven’t discussed yet; see String Functions). For example:

tolower() ~ /foo/ { … }

Another method, specific to gawk, is to set the variable IGNORECASE to a nonzero value (see Built-in Variables). When IGNORECASE is not zero, all regexp and string operations ignore case.

另请注意：在 awk 中 $1 是第一列，$2 第二列... $0 是整行。（你不想用整行索引数组）

这是在我的机器上运行的：

awk '{a[tolower() " " tolower()]+=;}END{for(i in a){print a[i], i}}' testing.txt

输出：

5 used cars
26 drivers

Answer 2

也许是最简单的方法：

awk  '{[=10=]=tolower([=10=]);n=;="";a[[=10=]]+=n}END{for(i in a){print a[i], i}}' file

使用 awk，在基于相同模式汇总行时忽略区分大小写的模式

Using awk, ignore casesensitve pattern when summarize lines based on the same pattern

awk

counting

case-insensitive