使用 awk,在基于相同模式汇总行时忽略区分大小写的模式

Using awk, ignore casesensitve pattern when summarize lines based on the same pattern

使用 awk,我想在基于相同模式汇总行时忽略区分大小写的模式。

我有以下行(非常感谢安德烈(https://whosebug.com/users/3476320/andrey

awk '{n=;="";a[[=10=]]+=n}END{for(i in a){print a[i], i}}' testing.txt

文件内容:

1 Used cars
12 Drivers
1 used cars
1 used  cars
14 drivers
2 Used Cars

实际输出为

2  Used Cars
14  drivers
12  Drivers
2  used cars
1  Used cars

我需要的东西:

26 drivers/Drivers (doesn't matter)
5 used cars/Used Cars (doesn't matter)

谢谢!

来自AWK Manual

One way to perform a case-insensitive match at a particular point in the program is to convert the data to a single case, using the tolower() or toupper() built-in string functions (which we haven’t discussed yet; see String Functions). For example:

tolower() ~ /foo/ { … }

Another method, specific to gawk, is to set the variable IGNORECASE to a nonzero value (see Built-in Variables). When IGNORECASE is not zero, all regexp and string operations ignore case.

另请注意:在 awk 中 $1 是第一列,$2 第二列... $0 是整行。 (你不想用整行索引数组)

这是在我的机器上运行的:

awk '{a[tolower() " " tolower()]+=;}END{for(i in a){print a[i], i}}' testing.txt

输出:

5 used cars
26 drivers

也许是最简单的方法:

awk  '{[=10=]=tolower([=10=]);n=;="";a[[=10=]]+=n}END{for(i in a){print a[i], i}}' file