使用 awk 计算两个文件中的出现次数和计算值

count number of occurences and compute values in two files using awk

我有 2 个文件,文件 1 大约有 100 万行,文件 2 大约有 1000 行。

file1 包含一些字段(tower_id、user_id、signal_strength)并且看起来像这样:

"0001","00abcde","0.65"
"0002","00abcde","0.35"
"0005","00bcdef","1.0"
"0001","00cdefg","0.1"
"0003","00cdefg","0.4"
"0008","00cdefg","0.3"
"0009","00cdefg","0.2"

file2 包含其他字段(tower_id、x_position、y_position),看起来像这样:

"0001","34","22"
"0002","78","56"
"0003","12","32"
"0004","79","45"
"0005","36","37"
"0006","87","99"
"0007","27","93"
"0008","55","04"
"0009","02","03"

每个user_id的signal_strength总和为1。我需要根据每个塔的信号强度计算用户位置,通过计算每个用户的塔数,并计算 strength_signal 与 tower_position 值的乘积,如下所示:

"00abcde" --> 0.65*34+0.35*78, 0.65*22+0.35*56
"00bcdef" --> 1.0*36, 1.0*37
"00cdefg" --> 0.1*34+0.4*12+0.3*55+0.2*02, 0.1*22+0.4*32+0.3*04+0.2*03

所以输出文件应该看起来像这样 (user_id, computed_x_position, computed_y_position):

00abcde,49.4,33.9
00bcdef,36,37
00cdefg,25.1,16.8

我的想法是使用 awk,以某种方式使用“已读”功能以及 file1 和 file2 作为输入文件(如 awk 'NR==FNR {some commands} {print some values}' file1 file2 > outputfile ),但我不知道该怎么做。谁能帮帮我?

这可能是您想要的:

awk -F '[,"]+' '
    NR==FNR { towx[] = ; towy[] = ; next }
            { usrx[] += towx[] * ; usry[] += towy[] *  }
    END     { for (usr in usrx) printf "%s,%.1f,%.1f\n",
                                       usr, usrx[usr], usry[usr] }
' file2 file1 # file2 precedes file1