根据一个值对 12 行的组进行排序

Sort groups of 12 lines based on one value

我正在尝试优化对包含 50 万行数据的列表中排名最高的多项式 (https://maths-people.anu.edu.au/~brent/pd/Murphy-thesis.pdf) 的搜索。该列表以 12 行为一组,每行采用以下格式:

n: 533439167600904850230361756102700151678687933392166847323827307497363839257031077774321424872955045754669625577486179222154434651598903112919949771321416511589029559325246084363632977829645558547714072241
Y0: -2185827644152440194843077528225522129878
Y1: 119181810251841490251547
c0: 520196368294236390929241313007470334962
c1: 96360506527052960901419060941213412645
c2: 43791634664623702231347384357
c3: -9285559657533242039560613517
c4: 563452403603161952
c5: -21637936320
skew: 137792.000
lognorm 67.52, exp_E 62.03, alpha -1.81 (proj -2.68), 3 real roots

n: 533439167600904850230361756102700151678687933392166847323827307497363839257031077774321424872955045754669625577486179222154434651598903112919949771321416511589029559325246084363632977829645558547714072241
Y0: -2185827643535814056463203098120423438934
Y1: 1185320029877707674463
c0: 2018231558989478149929124495499518870153
c1: 877408379299126273318698618329767851376
c2: -103500370253681428439107986294
c3: -8603519648746439934492486528
c4: 220583232537944759
c5: -12839506680
skew: 431744.000
lognorm 68.01, exp_E 62.61, alpha 0.09 (proj -1.93), 3 real roots

我如何才能根据给定参数的值对这些进行排序? (lognorm 或 exp_E)

如果没有“帮助”,我认为 sort 命令不会执行您想要的操作。
所以,

  • 将所有 12 行合并为一个超字符串
  • 在字符串前面加上两个排序字段
  • 按需要排序
  • 转换回原始格式

以下不是最高效的脚本,但应该相当容易理解

#  combine 12 lines into one super string
#  preceed each line with the two potential sort fields
gawk '
BEGIN{del="^"}
[=10=]==""{next}  ## skip blank line
{all=all [=10=] del}  ## build up combo string
/lognorm/{
  L=
  E=
  sub(",","",L)
  sub(",","",L)
  print L,E,all  ## copy two potential sort fields to fron of the string
  all=""
}'  |
sort -n -k1,1 | ## or -k2,2  ### now we sort on desired field
gawk '{
  gsub(/[\^]/, "\n")           # replace ^ with newline
  sub(/^[^ ]* [^ ]* /, "")  # strip first two fields (we added above)
  print [=10=]
}'