sed:使用正则表达式从日志中删除空格

sed: remove spaces from the log using regular expression

我正在处理由以下格式的多行组成的日志:

06I: 31 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
 #1.1/? THR 26 N       #1.1/A UNL 1 O      #1.1/? THR 26 H       3.515  2.716
 #1.1/? ASN 142 ND2    #1.1/A UNL 1 O      #1.1/? ASN 142 2HD2   3.227  2.305
 #1.1/A UNL 1 N        #1.1/? THR 26 O     #1.1/A UNL 1 H        3.463  2.652
 #1.2/A UNL 1 N        #1.2/? PHE 140 O    #1.2/A UNL 1 H        2.987  2.200
 #1.4/? THR 26 N       #1.4/A UNL 1 S      #1.4/? THR 26 H       4.354  3.371
 #1.4/? HIS 163 NE2    #1.4/A UNL 1 N     no hydrogen                                          3.137  N/A
 #1.4/A UNL 1 N        #1.4/? ARG 188 O    #1.4/A UNL 1 H        3.000  2.081
 #1.5/? HIS 163 NE2    #1.5/A UNL 1 N     no hydrogen                                          3.330  N/A
 #1.5/? GLN 189 NE2    #1.5/A UNL 1 O      #1.5/? GLN 189 2HE2   3.029  2.132
 #1.6/A UNL 1 N        #1.6/? ARG 188 O    #1.6/A UNL 1 H        2.984  2.064
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 N      #1.8/? ASN 142 2HD2   3.164  2.395
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O      #1.8/? ASN 142 2HD2   3.031  2.180
 #1.8/? GLN 189 NE2    #1.8/A UNL 1 O      #1.8/? GLN 189 1HE2   3.276  2.553
 #1.8/A UNL 1 N        #1.8/? THR 190 O    #1.8/A UNL 1 H        3.257  2.407
 #1.9/A UNL 1 N        #1.9/? THR 190 O    #1.9/A UNL 1 H        2.913  2.037
 #1.10/? SER 144 OG    #1.10/A UNL 1 S     #1.10/? SER 144 HG    4.246  3.845
 #1.10/? HIS 163 NE2   #1.10/A UNL 1 S    no hydrogen                                          3.700  N/A
 #1.10/A UNL 1 N       #1.10/? THR 190 O   #1.10/A UNL 1 H       3.008  2.091
 #1.12/? GLN 189 NE2   #1.12/A UNL 1 O     #1.12/? GLN 189 1HE2  2.929  2.152
 #1.12/A UNL 1 N       #1.12/? PHE 140 O   #1.12/A UNL 1 H       2.912  2.012
 #1.13/? ASN 142 ND2   #1.13/A UNL 1 O     #1.13/? ASN 142 2HD2  3.063  2.291
 #1.14/? HIS 41 NE2    #1.14/A UNL 1 S    no hydrogen                                          3.919  N/A
 #1.14/? ASN 142 ND2   #1.14/A UNL 1 O     #1.14/? ASN 142 2HD2  2.802  1.872
 #1.14/A UNL 1 N       #1.14/? THR 190 O   #1.14/A UNL 1 H       2.927  1.987
 #1.16/? GLN 189 NE2   #1.16/A UNL 1 N     #1.16/? GLN 189 1HE2  3.456  2.669
 #1.16/? GLN 189 NE2   #1.16/A UNL 1 O     #1.16/? GLN 189 1HE2  3.079  2.177
 #1.16/A UNL 1 N       #1.16/? THR 190 O   #1.16/A UNL 1 H       2.967  1.987
 #1.17/? ASN 142 ND2   #1.17/A UNL 1 N     #1.17/? ASN 142 2HD2  3.218  2.294
 #1.17/A UNL 1 N       #1.17/? THR 190 O   #1.17/A UNL 1 H       3.364  2.469
 #1.18/? ASN 142 ND2   #1.18/A UNL 1 O     #1.18/? ASN 142 2HD2  3.117  2.142
 #1.20/? ASN 142 ND2   #1.20/A UNL 1 N     #1.20/? ASN 142 2HD2  3.245  2.560
-----------------------------------------------------------------------------
structure30R: 21 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
 #1.4/? GLN 189 NE2    #1.4/A UNL 1 O       #1.4/? GLN 189 1HE2   3.139  2.374
 #1.5/? GLN 189 NE2    #1.5/A UNL 1 N       #1.5/? GLN 189 2HE2   3.296  2.365
 #1.7/? CYS 145 SG     #1.7/A UNL 1 O       #1.7/? CYS 145 HG     3.466  2.762
 #1.7/A UNL 1 O        #1.7/? LEU 141 O     #1.7/A UNL 1 H        2.951  2.048
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O       #1.8/? ASN 142 2HD2   3.660  3.073
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O       #1.8/? ASN 142 1HD2   2.965  2.162
 #1.8/? CYS 145 SG     #1.8/A UNL 1 O       #1.8/? CYS 145 HG     3.480  2.556
 #1.9/? HIS 163 NE2    #1.9/A UNL 1 O      no hydrogen                                                   3.272  N/A
 #1.9/A UNL 1 O        #1.9/? GLN 189 OE1   #1.9/A UNL 1 H        2.915  2.341
 #1.10/? ASN 142 ND2   #1.10/A UNL 1 O      #1.10/? ASN 142 2HD2  3.100  2.185
 #1.10/? GLN 189 NE2   #1.10/A UNL 1 O      #1.10/? GLN 189 1HE2  3.180  2.408
 #1.10/A UNL 1 O       #1.10/? GLU 166 O    #1.10/A UNL 1 H       3.246  2.639
 #1.11/? ASN 142 ND2   #1.11/A UNL 1 O      #1.11/? ASN 142 2HD2  3.122  2.204
 #1.11/? HIS 163 NE2   #1.11/A UNL 1 O     no hydrogen                                                   3.313  N/A

如您所见,有些行(由模式“无氢”+一些数字 os 空格组成)不符合格式,其中最后两个数字发生了显着偏移,例如no hydrogen 3.137 N/A

由于这些元素之间的空格数可能不同,我找不到使用 sed 删除所有无用空格的简单表达式,例如

sed -e "s/no hydrogen                     //g" 

只会匹配特定行。 你可以建议我一些可以与 sed 一起使用的正则表达式来匹配所有由“无氢”组成的行并删除未使用的空格吗?

这是预期的输出:

06I: 31 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
 #1.1/? THR 26 N       #1.1/A UNL 1 O      #1.1/? THR 26 H       3.515  2.716
 #1.1/? ASN 142 ND2    #1.1/A UNL 1 O      #1.1/? ASN 142 2HD2   3.227  2.305
 #1.1/A UNL 1 N        #1.1/? THR 26 O     #1.1/A UNL 1 H        3.463  2.652
 #1.2/A UNL 1 N        #1.2/? PHE 140 O    #1.2/A UNL 1 H        2.987  2.200
 #1.4/? THR 26 N       #1.4/A UNL 1 S      #1.4/? THR 26 H       4.354  3.371
 #1.4/? HIS 163 NE2    #1.4/A UNL 1 N     no hydrogen            3.137  N/A
 #1.4/A UNL 1 N        #1.4/? ARG 188 O    #1.4/A UNL 1 H        3.000  2.081
 #1.5/? HIS 163 NE2    #1.5/A UNL 1 N     no hydrogen            3.330  N/A
 #1.5/? GLN 189 NE2    #1.5/A UNL 1 O      #1.5/? GLN 189 2HE2   3.029  2.132
 #1.6/A UNL 1 N        #1.6/? ARG 188 O    #1.6/A UNL 1 H        2.984  2.064
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 N      #1.8/? ASN 142 2HD2   3.164  2.395
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O      #1.8/? ASN 142 2HD2   3.031  2.180
 #1.8/? GLN 189 NE2    #1.8/A UNL 1 O      #1.8/? GLN 189 1HE2   3.276  2.553
 #1.8/A UNL 1 N        #1.8/? THR 190 O    #1.8/A UNL 1 H        3.257  2.407
 #1.9/A UNL 1 N        #1.9/? THR 190 O    #1.9/A UNL 1 H        2.913  2.037
 #1.10/? SER 144 OG    #1.10/A UNL 1 S     #1.10/? SER 144 HG    4.246  3.845
 #1.10/? HIS 163 NE2   #1.10/A UNL 1 S    no hydrogen            3.700  N/A
 #1.10/A UNL 1 N       #1.10/? THR 190 O   #1.10/A UNL 1 H       3.008  2.091
 #1.12/? GLN 189 NE2   #1.12/A UNL 1 O     #1.12/? GLN 189 1HE2  2.929  2.152
 #1.12/A UNL 1 N       #1.12/? PHE 140 O   #1.12/A UNL 1 H       2.912  2.012
 #1.13/? ASN 142 ND2   #1.13/A UNL 1 O     #1.13/? ASN 142 2HD2  3.063  2.291
 #1.14/? HIS 41 NE2    #1.14/A UNL 1 S    no hydrogen            3.919  N/A
 #1.14/? ASN 142 ND2   #1.14/A UNL 1 O     #1.14/? ASN 142 2HD2  2.802  1.872
 #1.14/A UNL 1 N       #1.14/? THR 190 O   #1.14/A UNL 1 H       2.927  1.987
 #1.16/? GLN 189 NE2   #1.16/A UNL 1 N     #1.16/? GLN 189 1HE2  3.456  2.669
 #1.16/? GLN 189 NE2   #1.16/A UNL 1 O     #1.16/? GLN 189 1HE2  3.079  2.177
 #1.16/A UNL 1 N       #1.16/? THR 190 O   #1.16/A UNL 1 H       2.967  1.987
 #1.17/? ASN 142 ND2   #1.17/A UNL 1 N     #1.17/? ASN 142 2HD2  3.218  2.294
 #1.17/A UNL 1 N       #1.17/? THR 190 O   #1.17/A UNL 1 H       3.364  2.469
 #1.18/? ASN 142 ND2   #1.18/A UNL 1 O     #1.18/? ASN 142 2HD2  3.117  2.142
 #1.20/? ASN 142 ND2   #1.20/A UNL 1 N     #1.20/? ASN 142 2HD2  3.245  2.560
-----------------------------------------------------------------------------
structure30R: 21 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
 #1.4/? GLN 189 NE2    #1.4/A UNL 1 O       #1.4/? GLN 189 1HE2   3.139  2.374
 #1.5/? GLN 189 NE2    #1.5/A UNL 1 N       #1.5/? GLN 189 2HE2   3.296  2.365
 #1.7/? CYS 145 SG     #1.7/A UNL 1 O       #1.7/? CYS 145 HG     3.466  2.762
 #1.7/A UNL 1 O        #1.7/? LEU 141 O     #1.7/A UNL 1 H        2.951  2.048
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O       #1.8/? ASN 142 2HD2   3.660  3.073
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O       #1.8/? ASN 142 1HD2   2.965  2.162
 #1.8/? CYS 145 SG     #1.8/A UNL 1 O       #1.8/? CYS 145 HG     3.480  2.556
 #1.9/? HIS 163 NE2    #1.9/A UNL 1 O      no hydrogen            3.272  N/A
 #1.9/A UNL 1 O        #1.9/? GLN 189 OE1   #1.9/A UNL 1 H        2.915  2.341
 #1.10/? ASN 142 ND2   #1.10/A UNL 1 O      #1.10/? ASN 142 2HD2  3.100  2.185
 #1.10/? GLN 189 NE2   #1.10/A UNL 1 O      #1.10/? GLN 189 1HE2  3.180  2.408
 #1.10/A UNL 1 O       #1.10/? GLU 166 O    #1.10/A UNL 1 H       3.246  2.639
 #1.11/? ASN 142 ND2   #1.11/A UNL 1 O      #1.11/? ASN 142 2HD2  3.122  2.204
 #1.11/? HIS 163 NE2   #1.11/A UNL 1 O     no hydrogen

使用sed

$ sed 's/\(no hydrogen \{12\}\)[[:space:]]\+//' input_fie
06I: 31 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
 #1.1/? THR 26 N       #1.1/A UNL 1 O      #1.1/? THR 26 H       3.515  2.716
 #1.1/? ASN 142 ND2    #1.1/A UNL 1 O      #1.1/? ASN 142 2HD2   3.227  2.305
 #1.1/A UNL 1 N        #1.1/? THR 26 O     #1.1/A UNL 1 H        3.463  2.652
 #1.2/A UNL 1 N        #1.2/? PHE 140 O    #1.2/A UNL 1 H        2.987  2.200
 #1.4/? THR 26 N       #1.4/A UNL 1 S      #1.4/? THR 26 H       4.354  3.371
 #1.4/? HIS 163 NE2    #1.4/A UNL 1 N     no hydrogen            3.137  N/A
 #1.4/A UNL 1 N        #1.4/? ARG 188 O    #1.4/A UNL 1 H        3.000  2.081
 #1.5/? HIS 163 NE2    #1.5/A UNL 1 N     no hydrogen            3.330  N/A
 #1.5/? GLN 189 NE2    #1.5/A UNL 1 O      #1.5/? GLN 189 2HE2   3.029  2.132
 #1.6/A UNL 1 N        #1.6/? ARG 188 O    #1.6/A UNL 1 H        2.984  2.064
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 N      #1.8/? ASN 142 2HD2   3.164  2.395
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O      #1.8/? ASN 142 2HD2   3.031  2.180
 #1.8/? GLN 189 NE2    #1.8/A UNL 1 O      #1.8/? GLN 189 1HE2   3.276  2.553
 #1.8/A UNL 1 N        #1.8/? THR 190 O    #1.8/A UNL 1 H        3.257  2.407
 #1.9/A UNL 1 N        #1.9/? THR 190 O    #1.9/A UNL 1 H        2.913  2.037
 #1.10/? SER 144 OG    #1.10/A UNL 1 S     #1.10/? SER 144 HG    4.246  3.845
 #1.10/? HIS 163 NE2   #1.10/A UNL 1 S    no hydrogen            3.700  N/A
 #1.10/A UNL 1 N       #1.10/? THR 190 O   #1.10/A UNL 1 H       3.008  2.091
 #1.12/? GLN 189 NE2   #1.12/A UNL 1 O     #1.12/? GLN 189 1HE2  2.929  2.152
 #1.12/A UNL 1 N       #1.12/? PHE 140 O   #1.12/A UNL 1 H       2.912  2.012
 #1.13/? ASN 142 ND2   #1.13/A UNL 1 O     #1.13/? ASN 142 2HD2  3.063  2.291
 #1.14/? HIS 41 NE2    #1.14/A UNL 1 S    no hydrogen            3.919  N/A
 #1.14/? ASN 142 ND2   #1.14/A UNL 1 O     #1.14/? ASN 142 2HD2  2.802  1.872
 #1.14/A UNL 1 N       #1.14/? THR 190 O   #1.14/A UNL 1 H       2.927  1.987
 #1.16/? GLN 189 NE2   #1.16/A UNL 1 N     #1.16/? GLN 189 1HE2  3.456  2.669
 #1.16/? GLN 189 NE2   #1.16/A UNL 1 O     #1.16/? GLN 189 1HE2  3.079  2.177
 #1.16/A UNL 1 N       #1.16/? THR 190 O   #1.16/A UNL 1 H       2.967  1.987
 #1.17/? ASN 142 ND2   #1.17/A UNL 1 N     #1.17/? ASN 142 2HD2  3.218  2.294
 #1.17/A UNL 1 N       #1.17/? THR 190 O   #1.17/A UNL 1 H       3.364  2.469
 #1.18/? ASN 142 ND2   #1.18/A UNL 1 O     #1.18/? ASN 142 2HD2  3.117  2.142
 #1.20/? ASN 142 ND2   #1.20/A UNL 1 N     #1.20/? ASN 142 2HD2  3.245  2.560
-----------------------------------------------------------------------------
structure30R: 21 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
 #1.4/? GLN 189 NE2    #1.4/A UNL 1 O       #1.4/? GLN 189 1HE2   3.139  2.374
 #1.5/? GLN 189 NE2    #1.5/A UNL 1 N       #1.5/? GLN 189 2HE2   3.296  2.365
 #1.7/? CYS 145 SG     #1.7/A UNL 1 O       #1.7/? CYS 145 HG     3.466  2.762
 #1.7/A UNL 1 O        #1.7/? LEU 141 O     #1.7/A UNL 1 H        2.951  2.048
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O       #1.8/? ASN 142 2HD2   3.660  3.073
 #1.8/? ASN 142 ND2    #1.8/A UNL 1 O       #1.8/? ASN 142 1HD2   2.965  2.162
 #1.8/? CYS 145 SG     #1.8/A UNL 1 O       #1.8/? CYS 145 HG     3.480  2.556
 #1.9/? HIS 163 NE2    #1.9/A UNL 1 O      no hydrogen            3.272  N/A
 #1.9/A UNL 1 O        #1.9/? GLN 189 OE1   #1.9/A UNL 1 H        2.915  2.341
 #1.10/? ASN 142 ND2   #1.10/A UNL 1 O      #1.10/? ASN 142 2HD2  3.100  2.185
 #1.10/? GLN 189 NE2   #1.10/A UNL 1 O      #1.10/? GLN 189 1HE2  3.180  2.408
 #1.10/A UNL 1 O       #1.10/? GLU 166 O    #1.10/A UNL 1 H       3.246  2.639
 #1.11/? ASN 142 ND2   #1.11/A UNL 1 O      #1.11/? ASN 142 2HD2  3.122  2.204
 #1.11/? HIS 163 NE2   #1.11/A UNL 1 O     no hydrogen

\(no hydrogen \{12\}\) - 在括号 (..) 内创建组匹配,具有 sed 的反向引用功能,稍后可以使用 </code> 返回。该命令也可以写成 <code>\(no hydrogen[[:space:]]\{12\}\) 以强调 space 的存在。这将包括单词 no hydrogen 后的 12 space 秒作为反向引用返回。

[[:space:]]\+ - 由于这不是小组赛的一部分,因此将被排除在外。这将匹配匹配词后所有剩余的 spaces 和我们希望保留在组匹配中的 12 spaces。