如何使用正则表达式从日志文件中分离时间戳和表达式

Question

我有一个看起来像这样的日志文件：

04:26:24.664149 [PHY1 ] [I] [ 4198] PUCCH: cc=0; rnti=0x46, f=1a, n_pucch=12, dmrs_corr=0.995, snr=13.2 dB, corr=0.974, ack=1, ta=0.1 us

04:26:24.665067 [PHY0 ] [D] [ 4199] Worker 0 running

04:26:24.665166 [PHY0 ] [D] [ 4199] Sending to radio

04:26:24.666220 [PHY1 ] [I] [ 4200] PUCCH: cc=0; rnti=0x46, f=1, n_pucch=0, dmrs_corr=0.270, snr=-4.3 dB, corr=0.000, sr=no, ta=-9.0 us

04:26:24.666288 [PHY1 ] [D] [ 4200] Sending to radio

04:26:24.667305 [PHY0 ] [I] [ 4201] PUCCH: cc=0; rnti=0x46, f=2, n_pucch=0, dmrs_corr=0.989, snr=15.4 dB, corr=0.998, cqi=15 (cc=0), ta=0.2 us

04:26:24.667338 [MAC ] [D] [ 4201] ra_tbs=72/144, tbs_bytes=15, tbs=144, mcs=2

我想隔离有 snr={value} 条目的行，并复制与该条目关联的时间戳。我已经输入粗体我想用正则表达式提取的示例部分。

我尝试了许多不同的正则表达式来尝试从我的日志文件中提取这两位信息（在它们所在的行上）。重要的是要注意 snr 值可以是正数或负数，并且可以在 -999.9 dB 到 999.9 dB 之间变化。时间戳出现在日志文件的每一行上。

我的预期输出示例为：04:26:24.664149 snr=13.2

如有任何帮助，我们将不胜感激！

Answer 1

这是一种使用 re.findall 的方法：

inp = """04:26:24.664149 [PHY1   ] [I] [ 4198] PUCCH: cc=0; rnti=0x46, f=1a, n_pucch=12, dmrs_corr=0.995, **snr=13.2** dB, corr=0.974, ack=1, ta=0.1 us
04:26:24.665067 [PHY0   ] [D] [ 4199] Worker 0 running
04:26:24.665166 [PHY0   ] [D] [ 4199] Sending to radio
04:26:24.666220 [PHY1   ] [I] [ 4200] PUCCH: cc=0; rnti=0x46, f=1, n_pucch=0, dmrs_corr=0.270, **snr=-4.3** dB, corr=0.000, sr=no, ta=-9.0 us
04:26:24.666288 [PHY1   ] [D] [ 4200] Sending to radio"""

matches = re.findall("(\d{2}:\d{2}:\d{2}\.\d{6})[^\r\n]*(snr=-?\d+(?:\.\d+)?)", inp)
print(matches)

这会打印：

[('04:26:24.664149', 'snr=13.2'), ('04:26:24.666220', 'snr=-4.3')]

如何使用正则表达式从日志文件中分离时间戳和表达式

How to isolate timestamp and expression from log file using regex

python

regex

parsing

python-re