使用 Python 进行日志分析

Log Analytics with Python

我有以下文本文件。这是一个日志文件。但是,当我使用 Python 读取它时,它会将其视为一行,但在该行内,有多个事务。我希望能够阅读 'Line' 中的各个行。以下是文本文件:

[30-Apr-2020] [23:52:13:093] [[ACTIVE] ExecuteThread: '217' for queue: 'weblogic.kernel.Default (self-tuning)'] [27263170] [172.16.211.13][a70ce98f-7931-482a-8418-a4ccb4f43aaa][com.intellectdesign.cib.viewdefinition.hal.IntegratorListViewInstruction][INFO] {Entered into Method: {TOTAL_NUM_RECORDS=0, ALL_RECORDS=[{OD_MAKER_DATE=2020-04-20 14:21:53.0, OD_FUNCTION_ID=CRMOB, OD_STATUS=AH, BENE_ACC_NO=254723237762, OD_TXN_CY=KES, OD_AMOUNT=          55,000, TRANTYPEID=1, MODULE_DESCRIPTION=Safaricom M-Pesa B2C, OD_REF_NO=AAAAB5945920, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-20 13:22:47.0, OD_FUNCTION_ID=CRMOB, OD_STATUS=AH, BENE_ACC_NO=254723237762, OD_TXN_CY=KES, OD_AMOUNT=           5,000, TRANTYPEID=1, MODULE_DESCRIPTION=Safaricom M-Pesa B2C, OD_REF_NO=AAAAB592AE20, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-20 13:16:22.0, OD_FUNCTION_ID=CRMOB, OD_STATUS=AH, BENE_ACC_NO=0703761794, OD_TXN_CY=KES, OD_AMOUNT=           4,000, TRANTYPEID=1, MODULE_DESCRIPTION=Safaricom M-Pesa B2C, OD_REF_NO=AAAAB5924E20, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-18 14:43:54.0, OD_FUNCTION_ID=CRMOB, OD_STATUS=AH, BENE_ACC_NO=0703761794, OD_TXN_CY=KES, OD_AMOUNT=           3,600, TRANTYPEID=1, MODULE_DESCRIPTION=Safaricom M-Pesa B2C, OD_REF_NO=AAAAB5790A20, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-18 14:41:55.0, OD_FUNCTION_ID=PESAF, OD_STATUS=AH, BENE_ACC_NO=KCB 1169902251, OD_TXN_CY=KES, OD_AMOUNT=          55,000, TRANTYPEID=1, MODULE_DESCRIPTION=Pesalink, OD_REF_NO=AAAAB5790320, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-17 17:28:06.0, OD_FUNCTION_ID=PESAF, OD_STATUS=AH, BENE_ACC_NO=KCB 1169902251, OD_TXN_CY=KES, OD_AMOUNT=         200,000, TRANTYPEID=1, MODULE_DESCRIPTION=Pesalink, OD_REF_NO=AAAAB55EDE20, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-17 08:48:08.0, OD_FUNCTION_ID=CRIFT, OD_STATUS=AH, BENE_ACC_NO=01108076490100, OD_TXN_CY=KES, OD_AMOUNT=           5,126, TRANTYPEID=1, MODULE_DESCRIPTION=Internal Funds Transfer, OD_REF_NO=AAAAB5540820, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-17 04:22:26.0, OD_FUNCTION_ID=CRMOB, OD_STATUS=AH, BENE_ACC_NO=254723237762, OD_TXN_CY=KES, OD_AMOUNT=          25,000, TRANTYPEID=1, MODULE_DESCRIPTION=Safaricom M-Pesa B2C, OD_REF_NO=AAAAB552B020, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-17 04:20:34.0, OD_FUNCTION_ID=CRIFT, OD_STATUS=AH, BENE_ACC_NO=01108076490100, OD_TXN_CY=KES, OD_AMOUNT=         320,400, TRANTYPEID=1, MODULE_DESCRIPTION=Internal Funds Transfer, OD_REF_NO=AAAAB552A420, TRANSACTION_STATUS=success}, {OD_MAKER_DATE=2020-04-17 04:18:32.0, OD_FUNCTION_ID=CRIFT, OD_STATUS=RH, BENE_ACC_NO=01108076490100, OD_TXN_CY=KES, OD_AMOUNT=         330,866, TRANTYPEID=1, MODULE_DESCRIPTION=Internal Funds Transfer, OD_REF_NO=AAAAB5529820, TRANSACTION_STATUS=failed}], ENCODE_RESPONSE_IND=true, JSON_DATA={}}}

我希望能够在 {OD_MAKER_DATE=2020-04-17 04:20:34.0 分隔行 ..................... }.

我的python代码是:

#Loading libraries
import re
import pandas as pd
import numpy as np
filename = r'C:\Users\xxxxx\Desktop\test.txt'  

with open(filename) as fn:  
   ln = fn.readline()
   lncnt = 1
   while ln:
       print("Line {}: {}".format(lncnt, ln.strip()))
       ln = fn.readline()
       lncnt += 1

如果你想用另一个字符串分隔一个字符串,你可以使用 .split() 方法。

string = 'ABC splithere XYZ splithere COOL'
parts  = string.split('splithere')
print(parts)

如果你想要一些更复杂的字符串拆分或字符串匹配,你需要使用 "re" python 模块。