结构化 Android LogCat 文本文件到结构化 Pandas DF

Structure Android LogCat Text File to Structured Pandas DF

我想将 LogCat 行文本文件转换为结构化 Pandas DF。我似乎无法正确地概念化我将如何做到这一点......这是我的基本伪代码:

dateTime = []
processID = []
threadID = []
priority = []
application = []
tag = []
text = []

logFile = "xxxxxx.log"

for line in logfile:
     split the string according to the basic structure
     dateTime = [0]
     processID = [1]
     threadID = [2]
     priority = [3]
     application = [4]
     tag = [5]
     text = [6]
     append each to the empty list above

write the lists to pandas dataframe & add column names

问题是:我不知道如何使用这种结构正确定义定界符

08-01 14:28:35.947 1320 1320 D wpa_xxxx: wlan1: skip--ssid

import re
import pandas as pd

ROW_PATTERN = re.compile(r"""(\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2}\.\d+) (\d+) (\d+) ([A-Z]) (\S+) (\S+) (\S+)""")

with open(logFile) as f:
    s = pd.Series(f.readlines())

df = s.extract(ROW_PATTERN)
df.columns = ['dateTime', 'processID', 'threadID', 'priority', 'application', 'tag', 'text']

这会将 logFile 的每一行读入一个 Series 中的一行,然后可以通过正则表达式中的每个组将其扩展为一个 DataFrame。这假设 08-01 14:28:35.947 是每行中的第一个值,后续值由白色 space.

分隔