我应该使用正则表达式或其他东西从日志文件中提取日志 ID 吗?
Should I use regex or something else to extract log id from log files?
我有包含以下数据的列表:
["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]
我想在 python 中使用正则表达式来获取这样的元组列表
[("asdf", "mkol ghth"),("dfcf", "5566 7766"),("7uy7", "jhjh ffvf")]
我尝试使用 re.split,但我收到一条错误消息,提示要解压的值太多。以下是我的代码:
logTuples = [()]
for log in logList:
(logid, logcontent) = re.split(r"(\s)", log)
logTuples.append((logid, logcontent))
正则表达式在这里有点矫枉过正:
l = ["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]
lst = [tuple(i.split(maxsplit=1)) for i in l]
print(lst)
打印:
[('asdf', 'mkol ghth'), ('dfcf', '5566 7766'), ('7uy7', 'jhjh ffvf')]
来自文档:
https://docs.python.org/3/library/re.html
\s
For Unicode (str) patterns: Matches Unicode whitespace characters
(which includes [ \t\n\r\f\v], and also many other characters, for
example the non-breaking spaces mandated by typography rules in many
languages). If the ASCII flag is used, only [ \t\n\r\f\v] is matched.
有 2 个空格,因此有 3 个项目。
如果您所有的日志条目都有 3 个项目,用空格分隔,并且您始终将它们组织为 (1, 2 + ' ' + 3),则无需使用正则表达式将它们格式化为:
logtuples = []
for log in loglist:
splitlog = log.split(" ") #3 total elements
logtuples.append (splitlog[0], splitlog[1] + " " + splitlog[2])
我有包含以下数据的列表:
["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]
我想在 python 中使用正则表达式来获取这样的元组列表
[("asdf", "mkol ghth"),("dfcf", "5566 7766"),("7uy7", "jhjh ffvf")]
我尝试使用 re.split,但我收到一条错误消息,提示要解压的值太多。以下是我的代码:
logTuples = [()]
for log in logList:
(logid, logcontent) = re.split(r"(\s)", log)
logTuples.append((logid, logcontent))
正则表达式在这里有点矫枉过正:
l = ["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]
lst = [tuple(i.split(maxsplit=1)) for i in l]
print(lst)
打印:
[('asdf', 'mkol ghth'), ('dfcf', '5566 7766'), ('7uy7', 'jhjh ffvf')]
来自文档:
https://docs.python.org/3/library/re.html
\s
For Unicode (str) patterns: Matches Unicode whitespace characters (which includes [ \t\n\r\f\v], and also many other characters, for example the non-breaking spaces mandated by typography rules in many languages). If the ASCII flag is used, only [ \t\n\r\f\v] is matched.
有 2 个空格,因此有 3 个项目。
如果您所有的日志条目都有 3 个项目,用空格分隔,并且您始终将它们组织为 (1, 2 + ' ' + 3),则无需使用正则表达式将它们格式化为:
logtuples = []
for log in loglist:
splitlog = log.split(" ") #3 total elements
logtuples.append (splitlog[0], splitlog[1] + " " + splitlog[2])