按最先出现的分隔符拆分,Python

Split by the delimiter that comes first, Python

我尝试拆分一些不可预测的日志行。

我可以预测的一件事是第一个字段总是以 .: 结束。

有什么方法可以自动拆分字符串,无论哪个分隔符先出现?

检查两个字符的索引,然后使用最低索引拆分字符串。

使用index()函数查看字符串中.:字符的索引。

这是一个简单的实现:

def index_default(line, char):
    """Returns the index of a character in a line, or the length of the string
    if the character does not appear.
    """
    try:
        retval = line.index(char)
    except ValueError:
        retval = len(line)
    return retval

def split_log_line(line):
    """Splits a line at either a period or a colon, depending on which appears 
    first in the line.
    """
    if index_default(line, ".") < index_default(line, ":"):
        return line.split(".")
    else:
        return line.split(":")

我将 index() 函数包装在 index_default() 函数中,因为如果该行不包含字符,index() 会抛出 ValueError,我不确定是否每个您日志中的行将同时包含一个句点和一个冒号。

然后这是一个简单的例子:

mylines = [
    "line1.split at the dot",
    "line2:split at the colon",
    "line3:a colon preceded. by a dot",
    "line4-neither a colon nor a dot"
]

for line in mylines:
    print split_log_line(line)

哪个returns

['line1', 'split at the dot']
['line2', 'split at the colon']
['line3', 'a colon preceded. by a dot']
['line4-neither a colon nor a dot']