使用 python 将 MM:SS 时间格式转换为 .ass 格式

use python to convert MM:SS time format to .ass format

我有一段 MM:SS 格式的带时间的多行文本,带有来自视频的字幕行。我想将 MM:SS 格式转换为 ass 格式,即 00:MM:SS,000 并使用间隔制表符输出。我写了这段代码

text = """02:42 02:47   And so that Wayne Gretzky method for sort of going into the future and
02:47   02:51   imagining what that future might look like, again, is a good idea for research."""
for line in text.splitlines():
    words_in_line = line.split('\t')
    for word in words_in_line:
        if ":" in word:
                ass= "00:"+word +",000"
                final_line = line.replace(word,ass)
                print(final_line)

它转换格式,但它只转换每行中的一个时间,然后另一个在单独的行中,给出这样的输出

00:02:42,000    02:47   And so that Wayne Gretzky method for sort of going into the future and
02:42   00:02:47,000    And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000    02:51   imagining what that future might look like, again, is a good idea for research.
02:47   00:02:51,000    imagining what that future might look like, again, is a good idea for research.

如何更改代码以获得这样的输出?

00:02:42,000    00:02:47,000    And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000    00:02:51,000    imagining what that future might look like, again, is a good idea for research.

使用regex sub进行搜索和替换,\1对应括号中的部分。

import re
text = """02:42 02:47   And so that Wayne Gretzky method for sort of going into the future and
02:47   02:51   imagining what that future might look like, again, is a good idea for research."""
print(re.sub('(\d\d:\d\d)', '00:\1,000', text))

您可以进一步指定正则表达式,例如与

print(re.sub('^(\d\d:\d\d)\t(\d\d:\d\d)', '00:\1,000   00:\2,000', text))

避免错误替换。检查 regex101.com 以找到与您的数据匹配的数据。

这样的事情似乎可以解决问题:

text = """
02:42 02:47   And so that Wayne Gretzky method for sort of going into the future and
02:47   02:51   imagining what that future might look like, again, is a good idea for research.
"""


def convert_time(t):
    return f"00:{t},000"


for line in text.splitlines():
    try:
        start, end, text = line.split(None, 2)
    except ValueError:  # if the line is out of spec, just print it
        print(line)
        continue
    start = convert_time(start)
    end = convert_time(end)
    print(start, end, text, sep="\t")

输出为

00:02:42,000    00:02:47,000    And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000    00:02:51,000    imagining what that future might look like, again, is a good idea for research.