使用 python 将 MM:SS 时间格式转换为 .ass 格式
use python to convert MM:SS time format to .ass format
我有一段 MM:SS 格式的带时间的多行文本,带有来自视频的字幕行。我想将 MM:SS 格式转换为 ass 格式,即 00:MM:SS,000 并使用间隔制表符输出。我写了这段代码
text = """02:42 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:47 02:51 imagining what that future might look like, again, is a good idea for research."""
for line in text.splitlines():
words_in_line = line.split('\t')
for word in words_in_line:
if ":" in word:
ass= "00:"+word +",000"
final_line = line.replace(word,ass)
print(final_line)
它转换格式,但它只转换每行中的一个时间,然后另一个在单独的行中,给出这样的输出
00:02:42,000 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:42 00:02:47,000 And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000 02:51 imagining what that future might look like, again, is a good idea for research.
02:47 00:02:51,000 imagining what that future might look like, again, is a good idea for research.
如何更改代码以获得这样的输出?
00:02:42,000 00:02:47,000 And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000 00:02:51,000 imagining what that future might look like, again, is a good idea for research.
使用regex sub进行搜索和替换,\1
对应括号中的部分。
import re
text = """02:42 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:47 02:51 imagining what that future might look like, again, is a good idea for research."""
print(re.sub('(\d\d:\d\d)', '00:\1,000', text))
您可以进一步指定正则表达式,例如与
print(re.sub('^(\d\d:\d\d)\t(\d\d:\d\d)', '00:\1,000 00:\2,000', text))
避免错误替换。检查 regex101.com 以找到与您的数据匹配的数据。
这样的事情似乎可以解决问题:
text = """
02:42 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:47 02:51 imagining what that future might look like, again, is a good idea for research.
"""
def convert_time(t):
return f"00:{t},000"
for line in text.splitlines():
try:
start, end, text = line.split(None, 2)
except ValueError: # if the line is out of spec, just print it
print(line)
continue
start = convert_time(start)
end = convert_time(end)
print(start, end, text, sep="\t")
输出为
00:02:42,000 00:02:47,000 And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000 00:02:51,000 imagining what that future might look like, again, is a good idea for research.
我有一段 MM:SS 格式的带时间的多行文本,带有来自视频的字幕行。我想将 MM:SS 格式转换为 ass 格式,即 00:MM:SS,000 并使用间隔制表符输出。我写了这段代码
text = """02:42 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:47 02:51 imagining what that future might look like, again, is a good idea for research."""
for line in text.splitlines():
words_in_line = line.split('\t')
for word in words_in_line:
if ":" in word:
ass= "00:"+word +",000"
final_line = line.replace(word,ass)
print(final_line)
它转换格式,但它只转换每行中的一个时间,然后另一个在单独的行中,给出这样的输出
00:02:42,000 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:42 00:02:47,000 And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000 02:51 imagining what that future might look like, again, is a good idea for research.
02:47 00:02:51,000 imagining what that future might look like, again, is a good idea for research.
如何更改代码以获得这样的输出?
00:02:42,000 00:02:47,000 And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000 00:02:51,000 imagining what that future might look like, again, is a good idea for research.
使用regex sub进行搜索和替换,\1
对应括号中的部分。
import re
text = """02:42 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:47 02:51 imagining what that future might look like, again, is a good idea for research."""
print(re.sub('(\d\d:\d\d)', '00:\1,000', text))
您可以进一步指定正则表达式,例如与
print(re.sub('^(\d\d:\d\d)\t(\d\d:\d\d)', '00:\1,000 00:\2,000', text))
避免错误替换。检查 regex101.com 以找到与您的数据匹配的数据。
这样的事情似乎可以解决问题:
text = """
02:42 02:47 And so that Wayne Gretzky method for sort of going into the future and
02:47 02:51 imagining what that future might look like, again, is a good idea for research.
"""
def convert_time(t):
return f"00:{t},000"
for line in text.splitlines():
try:
start, end, text = line.split(None, 2)
except ValueError: # if the line is out of spec, just print it
print(line)
continue
start = convert_time(start)
end = convert_time(end)
print(start, end, text, sep="\t")
输出为
00:02:42,000 00:02:47,000 And so that Wayne Gretzky method for sort of going into the future and
00:02:47,000 00:02:51,000 imagining what that future might look like, again, is a good idea for research.