用另一个文本文件中的内容替换文本文件中的每三行

Replace every third line in a text file with content from another text file

尝试用实际字幕替换每三行。

背景:我在 Videosubfinder 和 ocr 的帮助下为音乐视频和电影制作字幕 api。

emptySub.srt(使用 Videosubfinder 自动创建)

1
00:00:10,076 --> 00:00:15,080
sub duration: 5,004

2
00:00:57,891 --> 00:01:01,694
sub duration: 3,803

subtitle.txt 看起来像这样(使用了 ocr api 并循环遍历图像,您不需要看代码)

I bought some eggs.
He bought some spam.

代码

with open("empty.srt", "a") as file:
    for line in file:
        # TODO

预期输出

1
00:00:10,076 --> 00:00:15,080
I bought some eggs.

2
00:00:57,891 --> 00:01:01,694
He bought some spam.

我被困住了。如何替换为我的字幕?也许我应该使用我不知道的正则表达式。

编辑:我终于自己解决了

您想要以下变体:

subtitleLines = open('subtitle.txt', 'r')

# Creates a list of lines
srtLines = open('srtfile.srt', 'r').readlines()

for (i, line) in enumerate(subtitleLines):
  srtLines[3*i + 2] = line

# emit srtLines

对于 KB 到 ~MBish 范围内的文件,这会很好地执行,但如果文件很大,您会希望 srt file 比字幕 file 快。你如何推进一个打开的文件?通过调用 next():

# after reading in subtitle.txt into subtitleFile:
for line in subtitleFile:
  for i in range(2):
    # Your "next" line will have a newline, so suppress print()'s
    # default newline.
    print(srtFile.next(), end="")
  # advance without printing
  srtFile.next()
  print line

您将想要抓住 StopIteration 并决定一旦 SRT 文件 "runs out" 做什么 - 是否要验证取决于您。

但是请注意,从您的示例来看,从第 3 行开始的第 4 行似乎是副标题行(srt 块之间有一个空行)。

subList = []
with open("subtitle.txt", "r") as subFile:
    for subLine in subFile:
        subList.append(subLine.rstrip())

print(subList)

i = 0
with open("emptySub.srt", "r") as file:
    for line in file:
        if line.startswith('s'):
            line = line.replace(line, subList[i]+'\n')
            i = i + 1
        with open('newFile.srt','a') as resFile:
            resFile.write(line)