通过 python 从目录及其子目录中的 txt/srt 个文件中删除特定的空行

Removing specific blank lines from txt/srt files inside a directory and its sub-directories by python

我有很多以下格式的字幕文件。

1

00:00:01,000 --> 00:00:02,008
some dummy text

2

00:00:02,008 --> 00:00:05,006
some dummy text
some dummy text

3

00:00:05,006 --> 00:00:08,008
some dummy text
some dummy text

我想通过删除时间和之前数字之间的空行将它们转换成下面的形式。

1
00:00:01,000 --> 00:00:02,008
some dummy text

2
00:00:02,008 --> 00:00:05,006
some dummy text
some dummy text

3
00:00:05,006 --> 00:00:08,008
some dummy text
some dummy text

由于文件很多,我需要一段代码来应用于目录及其子目录中的所有文件。是否有机会覆盖现有文件?

这里是你如何使用 os.walk() and re.sub():

import os
import re

for root, dirs, files in os.walk('C:\Users\User\Desktop\Folder\'):
    for file in files:
        if file.endswith('.txt'):
            fpath = os.path.join(root, file)
            with open(fpath, 'r') as f:
                t = re.sub('(?<=\d)\n*(?=\d\d\:\d\d:\d\d\,\d\d\d)','\n',f.read())
            with open(fpath, 'w') as f:
                f.write(t)