python regex: 解析文件名

Question

我有一个包含文件名及其文件扩展名的文本文件 (filenames.txt)。

filename.txt

    [AW] One Piece - 629 [1080P][Dub].mkv
    EP.585.1080p.mp4
    EP609.m4v
    EP 610.m4v
    One Piece 0696 A Tearful Reunion! Rebecca and Kyros!.mp4
    One_Piece_0745_Sons'_Cups!.mp4
    One Piece - 591 (1080P Funi Web-Dl -Ks-)-1.m4v
    One Piece - 621 1080P.mkv
    One_Piece_S10E577_Zs_Ambition_A_Great_and_Desperate_Escape_Plan.mp4

这些是示例文件名及其扩展名。我需要用剧集编号重命名文件名（不更改其扩展名）。

示例：

Input:
``````
    EP609.m4v
    EP 610.m4v
    EP.585.1080p.mp4
    One Piece - 621 1080P.mkv
    [AW] One Piece - 629 [1080P][Dub].mkv 
    One_Piece_0745_Sons'_Cups!.mp4
    One Piece 0696 A Tearful Reunion! Rebecca and Kyros!.mp4
    One Piece - 591 (1080P Funi Web-Dl -Ks-)-1.m4v
    One_Piece_S10E577_Zs_Ambition_A_Great_and_Desperate_Escape_Plan.mp4

Expected Output:
````````````````
    609.m4v
    610.m4v
    585.mp4
    621.mkv
    629.mkv
    745.mp4 (or) 0745.mp4
    696.mp4 (or) 0696.mp4
    591.m4v
    577.mp4

希望有人能帮我解析和重命名这些文件名。提前致谢！！！

Answer 1

因为你标记了python，我猜你愿意使用python。

（编辑：我意识到我的原始代码中的循环是不必要的。）

import re

with open('filename.txt', 'r') as f:
    files = f.read().splitlines() # read filenames

# assume: an episode comprises of 3 digits possibly preceded by 0
p = re.compile(r'0?(\d{3})')
for file in files:
    if m := p.search(file):
        print(m.group(1) + '.' + file.split('.')[-1])
    else:
        print(file)

这将输出

609.m4v
610.m4v
585.mp4
621.mkv
629.mkv 
745.mp4
696.mp4
591.m4v
577.mp4

基本上，它搜索第一个 3 位数字，可能前面有 0。

我强烈建议你检查输出；特别是，您可能希望运行 sort OUTPUTFILENAME | uniq -d 查看是否存在重复的目标名称。

（原答案：）

p = re.compile(r'\d{3,4}')

for file in files:
    for m in p.finditer(file):
        ep = m.group(0)
        if int(ep) < 1000:
            print(ep.lstrip('0') + '.' + file.split('.')[-1])
            break # go to next file if ep found (avoid the else clause)
    else: # if ep not found, just print the filename as is
        print(file)

Answer 2

解析剧集编号并重命名的程序。

Modules used:

re - To parse File Name
os - To rename File Name

full/path/to/folder - 是文件所在文件夹的路径

import re
import os

for file in os.listdir(path="full/path/to/folder/"):
    # searches for the first 3 or 4 digit number less than 1000 for each line.
    for match_obj in re.finditer(r'\d{3,4}', file):
        episode = match_obj.group(0)   
        if int(episode) < 1000:
            new_filename = episode.lstrip('0') + '.' + file.split('.')[-1]
            old_name = "full/path/to/folder/" + file
            new_name = "full/path/to/folder/" + new_filename
            os.rename(old_name, new_name)
            # go to next file if ep found (avoid the else clause)
            break 
    else:
       # if episode not found, just leave the filename as it is
       pass

python regex: 解析文件名

python regex: Parsing file name

python

bash

file-rename

file-handling

python-re