如何使用 python 将放置在多个嵌套文件夹中的文档移动和重命名到一个新的单个文件夹中?

How to move and rename documents placed in several nested folders into a new single folder with python?

我在多个文件夹中有多个文件,如下所示:

dir
├── 0
│   ├── 103425.xml
│   ├── 105340.xml
│   ├── 109454.xml
│
│── 1247
│   └── doc.xml
├── 14568
│   └── doc.xml
├── 1659
│   └── doc.xml
├── 10450
│   └── doc.xml
├── 10351
│   └── doc.xml

如何将所有文档提取到一个文件夹中,并为每个移动的文档附加文件夹名称:

new_dir
├── 0_103425.xml
├── 0_105340.xml
├── 0_109454.xml
├── 1247_doc.xml
├── 14568_doc.xml
├── 1659_doc.xml
├── 10450_doc.xml
├── 10351_doc.xml

我尝试使用以下方法提取它们:

import os

for path, subdirs, files in os.walk('../dir/'):
    for name in files:
        print(os.path.join(path, name))

更新

此外,我尝试过:

import os, shutil
from glob import glob

files = []
start_dir = os.getcwd()
pattern   = "*.xml"

for dir,_,_ in os.walk('../dir/'):
    files.extend(glob(os.path.join(dir,pattern))) 
for f in files:
    print(f)
    shutil.move(f, '../dir/')

上面给了我每个文件的路径。但是,我不明白如何重命名和移动它们:

---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
<ipython-input-50-229e4256f1f3> in <module>()
     10 for f in files:
     11     print(f)
---> 12     shutil.move(f, '../dir/')

/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/shutil.py in move(src, dst, copy_function)
    540         real_dst = os.path.join(dst, _basename(src))
    541         if os.path.exists(real_dst):
--> 542             raise Error("Destination path '%s' already exists" % real_dst)
    543     try:
    544         os.rename(src, real_dst)

Error: Destination path '../data/230948.xml' already exists

上面的错误说明了为什么我想用它的文件夹重命名它。

这对你有用吗?

import os
import pathlib

OLD_DIR = 'files'
NEW_DIR = 'new_dir'

p = pathlib.Path(OLD_DIR)
for f in p.glob('**/*.xml'):
    new_name = '{}_{}'.format(f.parent.name, f.name)
    f.rename(os.path.join(NEW_DIR, new_name))

如果您没有 Python (3.5+) 的现代版本,您也可以只使用 glob、os 和 shutil:

import os
import glob
import shutil


for f in glob.glob('files/**/*.xml'):
    new_name = '{}_{}'.format(os.path.basename(os.path.dirname(f)), os.path.basename(f))
    shutil.move(f, os.path.join('new_dir', new_name))

使用 Python 3 的新 pathlib module for path operations, and then shutil.move for moving the files into their correct places. Unlike os.renameshutil.move 将像 mv 命令一样工作,即使对于跨文件系统移动也能正确运行。

此代码适用于嵌套到任何级别的路径 - 路径中的任何 /\ 都将替换为目标文件名中的 _,因此 dir/foo/bar/baz/xyzzy.xml 将移至 new_dir/foo_bar_baz_xyzzy.xml.

from pathlib import Path
from shutil import move

src = Path('dir')
dst = Path('new_dir')

# create the target directory if it doesn't exist
if not dst.is_dir():
    dst.mkdir()

# go through each file
for i in src.glob('**/*'):
    # skip directories and alike
    if not i.is_file():
        continue

    # calculate path relative to `src`,
    # this will make dir/foo/bar into foo/bar
    p = i.relative_to(src)

    # replace path separators with underscore, so foo/bar becomes foo_bar
    target_file_name = str(p).replace('/', '_').replace('\', '_')

    # then do rename/move. shutil.move will always do the right thing
    # note that it *doesn't* accept Path objects in Python 3.5, so we
    # use str(...) here. `dst` is a path object, and `target_file_name
    # is the name of the file to be placed there; we can use the / operator
    # instead of os.path.join.
    move(str(i), str(dst / target_file_name))