在 os.walk()ing 时重命名文件夹和文件在更改目录名称后丢失了一些文件
Renaming folders and files while os.walk()ing them missed some files after change of the directory name
我有这样的文件夹结构:
Template
- Template1
- Template2
TemplateTest
- TemplateTest1
Config
- TemplateConfig
我想将每个文件名和文件夹名的 'Template' 替换为 'MyApp'。
这是我的代码:
for root, dirs, files in os.walk(path):
for name in files:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
os.rename(os.path.join(root,name),os.path.join(root,name.replace(old,new)))
for name in dirs:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
os.rename(os.path.join(root,name),os.path.join(root,replace))
奇怪的是,这只替换了文件夹名称和父文件夹名称不需要更改的文件名。像这样:
MyApp
- Template1
- Template2
MyAppTest
- TemplateTest1
Config
- MyAppConfig
但如果我执行此代码两次,它将替换文件。
我想知道为什么以及如何更改代码以替换我需要的一切?
如有疑问 - print
它:
创建数据结构:
import os
for d in ["./Template","./TemplateTest","./Config"]:
os.mkdir(d)
for f in ["./Template/Template1.txt","./Template/Template2.txt",
"./TemplateTest/TemplateTest1.txt", "./Config/TemplateConfig.txt"]:
with open(f,"w") as f:
f.write(" ")
测试os.walk
:
for root, dirs, files in os.walk("./"): # no topdown means == True
for name in files:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
print("renaming: ", os.path.join(root,name), " to ", os.path.join(root,replace))
# os.rename(os.path.join(root,name),os.path.join(root,replace))
for name in dirs:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
print("renaming: ", os.path.join(root,name), " to ", os.path.join(root,replace))
# os.rename(os.path.join(root,name),os.path.join(root,replace))
如果您注释掉 for ... loops
并且仅 print(root,dirs,files)
,则输出:
./ ['Config', 'Template', 'TemplateTest'] ['main.py']
./Config [] ['TemplateConfig.txt']
./Template [] ['Template1.txt', 'Template2.txt']
./TemplateTest [] ['TemplateTest1.txt']
如果您再次注释 for 循环并将重命名替换为 print
,您将得到:
renaming: ./Template to ./MyApp # aha - works
renaming: ./TemplateTest to ./MyAppTest # aha - works
renaming: ./Config/TemplateConfig.txt to ./Config/MyAppConfig.txt # works
renaming: ./Template/Template1.txt to ./Template/MyApp1.txt # folder not updated
renaming: ./Template/Template2.txt to ./Template/MyApp2.txt # folder also not updated
renaming: ./TemplateTest/TemplateTest1.txt to ./TemplateTest/MyAppTest1.txt # also not updated
如果您查看文档,它可能会说迭代 os.walk() 的生成结果时发生的更改不会反映在生成的数据中。
你基本上 "change a interable while iterating it" ;o)
来自链接的独库:
When topdown
is True
, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk()
will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk()
about directories the caller creates or renames before it resumes walk()
again.
Modifying dirnames when topdown
is False
has no effect on the behavior of the walk, because in bottom-up mode the directories in dirnames are generated before dirpath itself is generated.
(注意 os.walk
的调用签名是:
os.walk = walk(top, topdown=True, onerror=None, followlinks=False)
所以你超过了 True
、None
和 False
。)
问题与 os.walk
遍历目录和文件的顺序以及它遍历的目录和文件有关。
特别是,它从读取 path
处的目录开始。这会产生以下内容:
['Template', 'TemplateTest', 'Config']
所有这些都是目录,所以下次它要走的子目录列表是一样的,而且没有文件。这在第一次迭代中作为三个值返回:
path
['Template', 'TemplateTest', 'Config']
[]
然后您编写自己的代码,其中您在 Template
上调用 os.rename
,因此它现在被命名为 MyApp
,并在 TemplateTest
上调用,因此目录现在命名为 MyAppTest
.
接下来,os.walk
代码尝试读取子目录 Template
。这失败了,所以什么也没有发生(onerror
是 None
)。
接下来,os.walk
代码尝试读取子目录 TemplateTest
。这失败了,所以什么也没有发生。
最后,os.walk
代码尝试读取子目录 Config
。这成功了,一切顺利。
有两种不同的解决方案:您可以将 topdown
设置为 False
,或者您可以更新名为 dirs
的列表,以便 os.walk
知道 [=63] =]new 目录名称。 (编辑:我不确定 topdown=False
会修复它;那需要测试。)
(编辑:topdown=False
真的会修复它。这在文档中有描述:
When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again. Modifying dirnames when topdown is False has no effect on the behavior of the walk, because in bottom-up mode the directories in dirnames are generated before dirpath itself is generated.
)
我有这样的文件夹结构:
Template
- Template1
- Template2
TemplateTest
- TemplateTest1
Config
- TemplateConfig
我想将每个文件名和文件夹名的 'Template' 替换为 'MyApp'。
这是我的代码:
for root, dirs, files in os.walk(path):
for name in files:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
os.rename(os.path.join(root,name),os.path.join(root,name.replace(old,new)))
for name in dirs:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
os.rename(os.path.join(root,name),os.path.join(root,replace))
奇怪的是,这只替换了文件夹名称和父文件夹名称不需要更改的文件名。像这样:
MyApp
- Template1
- Template2
MyAppTest
- TemplateTest1
Config
- MyAppConfig
但如果我执行此代码两次,它将替换文件。 我想知道为什么以及如何更改代码以替换我需要的一切?
如有疑问 - print
它:
创建数据结构:
import os
for d in ["./Template","./TemplateTest","./Config"]:
os.mkdir(d)
for f in ["./Template/Template1.txt","./Template/Template2.txt",
"./TemplateTest/TemplateTest1.txt", "./Config/TemplateConfig.txt"]:
with open(f,"w") as f:
f.write(" ")
测试os.walk
:
for root, dirs, files in os.walk("./"): # no topdown means == True
for name in files:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
print("renaming: ", os.path.join(root,name), " to ", os.path.join(root,replace))
# os.rename(os.path.join(root,name),os.path.join(root,replace))
for name in dirs:
if name.startswith("Template"):
replace = name.replace("Template",'MyApp')
print("renaming: ", os.path.join(root,name), " to ", os.path.join(root,replace))
# os.rename(os.path.join(root,name),os.path.join(root,replace))
如果您注释掉 for ... loops
并且仅 print(root,dirs,files)
,则输出:
./ ['Config', 'Template', 'TemplateTest'] ['main.py']
./Config [] ['TemplateConfig.txt']
./Template [] ['Template1.txt', 'Template2.txt']
./TemplateTest [] ['TemplateTest1.txt']
如果您再次注释 for 循环并将重命名替换为 print
,您将得到:
renaming: ./Template to ./MyApp # aha - works
renaming: ./TemplateTest to ./MyAppTest # aha - works
renaming: ./Config/TemplateConfig.txt to ./Config/MyAppConfig.txt # works
renaming: ./Template/Template1.txt to ./Template/MyApp1.txt # folder not updated
renaming: ./Template/Template2.txt to ./Template/MyApp2.txt # folder also not updated
renaming: ./TemplateTest/TemplateTest1.txt to ./TemplateTest/MyAppTest1.txt # also not updated
如果您查看文档,它可能会说迭代 os.walk() 的生成结果时发生的更改不会反映在生成的数据中。
你基本上 "change a interable while iterating it" ;o)
来自链接的独库:
When
topdown
isTrue
, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), andwalk()
will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to informwalk()
about directories the caller creates or renames before it resumeswalk()
again. Modifying dirnames whentopdown
isFalse
has no effect on the behavior of the walk, because in bottom-up mode the directories in dirnames are generated before dirpath itself is generated.
(注意 os.walk
的调用签名是:
os.walk = walk(top, topdown=True, onerror=None, followlinks=False)
所以你超过了 True
、None
和 False
。)
问题与 os.walk
遍历目录和文件的顺序以及它遍历的目录和文件有关。
特别是,它从读取 path
处的目录开始。这会产生以下内容:
['Template', 'TemplateTest', 'Config']
所有这些都是目录,所以下次它要走的子目录列表是一样的,而且没有文件。这在第一次迭代中作为三个值返回:
path
['Template', 'TemplateTest', 'Config']
[]
然后您编写自己的代码,其中您在 Template
上调用 os.rename
,因此它现在被命名为 MyApp
,并在 TemplateTest
上调用,因此目录现在命名为 MyAppTest
.
接下来,os.walk
代码尝试读取子目录 Template
。这失败了,所以什么也没有发生(onerror
是 None
)。
接下来,os.walk
代码尝试读取子目录 TemplateTest
。这失败了,所以什么也没有发生。
最后,os.walk
代码尝试读取子目录 Config
。这成功了,一切顺利。
有两种不同的解决方案:您可以将 topdown
设置为 False
,或者您可以更新名为 dirs
的列表,以便 os.walk
知道 [=63] =]new 目录名称。 (编辑:我不确定 topdown=False
会修复它;那需要测试。)
(编辑:topdown=False
真的会修复它。这在文档中有描述:
When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again. Modifying dirnames when topdown is False has no effect on the behavior of the walk, because in bottom-up mode the directories in dirnames are generated before dirpath itself is generated.
)