在嵌套的 for 循环中就地修改文件
Modifying a file in-place inside nested for loops
我在修改每个文件的同时迭代其中的目录和文件。我希望在之后立即读取 new modified 文件。
这是我的代码和描述性注释:
# go through each directory based on their ids
for id in id_list:
id_dir = os.path.join(ouput_dir, id)
os.chdir(id_dir)
# go through all files (with a specific extension)
for filename in glob('*' + ext):
# modify the file by replacing all new-line characters with an empty space
with fileinput.FileInput(filename, inplace=True) as f:
for line in f:
print(line.replace('\n', ' '), end='')
# here I would like to read the NEW modified file
with open(filename) as newf:
content = newf.read()
就目前而言,newf
不是 新修改的,而是原来的 f
。我想我明白为什么会这样,但是我发现很难克服这个问题。
我总是可以做 2 次单独的迭代(根据它们的 ID 遍历每个目录,遍历所有文件(具有特定扩展名)并修改文件,然后重复迭代以读取它们中的每一个)但是我希望是否有更有效的解决方法。也许如果有可能 restart 第二个 for
循环在修改发生后发生 read
(这样至少可以避免重复外 for
循环)。
任何 ideas/designs 以干净高效的方式实现上述目标?
我并不是说您执行此操作的方式不正确,但我觉得您过于复杂了。这是我超级简单的解决方案。
import glob, fileinput
for filename in glob('*' + ext):
f_in = (x.rstrip() for x in open(filename, 'rb').readlines()) #instead of trying to modify in place we instead read in data and replace raw_values.
with open(filename, 'wb') as f_out: # we then write the data stream back out
#extra modification to the data can go here, i just remove the /r and /n and write back out
for i in f_in:
f_out.write(i)
#now there is no need to read the data back in because we already have a static referance to it.
对我来说,它适用于此代码:
#!/usr/bin/env python3
import os
from glob import glob
import fileinput
id_list=['1']
ouput_dir='.'
ext = '.txt'
# go through each directory based on their ids
for id in id_list:
id_dir = os.path.join(ouput_dir, id)
os.chdir(id_dir)
# go through all files (with a specific extension)
for filename in glob('*' + ext):
# modify the file by replacing all new-line characters with an empty space
for line in fileinput.FileInput(filename, inplace=True):
print(line.replace('\n', ' ') , end="")
# here I would like to read the NEW modified file
with open(filename) as newf:
content = newf.read()
print(content)
注意我是如何遍历这些行的!
我在修改每个文件的同时迭代其中的目录和文件。我希望在之后立即读取 new modified 文件。 这是我的代码和描述性注释:
# go through each directory based on their ids
for id in id_list:
id_dir = os.path.join(ouput_dir, id)
os.chdir(id_dir)
# go through all files (with a specific extension)
for filename in glob('*' + ext):
# modify the file by replacing all new-line characters with an empty space
with fileinput.FileInput(filename, inplace=True) as f:
for line in f:
print(line.replace('\n', ' '), end='')
# here I would like to read the NEW modified file
with open(filename) as newf:
content = newf.read()
就目前而言,newf
不是 新修改的,而是原来的 f
。我想我明白为什么会这样,但是我发现很难克服这个问题。
我总是可以做 2 次单独的迭代(根据它们的 ID 遍历每个目录,遍历所有文件(具有特定扩展名)并修改文件,然后重复迭代以读取它们中的每一个)但是我希望是否有更有效的解决方法。也许如果有可能 restart 第二个 for
循环在修改发生后发生 read
(这样至少可以避免重复外 for
循环)。
任何 ideas/designs 以干净高效的方式实现上述目标?
我并不是说您执行此操作的方式不正确,但我觉得您过于复杂了。这是我超级简单的解决方案。
import glob, fileinput
for filename in glob('*' + ext):
f_in = (x.rstrip() for x in open(filename, 'rb').readlines()) #instead of trying to modify in place we instead read in data and replace raw_values.
with open(filename, 'wb') as f_out: # we then write the data stream back out
#extra modification to the data can go here, i just remove the /r and /n and write back out
for i in f_in:
f_out.write(i)
#now there is no need to read the data back in because we already have a static referance to it.
对我来说,它适用于此代码:
#!/usr/bin/env python3
import os
from glob import glob
import fileinput
id_list=['1']
ouput_dir='.'
ext = '.txt'
# go through each directory based on their ids
for id in id_list:
id_dir = os.path.join(ouput_dir, id)
os.chdir(id_dir)
# go through all files (with a specific extension)
for filename in glob('*' + ext):
# modify the file by replacing all new-line characters with an empty space
for line in fileinput.FileInput(filename, inplace=True):
print(line.replace('\n', ' ') , end="")
# here I would like to read the NEW modified file
with open(filename) as newf:
content = newf.read()
print(content)
注意我是如何遍历这些行的!