组合多个 os.walk 运行的输出
Combine output of multiple os.walk runs
我有多个目录dirs = [dir1, dir2, ...]
这些目录的结构如下:
dir1
subdir1
folder1
file1
file2
subdir2
dir2
subdir1
folder2
file3
file4
subdir2
请注意,子目录的名称是相同的。 dir1 和 dir2 都有相同的命名子目录。我需要的是打印一个 html table,它结合了 dir1 和 dir2 中的文件和文件夹,如下所示:
subdir1
folder1
folder2
file3
file1
file2
file4
subdir2
还有一点需要注意,我需要知道每个文件和文件夹的路径,所以我可以 link 找到它。
到目前为止,我使用 os.walk
为 dir1 创建了树,并从中创建了一个 html table,其中的每一行都在一个列表中。然后我为所有其他目录执行 os.walk
,对于每个目录,遍历该列表,直到基本名称相同,然后插入文件和文件夹。但这非常慢。我相信有一个非常聪明的五行解决方案可以达到同样的效果。
def get_table(self, teams=['test1', 'test2']):
paths = []
table = []
for team in teams:
paths.append(config.basepath + '/' + team)
for path in paths:
if not table:
for root, dirs, files in os.walk(path):
dirs = sorted(dirs)
files = sorted(files)
team = self.get_team(path) # extracts the 'dir' from path
level = root.replace(path, '').count(os.sep)
indent = ' ' * 4 * (level)
subindent = ' ' * 4 * (level + 1)
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(indent, os.path.basename(root), team))
for f in files:
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
else:
for root, dirs, files in os.walk(path):
dirs = sorted(dirs)
files = sorted(files)
team = self.get_team(path)
level = root.replace(path, '').count(os.sep)
indent = ' ' * 4 * (level)
subindent = ' ' * 4 * (level + 1)
for idx, line in enumerate(table):
if os.path.basename(root) in line:
for f in files:
table.insert(idx+1, '{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
我已经 运行 有了这个,但是,我相信有更好的解决方案:
for path in paths:
for root, dirs, files in os.walk(path):
dirs = sorted(dirs)
files = sorted(files)
team = self.get_team(path)
level = root.replace(path, '').count(os.sep)
indent = ' ' * 4 * (level)
subindent = ' ' * 4 * (level + 1)
basename = os.path.basename(root)
if firstrun:
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(indent, basename, team))
coretasks[basename] = len(table) - 1
for f in files:
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
else:
parsed_folders = []
if basename in coretasks:
inserted_files = 0
for f in files:
table.insert(coretasks[basename] + 1, '{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
inserted_files += 1
parsed_folders.append(basename)
for coretask in coretasks.keys():
if not coretask in parsed_folders: coretasks[coretask] += inserted_files
firstrun = False
print('\n'.join(table))
我会将提取部分与恢复部分分开
提取:
def process(path, d={}):
print('initial', d)
for i in os.scandir(path):
if i.is_file():
if i.name in d: raise Exception(i.path +
"already present")
d[i.name] = None
elif i.is_dir():
if not i.name in d: d[i.name] = {}
process(i.path, d[i.name])
print('final', d)
return d
显示:
def process(path, d={}):
print('initial', d)
for i in os.scandir(path):
if i.is_file():
if i.name in d: raise Exception(i.path +
"already present")
d[i.name] = None
elif i.is_dir():
if not i.name in d: d[i.name] = {}
process(i.path, d[i.name])
print('final', d)
return d
根据您提出的结构,它给出:
>>> process('dir1')
>>> d = process('dir2')
>>> print(d)
{'subdir1': {'folder2': {'file3': None}, 'file1': None, 'file2': None, 'folder1': {}, 'file4': None}, 'subdir2': {}}
>>> display(d)
subdir1
file1
file2
file4
folder1
folder2
file3
subdir2
这样,您只需要更改 HTML 格式的显示部分...
我有多个目录dirs = [dir1, dir2, ...]
这些目录的结构如下:
dir1
subdir1
folder1
file1
file2
subdir2
dir2
subdir1
folder2
file3
file4
subdir2
请注意,子目录的名称是相同的。 dir1 和 dir2 都有相同的命名子目录。我需要的是打印一个 html table,它结合了 dir1 和 dir2 中的文件和文件夹,如下所示:
subdir1
folder1
folder2
file3
file1
file2
file4
subdir2
还有一点需要注意,我需要知道每个文件和文件夹的路径,所以我可以 link 找到它。
到目前为止,我使用 os.walk
为 dir1 创建了树,并从中创建了一个 html table,其中的每一行都在一个列表中。然后我为所有其他目录执行 os.walk
,对于每个目录,遍历该列表,直到基本名称相同,然后插入文件和文件夹。但这非常慢。我相信有一个非常聪明的五行解决方案可以达到同样的效果。
def get_table(self, teams=['test1', 'test2']):
paths = []
table = []
for team in teams:
paths.append(config.basepath + '/' + team)
for path in paths:
if not table:
for root, dirs, files in os.walk(path):
dirs = sorted(dirs)
files = sorted(files)
team = self.get_team(path) # extracts the 'dir' from path
level = root.replace(path, '').count(os.sep)
indent = ' ' * 4 * (level)
subindent = ' ' * 4 * (level + 1)
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(indent, os.path.basename(root), team))
for f in files:
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
else:
for root, dirs, files in os.walk(path):
dirs = sorted(dirs)
files = sorted(files)
team = self.get_team(path)
level = root.replace(path, '').count(os.sep)
indent = ' ' * 4 * (level)
subindent = ' ' * 4 * (level + 1)
for idx, line in enumerate(table):
if os.path.basename(root) in line:
for f in files:
table.insert(idx+1, '{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
我已经 运行 有了这个,但是,我相信有更好的解决方案:
for path in paths:
for root, dirs, files in os.walk(path):
dirs = sorted(dirs)
files = sorted(files)
team = self.get_team(path)
level = root.replace(path, '').count(os.sep)
indent = ' ' * 4 * (level)
subindent = ' ' * 4 * (level + 1)
basename = os.path.basename(root)
if firstrun:
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(indent, basename, team))
coretasks[basename] = len(table) - 1
for f in files:
table.append('{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
else:
parsed_folders = []
if basename in coretasks:
inserted_files = 0
for f in files:
table.insert(coretasks[basename] + 1, '{0}<tr class="{2}"><td>{1}</td><td>{2}</td></tr>'.format(subindent, f, team))
inserted_files += 1
parsed_folders.append(basename)
for coretask in coretasks.keys():
if not coretask in parsed_folders: coretasks[coretask] += inserted_files
firstrun = False
print('\n'.join(table))
我会将提取部分与恢复部分分开
提取:
def process(path, d={}):
print('initial', d)
for i in os.scandir(path):
if i.is_file():
if i.name in d: raise Exception(i.path +
"already present")
d[i.name] = None
elif i.is_dir():
if not i.name in d: d[i.name] = {}
process(i.path, d[i.name])
print('final', d)
return d
显示:
def process(path, d={}):
print('initial', d)
for i in os.scandir(path):
if i.is_file():
if i.name in d: raise Exception(i.path +
"already present")
d[i.name] = None
elif i.is_dir():
if not i.name in d: d[i.name] = {}
process(i.path, d[i.name])
print('final', d)
return d
根据您提出的结构,它给出:
>>> process('dir1')
>>> d = process('dir2')
>>> print(d)
{'subdir1': {'folder2': {'file3': None}, 'file1': None, 'file2': None, 'folder1': {}, 'file4': None}, 'subdir2': {}}
>>> display(d)
subdir1
file1
file2
file4
folder1
folder2
file3
subdir2
这样,您只需要更改 HTML 格式的显示部分...