使用后缀检查文件的并行性
Check parallelism of files using their suffixes
给定文件目录,例如:
mydir/
test1.abc
set123.abc
jaja98.abc
test1.xyz
set123.xyz
jaja98.xyz
我需要检查每个 .abc
文件是否有一个等效的 .xyz
文件。我可以这样做:
>>> filenames = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz', 'jaja98.xyz']
>>> suffixes = ('.abc', '.xyz')
>>> assert all( os.path.splitext(_filename)[0]+suffixes[1] in filenames for _filename in filenames if _filename.endswith(suffixes[0]) )
上面的代码应该通过断言,而像这样的代码会失败:
>>> filenames = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz']
>>> suffixes = ('.abc', '.xyz') >>> assert all(os.path.splitext(_filename)[0]+suffixes[1] in filenames for _filename in filenames if _filename.endswith(suffixes[0]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
但这有点太冗长了。
做同样的检查有更好的方法吗?
您可以定义辅助函数,它将 return set
不带扩展名的文件名匹配给定的后缀。然后您可以轻松检查后缀为 .abc
的文件是否为后缀为 .xyz
:
的文件的子集
filenames = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz', 'jaja98.xyz']
filenames2 = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz']
suffixes = ('.abc', '.xyz')
def filter_ext(names, ext):
return {n[:-len(ext)] for n in names if n.endswith(ext)}
assert filter_ext(filenames, suffixes[0]) <= filter_ext(filenames, suffixes[1])
assert filter_ext(filenames2, suffixes[0]) <= filter_ext(filenames2, suffixes[1]) # fail
上述方法也会更有效,因为它具有 O(n) 时间复杂度,而原始方法是 O(n^2)。当然,如果列表很小,这并不重要。
给定文件目录,例如:
mydir/
test1.abc
set123.abc
jaja98.abc
test1.xyz
set123.xyz
jaja98.xyz
我需要检查每个 .abc
文件是否有一个等效的 .xyz
文件。我可以这样做:
>>> filenames = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz', 'jaja98.xyz']
>>> suffixes = ('.abc', '.xyz')
>>> assert all( os.path.splitext(_filename)[0]+suffixes[1] in filenames for _filename in filenames if _filename.endswith(suffixes[0]) )
上面的代码应该通过断言,而像这样的代码会失败:
>>> filenames = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz']
>>> suffixes = ('.abc', '.xyz') >>> assert all(os.path.splitext(_filename)[0]+suffixes[1] in filenames for _filename in filenames if _filename.endswith(suffixes[0]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
但这有点太冗长了。
做同样的检查有更好的方法吗?
您可以定义辅助函数,它将 return set
不带扩展名的文件名匹配给定的后缀。然后您可以轻松检查后缀为 .abc
的文件是否为后缀为 .xyz
:
filenames = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz', 'jaja98.xyz']
filenames2 = ['test1.abc', 'set123.abc', 'jaja98.abc', 'test1.xyz', 'set123.xyz']
suffixes = ('.abc', '.xyz')
def filter_ext(names, ext):
return {n[:-len(ext)] for n in names if n.endswith(ext)}
assert filter_ext(filenames, suffixes[0]) <= filter_ext(filenames, suffixes[1])
assert filter_ext(filenames2, suffixes[0]) <= filter_ext(filenames2, suffixes[1]) # fail
上述方法也会更有效,因为它具有 O(n) 时间复杂度,而原始方法是 O(n^2)。当然,如果列表很小,这并不重要。