匹配所有不以前缀开头的文件的全局模式

Question

我想使用 glob 模式来匹配 src/ 文件夹中不以两个大写字母后跟一个点作为前缀的所有文件。

glob 应匹配以下文件

foo.txt
foo-bar.txt
foo.bar.baz.txt
fo.txt

但它不应该匹配以下文件:

AB.foo.txt
AB.foo-bar.txt
XY.foo.bar.baz.txt
FO.fo.txt

前缀始终是两个大写字母（A 到 Z）后跟一个点。

Answer 1

下面的怎么样（使用listdir）

from os import listdir

file_list = [f for f in listdir('./src') if not (f[0].isupper() and f[1].isupper() and f[2] == '.')]

Answer 2

glob() 基本上可以满足您的需求，但有一些限制。

你可以这样做：

glob.glob("src/[!A-Z][!A-Z][!.]*")

这将排除任何以两个大写字母后跟一个点开头的文件。但是，此特定语法还将排除文件名中少于 3 个字符的任何文件。 Globbing 类似于 shell 文件名 globbing 语法，在 shell 中，您要查找的内容通常使用 find 或 grep.

来完成

如果 glob() 不够灵活，您必须自己对所有文件进行全局匹配和模式匹配。

Answer 3

这个解决方案也使用 listdir，但使用正则表达式来获取我们需要的内容。

from os import listdir
import re

# Fetch any file not match Captial head kind. If non-text file exist in the directory, it might fetch it either.
file_list = [f for f in listdir('./src') if not re.match(r'[A-Z][A-Z]\..*', f)]

# Fetch any file not match Captial head kind. Fetch txt only
file_list = [f for f in listdir('./src') if not re.match(r'[A-Z][A-Z]\..*', f) and re.match(r'.*txt', f)]

Answer 4

如果您查看 glob.py，您会发现它使用 fnmatch.filter 来过滤路径。 fnmatch.filter 使用 fnmatch.translate 从模式中形成正则表达式。因此，可以使用 glob.glob("[!A-Z][!A-Z]*")（将转换为以下正则表达式：'(?s:[^A-Z][^A-Z].*)\Z'.

请注意，这将忽略前两个索引中包含大写字母的所有内容。函数translate定义如下：

def translate(pat):
"""Translate a shell PATTERN to a regular expression.

There is no way to quote meta-characters.
"""

所以我相信没有办法包含更复杂的正则表达式。

匹配所有不以前缀开头的文件的全局模式

Glob pattern to match all files that do not start with prefix

python

linux

glob