我的代码混淆了正则表达式的输入文件名

My code is confusing an input file name for a regex expression

我的正则表达式没有在字符范围内明确包含破折号,但是当输入文件名如下时,我的代码失败了:

Rage Against The Machine - 1996 - Bulls On Parade [Maxi-Single]

这是我的代码:

def find_cue_files(path):
  found_files = []
  for root, dirs, files in os.walk(path):
    if files:
      fcue = glob(os.path.join(root, '*.[Cc][Uu][Ee]')) # this is line 81 in my source file (mentioned in the traceback)
      # do a few other things...
  return found_files

很明显文件名的这一部分是问题所在:[Maxi-Single]

如何处理类似的文件名,以便将它们视为固定字符串,而不是正则表达式的一部分?

(这不是我的主要问题,但如果它是相关的,我愿意尝试另一种方法来进行不区分大小写的搜索。我已经查看了关于该主题的几个 Stack Overflow 问题,但我没有-- 到目前为止 -- 找到似乎适合这种情况的任何解决方案。)

这是我的错误:

回溯(最近调用最后):

  File "/usr/bin/xonsh", line 33, in <module>
    sys.exit(load_entry_point('xonsh==0.10.0', 'console_scripts', 'xonsh')())
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21336, in main
    _failback_to_other_shells(args, err)
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21283, in _failback_to_other_shells
    raise err
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21334, in main
    sys.exit(main_xonsh(args))
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 21388, in main_xonsh
    run_script_with_cache(
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 3285, in run_script_with_cache
    run_compiled_code(ccode, glb, loc, mode)
  File "/usr/lib/python3.9/site-packages/xonsh/__amalgam__.py", line 3190, in run_compiled_code
    func(code, glb, loc)
  File "process_audio_files.xsh", line 160, in <module>
    cue_files = find_cue_files(dest_path)
  File "process_audio_files.xsh", line 81, in find_cue_files
    fcue = glob(os.path.join(root, '*.[Cc][Uu][Ee]'))
  File "/usr/lib/python3.9/glob.py", line 22, in glob
    return list(iglob(pathname, recursive=recursive))
  File "/usr/lib/python3.9/glob.py", line 74, in _iglob
    for dirname in dirs:
  File "/usr/lib/python3.9/glob.py", line 75, in _iglob
    for name in glob_in_dir(dirname, basename, dironly):
  File "/usr/lib/python3.9/glob.py", line 86, in _glob1
    return fnmatch.filter(names, pattern)
  File "/usr/lib/python3.9/fnmatch.py", line 58, in filter
    match = _compile_pattern(pat)
  File "/usr/lib/python3.9/fnmatch.py", line 52, in _compile_pattern
    return re.compile(res).match
  File "/usr/lib/python3.9/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.9/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.9/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.9/sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.9/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/lib/python3.9/sre_parse.py", line 834, in _parse
    p = _parse_sub(source, state, sub_verbose, nested + 1)
  File "/usr/lib/python3.9/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/lib/python3.9/sre_parse.py", line 598, in _parse
    raise source.error(msg, len(this) + 1 + len(that))
re.error: bad character range i-S at position 70

编辑:我尝试使用此处引用的 re.escapehttps://docs.python.org/3/library/re.html

def find_cue_files(path):
  found_files = []
  for root, dirs, files in os.walk(path):
    if files:
      root2 = re.escape(root)
      fcue = glob(os.path.join(root2, '*.[Cc][Uu][Ee]')) 
      # do a few other things...
  return found_files

它处理了较早的文件名,但现在输入文件名失败 Aerosmith - Aerosmith (2014) [24-96 HD] 在我修改后的代码中的同一点产生相同的错误。

与其使用带有通过根传递的有趣文件模式的 glob,不如只整理名称,然后在根前面添加。一种可能的单行:

fcue=list(map(lambda x: os.path.join(root,x), (f for f in files if f.lower().endswith('.cue'))))