os.scandir 给出 [WinError 3] 系统找不到指定的路径
os.scandir gives [WinError 3] The system cannot find the path specified
我有一组(大)XML 文件,我想搜索其中存在的一组字符串 - 我正在尝试使用以下 Python 代码来执行此操作:
import collections
thestrings = []
with open('Strings.txt') as f:
for line in f:
text = line.strip()
thestrings.append(text)
print('Searching for:')
print(thestrings)
print('Results:')
try:
from os import scandir
except ImportError:
from scandir import scandir
def scantree(path):
"""Recursively yield DirEntry objects for given directory."""
for entry in scandir(path):
if entry.is_dir(follow_symlinks=False) and (not entry.name.startswith('.')):
yield from scantree(entry.path)
else:
yield entry
if __name__ == '__main__':
for entry in scantree('//path/to/folder'):
if ('.xml' in entry.name) and ('.zip' not in entry.name):
with open(entry.path) as f:
data = f.readline()
if (thestrings[0] in data):
print('')
print('****** Schema found in: ', entry.name)
print('')
data = f.read()
if (thestrings[1] in data) and (thestrings[2] in data) and (thestrings[3] in data):
print('Hit at:', entry.path)
print("Done!")
其中 Strings.txt 是一个包含我有兴趣查找的字符串的文件,第一行是模式 URI。
这似乎 运行 一开始没问题,但几秒钟后给了我一个:
FileNotFoundError: [WinError 3] The system cannot find the path specified: //some/path
这让我感到困惑,因为路径是在 运行 时间内构建的?
注意,如果我按如下方式检测代码:
with open(entry.path) as f:
data = f.readline()
if (thestrings[0] in data):
成为:
with open(entry.path) as f:
print(entry.name)
data = f.readline()
if (thestrings[0] in data):
然后我看到在错误发生之前找到了一些潜在的文件。
我意识到我的脚本正在寻找一些非常长的 UNC 路径名,对于 Windows 来说似乎太长了,所以我现在也在尝试打开文件之前检查路径长度,如下所示:
if name.endswith('.xml'):
fullpath = os.path.join(root, name)
if (len(fullpath) > 255): ##Too long for Windows!
print('File-extension-based candidate: ', fullpath)
else:
if os.path.isfile(fullpath):
with open(fullpath) as f:
data = f.readline()
if (thestrings[0] in data):
print('Schema-based candidate: ', fullpath)
请注意,我还决定检查该文件是否真的是一个文件,并且我更改了我的代码以使用 os.walk,如上所述。除了使用 .endswith()
简化对 .xml 文件扩展名的检查
现在似乎一切正常...
我有一组(大)XML 文件,我想搜索其中存在的一组字符串 - 我正在尝试使用以下 Python 代码来执行此操作:
import collections
thestrings = []
with open('Strings.txt') as f:
for line in f:
text = line.strip()
thestrings.append(text)
print('Searching for:')
print(thestrings)
print('Results:')
try:
from os import scandir
except ImportError:
from scandir import scandir
def scantree(path):
"""Recursively yield DirEntry objects for given directory."""
for entry in scandir(path):
if entry.is_dir(follow_symlinks=False) and (not entry.name.startswith('.')):
yield from scantree(entry.path)
else:
yield entry
if __name__ == '__main__':
for entry in scantree('//path/to/folder'):
if ('.xml' in entry.name) and ('.zip' not in entry.name):
with open(entry.path) as f:
data = f.readline()
if (thestrings[0] in data):
print('')
print('****** Schema found in: ', entry.name)
print('')
data = f.read()
if (thestrings[1] in data) and (thestrings[2] in data) and (thestrings[3] in data):
print('Hit at:', entry.path)
print("Done!")
其中 Strings.txt 是一个包含我有兴趣查找的字符串的文件,第一行是模式 URI。
这似乎 运行 一开始没问题,但几秒钟后给了我一个:
FileNotFoundError: [WinError 3] The system cannot find the path specified: //some/path
这让我感到困惑,因为路径是在 运行 时间内构建的?
注意,如果我按如下方式检测代码:
with open(entry.path) as f:
data = f.readline()
if (thestrings[0] in data):
成为:
with open(entry.path) as f:
print(entry.name)
data = f.readline()
if (thestrings[0] in data):
然后我看到在错误发生之前找到了一些潜在的文件。
我意识到我的脚本正在寻找一些非常长的 UNC 路径名,对于 Windows 来说似乎太长了,所以我现在也在尝试打开文件之前检查路径长度,如下所示:
if name.endswith('.xml'):
fullpath = os.path.join(root, name)
if (len(fullpath) > 255): ##Too long for Windows!
print('File-extension-based candidate: ', fullpath)
else:
if os.path.isfile(fullpath):
with open(fullpath) as f:
data = f.readline()
if (thestrings[0] in data):
print('Schema-based candidate: ', fullpath)
请注意,我还决定检查该文件是否真的是一个文件,并且我更改了我的代码以使用 os.walk,如上所述。除了使用 .endswith()
简化对 .xml 文件扩展名的检查现在似乎一切正常...