遍历目录下的文件时出现 FileNotFoundError
FileNotFoundError in iterating over files under a directory
import os
import pandas as pd
FILES = os.listdir("/CADEC/original")
for file in FILES:
if file.startswith("ARTHROTEC."):
print(file)
ARTHROTEC.1.ann
ARTHROTEC.10.ann
ARTHROTEC.100.ann
ARTHROTEC.101.ann
ARTHROTEC.102.ann
ARTHROTEC.103.ann
ARTHROTEC.104.ann
ARTHROTEC.105.ann
ARTHROTEC.106.ann
ARTHROTEC.107.ann
ARTHROTEC.108.ann
ARTHROTEC.109.ann
ARTHROTEC.11.ann
ARTHROTEC.110.ann
ARTHROTEC.111.ann
ARTHROTEC.112.ann
ARTHROTEC.113.ann
ARTHROTEC.114.ann
ARTHROTEC.115.ann
...
我想从目录下所有以特定字母开头的文件中提取数据。如上所示,当我遍历目录并打印适合的每个文件名时,我得到一列文件名(字符串)。同时,data = pd.read_csv("/CADEC/original/ARTHROTEC.1.ann", sep='\t', header=None)
工作得很好。但是,运行 下面的代码只会 return 错误。为什么找不到文件?我应该怎么做才能解决这个问题?
for file in FILES:
if file.startswith("ARTHROTEC."):
data = pd.read_csv(file, sep='\t', header=None)
FileNotFoundError: [Errno 2] File ARTHROTEC.1.ann does not exist: 'ARTHROTEC.1.ann'
os.listdir
只return目录中的文件名,不return路径,pandas
需要路径(或相对路径)文件,除非文件与代码位于同一目录中。
- 学习
pathlib
模块会更好,它将路径视为具有方法的对象,而不是字符串。
.glob
- 生成 Generator
个匹配模式 的对象
- Python 3's pathlib Module: Taming the File System
pathlib
可能需要一些时间来适应,但是所有用于提取路径特定部分的方法,例如 .suffix
for the file extension, or .stem
文件名,都是值得的。
import pandas as pd
from pathlib import Path
# create the path object and get the files with .glob
files = Path('/CADEC/original').glob('ARTHROTEC*.ann')
# create a list of dataframes, 1 dataframe for each file
df_list = [pd.read_csv(file, sep='\t', header=None) for file in files]
# alternatively, create a dict of dataframes with the filename as the key
df_dict = {file.stem: pd.read_csv(file, sep='\t', header=None) for file in files}
例子
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] on win32
import os
...: from pathlib import Path
...: os.listdir('e:/PythonProjects/stack_overflow/t-files')
Out[2]:
['.ipynb_checkpoints',
'03900169.txt',
'142233.0.txt',
'153431.2.txt',
'17371271.txt',
'274301.5.txt',
'42010316.txt',
'429237.7.txt',
'570651.4.txt',
'65500027.txt',
'688599.3.txt',
'740103.5.txt',
'742537.6.txt',
'87505504.txt',
'90950222.txt',
't1.txt',
't2.txt',
't3.txt']
list(Path('e:/PythonProjects/stack_overflow/t-files').glob('*'))
Out[3]:
[WindowsPath('e:/PythonProjects/stack_overflow/t-files/.ipynb_checkpoints'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/03900169.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/142233.0.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/153431.2.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/17371271.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/274301.5.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/42010316.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/429237.7.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/570651.4.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/65500027.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/688599.3.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/740103.5.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/742537.6.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/87505504.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/90950222.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/t1.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/t2.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/t3.txt')]
import os
import pandas as pd
FILES = os.listdir("/CADEC/original")
for file in FILES:
if file.startswith("ARTHROTEC."):
print(file)
ARTHROTEC.1.ann
ARTHROTEC.10.ann
ARTHROTEC.100.ann
ARTHROTEC.101.ann
ARTHROTEC.102.ann
ARTHROTEC.103.ann
ARTHROTEC.104.ann
ARTHROTEC.105.ann
ARTHROTEC.106.ann
ARTHROTEC.107.ann
ARTHROTEC.108.ann
ARTHROTEC.109.ann
ARTHROTEC.11.ann
ARTHROTEC.110.ann
ARTHROTEC.111.ann
ARTHROTEC.112.ann
ARTHROTEC.113.ann
ARTHROTEC.114.ann
ARTHROTEC.115.ann
...
我想从目录下所有以特定字母开头的文件中提取数据。如上所示,当我遍历目录并打印适合的每个文件名时,我得到一列文件名(字符串)。同时,data = pd.read_csv("/CADEC/original/ARTHROTEC.1.ann", sep='\t', header=None)
工作得很好。但是,运行 下面的代码只会 return 错误。为什么找不到文件?我应该怎么做才能解决这个问题?
for file in FILES:
if file.startswith("ARTHROTEC."):
data = pd.read_csv(file, sep='\t', header=None)
FileNotFoundError: [Errno 2] File ARTHROTEC.1.ann does not exist: 'ARTHROTEC.1.ann'
os.listdir
只return目录中的文件名,不return路径,pandas
需要路径(或相对路径)文件,除非文件与代码位于同一目录中。- 学习
pathlib
模块会更好,它将路径视为具有方法的对象,而不是字符串。.glob
- 生成Generator
个匹配模式 的对象
- Python 3's pathlib Module: Taming the File System
pathlib
可能需要一些时间来适应,但是所有用于提取路径特定部分的方法,例如.suffix
for the file extension, or.stem
文件名,都是值得的。
import pandas as pd
from pathlib import Path
# create the path object and get the files with .glob
files = Path('/CADEC/original').glob('ARTHROTEC*.ann')
# create a list of dataframes, 1 dataframe for each file
df_list = [pd.read_csv(file, sep='\t', header=None) for file in files]
# alternatively, create a dict of dataframes with the filename as the key
df_dict = {file.stem: pd.read_csv(file, sep='\t', header=None) for file in files}
例子
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] on win32
import os
...: from pathlib import Path
...: os.listdir('e:/PythonProjects/stack_overflow/t-files')
Out[2]:
['.ipynb_checkpoints',
'03900169.txt',
'142233.0.txt',
'153431.2.txt',
'17371271.txt',
'274301.5.txt',
'42010316.txt',
'429237.7.txt',
'570651.4.txt',
'65500027.txt',
'688599.3.txt',
'740103.5.txt',
'742537.6.txt',
'87505504.txt',
'90950222.txt',
't1.txt',
't2.txt',
't3.txt']
list(Path('e:/PythonProjects/stack_overflow/t-files').glob('*'))
Out[3]:
[WindowsPath('e:/PythonProjects/stack_overflow/t-files/.ipynb_checkpoints'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/03900169.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/142233.0.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/153431.2.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/17371271.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/274301.5.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/42010316.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/429237.7.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/570651.4.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/65500027.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/688599.3.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/740103.5.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/742537.6.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/87505504.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/90950222.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/t1.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/t2.txt'),
WindowsPath('e:/PythonProjects/stack_overflow/t-files/t3.txt')]