我的错误是由于绝对路径问题吗?
Is my error due to an absolute path issue?
我正在尝试创建一个变量,用于在我工作的名为 TimeSeries 的目录中存储一个文件夹。之后,我试图读取 TimeSeries 中的每个文件。显然,我的错误源于 df = pd.read_csv(f)
是相对路径而不是绝对路径。但是,我无法确认这一点,因为当我检查 isabs(direct)
时,我返回 True。我知道错误是关于那条特定的线,我只是不知道它是什么。
代码:
import pandas as pd
import numpy as np
import os
direct = os.path.abspath('TimeSeries')
for f in direct:
df = pd.read_csv(f)
df = df.replace(np.nan, 'Other', regex=True)
if df.columns[0] == ['FIPS']:
print(df.columns)
df = df.drop(['FIPS', 'Last_Update', 'Lat', 'Long_'], axis=1)
df = df.rename(columns={'Admin2': 'County',
'Province_State': 'State',
'Country_Region': 'Country',
'Combined_Key': 'City'})
df.to_csv(f)
elif df.columns[0] == ['Province/State']:
print(df.columns)
df = df.drop(['Last Update'], axis=1)
df = df.rename(columns={'Province/State': 'State',
'Country/Region': 'Country'})
df.to_csv(f)
else:
pass
结果:
Traceback (most recent call last):
File "C:/Users/USER/PycharmProjects/Corona Stats/Corona.py", line 9, in <module>
df = pd.read_csv(f)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1891, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File C does not exist: 'C'
Process finished with exit code 1
这是我直接打印时发生的情况。
C:\Users\USER\PycharmProjects\Corona Stats\TimeSeries
使用python或pandas当你使用read_csv或pd.read_csv时,它们都会查看当前工作目录,默认情况下python过程已经开始。所以你需要使用 os 模块到 chdir() 并从那里获取它。
import pandas as pd
import os
print(os.getcwd())
os.chdir("<PATH TO DIRECTORY>")
print(os.getcwd())
df = pd.read_csv('<The Filename You want to read>')
print(df.head())
在这里,您正在遍历路径中的 EACH 个字母:
direct = 'C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries'
for f in direct:
...
如果你想获取目录中的文件,你应该使用类似的东西:
for item in os.listdir(direct):
...
我个人会使用 pathlib
:
from pathlib import Path
direct = Path('C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries')
for item in direct.glob('*'):
...
IIUC:尝试:
source = "C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries"
for filename in os.listdir(source):
filepath = os.path.join(source, filename)
if not os.path.isfile(filepath):
continue
df = pd.read_csv(filepath)
df = df.replace(np.nan, 'Other', regex=True)
if df.columns[0] == 'FIPS':
print(df.columns)
df = df.drop(['FIPS', 'Last_Update', 'Lat', 'Long_'], axis=1)
df = df.rename(columns={'Admin2': 'County',
'Province_State': 'State',
'Country_Region': 'Country',
'Combined_Key': 'City'})
df.to_csv(filepath)
elif df.columns[0] == 'Province/State':
print(df.columns)
df = df.drop(['Last Update'], axis=1)
df = df.rename(columns={'Province/State': 'State',
'Country/Region': 'Country'})
df.to_csv(filepath)
我正在尝试创建一个变量,用于在我工作的名为 TimeSeries 的目录中存储一个文件夹。之后,我试图读取 TimeSeries 中的每个文件。显然,我的错误源于 df = pd.read_csv(f)
是相对路径而不是绝对路径。但是,我无法确认这一点,因为当我检查 isabs(direct)
时,我返回 True。我知道错误是关于那条特定的线,我只是不知道它是什么。
代码:
import pandas as pd
import numpy as np
import os
direct = os.path.abspath('TimeSeries')
for f in direct:
df = pd.read_csv(f)
df = df.replace(np.nan, 'Other', regex=True)
if df.columns[0] == ['FIPS']:
print(df.columns)
df = df.drop(['FIPS', 'Last_Update', 'Lat', 'Long_'], axis=1)
df = df.rename(columns={'Admin2': 'County',
'Province_State': 'State',
'Country_Region': 'Country',
'Combined_Key': 'City'})
df.to_csv(f)
elif df.columns[0] == ['Province/State']:
print(df.columns)
df = df.drop(['Last Update'], axis=1)
df = df.rename(columns={'Province/State': 'State',
'Country/Region': 'Country'})
df.to_csv(f)
else:
pass
结果:
Traceback (most recent call last):
File "C:/Users/USER/PycharmProjects/Corona Stats/Corona.py", line 9, in <module>
df = pd.read_csv(f)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1891, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File C does not exist: 'C'
Process finished with exit code 1
这是我直接打印时发生的情况。
C:\Users\USER\PycharmProjects\Corona Stats\TimeSeries
使用python或pandas当你使用read_csv或pd.read_csv时,它们都会查看当前工作目录,默认情况下python过程已经开始。所以你需要使用 os 模块到 chdir() 并从那里获取它。
import pandas as pd
import os
print(os.getcwd())
os.chdir("<PATH TO DIRECTORY>")
print(os.getcwd())
df = pd.read_csv('<The Filename You want to read>')
print(df.head())
在这里,您正在遍历路径中的 EACH 个字母:
direct = 'C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries'
for f in direct:
...
如果你想获取目录中的文件,你应该使用类似的东西:
for item in os.listdir(direct):
...
我个人会使用 pathlib
:
from pathlib import Path
direct = Path('C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries')
for item in direct.glob('*'):
...
IIUC:尝试:
source = "C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries"
for filename in os.listdir(source):
filepath = os.path.join(source, filename)
if not os.path.isfile(filepath):
continue
df = pd.read_csv(filepath)
df = df.replace(np.nan, 'Other', regex=True)
if df.columns[0] == 'FIPS':
print(df.columns)
df = df.drop(['FIPS', 'Last_Update', 'Lat', 'Long_'], axis=1)
df = df.rename(columns={'Admin2': 'County',
'Province_State': 'State',
'Country_Region': 'Country',
'Combined_Key': 'City'})
df.to_csv(filepath)
elif df.columns[0] == 'Province/State':
print(df.columns)
df = df.drop(['Last Update'], axis=1)
df = df.rename(columns={'Province/State': 'State',
'Country/Region': 'Country'})
df.to_csv(filepath)