Pandas DataFrame.to_csv() OSError: [Errno 22] Invalid argument and PermissionError: [Errno 13] Permission denied

Question

我正在将大量金融时间序列数据写入单个 CSV 文件。在一个实例中，我发现 to_csv 方法反复失败，但我终究无法弄清楚原因。在调用方法 to_csv 期间，所有内容都会挂起 10-15 分钟以上。在因错误崩溃之前：

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 172, in save self._save() File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 274, in _save self._save_header() File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 242, in _save_header writer.writerow(encoded_labels) OSError: [Errno 22] Invalid argument

During handling of the above exception, another exception occurred:

OSError: [Errno 22] Invalid argument

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "securitiesArchives.py", line 1072, in out_df.to_csv("PRN.csv",mode='w',encoding='UTF-8' ,compression=None) File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 3020, in to_csv formatter.save() File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 187, in save f.close() OSError: [Errno 22] Invalid argument

似乎在写入 csv 文件的 header 行时挂断了。我将相同的帧写入 hdf，然后从 hdf 加载，并使用 hdf 加载的帧，重现了相同（或非常接近相同）的失败：

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 172, in save self._save() File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 274, in _save self._save_header() File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 242, in _save_header writer.writerow(encoded_labels) PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "bad_archive.py", line 12, in #out_df.to_csv("PRN.csv",mode='w',encoding='UTF-8' ,compression=None) File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 3020, in to_csv formatter.save() File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\formats\csvs.py", line 187, in save f.close() PermissionError: [Errno 13] Permission denied

不确定为什么从较大的代码 body 转移到小样本问题时它从 "OSError: [Errno 22] Invalid argument" 变为 "PermissionError: [Errno 13] Permission denied"。我搜索了与方法 to_csv 相关的这些错误，发现 pandas 的先前版本可能有类似的问题，但这应该在以后的版本中解决。我的 pandas 是：

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.24.2 pytest: 5.0.1 pip: 19.1.1 setuptools: 41.0.1 Cython: 0.29.12 numpy: 1.16.4 scipy: 1.2.1 pyarrow: None xarray: None IPython: 7.6.1 sphinx: 2.1.2 patsy: 0.5.1 dateutil: 2.8.0 pytz: 2019.1 blosc: None bottleneck: 1.2.1 tables: 3.5.2 numexpr: 2.6.9 feather: None matplotlib: 3.1.0 openpyxl: 2.6.2 xlrd: 1.2.0 xlwt: 1.3.0 xlsxwriter: 1.1.8 lxml.etree: 4.3.4 bs4: 4.7.1 html5lib: 1.0.1 sqlalchemy: 1.3.5 pymysql: None psycopg2: None jinja2: 2.10.1 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: 0.8.1 gcsfs: None

我在 win-10 64 位机器上使用 Anaconda Python 3.7.3（默认，2019 年 4 月 24 日，15:29:51）[MSC v.1915 64 位 (AMD64)]： : Anaconda, Inc. 在 win32

我试过：

在调用之前将索引转换为 str 到 astype(str) to_csv，同样的问题
索引（行）上的 dropna 因为有大量 NaN 条目（最初此框架是更大的多索引框架的一部分）问题仍然存在
关键字参数 headers=False 和 index=False，两者都没有改变行为。
使用 .loc[] 仅对第一行进行切片并在此

out_df.loc[out_df.index.values[0]].to_csv("PRN.csv",mode='w',encoding='UTF-8' ,compression=None)

也失败了。即使这现在是一个系列，不再是一个框架，因为产生了以下警告

FutureWarning: The signature of Series.to_csv was aligned to that of DataFrame.to_csv, and argument 'header' will change its default value from False to True: please pass an explicit value to suppress this warning.

再次尝试与上述相同的方法，而不是对前两行进行切片以确保它仍然是一个框架而不是转换为系列

The entire two row DataFrame which refuses cooperation with to_csv out_df.loc[out_df.index.values[0]:out_df.index.values[1]].to_csv("PRN.csv",mode='w',encoding='UTF-8' ,compression=None,index=False,header=False)

但这也和之前一样失败了。但是，我能够毫无问题地将每一列的序列独立写入其自己的 CSV 文件。

for col_name in out_df.columns:
   print('Writing '+col_name+' as CSV')
    out_df[col_name].to_csv(col_name.replace(' ','_')+"_PRN.csv",mode='w',encoding='UTF-8' ,compression=None)
    print('Done.')

综合上述两行写入尝试的成功和失败，我认为这不是与特定列值相关的问题。此外，回溯让我认为这个问题与编写列 headers 有关。但问题是我有 3000 多个具有完全相同列标签的其他 DataFrame，它们使用 to_csv 毫无问题地写入 csv。在这一点上，我超出了我的深度。

无论我是使用我写入 hdf 的数据还是使用 yfinance 从雅虎获取的新数据，同一组数据都会反复出现故障。以下代码可靠地在我的系统上重现了该问题：

import pandas as pd
import yfinance as yf

good_df = yf.download(tickers='AAPL',interval='1m',period='7d')
bad_df = yf.download(tickers='PRN',interval='1m',period='7d')
print('Writing test case AAPL as CSV')   
good_df.to_csv("AAPL.csv",mode='w',encoding='UTF-8' ,compression=None) 
print('Writing test case PRN as CSV')   
bad_df.to_csv("PRN.csv",mode='w',encoding='UTF-8' ,compression=None)

有人有什么想法吗？

PS - 虽然 re-reading 我决定检查列标签是否相等，就布尔比较而言，'good' DataFrame 和 'bad' DataFrame 相同。

>>>print(good_df.columns)
Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object') 
>>>print(bad_df.columns)
Index(['Open', 'High', 'Low', 'Close', 'Adj Close','Volume'], dtype='object') 
>>>print(good_df.columns == bad_df.columns)
[ True  True  True  True  True  True]

PPS - 我也尝试从 to_csv 中删除所有标志，尽管它们应该是默认值。它是其他代码中使用的 carry-over，我正在检查不同的值以查看它是否有效。最基本的 to_csv 调用像以前一样失败

import pandas as pd
import yfinance as yf

good_df = yf.download(tickers='AAPL',interval='1m',period='7d')
bad_df = yf.download(tickers='PRN',interval='1m',period='7d')
print('Writing test case AAPL as CSV')   
good_df.to_csv("AAPL.csv") 
print('Writing test case PRN as CSV')   
bad_df.to_csv("PRN.csv")

更新回应程

的回复

我在资源管理器中或通过控制台中的目录看不到任何文件。但是为了测试这个，我使用了一个新的文件名，它不是符号 "PRN" 并且你看它有效。

我不认为这是问题所在，因为我已经尝试在较大的 parent 代码和玩具问题中写入不同的目标文件夹。都没有用。

似乎 windows 对任何名为 "PRN.csv" 或其他名称的旧文件有一个旧引用....多么令人沮丧。希望简单的重启就能解决它。

谢谢！

Answer 1

我今天早些时候确实遇到了同样的问题，但由于我处理的数据要小得多，所以更容易找到解决方案。

当一个文件在另一个程序中打开时，您不能写入或附加到它。检查您可能忘记 close() 的地方，或者它是否在 Microsoft Excel.

中开放供查看

此外，通常最好使用 open('file', 'a') 来写入，以防您之前存储的任何数据存储在那里。如果没有，它会像 open('file','w') 一样创建一个新文件。

Pandas DataFrame.to_csv() OSError: [Errno 22] Invalid argument and PermissionError: [Errno 13] Permission denied

Pandas DataFrame.to_csv() OSError: [Errno 22] Invalid argument and PermissionError: [Errno 13] Permission denied

python

errno

export-to-csv

pandas

更新回应程