当我们尝试在 Python 中使用 pysftp 从 SFTP 串行下载 50 多个文件时,下载失败并显示 "Authentication failed"?
Download fails with "Authentication failed" when we try to download 50+ files from SFTP serially using pysftp in Python?
for remote_path in list_of_stfp_paths:
with pysftp.Connection(HOSTNAME, username=USERNAME, password=PASSWORD) as sftp:
sftp.get(remote_path, str(local_path))
#checks distinct count of a column for the csv downloaded, deletes it later
df = pd.read_csv(str(local_path))
print(df['taken_time'].value_counts())
os.remove(str(local_path))
我使用的代码如上。它只是具有多个远程路径的 for 循环中的 运行。
有时,它会完成。有时,我收到一条错误消息
Exception: Authentication failed.
不为每个文件重新连接。只循环下载,不循环连接:
with pysftp.Connection(HOSTNAME, username=USERNAME, password=PASSWORD) as sftp:
for remote_path in list_of_stfp_paths:
sftp.get(remote_path, str(local_path))
#checks distinct count of a column for the csv downloaded, deletes it later
df = pd.read_csv(str(local_path))
print(df['taken_time'].value_counts())
os.remove(str(local_path))
不过请注意,您甚至不必将文件下载到本地磁盘,只需直接从 SFTP 服务器读取它们即可:
with pysftp.Connection(HOSTNAME, username=USERNAME, password=PASSWORD) as sftp:
for remote_path in list_of_stfp_paths:
with sftp.open(remote_path) as f:
f.prefetch()
#checks distinct count of a column for the csv
df = pd.read_csv(f)
print(df['taken_time'].value_counts())
它甚至可能更快,因为它允许并行进行下载和解析,而不是按顺序进行。参见
for remote_path in list_of_stfp_paths:
with pysftp.Connection(HOSTNAME, username=USERNAME, password=PASSWORD) as sftp:
sftp.get(remote_path, str(local_path))
#checks distinct count of a column for the csv downloaded, deletes it later
df = pd.read_csv(str(local_path))
print(df['taken_time'].value_counts())
os.remove(str(local_path))
我使用的代码如上。它只是具有多个远程路径的 for 循环中的 运行。 有时,它会完成。有时,我收到一条错误消息
Exception: Authentication failed.
不为每个文件重新连接。只循环下载,不循环连接:
with pysftp.Connection(HOSTNAME, username=USERNAME, password=PASSWORD) as sftp:
for remote_path in list_of_stfp_paths:
sftp.get(remote_path, str(local_path))
#checks distinct count of a column for the csv downloaded, deletes it later
df = pd.read_csv(str(local_path))
print(df['taken_time'].value_counts())
os.remove(str(local_path))
不过请注意,您甚至不必将文件下载到本地磁盘,只需直接从 SFTP 服务器读取它们即可:
with pysftp.Connection(HOSTNAME, username=USERNAME, password=PASSWORD) as sftp:
for remote_path in list_of_stfp_paths:
with sftp.open(remote_path) as f:
f.prefetch()
#checks distinct count of a column for the csv
df = pd.read_csv(f)
print(df['taken_time'].value_counts())
它甚至可能更快,因为它允许并行进行下载和解析,而不是按顺序进行。参见