FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

我最近使用 joblib 将模型保存到 s3

model_doc是模型对象

import subprocess
import joblib

save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")

def save_d2v_to_s3_current_doc2vec_model(model,fname):
    model_name = fname
    joblib.dump(model,model_name)
    s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
    path = s3_base_path+'/'+model_name
    command = "aws s3 cp {} {}".format(model_name,path).split()
    print('saving...'+model_name)
    subprocess.call(command)

它成功了,但是之后当我尝试从 s3 加载模型时它给了我一个错误

model = load_d2v("doc2vec_model")

def load_d2v(fname):
    model_name = fname
    s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
    path = s3_base_path+'/'+model_name  
    command = "aws s3 cp {} {}".format(path,model_name).split()
    print('loading...'+model_name)
    subprocess.call(command)
    model=joblib.load(model_name)
    return model

这是我得到的错误:

loading...doc2vec_model
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in load_d2v
  File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
    with Popen(*popenargs, **kwargs) as p:
  File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

我什至不明白为什么它说找不到文件,这是我用来保存模型的路径,但现在我无法从 s3 取回模型。请帮帮我!!

如果您确信模型文件存在,请尝试将模型文件的扩展名添加到 fname。 例如doc2vec_model.h3

我建议,与其使用通用的 print() 行来显示您的意图,不如打印您编写的 实际 command ,以验证它在观察中是有意义的。

如果是,那么也可以在启动 python 代码的命令提示符处直接尝试 完全相同的 aws ... 命令 ,以确保它以这种方式运行。如果没有,您可能会得到更明确的错误。

请注意,您收到的错误看起来并不特别像是来自 aws 命令,或者来自 S3 服务 - 它可能会谈论 'paths' 或 'objects'.相反,它来自 Python subprocess 系统 & Popen' call. I think those are via your call tosubprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with theload_d2v()`?)

这表明未找到的文件可能是 aws 命令本身。您确定它已安装并可从您的 Python 在 运行 中的确切 working-directory/environment 运行,并通过 subprocess.call()?

调用

(顺便说一句,如果我的 帮助您解决了 sklearn.externals.joblib 问题,您最好将答案标记为已接受,以免其他潜在回答者认为这仍然是一个问题阻碍你的未解决问题。)