Python Flask 应用程序在本地运行,但 returns 在 Heroku 上托管时出现 AttributeError

Python Flask app runs locally, but returns AttributeError when hosted on Heroku

我正在为大学开发应用程序。 Web 应用程序使用 joblib 加载给定模型,并且为了工作它使用 class FlexibleScaler:

flexible.py

from sklearn.preprocessing import MinMaxScaler, StandardScaler, PowerTransformer, MaxAbsScaler, RobustScaler, Normalizer
from sklearn.base import BaseEstimator, TransformerMixin

class FlexibleScaler(BaseEstimator, TransformerMixin):
    def __init__(self, scaler=None):
        self.scaler = scaler
        self.check = False


    def __assign_scaler(self):
        if self.scaler == 'min-max':
            self.method = MinMaxScaler()
        elif self.scaler == 'standard':
            self.method = StandardScaler()
        elif self.scaler == 'yeo-johnson':
            self.method = PowerTransformer(method='yeo-johnson')
        elif self.scaler == 'box-cox':
            self.method = PowerTransformer(method='box-cox')
        elif self.scaler == 'max-abs':
            self.method = MaxAbsScaler()
        elif self.scaler == 'robust':
            self.method = RobustScaler()
        elif self.scaler == 'normalize':
            self.method = Normalizer()
        else:
            self.method = None
        self.check = True

    def fit_transform(self, X, y=None, **fit_params):
        if not self.check:
            self.__assign_scaler()
        if self.method is None:
            return X
        return self.method.fit_transform(X, y, **fit_params)

    def fit(self, X):
        if not self.check:
            self.__assign_scaler()
        if self.method is None:
            return X
        self.method.fit(X)

    def transform(self, X):
        if not self.check:
            self.__assign_scaler()
        if self.method is None:
            return X
        return self.method.transform(X)

flask_start.py

from flask import Flask, Response, render_template, request, flash, redirect, session, g
import joblib
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import MinMaxScaler, StandardScaler, PowerTransformer, MaxAbsScaler, RobustScaler, Normalizer
from sklearn.base import BaseEstimator, TransformerMixin
from flexible import FlexibleScaler


UPLOAD_FOLDER = '/tmp/'
ALLOWED_EXTENSIONS = {'csv'}
app = Flask(__name__)

app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
app.config['ALLOWED_EXTENSIONS'] = ALLOWED_EXTENSIONS


@app.route("/", methods=["POST", "GET"])
def home():

    if request.method == 'POST':

        //get data to process

        clf = joblib.load('ENS_fitted.joblib')

        prediction = clf.predict(features)
        pred_prob = clf.predict_proba(features)

        //do operations and return template

if __name__ == "__main__":
    app.run(debug = True)

这一切都在本地工作。一旦我在 Heroku 上部署,我就会在 joblib.load():

上收到以下错误
Traceback (most recent call last):

2020-09-24T21:27:30.117559+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app

2020-09-24T21:27:30.117559+00:00 app[web.1]:     response = self.full_dispatch_request()

2020-09-24T21:27:30.117560+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request

2020-09-24T21:27:30.117560+00:00 app[web.1]:     rv = self.handle_user_exception(e)

2020-09-24T21:27:30.117561+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception

2020-09-24T21:27:30.117561+00:00 app[web.1]:     reraise(exc_type, exc_value, tb)

2020-09-24T21:27:30.117561+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise

2020-09-24T21:27:30.117562+00:00 app[web.1]:     raise value

2020-09-24T21:27:30.117563+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request

2020-09-24T21:27:30.117563+00:00 app[web.1]:     rv = self.dispatch_request()

2020-09-24T21:27:30.117563+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request

2020-09-24T21:27:30.117564+00:00 app[web.1]:     return self.view_functions[rule.endpoint](**req.view_args)

2020-09-24T21:27:30.117564+00:00 app[web.1]:   File "/app/flask_start.py", line 138, in home

2020-09-24T21:27:30.117564+00:00 app[web.1]:     clf = joblib.load('ENS_fitted.joblib')

2020-09-24T21:27:30.117565+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 585, in load

2020-09-24T21:27:30.117565+00:00 app[web.1]:     obj = _unpickle(fobj, filename, mmap_mode)

2020-09-24T21:27:30.117566+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle

2020-09-24T21:27:30.117566+00:00 app[web.1]:     obj = unpickler.load()

2020-09-24T21:27:30.117567+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/pickle.py", line 1210, in load

2020-09-24T21:27:30.117570+00:00 app[web.1]:     dispatch[key[0]](self)

2020-09-24T21:27:30.117570+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/pickle.py", line 1526, in load_global

2020-09-24T21:27:30.117570+00:00 app[web.1]:     klass = self.find_class(module, name)

2020-09-24T21:27:30.117571+00:00 app[web.1]:   File "/app/.heroku/python/lib/python3.8/pickle.py", line 1581, in find_class

2020-09-24T21:27:30.117571+00:00 app[web.1]:     return getattr(sys.modules[module], name)

AttributeError: module '__main__' has no attribute 'FlexibleScaler'

我不明白为什么会这样。导入在那里并在本地工作。 我试图将 class FlexibleScaler 直接复制到 flask_start.py(也可以在本地工作),但没有成功。

本地和 Heroku 之间唯一不同的是,在 Heroku 上我使用 gunicorn 启动应用程序。

如有任何帮助,我们将不胜感激。

我会尝试把

clf = joblib.load('ENS_fitted.joblib')

进入 try-except 块以查看那里的异常是否与您在

处得到的相同
AttributeError: module '__main__' has no attribute 'FlexibleScaler'

例如:

try:

    clf = joblib.load('ENS_fitted.joblib')
    prediction = clf.predict(features)
    pred_prob = clf.predict_proba(features)

except Exception as e:

    print(f"Exception: {e}")

此外,我建议确保 gunicorn 将程序调用为 name == "main" 通过打印通知到你自己;

if __name__ == "__main__":
    print("__name__ is __main__")
    app.run(debug = True)

如果执行此操作后仍无法解决错误,我会考虑使用 flask 配置 gunicorn。

看来 joblib.save() 产生 ENS_fitted.joblib 发生在 flask_start.py 是 运行 直接来自 python。在这种情况下,flask_start 将具有 __name__"__main__"。然后,当 joblib.save() 泡菜时,它会将 FlexibleScaler 实例保存为 __main__.FlexibleScaler.

但是当您在 gunicorn 下部署和 运行 时,flask_start 将具有 "flask_start"__name__。这混淆了 joblib.load(),它希望找到一个 __main__.FlexibleScaler,并如上所示放弃。

解决这个问题的方法是重新生成您保存的模型,但这次是通过

调用flask_start
% FLASK_APP=flask_start flask run

然后joblib.save(),然后re-deploy。

已更新

如果您绝对无法重新生成模型,您可以试试这个 hack。在 flask_start.py 中导入后,添加

import __main__
__main__.FlexibleScalar = FlexibleScalar

您要么能够 joblib.load() 模型,要么 运行 与另一个 class 出现类似的错误,在这种情况下,请重复此技巧。