PySpark+Flask+CherryPy - AttributeError: 'module' object has no attribute 'tree'
PySpark+Flask+CherryPy - AttributeError: 'module' object has no attribute 'tree'
我正在尝试根据本教程 https://www.codementor.io/spark/tutorial/building-a-web-service-with-apache-spark-flask-example-app-part2#/ 测试如何将 Flask 与 Spark 模型集成。这里 CherryPy 用于 wsgi。麻烦的是,当我们通过 spark-submit 启动应用程序时,它显示了这样的堆栈跟踪:
Traceback (most recent call last):
File "/home/roman/dev/python/flask-spark/cherrypy.py", line 43, in <module>
run_server(app)
File "/home/roman/dev/python/flask-spark/cherrypy.py", line 21, in run_server
cherrypy.tree.graft(app_logged, '/')
AttributeError: 'module' object has no attribute 'tree'
我不知道问题出在哪里。我认为这是因为 new/old 版本或类似的东西,但我不确定。我也使用 python 3 而不是 python 2,但它没有帮助。这是 wsgi 配置:
import time, sys, cherrypy, os
from paste.translogger import TransLogger
from webapp import create_app
from pyspark import SparkContext, SparkConf
def init_spark_context():
# load spark context
conf = SparkConf().setAppName("movie_recommendation-server")
# IMPORTANT: pass aditional Python modules to each worker
sc = SparkContext(conf=conf, pyFiles=['test.py', 'webapp.py'])
return sc
def run_server(app):
# Enable WSGI access logging via Paste
app_logged = TransLogger(app)
# Mount the WSGI callable object (app) on the root directory
cherrypy.tree.graft(app_logged, '/')
# Set the configuration of the web server
cherrypy.config.update({
'engine.autoreload.on': True,
'log.screen': True,
'server.socket_port': 5432,
'server.socket_host': '0.0.0.0'
})
# Start the CherryPy WSGI web server
cherrypy.engine.start()
cherrypy.engine.block()
if __name__ == "__main__":
# Init spark context and load libraries
sc = init_spark_context()
dataset_path = os.path.join('datasets', 'ml-latest-small')
app = create_app(sc, dataset_path)
# start web server
run_server(app)
您提供的回溯清楚地表明您的应用正在尝试使用名为 cherrypy
(/home/roman/dev/python/flask-spark/cherrypy.py
) 的本地模块,而不是实际的 cherrypy
库(应该类似于/path/to/your/python/lib/python-version/siteX.Y/cherrypy
).
要解决这个问题,您可以简单地重命名本地模块以避免冲突。
我正在尝试根据本教程 https://www.codementor.io/spark/tutorial/building-a-web-service-with-apache-spark-flask-example-app-part2#/ 测试如何将 Flask 与 Spark 模型集成。这里 CherryPy 用于 wsgi。麻烦的是,当我们通过 spark-submit 启动应用程序时,它显示了这样的堆栈跟踪:
Traceback (most recent call last):
File "/home/roman/dev/python/flask-spark/cherrypy.py", line 43, in <module>
run_server(app)
File "/home/roman/dev/python/flask-spark/cherrypy.py", line 21, in run_server
cherrypy.tree.graft(app_logged, '/')
AttributeError: 'module' object has no attribute 'tree'
我不知道问题出在哪里。我认为这是因为 new/old 版本或类似的东西,但我不确定。我也使用 python 3 而不是 python 2,但它没有帮助。这是 wsgi 配置:
import time, sys, cherrypy, os
from paste.translogger import TransLogger
from webapp import create_app
from pyspark import SparkContext, SparkConf
def init_spark_context():
# load spark context
conf = SparkConf().setAppName("movie_recommendation-server")
# IMPORTANT: pass aditional Python modules to each worker
sc = SparkContext(conf=conf, pyFiles=['test.py', 'webapp.py'])
return sc
def run_server(app):
# Enable WSGI access logging via Paste
app_logged = TransLogger(app)
# Mount the WSGI callable object (app) on the root directory
cherrypy.tree.graft(app_logged, '/')
# Set the configuration of the web server
cherrypy.config.update({
'engine.autoreload.on': True,
'log.screen': True,
'server.socket_port': 5432,
'server.socket_host': '0.0.0.0'
})
# Start the CherryPy WSGI web server
cherrypy.engine.start()
cherrypy.engine.block()
if __name__ == "__main__":
# Init spark context and load libraries
sc = init_spark_context()
dataset_path = os.path.join('datasets', 'ml-latest-small')
app = create_app(sc, dataset_path)
# start web server
run_server(app)
您提供的回溯清楚地表明您的应用正在尝试使用名为 cherrypy
(/home/roman/dev/python/flask-spark/cherrypy.py
) 的本地模块,而不是实际的 cherrypy
库(应该类似于/path/to/your/python/lib/python-version/siteX.Y/cherrypy
).
要解决这个问题,您可以简单地重命名本地模块以避免冲突。