无法启动 Airflow worker/flower 并且需要澄清 Airflow 架构以确认安装正确
Unable to start Airflow worker/flower and need clarification on Airflow architecture to confirm that the installation is correct
运行 另一台机器上的工作人员导致下面指定的错误。我已关注 the configuration instructions 并已同步 dags 文件夹。
我还要确认一下,RabbitMQ和PostgreSQL只需要安装在Airflow核心机上,不需要安装在worker上(worker只连接core)
设置的详细说明如下:
气流core/server电脑
安装了以下软件:
- Python 2.7 与
- 气流(AIRFLOW_HOME = ~/气流)
- 芹菜
- psycogp2
- RabbitMQ
- PostgreSQL
在 airflow.cfg 中进行的配置:
sql_alchemy_conn = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
executor = CeleryExecutor
broker_url = amqp://username:password@192.168.1.2:5672//
celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
执行的测试:
- RabbitMQ 运行宁
- 可以连接到 PostgreSQL 并已确认 Airflow 已创建表
- 可以启动和查看网络服务器(包括自定义 dags)
.
.
Airflow 工作计算机
安装了以下软件:
- Python 2.7 与
- 气流(AIRFLOW_HOME = ~/气流)
- 芹菜
- psycogp2
在airflow.cfg中所做的配置与服务器中的完全相同:
sql_alchemy_conn = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
executor = CeleryExecutor
broker_url = amqp://username:password@192.168.1.2:5672//
celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
worker 机器上命令 运行 的输出:
当运行宁airflow flower
:
ubuntu@airflow_client:~/airflow$ airflow flower
[2016-06-13 04:19:42,814] {__init__.py:36} INFO - Using executor CeleryExecutor
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/bin/airflow", line 15, in <module>
args.func(args)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/airflow/bin/cli.py", line 576, in flower
os.execvp("flower", ['flower', '-b', broka, port, api])
File "/home/ubuntu/anaconda2/lib/python2.7/os.py", line 346, in execvp
_execvpe(file, args)
File "/home/ubuntu/anaconda2/lib/python2.7/os.py", line 382, in _execvpe
func(fullname, *argrest)
OSError: [Errno 2] No such file or directory
当运行宁airflow worker
:
ubuntu@airflow_client:~$ airflow worker
[2016-06-13 04:08:43,573] {__init__.py:36} INFO - Using executor CeleryExecutor
[2016-06-13 04:08:43,935: ERROR/MainProcess] Unrecoverable error: ImportError('No module named postgresql',)
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
self.on_start()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/apps/worker.py", line 169, in on_start
string(self.colored.cyan(' \n', self.startup_info())),
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/apps/worker.py", line 230, in startup_info
results=self.app.backend.as_uri(),
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/kombu/utils/__init__.py", line 325, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/app/base.py", line 626, in backend
return self._get_backend()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/app/base.py", line 444, in _get_backend
self.loader)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/backends/__init__.py", line 68, in get_backend_by_url
return get_backend_cls(backend, loader), url
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/backends/__init__.py", line 49, in get_backend_cls
cls = symbol_by_name(backend, aliases)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/kombu/utils/__init__.py", line 96, in symbol_by_name
module = imp(module_name, package=package, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named postgresql
当 celery_result_backend
更改为默认值 db+mysql://airflow:airflow@localhost:3306/airflow
且 airflow worker
再次为 运行 时,结果为:
ubuntu@airflow_client:~/airflow$ airflow worker
[2016-06-13 04:17:32,387] {__init__.py:36} INFO - Using executor CeleryExecutor
-------------- celery@airflow_client2 v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Linux-3.19.0-59-generic-x86_64-with-debian-jessie-sid
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: airflow.executors.celery_executor:0x7f5cb65cb510
- ** ---------- .> transport: amqp://username:**@192.168.1.2:5672//
- ** ---------- .> results: mysql://airflow:**@localhost:3306/airflow
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> default exchange=default(direct) key=celery
[2016-06-13 04:17:33,385] {__init__.py:36} INFO - Using executor CeleryExecutor
Starting flask
[2016-06-13 04:17:33,737] {_internal.py:87} INFO - * Running on http://0.0.0.0:8793/ (Press CTRL+C to quit)
[2016-06-13 04:17:34,536: WARNING/MainProcess] celery@airflow_client2 ready.
我错过了什么?我该如何进一步诊断?
您需要确保安装了Celery Flower。即pip install flower
.
ImportError: No module named postgresql
错误是由于您的 celery_result_backend
中使用了无效的前缀。当使用数据库作为 Celery 后端时,连接 URL 必须以 db+
为前缀。看
https://docs.celeryproject.org/en/stable/userguide/configuration.html#conf-database-result-backend
所以替换:
celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
类似的东西:
celery_result_backend = db+postgresql://username:password@192.168.1.2:5432/airflow
运行 另一台机器上的工作人员导致下面指定的错误。我已关注 the configuration instructions 并已同步 dags 文件夹。
我还要确认一下,RabbitMQ和PostgreSQL只需要安装在Airflow核心机上,不需要安装在worker上(worker只连接core)
设置的详细说明如下:
气流core/server电脑
安装了以下软件:
- Python 2.7 与
- 气流(AIRFLOW_HOME = ~/气流)
- 芹菜
- psycogp2
- RabbitMQ
- PostgreSQL
在 airflow.cfg 中进行的配置:
sql_alchemy_conn = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
executor = CeleryExecutor
broker_url = amqp://username:password@192.168.1.2:5672//
celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
执行的测试:
- RabbitMQ 运行宁
- 可以连接到 PostgreSQL 并已确认 Airflow 已创建表
- 可以启动和查看网络服务器(包括自定义 dags)
.
.
Airflow 工作计算机
安装了以下软件:
- Python 2.7 与
- 气流(AIRFLOW_HOME = ~/气流)
- 芹菜
- psycogp2
在airflow.cfg中所做的配置与服务器中的完全相同:
sql_alchemy_conn = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
executor = CeleryExecutor
broker_url = amqp://username:password@192.168.1.2:5672//
celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
worker 机器上命令 运行 的输出:
当运行宁airflow flower
:
ubuntu@airflow_client:~/airflow$ airflow flower
[2016-06-13 04:19:42,814] {__init__.py:36} INFO - Using executor CeleryExecutor
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/bin/airflow", line 15, in <module>
args.func(args)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/airflow/bin/cli.py", line 576, in flower
os.execvp("flower", ['flower', '-b', broka, port, api])
File "/home/ubuntu/anaconda2/lib/python2.7/os.py", line 346, in execvp
_execvpe(file, args)
File "/home/ubuntu/anaconda2/lib/python2.7/os.py", line 382, in _execvpe
func(fullname, *argrest)
OSError: [Errno 2] No such file or directory
当运行宁airflow worker
:
ubuntu@airflow_client:~$ airflow worker
[2016-06-13 04:08:43,573] {__init__.py:36} INFO - Using executor CeleryExecutor
[2016-06-13 04:08:43,935: ERROR/MainProcess] Unrecoverable error: ImportError('No module named postgresql',)
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
self.on_start()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/apps/worker.py", line 169, in on_start
string(self.colored.cyan(' \n', self.startup_info())),
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/apps/worker.py", line 230, in startup_info
results=self.app.backend.as_uri(),
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/kombu/utils/__init__.py", line 325, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/app/base.py", line 626, in backend
return self._get_backend()
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/app/base.py", line 444, in _get_backend
self.loader)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/backends/__init__.py", line 68, in get_backend_by_url
return get_backend_cls(backend, loader), url
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/celery/backends/__init__.py", line 49, in get_backend_cls
cls = symbol_by_name(backend, aliases)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/kombu/utils/__init__.py", line 96, in symbol_by_name
module = imp(module_name, package=package, **kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named postgresql
当 celery_result_backend
更改为默认值 db+mysql://airflow:airflow@localhost:3306/airflow
且 airflow worker
再次为 运行 时,结果为:
ubuntu@airflow_client:~/airflow$ airflow worker
[2016-06-13 04:17:32,387] {__init__.py:36} INFO - Using executor CeleryExecutor
-------------- celery@airflow_client2 v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Linux-3.19.0-59-generic-x86_64-with-debian-jessie-sid
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: airflow.executors.celery_executor:0x7f5cb65cb510
- ** ---------- .> transport: amqp://username:**@192.168.1.2:5672//
- ** ---------- .> results: mysql://airflow:**@localhost:3306/airflow
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> default exchange=default(direct) key=celery
[2016-06-13 04:17:33,385] {__init__.py:36} INFO - Using executor CeleryExecutor
Starting flask
[2016-06-13 04:17:33,737] {_internal.py:87} INFO - * Running on http://0.0.0.0:8793/ (Press CTRL+C to quit)
[2016-06-13 04:17:34,536: WARNING/MainProcess] celery@airflow_client2 ready.
我错过了什么?我该如何进一步诊断?
您需要确保安装了Celery Flower。即pip install flower
.
ImportError: No module named postgresql
错误是由于您的 celery_result_backend
中使用了无效的前缀。当使用数据库作为 Celery 后端时,连接 URL 必须以 db+
为前缀。看
https://docs.celeryproject.org/en/stable/userguide/configuration.html#conf-database-result-backend
所以替换:
celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
类似的东西:
celery_result_backend = db+postgresql://username:password@192.168.1.2:5432/airflow