全新安装 Apache Airflow 2.2.3 ..糟糕的事情发生了

fresh install of Apache Airflow 2.2.3 .. OOps something bad happened

我已经通过 pip 在本地成功安装了 apache airflow .. 需要一些引脚

pip3 install zipp==3.1.0
pip3 install sqlalchemy==1.3.24
python3 -m pip install virtualenv
pip3 install apache-airflow[cncf.kubernetes]

pip3 install apache-airflow

并且由于我是 n00b 所有这些东西我从基础开始..我首先尝试 airflow standalone 但文档中没有说明默认用户名和密码是什么因为那是......所以我进去并与基本用户一起开始了一些服务......

airflow db init
airflow users create --role Admin --username admin --email admin --firstname admin --lastname admin --password admin

现在这只需要启动..我意识到我需要启动调度程序和网络应用程序...由于某种原因我的自动脚本没有这样做所以..我必须手动执行但是。 .

airflow scheduler &
airflow webapp

现在一切正常..我可以看到我有一个启动 GUI 和 运行..一切似乎都很好..我想启动我发现的第一个 DAG

example_bash_operator

问题是...当我点击名称时..或点击开始..有一半的时间它有效..但通常是前几次我点击任何东西..我收到了一个错误

Python version: 3.8.10
Airflow version: 2.2.3
Node: juju-2dd159-310.lxd
-------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/www/auth.py", line 51, in decorated
    return func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/www/decorators.py", line 72, in wrapper
    return f(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
    return func(*args, session=session, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/www/views.py", line 1732, in trigger
    if unpause and dag.is_paused:
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/models/dag.py", line 1081, in is_paused
    warnings.warn(
  File "/usr/lib/python3.8/warnings.py", line 109, in _showwarnmsg
    sw(msg.message, msg.category, msg.filename, msg.lineno,
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/settings.py", line 117, in custom_show_warning
    write_console.print(msg, soft_wrap=True)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 1642, in print
    self._buffer.extend(new_segments)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 842, in __exit__
    self._exit_buffer()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 800, in _exit_buffer
    self._check_buffer()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 1935, in _check_buffer
    self.file.flush()
BrokenPipeError: [Errno 32] Broken pipe

如果我忽略它并等待一分钟或再试一次..突然它起作用了...任何线索如何使这种体验变得顺畅?

编辑:这有助于回答问题

ubuntu@juju-2dd159-311:~$ pip --version
pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)
ubuntu@juju-2dd159-311:~$ python3 --version
Python 3.8.10

编辑 #2

我按照这些说明按照需要安装了约束 https://airflow.apache.org/docs/apache-airflow/stable/start/local.html

这极大地修复了 GUI 和 UI.. 中的稳定性。但是......然后我开始连接 postgresql 数据库......现在它甚至不会登录时没有 brokenpipe 错误

Python version: 3.8.10
Airflow version: 2.2.3
Node: juju-2dd159-318.lxd
-------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/www/auth.py", line 51, in decorated
    return func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/www/views.py", line 718, in index
    paging=wwwutils.generate_pages(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/www/utils.py", line 113, in generate_pages
    previous_node = Markup(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/jinja2/utils.py", line 838, in __new__
    warnings.warn(
  File "/usr/lib/python3.8/warnings.py", line 109, in _showwarnmsg
    sw(msg.message, msg.category, msg.filename, msg.lineno,
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/settings.py", line 117, in custom_show_warning
    write_console.print(msg, soft_wrap=True)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 1642, in print
    self._buffer.extend(new_segments)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 842, in __exit__
    self._exit_buffer()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 800, in _exit_buffer
    self._check_buffer()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/rich/console.py", line 1935, in _check_buffer
    self.file.flush()

您是否尝试按照“快速入门”说明进行操作?

https://airflow.apache.org/docs/apache-airflow/stable/start/index.html

Airflow 有关于如何开始的很好而全面的说明,如果您按照它一步一步地操作,您将启动 Airflow 并 运行。这可以通过 docker compose 或本地 virtualenv 来完成。

您的问题可能是缺少资源 - 内存(最有可能)。 Airflow 需要大量内存 (4GB) 才能启动,因为它是一个复杂的系统。它是作为先决条件编写的,尤其是在 Docker Compose 快速入门中。如果您没有足够的资源,Docker Compose 甚至会警告您,所以如果您想要真正稳健的快速入门,我推荐这个。

您需要查看日志以了解出现管道错误的原因。但最有可能的原因是缺乏资源。

关于“独立”模式和用户密码 - 您可能错过了 airflow 写给您的内容。它在启动时动态生成密码,并实际告诉您应该使用什么密码:

standalone | 
standalone | Airflow is ready
standalone | Login with username: admin  password: 4hfH8mATcvMFmne9
standalone | Airflow Standalone is for development purposes only. Do not use this in production!
standalone | 

事实证明.. 在后台为 运行 apache 气流编写一个简单的脚本

#!/bin/bash
airflow webserver -D

不能正常工作或玩得很好.. 使用 Ubuntu LTS 20.04.. 我现在发现我应该让 SystemD 处理开始停止.. 现在工作得很好..这是我在 Ubuntu 20.04 LTS

中注入所需脚本的方式

这会注册网络服务器、调度程序和触发器服务

if [ ! -e /etc/systemd/system/airflow-scheduler.service ]; then
  cat <<EOT >> /etc/systemd/system/airflow-scheduler.service
[Unit]
Description=Airflow scheduler daemon

[Service]
Environment="PATH=/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
User=ubuntu
Type=simple
ExecStart=/home/ubuntu/.local/bin/airflow scheduler
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target
EOT
fi

if [ ! -e /etc/systemd/system/airflow-webserver.service ]; then
  cat <<EOT >> /etc/systemd/system/airflow-webserver.service
[Unit]
Description=Airflow webserver daemon

[Service]
Environment="PATH=/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
User=ubuntu
Type=simple
ExecStart=/home/ubuntu/.local/bin/airflow webserver
Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOT
fi

if [ ! -e /etc/systemd/system/airflow-triggerer.service ]; then
  cat <<EOT >> /etc/systemd/system/airflow-triggerer.service
[Unit]
Description=Airflow triggerer daemon

[Service]
Environment="PATH=/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
User=ubuntu
Type=simple
ExecStart=/home/ubuntu/.local/bin/airflow triggerer
Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOT
fi

systemctl daemon-reload
systemctl enable airflow-scheduler
systemctl enable airflow-webserver
systemctl enable airflow-triggerer

然后开始我就做

systemctl start airflow-webserver
systemctl start airflow-scheduler
systemctl start airflow-triggerer