Apache Airflow 和 Apache Atlas 超时
Apache Airflow and Apache Atlas Timeout
我是 AWS ECS 中的 运行 Apache Airflow,我是 EC2 中的 运行 Apache Atlas。我已经能够将 Apache Airflow 的本地实例连接到 EC2 上的 Apache Atlas;但是,我无法连接我的 AWS ECS 实例和 EC2 实例。当 DAG 中的 Airflow 任务尝试将信息推送到 Apache Atlas 时,出现以下错误。
[2021-02-18 18:49:37,301] {connectionpool.py:752} WARNING - Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e87410>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:49:47,302] {connectionpool.py:752} WARNING - Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e87b10>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:49:57,311] {connectionpool.py:752} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e9f190>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:50:07,319] {connectionpool.py:752} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e9f7d0>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:50:17,327] {connectionpool.py:752} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e9fe10>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:50:27,338] {taskinstance.py:1150} ERROR - HTTPConnectionPool(host='<ip-address>, port=21000): Max retries exceeded with url: /api/atlas/v2/types/typedefs (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2ea3490>, 'Connection to <ip-address> timed out. (connect timeout=10)'))
编辑:
按要求发布代码
airflow.cfg配置
backend = airflow.lineage.backend.atlas.AtlasBackend
[atlas]
host = <ip-address>
port = 21000
username = admin
password = <password>
我能够通过将 ip 地址设置为私有 ip 地址而不是 运行 atlas 所在的 ec2 的 public ip 地址来解决问题。此外,我必须更新ec2 运行 apache atlas 的安全组入站规则,以允许airflow webserver 流量的私有IP 地址进入。
我是 AWS ECS 中的 运行 Apache Airflow,我是 EC2 中的 运行 Apache Atlas。我已经能够将 Apache Airflow 的本地实例连接到 EC2 上的 Apache Atlas;但是,我无法连接我的 AWS ECS 实例和 EC2 实例。当 DAG 中的 Airflow 任务尝试将信息推送到 Apache Atlas 时,出现以下错误。
[2021-02-18 18:49:37,301] {connectionpool.py:752} WARNING - Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e87410>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:49:47,302] {connectionpool.py:752} WARNING - Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e87b10>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:49:57,311] {connectionpool.py:752} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e9f190>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:50:07,319] {connectionpool.py:752} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e9f7d0>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:50:17,327] {connectionpool.py:752} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2e9fe10>, 'Connection to <ip-address> timed out. (connect timeout=10)')': /api/atlas/v2/types/typedefs
[2021-02-18 18:50:27,338] {taskinstance.py:1150} ERROR - HTTPConnectionPool(host='<ip-address>, port=21000): Max retries exceeded with url: /api/atlas/v2/types/typedefs (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fb1e2ea3490>, 'Connection to <ip-address> timed out. (connect timeout=10)'))
编辑: 按要求发布代码
airflow.cfg配置
backend = airflow.lineage.backend.atlas.AtlasBackend
[atlas]
host = <ip-address>
port = 21000
username = admin
password = <password>
我能够通过将 ip 地址设置为私有 ip 地址而不是 运行 atlas 所在的 ec2 的 public ip 地址来解决问题。此外,我必须更新ec2 运行 apache atlas 的安全组入站规则,以允许airflow webserver 流量的私有IP 地址进入。