Airflow 调度程序无法使用 kubernetes 执行程序启动
Airflow scheduler fails to start with kubernetes executor
我正在使用 https://github.com/helm/charts/tree/master/stable/airflow helm chart 并构建安装了 kubernetes 的 v1.10.8 puckle/docker-airflow
映像,并在 helm chart 中使用该映像,
但我不断得到
File "/usr/local/bin/airflow", line 37, in <module>
args.func(args)
File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 1140, in initdb
db.initdb(settings.RBAC)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 332, in initdb
dagbag = models.DagBag()
File "/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py", line 95, in __init__
executor = get_default_executor()
File "/usr/local/lib/python3.7/site-packages/airflow/executors/__init__.py", line 48, in get_default_executor
DEFAULT_EXECUTOR = _get_executor(executor_name)
File "/usr/local/lib/python3.7/site-packages/airflow/executors/__init__.py", line 87, in _get_executor
return KubernetesExecutor()
File "/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py", line 702, in __init__
self.kube_config = KubeConfig()
File "/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py", line 283, in __init__
self.kube_client_request_args = json.loads(kube_client_request_args)
File "/usr/local/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
在我的调度程序中,也正如各种消息来源所建议的那样,
我试过设置:
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: {"_request_timeout" : [60,60] }
在我的掌舵价值观中。那也没有用任何人有任何想法我错过了什么?
这是我的 values.yaml
airflow:
image:
repository: airflow-docker-local
tag: 1.10.8
executor: Kubernetes
service:
type: LoadBalancer
config:
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.8
AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
AIRFLOW__KUBERNETES__NAMESPACE: airflow
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: {"_request_timeout" : [60,60] }
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://postgres:airflow@airflow-postgresql:5432/airflow
persistence:
enabled: true
existingClaim: ''
workers:
enabled: false
postgresql:
enabled: true
redis:
enabled: false
编辑:
在helm中设置环境变量的各种尝试values.yaml都没有用,之后我添加了(注意双引号和单引号)
ENV AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS='{"_request_timeout" : [60,60] }'
至 Dockerfile 此处:https://github.com/puckel/docker-airflow/blob/1.10.9/Dockerfile#L19
之后,我的 airflow-scheduler
pod 启动,但我的调度程序 pod 不断出现以下错误。
Process KubernetesJobWatcher-9: Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 313,
in recv_into return self.connection.recv_into(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/OpenSSL/SSL.py",
line 1840, in recv_into self._raise_ssl_error(self._ssl, result) File "/usr/local/lib/python3.7/site-packages/OpenSSL/SSL.py",
line 1646, in _raise_ssl_error raise WantReadError() OpenSSL.SSL.WantReadError
对于 helm 值,模板使用放置 airflow.config
地图 into double quotes "
的循环。这意味着值中的任何 "
都需要转义才能使输出模板化 YAML 有效。
airflow:
config:
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: '{\"_request_timeout\":60}'
部署并运行(但我还没有完成端到端测试)
根据 this github issue,python 调度程序 SSL 超时可能不是问题,因为观察程序会在 60 秒连接超时后再次启动。
我正在使用 https://github.com/helm/charts/tree/master/stable/airflow helm chart 并构建安装了 kubernetes 的 v1.10.8 puckle/docker-airflow
映像,并在 helm chart 中使用该映像,
但我不断得到
File "/usr/local/bin/airflow", line 37, in <module>
args.func(args)
File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 1140, in initdb
db.initdb(settings.RBAC)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 332, in initdb
dagbag = models.DagBag()
File "/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py", line 95, in __init__
executor = get_default_executor()
File "/usr/local/lib/python3.7/site-packages/airflow/executors/__init__.py", line 48, in get_default_executor
DEFAULT_EXECUTOR = _get_executor(executor_name)
File "/usr/local/lib/python3.7/site-packages/airflow/executors/__init__.py", line 87, in _get_executor
return KubernetesExecutor()
File "/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py", line 702, in __init__
self.kube_config = KubeConfig()
File "/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py", line 283, in __init__
self.kube_client_request_args = json.loads(kube_client_request_args)
File "/usr/local/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
在我的调度程序中,也正如各种消息来源所建议的那样, 我试过设置:
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: {"_request_timeout" : [60,60] }
在我的掌舵价值观中。那也没有用任何人有任何想法我错过了什么?
这是我的 values.yaml
airflow:
image:
repository: airflow-docker-local
tag: 1.10.8
executor: Kubernetes
service:
type: LoadBalancer
config:
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.8
AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
AIRFLOW__KUBERNETES__NAMESPACE: airflow
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: {"_request_timeout" : [60,60] }
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://postgres:airflow@airflow-postgresql:5432/airflow
persistence:
enabled: true
existingClaim: ''
workers:
enabled: false
postgresql:
enabled: true
redis:
enabled: false
编辑:
在helm中设置环境变量的各种尝试values.yaml都没有用,之后我添加了(注意双引号和单引号)
ENV AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS='{"_request_timeout" : [60,60] }'
至 Dockerfile 此处:https://github.com/puckel/docker-airflow/blob/1.10.9/Dockerfile#L19
之后,我的 airflow-scheduler
pod 启动,但我的调度程序 pod 不断出现以下错误。
Process KubernetesJobWatcher-9: Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 313,
in recv_into return self.connection.recv_into(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/OpenSSL/SSL.py",
line 1840, in recv_into self._raise_ssl_error(self._ssl, result) File "/usr/local/lib/python3.7/site-packages/OpenSSL/SSL.py",
line 1646, in _raise_ssl_error raise WantReadError() OpenSSL.SSL.WantReadError
对于 helm 值,模板使用放置 airflow.config
地图 into double quotes "
的循环。这意味着值中的任何 "
都需要转义才能使输出模板化 YAML 有效。
airflow:
config:
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS: '{\"_request_timeout\":60}'
部署并运行(但我还没有完成端到端测试)
根据 this github issue,python 调度程序 SSL 超时可能不是问题,因为观察程序会在 60 秒连接超时后再次启动。