调度 Airflow DAG 作业
Scheduling AirfFlow DAG job
我写了一个 AirFlow DAG 如下 -
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2016, 7, 5),
'email': ['airflow@airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(seconds=30),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
}
dag = DAG(
'test-air', default_args=default_args, schedule_interval='*/2 * * * *')
.................
.................
{{Tasks}}
根据上面的配置,作业应该 运行 每偶数分钟。但它显示在输出下方
airflow scheduler -d test-air
[2016-07-05 15:24:02,168] {jobs.py:574} INFO - Prioritizing 0 queued jobs
[2016-07-05 15:24:02,177] {jobs.py:726} INFO - Starting 0 scheduler jobs
[2016-07-05 15:24:02,177] {jobs.py:741} INFO - Done queuing tasks, calling the executor's heartbeat
[2016-07-05 15:24:02,177] {jobs.py:744} INFO - Loop took: 0.012636 seconds
[2016-07-05 15:24:02,256] {models.py:305} INFO - Finding 'running' jobs without a recent heartbeat
[2016-07-05 15:24:02,256] {models.py:311} INFO - Failing jobs without heartbeat after 2016-07-05 15:21:47.256816
[2016-07-05 15:24:07,177] {jobs.py:574} INFO - Prioritizing 0 queued jobs
[2016-07-05 15:24:07,182] {jobs.py:726} INFO - Starting 0 scheduler jobs
[2016-07-05 15:24:07,182] {jobs.py:741} INFO - Done queuing tasks, calling the executor's heartbeat
[2016-07-05 15:24:07,182] {jobs.py:744} INFO - Loop took: 0.007725 seconds
[2016-07-05 15:24:07,249] {models.py:305} INFO - Finding 'running' jobs without a recent heartbeat
[2016-07-05 15:24:07,249] {models.py:311} INFO - Failing jobs without heartbeat after 2016-07-05 15:21:52.249706
有人可以带我过去吗?
谢谢
帕里
默认情况下,创建的每个 dag 都处于 "pause" 模式。这是在您的 "airflow.cfg" 文件中定义的。
您可以通过
取消暂停您的 dag
$ airflow unpause test-air
然后使用调度程序重试。
您还可以从 Airflow webUI 切换您的 dag on/off(默认情况下它是关闭的)
我写了一个 AirFlow DAG 如下 -
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2016, 7, 5),
'email': ['airflow@airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(seconds=30),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
}
dag = DAG(
'test-air', default_args=default_args, schedule_interval='*/2 * * * *')
.................
.................
{{Tasks}}
根据上面的配置,作业应该 运行 每偶数分钟。但它显示在输出下方
airflow scheduler -d test-air
[2016-07-05 15:24:02,168] {jobs.py:574} INFO - Prioritizing 0 queued jobs
[2016-07-05 15:24:02,177] {jobs.py:726} INFO - Starting 0 scheduler jobs
[2016-07-05 15:24:02,177] {jobs.py:741} INFO - Done queuing tasks, calling the executor's heartbeat
[2016-07-05 15:24:02,177] {jobs.py:744} INFO - Loop took: 0.012636 seconds
[2016-07-05 15:24:02,256] {models.py:305} INFO - Finding 'running' jobs without a recent heartbeat
[2016-07-05 15:24:02,256] {models.py:311} INFO - Failing jobs without heartbeat after 2016-07-05 15:21:47.256816
[2016-07-05 15:24:07,177] {jobs.py:574} INFO - Prioritizing 0 queued jobs
[2016-07-05 15:24:07,182] {jobs.py:726} INFO - Starting 0 scheduler jobs
[2016-07-05 15:24:07,182] {jobs.py:741} INFO - Done queuing tasks, calling the executor's heartbeat
[2016-07-05 15:24:07,182] {jobs.py:744} INFO - Loop took: 0.007725 seconds
[2016-07-05 15:24:07,249] {models.py:305} INFO - Finding 'running' jobs without a recent heartbeat
[2016-07-05 15:24:07,249] {models.py:311} INFO - Failing jobs without heartbeat after 2016-07-05 15:21:52.249706
有人可以带我过去吗?
谢谢 帕里
默认情况下,创建的每个 dag 都处于 "pause" 模式。这是在您的 "airflow.cfg" 文件中定义的。 您可以通过
取消暂停您的 dag$ airflow unpause test-air
然后使用调度程序重试。
您还可以从 Airflow webUI 切换您的 dag on/off(默认情况下它是关闭的)