azkaban 运行 selenium automatic python 脚本流程在执行大约二十分钟后失败,系统变得非常缓慢

azkaban run selenium automatic python script flow failed when after execute about twenty miniutes,and the system becomes very slowly

我在 azkaban 中 运行 python 脚本。

enviroment:
CentOS 8.1 
azkaban 3.90.0
Python 3.6.8
ChromeDriver84.0.4147.30

在test.flow文件中

nodes:
  - name: job_test
    type: command
    config:
      command: python3 /home/azkaban/python_codes/pyib/activity/pickgoods.py

当运行执行此流程大约二十分钟后,系统变得非常缓慢并且执行失败。

28-07-2020 18:30:40 CST job_test INFO - Process with id 1403 completed unsuccessfully in 1727 seconds.
28-07-2020 18:30:40 CST job_test ERROR - Job run failed!
java.lang.RuntimeException: azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1
    at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:312)
    at azkaban.execapp.JobRunner.runJob(JobRunner.java:830)
    at azkaban.execapp.JobRunner.doRun(JobRunner.java:607)
    at azkaban.execapp.JobRunner.run(JobRunner.java:568)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1
    at azkaban.jobExecutor.utils.process.AzkabanProcess.run(AzkabanProcess.java:125)
    at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:304)
    ... 8 more
28-07-2020 18:30:40 CST job_test ERROR - azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1 cause: azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1
28-07-2020 18:30:40 CST job_test INFO - Finishing job job_test at 1595932240480 with status FAILED

和 azkaban-webserver.log 下的 azkaban-web-server

2020/07/28 21:00:34.127 +0800  INFO [ExecutorManager] [AzkabanWebServer-QueueProcessor-Thread] [Azkaban] Successfully refreshed executor: iZbp1hb3esnbp3levrcg05Z:36037 (id: 16), active=true with executor info : ExecutorInfo{remainingMemoryPercent=45.705342424456234, remainingMemoryInMB=835, remainingFlowCapacity=30, numberOfAssignedFlows=0, lastDispatchedTime=1595936723440, cpuUsage=0.01}
2020/07/28 21:00:34.128 +0800 ERROR [ExecutorManager] [AzkabanWebServer-QueueProcessor-Thread] [Azkaban] Failed to update ExecutorInfo for executor : iZbp1hb3esnbp3levrcg05Z:44085 (id: 17), active=true
java.util.concurrent.ExecutionException: org.apache.http.conn.HttpHostConnectException: Connect to iZbp1hb3esnbp3levrcg05Z:44085 [iZbp1hb3esnbp3levrcg05Z/172.16.184.105] failed: Connection refused (Connection refused)

谁能帮忙解决一下?

您的作业进程崩溃了。您可以在web UI 中找到它的错误日志以进一步调试;见 https://azkaban.readthedocs.io/en/latest/useAzkaban.html#job-logs