Spark 退出状态 134. 什么意思

Spark Exit Status 134. What does it mean

在 运行 我的工作中,我的一些任务出现以下失败错误。 但工作总体上成功完成并退出。这是什么意思?我可以相信结果吗?

ExecutorLostFailure (executor 8 exited caused by one of the running tasks) Reason: Container from a bad node: container_1610292825631_0097_01_000013 on host: ip-xx-xxx-xx-xx.us.aws.xxxx.com. Exit status: 134. Diagnostics: e 44.0 (TID 16633)

Container exited with a non-zero exit code 134. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/bin/bash: line 1: 16507 Aborted
Last 4096 bytes of stderr :
 task 422.0 in stage 44.0 (TID 16633)
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Getting 56 non-empty blocks including 12 local blocks and 44 remote blocks
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 2 ms
21/01/25 17:25:50 INFO Executor: Finished task 422.0 in stage 44.0 (TID 16633). 6435 bytes result sent to driver
21/01/25 17:25:50 INFO CoarseGrainedExecutorBackend: Got assigned task 16639
21/01/25 17:25:50 INFO Executor: Running task 433.0 in stage 44.0 (TID 16639)
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Getting 95 non-empty blocks including 9 local blocks and 86 remote blocks
21/01/25 17:25:50 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:51 INFO Executor: Finished task 383.0 in stage 44.0 (TID 16579). 6478 bytes result sent to driver
21/01/25 17:25:51 INFO CoarseGrainedExecutorBackend: Got assigned task 16661
21/01/25 17:25:51 INFO Executor: Running task 471.0 in stage 44.0 (TID 16661)
21/01/25 17:25:51 INFO ShuffleBlockFetcherIterator: Getting 200 non-empty blocks including 30 local blocks and 170 remote blocks
21/01/25 17:25:51 INFO ShuffleBlockFetcherIterator: Started 6 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 319.0 in stage 44.0 (TID 16555). 6478 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16675
21/01/25 17:25:52 INFO Executor: Running task 482.0 in stage 44.0 (TID 16675)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 25 non-empty blocks including 5 local blocks and 20 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 482.0 in stage 44.0 (TID 16675). 6435 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16679
21/01/25 17:25:52 INFO Executor: Running task 491.0 in stage 44.0 (TID 16679)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 138 non-empty blocks including 19 local blocks and 119 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 433.0 in stage 44.0 (TID 16639). 6521 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16684
21/01/25 17:25:52 INFO Executor: Running task 493.0 in stage 44.0 (TID 16684)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 190 non-empty blocks including 29 local blocks and 161 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:52 INFO Executor: Finished task 491.0 in stage 44.0 (TID 16679). 6435 bytes result sent to driver
21/01/25 17:25:52 INFO CoarseGrainedExecutorBackend: Got assigned task 16685
21/01/25 17:25:52 INFO Executor: Running task 500.0 in stage 44.0 (TID 16685)
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Getting 51 non-empty blocks including 12 local blocks and 39 remote blocks
21/01/25 17:25:52 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:54 INFO Executor: Finished task 500.0 in stage 44.0 (TID 16685). 6478 bytes result sent to driver
21/01/25 17:25:54 INFO CoarseGrainedExecutorBackend: Got assigned task 16714
21/01/25 17:25:54 INFO Executor: Running task 524.0 in stage 44.0 (TID 16714)
21/01/25 17:25:54 INFO ShuffleBlockFetcherIterator: Getting 114 non-empty blocks including 17 local blocks and 97 remote blocks
21/01/25 17:25:54 INFO ShuffleBlockFetcherIterator: Started 7 remote fetches in 1 ms
21/01/25 17:25:59 INFO Executor: Finished task 471.0 in stage 44.0 (TID 16661). 6478 bytes result sent to driver
21/01/25 17:25:59 INFO CoarseGrainedExecutorBackend: Got assigned task 16767
21/01/25 17:25:59 INFO Executor: Running task 536.0 in stage 44.0 (TID 16767)
21/01/25 17:25:59 INFO ShuffleBlockFetcherIterator: Getting 110 non-empty blocks including 16 local blocks and 94 remote blocks
21/01/25 17:25:59 INFO ShuffleBlockFetcherIterator: Started 5 remote fetches in 1 ms

TL;DR 你可以相信结果。

Spark 内置支持在其他可用节点上重试失败的任务以支持容错。您失败的作业将在其他 node/executor 上重试,并且该结果包含在您的最终结果中。所以,是的,你可以相信结果。

关于错误,退出状态 134 表示收到 SIGABORT 退出信号。正如错误消息中所说,这可能是因为容器是在黑名单节点(坏节点)上启动的。列入黑名单的节点是被 YARN 标记为不适合 运行 个容器的节点。