如何访问第一次尝试的纱线日志?

How can I access the first attempt's yarn log?

如果我使用 _attemptid 后缀,我会得到给定尝试的日志吗?像这样:

yarn logs -applicationId application_11112222333333_444444_1

奇怪的是我没有在网上找到这个问题的答案。

更新: 让我改一下我的问题: 如何访问给定尝试的纱线日志?

这里有一个有点难看但可以通过几个步骤解决的解决方案(对于 hadoop-2.6)。基本上每次尝试都在它的容器中执行。要获取特定容器的日志,需要知道 applicationId、containerId 和节点管理器地址。例如,您需要获取 appattempt_1:

的日志
  1. 获取有关 appattempt 的信息(containerId,主机 url): yarn applicationattempt -list application_ID_1。你会得到这样的东西:
======================== ======== ==================== =========================== 
  ApplicationAttempt-Id    State    AM-Container-Id            Tracking-URL         
 ======================== ======== ==================== =========================== 
  appattempt_1             FAILED   container_1          https://host1:8090/blabla  
  appattempt_2             KILLED   container_2          https://host2:8090/blabla  
 ======================== ======== ==================== =========================== 
  1. 将tracking-URL转为节点地址: $ yarn node -list -all | grep host1 | awk '{print }' host1:8041

  2. yarn logs -applicationId application_ID_1 -containerId container_1 -nodeAddress host1:8041

在 hadoop-2.7 中你可以只使用:

yarn logs -applicationId  [OPTIONS]

general options are:
 -am                      Prints the AM Container logs for
                                         this application. Specify
                                         comma-separated value to get logs
                                         for related AM Container. For
                                         example, If we specify -am 1,2,
                                         we will get the logs for the
                                         first AM Container as well as the
                                         second AM Container. To get logs
                                         for all AM Containers, use -am
                                         ALL. To get logs for the latest
                                         AM Container, use -am -1. By
                                         default, it will print all
                                         available logs. Work with
                                         -log_files to get only specific
                                         logs.