Hadoop 2.7.5 - AM 容器启动错误

Hadoop 2.7.5 - AM Container launch error

我看到了多个与 AM 容器启动错误相关的问题,但 none 为我解决了这个问题。

我已经在我的 Mac OSX High Sierra 笔记本电脑上安装了 Hadoop 2.7.5,并正在尝试 Pi 的示例 mapreduce 作业:

hadoop jar /usr/local/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 2 4

我有所有的服务运行:

$ jps
69555 NameNode
69954 NodeManager
69750 SecondaryNameNode
70806 JobHistoryServer
69643 DataNode
71194 Jps
69866 ResourceManager

这是我得到的输出:

$ hadoop jar /usr/local/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi  2 4
Number of Maps  = 2
Samples per Map = 4
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/03/25 13:30:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
18/03/25 13:30:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/03/25 13:30:43 INFO input.FileInputFormat: Total input paths to process : 2
18/03/25 13:30:43 INFO mapreduce.JobSubmitter: number of splits:2
18/03/25 13:30:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1521963635636_0004
18/03/25 13:30:44 INFO impl.YarnClientImpl: Submitted application application_1521963635636_0004
18/03/25 13:30:44 INFO mapreduce.Job: The url to track the job: http://AbdealiJK-Mac.local:8088/proxy/application_1521963635636_0004/
18/03/25 13:30:44 INFO mapreduce.Job: Running job: job_1521963635636_0004
18/03/25 13:30:51 INFO mapreduce.Job: Job job_1521963635636_0004 running in uber mode : false
18/03/25 13:30:51 INFO mapreduce.Job:  map 0% reduce 0%
18/03/25 13:30:51 INFO mapreduce.Job: Job job_1521963635636_0004 failed with state FAILED due to: Application application_1521963635636_0004 failed 2 times due to AM Container for appattempt_1521963635636_0004_000002 exited with  exitCode: -1
For more detailed output, check application tracking page:http://AbdealiJK-Mac.local:8088/cluster/app/application_1521963635636_0004Then, click on links to logs of each attempt.
Diagnostics: File /Users/abdealijk/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004/container_1521963635636_0004_02_000001 does not exist
Failing this attempt. Failing the application.
18/03/25 13:30:51 INFO mapreduce.Job: Counters: 0
Job Finished in 7.986 seconds
java.io.FileNotFoundException: File does not exist: hdfs://localhost/user/abdealijk/QuasiMonteCarlo_1521964841970_1162968685/out/reduce-out
    at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1309)
    at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1301)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1820)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1843)
    at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
    at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:355)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

错误似乎是:

File /Users/abdealijk/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004/container_1521963635636_0004_02_000001 does not exist

但是当我检查它时:

$ ls -lh ~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004
total 0
drwxr-xr-x  6 abdealijk  staff   192B Mar 25 13:30 filecache

我有写权限,我拥有那个文件夹等等。但是那里仍然没有创建容器文件夹。

编辑 1:使用 YARN-RM / yarn 命令记录日志

我试过检查 YARN-RM webUI 中的日志以及 yarn logs -applicationId 但他们都说由于 AM-container 没有启动,所以没有找到日志。

编辑 2:这是我在该文件夹中得到的内容

$ tree ~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1522077498598_0003
~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1522077498598_0003
└── filecache
    ├── 10
    │   └── job.splitmetainfo
    ├── 11
    │   └── job.jar
    │       └── job.jar
    ├── 12
    │   └── job.split
    └── 13
        └── job.xml

6 directories, 4 files

容器没有文件夹:(

编辑 3:我的核心-site.xml 有:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <!-- <value>hdfs://localhost/</value> -->
        <value>hdfs://localhost:8020/</value>
    </property>
</configuration>

hdfs://localhost/hdfs://localhost:8020/ 我都试过了。

我认为这可能是 URI 的问题

修复它的方法如下:

hdfs-site.xml:

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>${user.home}/hadoop/hdfs/datanode</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>${user.home}/hadoop/hdfs/namenode</value>
  </property>
</configuration>

core-site.xml:

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000/</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/tmp/hadoop-tmpdir</value>
  </property>
</configuration>

执行:

$ rm -rf ~/hadoop  # To delete my previous folders
$ mkdir -p ~/hadoop/hdfs/namenode
$ mkdir -p ~/hadoop/hdfs/datanode
$ hdfs namenode -format

现在执行命令允许容器成功启动。