Hadoop 2.7.5 - AM 容器启动错误
Hadoop 2.7.5 - AM Container launch error
我看到了多个与 AM 容器启动错误相关的问题,但 none 为我解决了这个问题。
我已经在我的 Mac OSX High Sierra 笔记本电脑上安装了 Hadoop 2.7.5,并正在尝试 Pi 的示例 mapreduce 作业:
hadoop jar /usr/local/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 2 4
我有所有的服务运行:
$ jps
69555 NameNode
69954 NodeManager
69750 SecondaryNameNode
70806 JobHistoryServer
69643 DataNode
71194 Jps
69866 ResourceManager
这是我得到的输出:
$ hadoop jar /usr/local/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 2 4
Number of Maps = 2
Samples per Map = 4
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/03/25 13:30:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
18/03/25 13:30:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/03/25 13:30:43 INFO input.FileInputFormat: Total input paths to process : 2
18/03/25 13:30:43 INFO mapreduce.JobSubmitter: number of splits:2
18/03/25 13:30:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1521963635636_0004
18/03/25 13:30:44 INFO impl.YarnClientImpl: Submitted application application_1521963635636_0004
18/03/25 13:30:44 INFO mapreduce.Job: The url to track the job: http://AbdealiJK-Mac.local:8088/proxy/application_1521963635636_0004/
18/03/25 13:30:44 INFO mapreduce.Job: Running job: job_1521963635636_0004
18/03/25 13:30:51 INFO mapreduce.Job: Job job_1521963635636_0004 running in uber mode : false
18/03/25 13:30:51 INFO mapreduce.Job: map 0% reduce 0%
18/03/25 13:30:51 INFO mapreduce.Job: Job job_1521963635636_0004 failed with state FAILED due to: Application application_1521963635636_0004 failed 2 times due to AM Container for appattempt_1521963635636_0004_000002 exited with exitCode: -1
For more detailed output, check application tracking page:http://AbdealiJK-Mac.local:8088/cluster/app/application_1521963635636_0004Then, click on links to logs of each attempt.
Diagnostics: File /Users/abdealijk/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004/container_1521963635636_0004_02_000001 does not exist
Failing this attempt. Failing the application.
18/03/25 13:30:51 INFO mapreduce.Job: Counters: 0
Job Finished in 7.986 seconds
java.io.FileNotFoundException: File does not exist: hdfs://localhost/user/abdealijk/QuasiMonteCarlo_1521964841970_1162968685/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1820)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1843)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:355)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
错误似乎是:
File /Users/abdealijk/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004/container_1521963635636_0004_02_000001 does not exist
但是当我检查它时:
$ ls -lh ~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004
total 0
drwxr-xr-x 6 abdealijk staff 192B Mar 25 13:30 filecache
我有写权限,我拥有那个文件夹等等。但是那里仍然没有创建容器文件夹。
编辑 1:使用 YARN-RM / yarn 命令记录日志
我试过检查 YARN-RM webUI 中的日志以及 yarn logs -applicationId
但他们都说由于 AM-container 没有启动,所以没有找到日志。
编辑 2:这是我在该文件夹中得到的内容
$ tree ~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1522077498598_0003
~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1522077498598_0003
└── filecache
├── 10
│ └── job.splitmetainfo
├── 11
│ └── job.jar
│ └── job.jar
├── 12
│ └── job.split
└── 13
└── job.xml
6 directories, 4 files
容器没有文件夹:(
编辑 3:我的核心-site.xml 有:
<configuration>
<property>
<name>fs.defaultFS</name>
<!-- <value>hdfs://localhost/</value> -->
<value>hdfs://localhost:8020/</value>
</property>
</configuration>
hdfs://localhost/
和 hdfs://localhost:8020/
我都试过了。
我认为这可能是 URI 的问题
修复它的方法如下:
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>${user.home}/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>${user.home}/hadoop/hdfs/namenode</value>
</property>
</configuration>
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000/</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-tmpdir</value>
</property>
</configuration>
执行:
$ rm -rf ~/hadoop # To delete my previous folders
$ mkdir -p ~/hadoop/hdfs/namenode
$ mkdir -p ~/hadoop/hdfs/datanode
$ hdfs namenode -format
现在执行命令允许容器成功启动。
我看到了多个与 AM 容器启动错误相关的问题,但 none 为我解决了这个问题。
我已经在我的 Mac OSX High Sierra 笔记本电脑上安装了 Hadoop 2.7.5,并正在尝试 Pi 的示例 mapreduce 作业:
hadoop jar /usr/local/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 2 4
我有所有的服务运行:
$ jps
69555 NameNode
69954 NodeManager
69750 SecondaryNameNode
70806 JobHistoryServer
69643 DataNode
71194 Jps
69866 ResourceManager
这是我得到的输出:
$ hadoop jar /usr/local/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 2 4
Number of Maps = 2
Samples per Map = 4
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/03/25 13:30:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
18/03/25 13:30:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/03/25 13:30:43 INFO input.FileInputFormat: Total input paths to process : 2
18/03/25 13:30:43 INFO mapreduce.JobSubmitter: number of splits:2
18/03/25 13:30:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1521963635636_0004
18/03/25 13:30:44 INFO impl.YarnClientImpl: Submitted application application_1521963635636_0004
18/03/25 13:30:44 INFO mapreduce.Job: The url to track the job: http://AbdealiJK-Mac.local:8088/proxy/application_1521963635636_0004/
18/03/25 13:30:44 INFO mapreduce.Job: Running job: job_1521963635636_0004
18/03/25 13:30:51 INFO mapreduce.Job: Job job_1521963635636_0004 running in uber mode : false
18/03/25 13:30:51 INFO mapreduce.Job: map 0% reduce 0%
18/03/25 13:30:51 INFO mapreduce.Job: Job job_1521963635636_0004 failed with state FAILED due to: Application application_1521963635636_0004 failed 2 times due to AM Container for appattempt_1521963635636_0004_000002 exited with exitCode: -1
For more detailed output, check application tracking page:http://AbdealiJK-Mac.local:8088/cluster/app/application_1521963635636_0004Then, click on links to logs of each attempt.
Diagnostics: File /Users/abdealijk/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004/container_1521963635636_0004_02_000001 does not exist
Failing this attempt. Failing the application.
18/03/25 13:30:51 INFO mapreduce.Job: Counters: 0
Job Finished in 7.986 seconds
java.io.FileNotFoundException: File does not exist: hdfs://localhost/user/abdealijk/QuasiMonteCarlo_1521964841970_1162968685/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1820)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1843)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:355)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
错误似乎是:
File /Users/abdealijk/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004/container_1521963635636_0004_02_000001 does not exist
但是当我检查它时:
$ ls -lh ~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1521963635636_0004
total 0
drwxr-xr-x 6 abdealijk staff 192B Mar 25 13:30 filecache
我有写权限,我拥有那个文件夹等等。但是那里仍然没有创建容器文件夹。
编辑 1:使用 YARN-RM / yarn 命令记录日志
我试过检查 YARN-RM webUI 中的日志以及 yarn logs -applicationId
但他们都说由于 AM-container 没有启动,所以没有找到日志。
编辑 2:这是我在该文件夹中得到的内容
$ tree ~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1522077498598_0003
~/hadoop/nm-local-dir/usercache/abdealijk/appcache/application_1522077498598_0003
└── filecache
├── 10
│ └── job.splitmetainfo
├── 11
│ └── job.jar
│ └── job.jar
├── 12
│ └── job.split
└── 13
└── job.xml
6 directories, 4 files
容器没有文件夹:(
编辑 3:我的核心-site.xml 有:
<configuration>
<property>
<name>fs.defaultFS</name>
<!-- <value>hdfs://localhost/</value> -->
<value>hdfs://localhost:8020/</value>
</property>
</configuration>
hdfs://localhost/
和 hdfs://localhost:8020/
我都试过了。
我认为这可能是 URI 的问题
修复它的方法如下:
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>${user.home}/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>${user.home}/hadoop/hdfs/namenode</value>
</property>
</configuration>
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000/</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-tmpdir</value>
</property>
</configuration>
执行:
$ rm -rf ~/hadoop # To delete my previous folders
$ mkdir -p ~/hadoop/hdfs/namenode
$ mkdir -p ~/hadoop/hdfs/datanode
$ hdfs namenode -format
现在执行命令允许容器成功启动。