YarnApplicationState：ACCEPTED：等待分配、启动和注册 AM 容器

Question

我是 Hadoop 生态系统的新手。

我最近在单节点集群上尝试了 Hadoop (2.7.1)，没有任何问题，并决定转向具有 1 个名称节点和 2 个数据节点的多节点集群。

但是我遇到了一个奇怪的问题。 我尝试运行的任何作业都卡在以下消息中：

在网络界面上：

YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register

在命令行中：

16/01/05 17:52:53 INFO mapreduce.Job: Running job: job_1451083949804_0001

他们甚至没有开始，在这一点上我不确定我需要做哪些改变才能让它发挥作用。

这是我试图解决的问题：

在所有节点上禁用防火墙
设置较低的资源限制
在不同的机器、路由器和发行版下配置

如果能在正确的方向上提供任何帮助（即使是一分钟的提示），我将不胜感激。

我遵循了这些说明（配置）：

Answer 1

我终于解决了这个问题。发布详细步骤以供将来参考。（仅适用于测试环境）

Hadoop (2.7.1) Multi-Node 集群配置

确保您有一个没有主机隔离的可靠网络。静态 IP 分配更可取，或者 at-least 具有极长的 DHCP 租约。此外，所有节点（Namenode/master & Datanodes/slaves）应该有一个密码相同的公共用户帐户；如果您不这样做，请在所有节点上创建此类用户帐户。在所有节点上使用相同的用户名和密码会使事情变得不那么复杂。
[在所有机器上] 首先为 single-node 集群配置所有节点。您可以使用我在 here.

在新终端中执行这些命令

[在所有机器上] ↴

stop-dfs.sh;stop-yarn.sh;jps
rm -rf /tmp/hadoop-$USER

[仅限 Namenode/master] ↴

rm -rf ~/hadoop_store/hdfs/datanode

[仅限 Datanodes/slaves] ↴

rm -rf ~/hadoop_store/hdfs/namenode

[在所有机器上] 为集群中的所有节点添加 IP 地址和相应的主机名。

sudo nano /etc/hosts

主机

xxx.xxx.xxx.xxx master
xxx.xxx.xxx.xxy slave1
xxx.xxx.xxx.xxz slave2
# Additionally you may need to remove lines like "xxx.xxx.xxx.xxx localhost", "xxx.xxx.xxx.xxy localhost", "xxx.xxx.xxx.xxz localhost" etc if they exist.
# However it's okay keep lines like "127.0.0.1 localhost" and others.

[在所有机器上] 配置 iptables

允许您计划用于各种 Hadoop 守护程序的默认或自定义端口通过防火墙

或

更简单，禁用 iptables
- 在 RedHat 上，例如发行版（Fedora、CentOS）
```
sudo systemctl disable firewalld
sudo systemctl stop firewalld
```
- 在类似 Debian 的发行版上 (Ubuntu)
```
sudo ufw disable
```
[仅在 Namenode/master 上] 从 Namenode（主节点）获得对所有 Datnode（从节点）的 ssh 访问。
```
ssh-copy-id -i ~/.ssh/id_rsa.pub $USER@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub $USER@slave2
```
通过运行、ping slave1、ssh slave1、ping slave2、ssh slave2等确认。你应该有一个适当的回应。（请记住通过键入 exit 或关闭终端来退出每个 ssh 会话。为了安全起见，我还确保所有节点都能够相互访问，而不仅仅是 Namenode/master。 )

[在所有机器上] 编辑 core-site.xml 文件

nano /usr/local/hadoop/etc/hadoop/core-site.xml

core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>master:9000</value>
        <description>NameNode URI</description>
    </property>
</configuration>

[在所有机器上] 编辑 yarn-site.xml 文件

nano /usr/local/hadoop/etc/hadoop/yarn-site.xml

yarn-site.xml

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
        <description>The hostname of the RM.</description>
    </property>
    <property>
         <name>yarn.nodemanager.aux-services</name>
         <value>mapreduce_shuffle</value>
    </property>
    <property>
         <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
         <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>

[在所有机器上] 修改从属文件，删除文本 "localhost" 并添加从属主机名
```
nano /usr/local/hadoop/etc/hadoop/slaves
```
奴隶
```
slave1
slave2
```
（我想只在 Namenode/master 上也可以，但我还是在所有机器上都这样做了。另请注意，在此配置中，master 仅充当资源管理器，这就是我的意图.)
[在所有机器上] 修改 hdfs-site.xml 文件以将属性 dfs.replication 的值更改为某些内容> 1（at-least 到集群中的从属数量；这里我有两个从属所以我将其设置为 2）
[仅在 Namenode/master 上] 通过 namenode
（重新）格式化 HDFS
```
hdfs namenode -format
```
[可选]
- 从主人的 hdfs-site.xml 文件中删除 dfs.datanode.data.dir 属性。
- 从所有奴隶的 hdfs-site.xml 文件中删除 dfs.namenode.name.dir 属性。

正在测试（仅在 Namenode/master 上执行）

start-dfs.sh;start-yarn.sh

echo "hello world hello Hello" > ~/Downloads/test.txt

hadoop fs -mkdir /input

hadoop fs -put ~/Downloads/test.txt /input

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount /input /output

等待几秒钟，映射器和缩减器应该开始。

这些链接帮助我解决了这个问题：

Hadoop YARN Installation: The definitive guide#Cluster Installation

Answer 2

我在运行

时遇到了同样的问题

"hadoop jar hadoop-mapreduce-examples-2.6.4.jar wordcount /calculateCount/ /output"

此命令到此为止，

我跟踪了这份工作，找到了 "there are 15 missing blocks, and they are all corrupted"

然后我做了以下事情： 1) 运行 "hdfs fsck / " 2) 运行 "hdfs fsck / -delete " 3) 在两个数据节点上将“-A INPUT -p tcp -j ACCEPT”添加到 /etc/sysconfig/iptables 4) 运行 "stop-all.sh and start-all.sh"

一切顺利

我觉得防火墙是重点

YarnApplicationState：ACCEPTED：等待分配、启动和注册 AM 容器

YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register

linux

hadoop

mapreduce

distributed-computing

hadoop-yarn